AWS object storage service with 99.999999999% durability, unlimited scalability, and multiple storage classes for cost optimization.
Amazon S3 (Simple Storage Service) is AWS's object storage service that offers 11 nines durability (99.999999999%) and 99.99% availability. It stores any amount of data — from bytes to petabytes — with HTTP/HTTPS access and REST APIs. It's the foundation of countless AWS architectures, from data lakes to static content distribution.
S3 organizes data into buckets (containers) and objects (files with metadata). Each object can be up to 5TB and is identified by a unique key within the bucket. The service automatically handles replication, versioning, and geographic distribution of data.
S3's distributed architecture enables virtually unlimited scalability without manual intervention. Data is automatically replicated across multiple availability zones within a region, ensuring durability and availability even during hardware failures or natural disasters.
S3 offers multiple storage classes optimized for different access patterns:
| Class | Availability | Typical use | Savings vs Standard |
|---|---|---|---|
| Standard | 99.99% | Frequent access | Base |
| Intelligent-Tiering | 99.9% | Variable access | Up to 68% (automatic) |
| Standard-IA | 99.9% | Infrequent access | Up to 40% + retrieval |
| One Zone-IA | 99.5% | Recreatable data | 20% less than Standard-IA |
| Glacier Instant | 99.9% | Archives with instant access | Up to 68% vs Standard-IA |
| Glacier Flexible | 99.99% | Archives, 1-12 hours | Up to 90% vs Standard |
| Glacier Deep Archive | 99.99% | Archives, 12+ hours | Up to 95% vs Standard |
Intelligent-Tiering automatically monitors access patterns and moves objects between frequent and infrequent access tiers. It charges a small monitoring fee but can generate significant savings on workloads with unpredictable access patterns.
Lifecycle policies automate object transitions between storage classes and deletion:
{
"Rules": [
{
"ID": "DataArchiving",
"Status": "Enabled",
"Filter": {
"Prefix": "logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 2555
}
}
]
}This policy moves logs to Standard-IA after 30 days, to Glacier after 90 days, to Deep Archive after one year, and deletes them after 7 years.
S3 can send notifications when specific events occur:
{
"LambdaFunctionConfigurations": [
{
"Id": "ProcessImageUpload",
"LambdaFunctionArn": "arn:aws:lambda:region:account:function:ProcessImage",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{
"Name": "prefix",
"Value": "images/"
},
{
"Name": "suffix",
"Value": ".jpg"
}
]
}
}
}
]
}Common patterns include:
S3 Select enables running simple SQL queries directly on CSV, JSON, and Parquet objects without downloading the entire file:
SELECT s.name, s.age FROM s3object s
WHERE s.age > 25 AND s.department = 'Engineering'This significantly reduces data transfer costs and improves performance for exploratory analytics. It's especially useful for:
Access control:
Encryption:
Monitoring and auditing:
Backup and recovery:
S3 integrates natively with AWS services to create robust architectures:
S3 is AWS's most fundamental service — not just for its 11 nines of durability, but for its role as the backbone of virtually every cloud architecture. As a staff engineer, mastering S3 means understanding how to optimize costs through storage classes, implement defense-in-depth security, and design data pipelines that scale.
The difference between basic and expert S3 usage can represent significant savings in storage costs — for example, Glacier Deep Archive costs up to 95% less than Standard. Misconfigured lifecycle policies are one of the main causes of AWS cost overruns. Intelligent-Tiering, S3 Select, and event notifications are tools that separate amateur from enterprise-grade architectures.
Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.
Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.
AWS identity and access management service controlling who can do what in your account, with granular policies based on the principle of least privilege.
Architecture design for scaling a personal second brain to a production system with AWS serverless — from the current prototype to specialized use cases in legal, research, and community building.
Production-ready serverless backend for a personal knowledge graph — DynamoDB, Lambda, Bedrock, MCP, Step Functions. The implementation of the architecture described in the 'From Prototype to Production' essay.