Jonatan Matajonmatum.com
conceptsnotesexperimentsessays
© 2026 Jonatan Mata. All rights reserved.v2.1.1
Concepts

AWS Well-Architected Framework

AWS framework with six pillars of best practices for designing and operating reliable, secure, efficient, and cost-effective cloud systems.

evergreen#aws#well-architected#best-practices#cloud#architecture#pillars

What it is

The AWS Well-Architected Framework is a set of best practices organized into six pillars for evaluating and improving cloud architectures. It functions as a maturity model specific to AWS that enables teams to measure their workloads against proven industry standards.

The framework provides a common language for discussing architectural trade-offs and offers concrete tools to identify improvement areas. It's not a rigid methodology, but a set of guiding questions that help make informed decisions about architecture, operations, and resource optimization.

The six fundamental pillars

Operational excellence

Focuses on running and monitoring systems to deliver business value and continuously improve processes and procedures.

Implementation examples:

  • Deployment automation: Use AWS CodePipeline with CloudFormation for consistent deployments and automatic rollbacks
  • Proactive observability: Implement CloudWatch dashboards with alerts based on business metrics, not just technical ones
  • Automated runbooks: Create Systems Manager automation documents for common incident responses

Security

Protects information, systems, and assets while delivering business value through risk assessments and mitigation strategies.

Implementation examples:

  • Least privilege principle: Use AWS IAM roles with function-specific policies, automatic credential rotation
  • Encryption in transit and at rest: Implement AWS KMS with workload-specific keys and automatic encryption in S3
  • Threat detection: Configure GuardDuty with Security Hub for event correlation and automated response

Reliability

The ability of a workload to perform its intended function correctly and consistently when expected.

Implementation examples:

  • Automatic recovery: Use Auto Scaling Groups with custom health checks and multiple AZs
  • Backup and restore: Implement AWS Backup with automatic retention policies and scheduled restore testing
  • Circuit breakers: Use AWS Step Functions with exponential retry and fallback to alternative services

Performance efficiency

Using computing resources efficiently to meet system requirements and maintain that efficiency as demand changes.

Implementation examples:

  • Dynamic right-sizing: Use Compute Optimizer with AWS Lambda for variable workloads and Reserved Instances for predictable loads
  • Intelligent caching: Implement ElastiCache with TTL based on access patterns and CloudFront for static content
  • Serverless architecture: Migrate processing functions to Lambda with DynamoDB for automatic scalability

Cost optimization

Running systems to deliver business value at the lowest price point possible.

Implementation examples:

  • Instance strategy: Combine On-Demand (20%), Reserved Instances (60%), and Spot Instances (20%) based on workload criticality
  • Storage lifecycle: Use S3 Intelligent Tiering with automatic transitions to Glacier for archival data
  • Proactive monitoring: Implement Cost Anomaly Detection with automatic alerts and AWS Budgets with corrective actions

Sustainability

Minimizing the environmental impacts of running workloads in the cloud.

Implementation examples:

  • Efficient processors: Migrate to Graviton3 instances to reduce energy consumption by up to 60%
  • Utilization optimization: Use Spot Instances for batch workloads and shut down non-critical resources outside business hours
  • Green regions: Select AWS regions with higher renewable energy percentage for non-latency-sensitive workloads

Well-Architected review process

Who participates?

  • Solutions architect (review leader)
  • Workload owner (product owner or tech lead)
  • Operations engineer (SRE or DevOps)
  • Security specialist (for critical workloads)
  • Finance representative (for cost analysis)

Recommended cadence

  • New workloads: Before detailed design and before production
  • Existing workloads: Every 6-12 months or after significant architectural changes
  • Critical workloads: Quarterly with monthly light reviews
  • Post-incident: Within 2 weeks after major incidents

Process phases

  1. Preparation (1-2 weeks): Gather documentation, current metrics, and identify stakeholders
  2. Review (4-6 hours): Collaborative session using Well-Architected Tool with guiding questions
  3. Analysis (1 week): Prioritize findings based on business impact and technical effort
  4. Action plan (2 weeks): Create roadmap with specific timelines and owners
  5. Follow-up (ongoing): Monthly progress reviews and plan adjustments

Specialized lenses

Serverless Lens

Specific focus for serverless architectures that emphasizes:

  • Event-driven design: Optimization of EventBridge and SQS for decoupling
  • Cold start optimization: Warming strategies and provisioned concurrency in Lambda
  • Distributed observability: X-Ray tracing for debugging complex flows

SaaS Lens

Specific considerations for multi-tenant applications:

  • Tenant isolation: Isolation strategies at data, compute, and network levels
  • Billing and metering: Implementation of cost allocation tags and usage tracking
  • Onboarding automation: Automatic resource provisioning per tenant

Decision table: pillar prioritization

Workload typePrimary pillarSecondary pillarRationale
Startup MVPCost → PerformanceReliabilityOptimize burn rate, iterate quickly
Critical e-commerceReliability → SecurityPerformanceDowntime = direct revenue loss
Financial applicationSecurity → ReliabilityOperationalStrict compliance and regulation
Batch workloadCost → SustainabilityPerformanceNon-time-sensitive processing
Public APIPerformance → ReliabilitySecurityCritical user experience
Internal applicationOperational → CostPerformanceDevelopment team efficiency

Practical example: e-commerce architecture

Consider an e-commerce platform with React frontend, API Gateway, Lambda functions, DynamoDB, and S3:

Operational excellence: CI/CD with blue-green deployments, monitoring business metrics (conversion, checkout time)

Security: WAF on CloudFront, PII encryption in DynamoDB, granular IAM roles per function

Reliability: Multi-AZ deployment, DynamoDB Global Tables, S3 Cross-Region Replication for critical assets

Performance: CloudFront for static assets, DynamoDB DAX for product cache, Lambda provisioned concurrency for critical APIs

Cost: S3 Intelligent Tiering for images, Spot Instances for analytics processing, Reserved Capacity for DynamoDB

Sustainability: Graviton instances for Lambda, lifecycle policies for logs, regions with renewable energy

Why it matters

The Well-Architected Framework is the de facto standard for evaluating architectures on AWS. Its six pillars provide a common language for discussing architectural trade-offs and establishing clear priorities. For engineering teams, it represents the difference between ad-hoc architectures and systems designed with strategic intent. The framework not only identifies problems but provides a clear roadmap for continuous improvement, connecting technical decisions with business objectives.

References

  • AWS Well-Architected Framework — AWS, 2024. Complete official framework documentation.
  • Well-Architected Labs — AWS, 2024. Hands-on practical exercises for each pillar.
  • Serverless Lens — AWS, 2024. Specialized guide for serverless architectures.
  • SaaS Lens — AWS, 2024. Best practices for multi-tenant applications.
  • Security Pillar Whitepaper — AWS, 2024. Detailed security pillar guide.
  • Cost Optimization Pillar — AWS, 2024. Advanced cost optimization strategies.
  • Well-Architected Tool User Guide — AWS, 2024. Review tool usage manual.

Related content

  • Maturity Models

    Structured frameworks for progressively assessing and improving organizational capabilities, from CMMI to modern approaches like DORA and simplified models.

  • Cost Optimization

    Practices and strategies to minimize cloud spending without sacrificing performance, including right-sizing, reservations, spot instances, and eliminating idle resources.

  • Serverless

    Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.

  • From Prototype to Production: A Serverless Second Brain on AWS

    Architecture design for scaling a personal second brain to a production system with AWS serverless — from the current prototype to specialized use cases in legal, research, and community building.

  • Terraform AWS Serverless Modules

    Collection of 13 Terraform modules published on the Terraform Registry for deploying serverless architectures on AWS, with 12 examples covering basic ECS to full-stack CRUD with DynamoDB and AgentCore with MCP.

  • Serverless Second Brain

    Production-ready serverless backend for a personal knowledge graph — DynamoDB, Lambda, Bedrock, MCP, Step Functions. The implementation of the architecture described in the 'From Prototype to Production' essay.

Concepts