Jonatan Matajonmatum.com
conceptsnotesexperimentsessays
© 2026 Jonatan Mata. All rights reserved.v2.1.1
Concepts

Cost Optimization

Practices and strategies to minimize cloud spending without sacrificing performance, including right-sizing, reservations, spot instances, and eliminating idle resources.

evergreen#cost-optimization#finops#cloud#aws#savings#efficiency

What it is

Cloud cost optimization is the continuous process of reducing spending without negatively impacting performance or availability. It's one of the pillars of the AWS Well-Architected Framework and a discipline known as FinOps (Financial Operations).

Unlike traditional cost reduction, cloud optimization requires a dynamic and automated approach. Resources can scale up or down based on demand, prices change constantly, and new services offer better cost-performance ratios. This complexity makes optimization a shared responsibility between engineering, operations, and finance.

The goal isn't simply to spend less, but to maximize the value obtained per dollar invested. This means finding the optimal balance between cost, performance, availability, and user experience.

FinOps framework

FinOps is an operational framework that combines systems, best practices, and culture to increase an organization's ability to understand cloud costs and make informed business decisions. It's structured in three iterative phases:

Inform

  • Real-time visibility into spending and usage
  • Accurate cost allocation by team, project, or application
  • Benchmarking and trend analysis

Optimize

  • Right-sizing resources based on actual metrics
  • Selecting appropriate purchase models
  • Eliminating waste and orphaned resources

Operate

  • Automating optimization policies
  • Continuous governance and proactive alerts
  • Building financial accountability culture in engineering teams

Optimization strategies by service

EC2 and Compute

Right-sizing: Analyze CPU, memory, network, and storage metrics to adjust instance sizes. AWS Compute Optimizer provides recommendations based on CloudWatch historical data.

Purchase models:

ModelTypical discountCommitmentUse case
On-demand0%NoneUnpredictable workloads, development
Savings Plans30-70%1-3 yearsConsistent usage, instance flexibility
Reserved Instances30-70%1-3 yearsStable workloads, specific instances
Spot Instances60-90%NoneInterruption-tolerant workloads

Auto Scaling: Configure policies that scale resources based on actual demand, not estimates.

Serverless

Serverless optimizes costs automatically:

  • Pay per invocation and actual execution duration
  • Scales to zero when there's no traffic
  • No idle infrastructure cost
  • Memory and timeout optimization to reduce per-invocation costs

Storage

S3: Use appropriate storage classes (Standard, IA, Glacier) and configure lifecycle policies for automatic transitions.

EBS: Remove orphaned volumes, use gp3 instead of gp2, and configure snapshots with appropriate retention.

Databases

RDS: Right-size instances, use Reserved Instances for stable workloads, and Aurora Serverless for variable workloads.

DynamoDB: On-demand mode for unpredictable traffic, provisioned mode with Auto Scaling for stable workloads.

Decision framework: Reserved vs Spot vs On-demand

Loading diagram...

Decision criteria

Reserved Instances/Savings Plans:

  • Consistent usage greater than 75% of the time
  • Ability to commit for 1-3 years
  • Stable production workloads

Spot Instances:

  • Fault-tolerant applications
  • Batch processing or data analysis
  • Systems with automatic checkpointing

On-demand:

  • Development and testing
  • Unpredictable workloads
  • Critical applications without interruption tolerance

Tagging strategy

A consistent tagging system is fundamental for cost allocation and optimization:

# Example tagging strategy
required_tags:
  Environment: [prod, staging, dev]
  Team: [platform, data, frontend]
  Project: [user-auth, analytics, billing]
  CostCenter: [engineering, marketing, sales]
  
optional_tags:
  Owner: email_address
  Schedule: [24x7, business-hours, weekend-off]
  Backup: [daily, weekly, none]

Implementation can be automated with Infrastructure as Code and AWS Organizations policies.

Tools and automation

AWS native

  • Cost Explorer: Historical analysis and forecasting
  • Budgets: Proactive budget alerts
  • Trusted Advisor: Optimization recommendations
  • Compute Optimizer: ML-based right-sizing

Observability

Observability is key for continuous optimization:

  • Cost per transaction metrics
  • Spending anomaly alerts
  • Efficiency dashboards per service

Automation

  • Lambda functions for automatic shutdown of development resources
  • EventBridge rules to detect orphaned resources
  • AWS Config rules for tagging compliance

Why it matters

In mature organizations, cost is an engineering metric as important as latency or availability. Without active optimization, cloud spending grows rapidly year over year. FinOps practices aren't the exclusive responsibility of finance; they require engineering teams to understand the economic impact of their architectural decisions. A well-optimized system can operate with significantly less cost than an unoptimized one, freeing budget for innovation and new features.

References

  • AWS Cloud Financial Management | Amazon Web Services — AWS, 2024. Official cost management tools.
  • The FinOps Foundation — FinOps Foundation, 2024. Community and resources for FinOps.
  • FinOps Framework Overview — FinOps Foundation, 2024. Complete FinOps reference framework.
  • Cost Optimization Pillar - AWS Well-Architected Framework - Cost Optimization Pillar — AWS, 2024. Official cost optimization pillar guide.
  • Principles of cloud cost optimization | Google Cloud Blog — Google Cloud, 2023. Multi-cloud optimization principles.
  • Amazon EC2 – Secure and resizable compute capacity – AWS — AWS, 2024. Official EC2 pricing models documentation.
  • Analyzing your costs and usage with AWS Cost Explorer - AWS Cost Management — AWS, 2024. Complete Cost Explorer guide.

Related content

  • AWS Well-Architected Framework

    AWS framework with six pillars of best practices for designing and operating reliable, secure, efficient, and cost-effective cloud systems.

  • Serverless

    Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.

  • Observability

    Ability to understand a system's internal state from its external outputs: logs, metrics, and traces, enabling problem diagnosis without direct system access.

  • Infrastructure as Code

    Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.

  • From Prototype to Production: A Serverless Second Brain on AWS

    Architecture design for scaling a personal second brain to a production system with AWS serverless — from the current prototype to specialized use cases in legal, research, and community building.

  • Serverless Second Brain

    Production-ready serverless backend for a personal knowledge graph — DynamoDB, Lambda, Bedrock, MCP, Step Functions. The implementation of the architecture described in the 'From Prototype to Production' essay.

  • Prompt Caching

    Technique that stores the internal computation of reused prompt prefixes across LLM calls, reducing costs by up to 90% and latency by up to 85% in applications with repetitive context.

  • AWS Bedrock

    AWS serverless service providing access to foundation models from multiple providers (Anthropic, Meta, Mistral, Amazon) via unified API, without managing ML infrastructure.

  • AI Observability

    Practices and tools for monitoring, tracing, and debugging AI systems in production, covering token metrics, latency, response quality, costs, and hallucination detection.

Concepts