Pattern providing a single entry point for multiple microservices, handling routing, authentication, rate limiting, and response aggregation.
The API Gateway pattern provides a single entry point for clients needing to access multiple microservices. Instead of the client knowing and calling each service directly, the gateway acts as an intelligent proxy that routes, aggregates, transforms, and protects requests.
An API Gateway is fundamentally different from a simple load balancer or reverse proxy. While the latter distribute traffic, the gateway adds application logic: centralized authentication, protocol transformation, response aggregation from multiple services, and enforcement of security and rate limiting policies.
The pattern emerges naturally in microservices architectures where the complexity of coordinating multiple services from the client becomes unmanageable. Instead of each client application implementing service discovery logic, error handling, and authentication for dozens of APIs, the gateway centralizes these responsibilities.
The gateway maintains a registry of available services and their locations, routing requests based on paths, headers, or payload content. This allows backend services to change location without affecting clients.
Centralizes identity verification using protocols like OAuth/OIDC, validating JWT tokens, and applying authorization policies before forwarding requests to internal services.
Implements rate limits per client, API key, or endpoint to protect backend services from overload. Circuit breakers detect degraded services and fail fast instead of propagating latency.
Combines responses from multiple services into a single response, reduces the number of round-trips from the client, and transforms data formats between protocols (REST to GraphQL, JSON to XML).
AWS API Gateway is a fully managed service that integrates natively with Lambda and other AWS services. It automatically handles scaling, availability, and monitoring.
# Example configuration with AWS SAM
Resources:
ApiGateway:
Type: AWS::Serverless::Api
Properties:
StageName: prod
Auth:
DefaultAuthorizer: CognitoAuthorizer
Authorizers:
CognitoAuthorizer:
UserPoolArn: !GetAtt UserPool.Arn
MethodSettings:
- ResourcePath: "/*"
HttpMethod: "*"
ThrottlingBurstLimit: 200
ThrottlingRateLimit: 100Kong is an open source gateway built on NGINX that offers extensible plugins for authentication, rate limiting, transformations, and observability.
# Kong configuration with rate limiting and circuit breaker
services:
- name: user-service
url: http://user-service:8080
plugins:
- name: rate-limiting
config:
minute: 100
hour: 1000
- name: proxy-cache
config:
response_code: [200, 301, 404]
request_method: [GET, HEAD]
cache_ttl: 300
routes:
- name: users-route
service: user-service
paths: ["/api/users"]NGINX Plus adds API management capabilities to NGINX, including dynamic rate limiting, active health checks, and monitoring dashboard.
# NGINX Plus configuration with rate limiting
upstream user_service {
zone user_service 64k;
server user-service-1:8080 max_fails=3 fail_timeout=30s;
server user-service-2:8080 max_fails=3 fail_timeout=30s;
}
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
server {
listen 80;
location /api/users {
limit_req zone=api_limit burst=20 nodelay;
# Circuit breaker logic
error_page 502 503 504 = @fallback;
proxy_pass http://user_service;
proxy_set_header Authorization $http_authorization;
# Health check
health_check uri=/health match=server_ok;
}
location @fallback {
return 503 '{"error": "Service temporarily unavailable"}';
add_header Content-Type application/json;
}
}| Aspect | API Gateway | Backend for Frontend |
|---|---|---|
| Purpose | Single entry point for all clients | Client-specific adapter |
| Granularity | One instance for entire organization | One instance per frontend application |
| Responsibility | Routing, authentication, rate limiting | Specific aggregation and transformation |
| Complexity | High — must serve multiple use cases | Low — optimized for specific client |
| Coupling | Low with clients, high with backend services | High with specific client, low with others |
| Scalability | Potential bottleneck | Scales independently per client |
| Use cases | Shared microservices, public APIs | Mobile vs web, different UI versions |
In practice, many organizations combine both patterns: an API Gateway for cross-cutting functionality (authentication, rate limiting) and multiple BFF for client-specific aggregations.
In modern architectures, the API Gateway complements a service mesh like Istio or Linkerd. The gateway handles north-south traffic (client to services), while the service mesh handles east-west traffic (service to service).
# Istio Gateway + VirtualService
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: api-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: api-tls-secret
hosts:
- api.company.com
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: api-routes
spec:
hosts:
- api.company.com
gateways:
- api-gateway
http:
- match:
- uri:
prefix: /api/users
route:
- destination:
host: user-service
port:
number: 8080
fault:
delay:
percentage:
value: 0.1
fixedDelay: 5s
retries:
attempts: 3
perTryTimeout: 2sImplements dynamic limits based on backend system health, automatically reducing traffic when services show signs of stress.
Detects failures in backend services and serves cached responses or default responses to maintain system availability.
Automatically transforms between different API versions, allowing service evolution without breaking existing clients.
Monolithic gateway: A gateway containing domain-specific business logic. This violates separation of concerns and creates coupling.
Excessive aggregation: Performing complex joins or business logic in the gateway. This logic belongs in dedicated services or BFF.
Single point of failure: Not implementing high availability in the gateway. A failed gateway means the entire application becomes inaccessible.
Heavy data transformation: Performing expensive transformations in the gateway that should be done in specialized services or on the client.
From a senior engineering perspective, the API Gateway solves the fundamental problem of operational complexity in distributed architectures. Without it, each frontend team must implement service discovery, error handling, authentication, and data aggregation — duplicating effort and creating inconsistencies.
The gateway centralizes security policies and observability, allowing platform teams to implement organizational controls without modifying every microservice. This is critical for compliance with regulations like GDPR or PCI-DSS that require centralized auditing.
However, it introduces additional latency and becomes a critical point of failure. The decision to implement a gateway must balance operational simplicity against the risk of creating a bottleneck. In organizations with few services, the additional complexity may not be justified.
Architectural style structuring an application as a collection of small, independent, deployable services, each with its own business logic and data.
AWS managed service for creating, publishing, and managing REST, HTTP, and WebSocket APIs that act as entry points to Lambda functions and other backend services.
Architectural pattern where each client type has its own dedicated backend adapting microservice APIs to that client's specific needs.
Industry standards for delegated authorization (OAuth 2.0) and federated authentication (OpenID Connect), enabling third-party login and secure API access.
Infrastructure layer dedicated to managing communication between microservices, providing observability, security, and traffic control transparently.
Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.
Incremental migration strategy that gradually replaces a legacy system with new components, progressively routing traffic until the old system can be retired.
Principles and practices for designing clear, consistent, and evolvable programming interfaces that facilitate integration between systems.