Production-ready serverless backend for a personal knowledge graph — DynamoDB, Lambda, Bedrock, MCP, Step Functions. The implementation of the architecture described in the 'From Prototype to Production' essay.
The production implementation of the serverless second brain described in the essay of the same name. While the essay defines the architecture — memory, compute, and interface layers with "two doors" for humans and agents — this repository is the code that brings it to life.
Available as source code.
The system separates three layers with clear responsibilities:
Bedrock provides classification (Claude) and embeddings (Titan 1,024 dimensions).
The project follows four phases defined in the essay. All four phases are complete and deployed to dev:
| Phase | Components | Status |
|---|---|---|
| 1 — Capture | Terraform foundation, DynamoDB, S3, Capture Lambda, API Gateway, Step Functions, data migration | ✅ Complete |
| 2 — Read | Search Lambda (hybrid keyword + semantic), Graph Lambda, CloudFront + S3 frontend | ✅ Complete |
| 3 — Agent | AgentCore MCP Gateway with 6 tools, Connect Lambda, Flag Lambda, write safety | ✅ Complete |
| 4 — Surfacing | Surfacing Lambda with 5 analyzers, EventBridge daily digest, SNS email | ✅ Complete |
| Cross-cutting | Benchmarks, domain-agnostic config, observability | 🔲 Pending |
Human door (API Gateway REST):
POST /capture — ingests text, classifies with Bedrock Claude, generates embeddings with Titan, persists to DynamoDB + S3 via Step FunctionsGET /search?q= — hybrid keyword + semantic search with cosine similarity over 1,024-dimension embeddingsGET /graph — full knowledge graph (nodes + bidirectional edges)GET /nodes/{id} — single node with edges and related nodesGET /health — health checkAgent door (AgentCore MCP Gateway):
read_node — reads a node by slug with metadata, edges, and related nodeslist_nodes — lists nodes with filters by type, status, and tagssearch — hybrid keyword + semantic searchadd_node — creates a seed node with automatic Bedrock classificationconnect_nodes — creates bidirectional edges with audit trailflag_stale — flags a node for human review without modifying itSurfacing (daily digest):
STALE_DAYS, MIN_EDGES, SIMILARITY_THRESHOLD)Infrastructure:
Resilience (deep review #3):
invokeWithRetry() with exponential backoff for Bedrock throttling (1s, 2s, 4s)connect_nodes)batchGetNodes() eliminates N+1 queries in Graph LambdaContent-Type on all responsesThe "agent door" exposes Lambda functions as MCP tools via Bedrock AgentCore. Any MCP-compatible agent can discover and use the tools semantically:
| MCP Tool | Lambda | Operation | Write |
|---|---|---|---|
read_node | Graph | Read node + edges + related | No |
list_nodes | Graph | List/filter nodes | No |
search | Search | Hybrid keyword + semantic search | No |
add_node | Capture | Create seed node with AI classification | Yes |
connect_nodes | Connect | Create bidirectional edge | Yes |
flag_stale | Flag | Flag node for review | Yes |
Write operations follow strict controls:
AUDIT# item in DynamoDB with actor, action, changes, and 90-day TTLseed — human review required for promotionflag_staleagent:{session_id} or api)connect_nodes verifies both nodes exist before creating edges// Audit trail on every write operation
const audit: AuditItem = {
PK: `AUDIT#${now}`,
SK: `NODE#${slug}`,
action: "connect",
actor,
changes: { source, target, edge_type, weight },
ttl: Math.floor(Date.now() / 1000) + 90 * 86400,
};
await putAudit(audit);Single-table design with four item types:
| PK | SK | Data |
|---|---|---|
NODE#serverless | META | Type, status, titles, summaries, tags, timestamps |
NODE#serverless | EDGE#aws-lambda | Relationship type, weight, direction |
NODE#serverless | EMBED | 1,024-dimension vector (Titan V2) |
AUDIT#2026-03-19T10:30:00Z | NODE#serverless | Action, author, diff |
Two GSIs enable inverse queries and status filters:
GSI2PK (status) — "all seeds not updated in 7 days"Step Functions orchestrates the full pipeline with automatic retry on Bedrock throttling:
Each step is a separate Lambda invocation. Express Workflow (synchronous) keeps the response within API Gateway timeout.
The Search Lambda combines keyword matching and semantic similarity:
// Hybrid search: keyword + semantic
const keywordResults = await queryByKeywords(query, table);
const queryEmbedding = await generateEmbedding(query);
const allEmbeddings = await scanEmbeddings(table);
const semanticResults = allEmbeddings
.map(item => ({
slug: item.PK.replace("NODE#", ""),
score: cosineSimilarity(queryEmbedding, item.embedding),
}))
.sort((a, b) => b.score - a.score);
// Combine scores with configurable weights
const combined = mergeResults(keywordResults, semanticResults, {
keywordWeight: 0.3,
semanticWeight: 0.7,
});At current scale (~160 nodes, ~700KB of vectors) the in-memory scan is sufficient. The scalability benchmark (issue #12) will evaluate alternatives for 10K+ nodes.
All infrastructure is defined with Terraform using reusable modules:
infra/
bootstrap/ → State backend (S3 + DynamoDB lock)
modules/
dynamodb/ → Single-table design + GSIs
lambda/ → Compute functions
api-gateway/ → Human door (REST)
step-functions/ → Pipeline orchestration
s3/ → Content and frontend
cloudfront/ → CDN + security headers
iam/ → Roles and policies
agentcore-gateway/ → Agent door (MCP)
sns/ → Notifications
environments/
dev/ → Dev config (deployed)
prod/ → Prod config
CI/CD uses GitHub Actions with OIDC — no static AWS credentials:
terraform-plan.yml — plan on PRsterraform-apply.yml — apply on merge to mainlambda-deploy.yml — function packaging and deploymentScales to zero. No minimum costs beyond S3 storage:
| Load | Monthly cost |
|---|---|
| Idle (0 req/day) | ~$0.51 |
| Moderate (100 req/day) | ~$2.44 |
| High (1,000 req/day) | ~$11.21 |
The repository includes 12 ADRs (Architecture Decision Records) in docs/decisions/ documenting every technical decision with context, alternatives evaluated, benchmark data, and revisit criteria:
| ADR | Decision |
|---|---|
| 001 | Lambda packaging (zip) with no web framework |
| 002 | Write safety — 6 controls for MCP agent mutations |
| 003 | Cognito authentication and visibility model (proposed) |
| 004 | DynamoDB single-table design with 2 GSIs |
| 005 | Hybrid keyword + semantic search with configurable weights |
| 006 | Step Functions Express for capture pipeline |
| 007 | AgentCore Gateway over self-hosted MCP server |
| 008 | In-memory embedding scan (temporary, until ~5K nodes) |
| 009 | Spec-Driven Development — 7 steering files before code |
| 010 | Bedrock token optimization — 20 recent slugs vs all |
| 011 | CloudFront + S3 over Vercel/Amplify |
| 012 | GitHub Actions OIDC over static credentials |
terraform applyThis project translates a reference architecture into deployable code. The goal is for any builder to take the repository, configure their domain (legal, research, education) in terraform.tfvars, and deploy a complete second brain with terraform apply. The essay explains the "why" behind each decision; the code implements the "how."
All four phases already demonstrate the architecture works: capture with automatic classification, hybrid semantic search, a bidirectional knowledge graph, an MCP door for AI agents with write safety controls, and a proactive daily digest that identifies forgotten seeds and missing connections — all serverless, all in Terraform, scaling to zero at ~$0.51/month idle cost.
Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.
AWS serverless compute service that runs code in response to events without provisioning or managing servers, automatically scaling from zero to thousands of concurrent executions.
AWS serverless NoSQL database with single-digit millisecond latency at any scale, ideal for applications requiring high performance and automatic scalability.
AWS serverless service providing access to foundation models from multiple providers (Anthropic, Meta, Mistral, Amazon) via unified API, without managing ML infrastructure.
AWS managed service for creating, publishing, and managing REST, HTTP, and WebSocket APIs that act as entry points to Lambda functions and other backend services.
AWS serverless orchestration service that coordinates multiple services into visual workflows using Amazon States Language (ASL), with built-in error handling, retries, and parallel execution.
AWS serverless event bus connecting applications using events, enabling decoupled event-driven architectures with rule-based routing.
AWS pub/sub messaging service that distributes messages to multiple subscribers simultaneously, enabling fan-out patterns and notifications at scale.
AWS object storage service with 99.999999999% durability, unlimited scalability, and multiple storage classes for cost optimization.
AWS identity and access management service controlling who can do what in your account, with granular policies based on the principle of least privilege.
Data structures representing knowledge as networks of entities and relationships, enabling reasoning, connection discovery, and semantic queries over complex domains.
Open protocol created by Anthropic that standardizes how AI applications connect with external tools, data, and services through a universal interface.
Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.
Practices and strategies to minimize cloud spending without sacrificing performance, including right-sizing, reservations, spot instances, and eliminating idle resources.
AWS framework with six pillars of best practices for designing and operating reliable, secure, efficient, and cost-effective cloud systems.