From Prototype to Production: A Serverless Second Brain on AWS

The starting point

In "Building a Second Brain in Public" I documented how a weekend in March 2026 became a knowledge system with over 150 nodes, 450+ edges, and ~200,000 bilingual words. The prototype runs on Vercel — instant deploys, automatic SSL, global CDN. For a static site with MDX content, it's perfect.

But the prototype has clear limits. Capture requires creating MDX files with 11 frontmatter fields and committing — 15 to 30 minutes per concept. AI agents can read the full graph via MCP, but they can't write to it. Proactive surfacing is limited to a weekly QA agent that detects quality problems, not relevant content for the user. And semantic search failed in production due to the WASM model weight in the browser.

These aren't bugs — they're the natural boundaries of a static system. Crossing them requires stateful backend. And as an engineer who has built production systems on AWS serverless for years, that suite is where I have the most experience, the most control, and — being honest — the most curiosity to explore the limits.

This essay isn't a deployment tutorial. It's the architecture design for taking a personal second brain to production — and an exploration of what happens when that design is applied to specialized domains where structured knowledge has real value.

The architecture: each service, one responsibility

The principle that worked in the prototype — separating memory, compute, and interface — stays. What changes is that each layer gains capabilities the static system can't offer.

Loading diagram...

Memory layer: DynamoDB + S3

DynamoDB stores the structured metadata — the equivalent of the current MDX frontmatter plus graph edges and embedding vectors. The main table uses a single-table design:

PK	SK	Data
`NODE#serverless`	`META`	Type, status, titles, summaries, tags, timestamps
`NODE#serverless`	`EDGE#aws-lambda`	Relationship type, weight, direction
`NODE#serverless`	`EDGE#aws-api-gateway`	Relationship type, weight, direction
`NODE#serverless`	`EMBED`	1,024-dimension vector (Titan Text Embeddings V2)
`AUDIT#2026-03-19T10:30:00Z`	`NODE#serverless`	Action, author, diff

A GSI inverts PK/SK for reverse queries: "what nodes point to serverless?" Another GSI projects status for queries like "all seeds not updated in 7 days."

S3 stores long-form content — the MDX body in Spanish and English. DynamoDB has a 400KB item limit; an evergreen concept with 1,500 bilingual words fits, but separating content from metadata keeps graph queries fast and cheap.

Why not Aurora Serverless with Postgres? Three reasons:

Data model: a knowledge graph is naturally key-value with relationships — DynamoDB's single-table pattern models it without an ORM or schema migrations
Predictable latency: Aurora Serverless v2 supports scaling to 0 ACUs since 2024, but waking up incurs reconnection latency. DynamoDB responds in consistent milliseconds, with no pauses or resumes
Operational simplicity: DynamoDB requires no VPCs, subnets, or security groups. A single Terraform resource with IAM — no intermediate network layers

Compute layer: Lambda + Step Functions + EventBridge

Each Lambda function has a single responsibility:

Capture: receives text + optional URL, generates frontmatter with Bedrock, classifies type, suggests tags and cross-refs, persists to DynamoDB + S3
Search: hybrid search — keyword matching in DynamoDB + cosine similarity against pre-computed embeddings with Bedrock Titan
Graph: reads edges from DynamoDB, builds the graph in memory, responds with JSON for D3
Surfacing: daily cron via EventBridge that identifies forgotten seeds, nodes with few connections, concepts that should be related but aren't

These Lambda functions serve as tools for both the REST API (consumed by the SPA) and AgentCore Gateway, which automatically exposes them as MCP tools — read_node, add_node, connect_nodes, search, flag_stale — with no additional protocol code. Any MCP-compatible AI client can discover and invoke these tools through Gateway.

AgentCore Runtime hosts the AI agent in microVMs with session isolation. The agent reasons with Bedrock Claude and invokes tools registered in Gateway to interact with the graph.

Step Functions orchestrates multi-step flows — for example, the full capture pipeline: validate input → generate metadata with Bedrock → compute embeddings → persist → notify. If Bedrock fails due to throttling, Step Functions retries with exponential backoff without custom code.

Interface layer: CloudFront + API Gateway + AgentCore Gateway

CloudFront serves the static frontend from S3 — the same exported Next.js that runs on Vercel today, but with full control over headers, cache, and edge functions. API Gateway REST exposes the compute endpoints for the SPA with throttling, API keys, and IAM authorization. AgentCore Gateway exposes the same Lambda functions as MCP tools with OAuth authentication, semantic tool discovery, and automatic protocol translation.

The "two doors" — one interface for humans, another for AI agents — materialize:

Human door: CloudFront → SPA + API Gateway REST for search, interactive graph, tag navigation
Agent door: AgentCore Gateway → MCP tools with read and write, semantic discovery, and OAuth authorization

The cost: pennies, not dollars

The question every serverless architect must answer: how much does it cost at rest and under load?

Service	Idle (0 req/day)	Moderate use (100 req/day)	High use (1,000 req/day)
DynamoDB on-demand	$0.00	~$0.25	~$2.50
Lambda	$0.00	~$0.01	~$0.10
API Gateway	$0.00	~$0.04	~$0.35
S3 + CloudFront	~$0.50	~$0.60	~$1.00
Bedrock (embeddings)	$0.00	~$0.50	~$2.00
Bedrock (chat/agent)	$0.00	~$1.00	~$5.00
EventBridge + SNS	~$0.01	~$0.01	~$0.01
Step Functions	$0.00	~$0.03	~$0.25
Total	~$0.51	~$2.44	~$11.21

At rest, the system costs less than a coffee. Under moderate use — the pattern for a personal second brain — less than $3/month. Even under high load, it stays below $12/month. Estimates based on AWS pricing for us-east-1, March 2026.

The key is that serverless scales to zero. No servers running waiting for requests. No databases with capacity minimums. Every penny corresponds to real work.

Infrastructure as code: Terraform as the foundation

All infrastructure is defined with Terraform — the same approach that already manages the project's current IAM. A single terraform apply stands up the complete system:

resource "aws_dynamodb_table" "knowledge_graph" {
  name         = "SecondBrain-KnowledgeGraph"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "PK"
  range_key    = "SK"
 
  attribute {
    name = "PK"
    type = "S"
  }
  attribute {
    name = "SK"
    type = "S"
  }
 
  # GSI for reverse queries (what points to this node?)
  global_secondary_index {
    name            = "GSI1"
    hash_key        = "SK"
    range_key       = "PK"
    projection_type = "ALL"
  }
 
  point_in_time_recovery { enabled = true }
}
 
resource "aws_lambda_function" "capture" {
  function_name = "SecondBrain-Capture"
  runtime       = "nodejs22.x"
  handler       = "capture.handler"
  filename      = "lambda.zip"
  role          = aws_iam_role.lambda_exec.arn
  timeout       = 30
  memory_size   = 512
 
  environment {
    variables = { TABLE_NAME = aws_dynamodb_table.knowledge_graph.name }
  }
}

Terraform offers practical advantages for this project: remote state in S3 with DynamoDB locking, explicit plan/apply that shows exactly what changes before executing, and a provider ecosystem covering all AWS services including AgentCore. The project's existing infrastructure — the IAM role for content agents with GitHub Actions OIDC — already runs on Terraform.

Use cases: from general knowledge to specialized

The current prototype is a general technical knowledge second brain — software engineering, cloud, AI concepts. But the serverless architecture I described has nothing technology-specific. It's a knowledge graph system with capture, semantic search, agents, and proactive surfacing. That opens the door to domains where structured knowledge has much higher value.

Public and general knowledge

The base case. A professional — engineer, designer, product manager — builds their public second brain. They share what they learn, connect ideas, let AI agents consume their graph. The value is in accumulation and connections.

This is what jonmatum.com does today. The serverless version adds low-friction capture (an endpoint instead of a commit), real semantic search, and agents that can write. Cost stays in pennies.

Legal: case law as a knowledge graph

A law firm handles thousands of documents — rulings, contracts, regulations, precedents. Today they live in SharePoint folders or document management systems that search by full text. A legal second brain would change the paradigm:

Nodes: each ruling, statute, contract template, legal opinion
Edges: "cites," "contradicts," "extends," "repeals," "applies principle of"
Capture: a lawyer reads a new ruling and adds it via an endpoint — the agent auto-classifies, extracts citations, connects with existing precedents
Surfacing: "this ruling from last week contradicts the precedent you used in case X"
MCP agent: a legal assistant that navigates the graph to find relevant precedents, identifies contradictions, and generates argumentation drafts

The differentiating value: connections between legal documents are the real knowledge of an experienced lawyer. A junior with graph access can navigate relationships that would take years of experience to discover.

The architecture is identical — what changes are node types, relationships, and the agent's prompt. DynamoDB stores metadata, S3 stores full documents, Bedrock generates embeddings and contextual responses.

Research and development: the connected lab

An R&D team produces papers, prototypes, datasets, experimental results. Knowledge fragments across Notion, Google Drive, Slack, and researchers' memories. An R&D second brain:

Nodes: papers, experiments, datasets, hypotheses, results
Edges: "reproduces," "refutes," "extends," "uses dataset from," "inspired by"
Capture: a researcher finishes an experiment and logs results — the agent automatically connects with previous hypotheses and related papers
Surfacing: "María's experiment last week got results similar to your January hypothesis — should you collaborate?"
MCP agent: a research assistant that cross-references experimental results with existing literature

The value: in teams of 5+ researchers, tacit knowledge — who worked on what, what was tried and failed, what connections exist between research lines — gets lost. The graph captures it and makes it navigable.

Education: the living curriculum

An educational institution or educational content creator:

Nodes: curriculum concepts, exercises, assessments, resources
Edges: "prerequisite of," "complements," "assesses understanding of"
Capture: a teacher adds a new resource — the agent connects it with the concepts it covers
Surfacing: "3 students failed on concept X — here are alternative resources covering the prerequisites"
MCP agent: a tutor that navigates the graph to create personalized learning paths

The common pattern

All these cases share the same architecture:

Loading diagram...

What changes between domains:

Component	General	Legal	R&D	Education
Node types	Concept, note, experiment	Ruling, statute, contract	Paper, experiment, dataset	Concept, exercise, resource
Relationships	"related to"	"cites," "repeals"	"reproduces," "refutes"	"prerequisite of"
Agent prompt	Staff+ engineer	Senior lawyer	Principal researcher	Personalized tutor
Surfacing	Forgotten seeds	Contradicting precedents	Cross-referenced results	Learning gaps

The AWS infrastructure is identical. A terraform apply with different variables.

Building in community

This design doesn't exist in a vacuum. It's born from the intersection of two communities I care about: the AWS community and the PKM (Personal Knowledge Management) community.

The AWS community has a strong tradition of "builders" — engineers who build in public, share architectures, and contribute to open source projects. AWS Community Builders, User Groups, re:Invent talks — all share a principle: show what you build, explain why you made those decisions, and let others learn from your mistakes.

A serverless second brain on AWS is exactly that kind of project. It's not a product — it's a reference prototype. An architecture that other builders can take, adapt to their domain, and deploy with terraform apply. The code is open source. Architecture decisions are documented in essays like this one. Costs are transparent.

What I want to explore with the community:

The knowledge graph pattern in DynamoDB — single table with GSIs for bidirectional relationships. Does it work at 10,000 nodes? At 100,000? Where are the limits?
MCP agents with write access — most MCP implementations are read-only. What happens when the agent can mutate the graph? What quality controls are needed?
Serverless semantic search — Bedrock Titan Embeddings + DynamoDB vs. OpenSearch Serverless vs. pgvector on Aurora. What's the crossover point in cost and latency?
The real cost of Bedrock in production — the penny estimates are theoretical. How does it behave with real usage, throttling, and model cold starts?

These are questions a single builder can't answer. They need data from multiple implementations, at multiple scales, with multiple usage patterns. That's exactly what a community of builders produces.

What comes next

The current prototype — Vercel, MDX, agents in GitHub Actions — keeps working. I'm not migrating it tomorrow. But the design is ready, and the pieces are clear:

Phase 1: Capture API — a Lambda + API Gateway endpoint that accepts text and creates a seed in DynamoDB. The low-friction capture that's missing today
Phase 2: Semantic search — embeddings with Bedrock Titan, stored in DynamoDB, queried by cosine similarity in Lambda
Phase 3: MCP agent with write access — Bedrock AgentCore Runtime hosts the agent, AgentCore Gateway exposes Lambda tools as MCP. The agent door stops being read-only
Phase 4: Proactive surfacing — EventBridge + Lambda + SNS for daily digests with forgotten seeds, suggested connections, and relevant content

Each phase is independent and deployable separately. Each adds value without requiring the next. And each is a topic for a future essay — with code, real costs, and lessons learned.

The second brain isn't the destination — it's the vehicle. What matters is the knowledge it structures, the connections it reveals, and the decisions it informs. The serverless architecture is simply the most efficient way I know to keep that vehicle running — no servers to maintain, no fixed costs, and the ability to scale from a curious engineer to a full research team.

Why it matters

The gap between a personal second brain and a production knowledge system isn't about technology — it's about architecture. The pieces exist: DynamoDB for graphs, Bedrock for AI, Lambda for compute, MCP for interoperability. What's missing is the design that connects them with the right constraints: zero idle cost, automatic scaling, and the separation between memory, compute, and interface that allows each layer to evolve independently.

This essay is that design. It's not definitive — it's a starting point for building, measuring, and learning in public.

References

AWS Serverless Applications Lens — AWS Well-Architected Framework. Best practices guide for serverless architectures.
Amazon Bedrock AgentCore — AWS. Platform for deploying AI agents with serverless Runtime, MCP Gateway, and identity management.
Model Context Protocol — Specification — Anthropic/Linux Foundation. Protocol specification for agent interoperability.
Amazon DynamoDB Developer Guide — AWS. Official DynamoDB documentation.
Building a Second Brain — Tiago Forte, 2022. The book that popularized the concept and the PARA method.
AWS Serverless — AWS. Official AWS serverless portfolio page.

The starting point

The architecture: each service, one responsibility

The principle that worked in the prototype — separating memory, compute, and interface — stays. What changes is that each layer gains capabilities the static system can't offer.

Loading diagram...

Memory layer: DynamoDB + S3

DynamoDB stores the structured metadata — the equivalent of the current MDX frontmatter plus graph edges and embedding vectors. The main table uses a single-table design:

PK	SK	Data
`NODE#serverless`	`META`	Type, status, titles, summaries, tags, timestamps
`NODE#serverless`	`EDGE#aws-lambda`	Relationship type, weight, direction
`NODE#serverless`	`EDGE#aws-api-gateway`	Relationship type, weight, direction
`NODE#serverless`	`EMBED`	1,024-dimension vector (Titan Text Embeddings V2)
`AUDIT#2026-03-19T10:30:00Z`	`NODE#serverless`	Action, author, diff

A GSI inverts PK/SK for reverse queries: "what nodes point to serverless?" Another GSI projects status for queries like "all seeds not updated in 7 days."

Why not Aurora Serverless with Postgres? Three reasons:

Data model: a knowledge graph is naturally key-value with relationships — DynamoDB's single-table pattern models it without an ORM or schema migrations
Predictable latency: Aurora Serverless v2 supports scaling to 0 ACUs since 2024, but waking up incurs reconnection latency. DynamoDB responds in consistent milliseconds, with no pauses or resumes
Operational simplicity: DynamoDB requires no VPCs, subnets, or security groups. A single Terraform resource with IAM — no intermediate network layers

Compute layer: Lambda + Step Functions + EventBridge

Each Lambda function has a single responsibility:

Capture: receives text + optional URL, generates frontmatter with Bedrock, classifies type, suggests tags and cross-refs, persists to DynamoDB + S3
Search: hybrid search — keyword matching in DynamoDB + cosine similarity against pre-computed embeddings with Bedrock Titan
Graph: reads edges from DynamoDB, builds the graph in memory, responds with JSON for D3
Surfacing: daily cron via EventBridge that identifies forgotten seeds, nodes with few connections, concepts that should be related but aren't

AgentCore Runtime hosts the AI agent in microVMs with session isolation. The agent reasons with Bedrock Claude and invokes tools registered in Gateway to interact with the graph.

Interface layer: CloudFront + API Gateway + AgentCore Gateway

The "two doors" — one interface for humans, another for AI agents — materialize:

Human door: CloudFront → SPA + API Gateway REST for search, interactive graph, tag navigation
Agent door: AgentCore Gateway → MCP tools with read and write, semantic discovery, and OAuth authorization

The cost: pennies, not dollars

The question every serverless architect must answer: how much does it cost at rest and under load?

Service	Idle (0 req/day)	Moderate use (100 req/day)	High use (1,000 req/day)
DynamoDB on-demand	$0.00	~$0.25	~$2.50
Lambda	$0.00	~$0.01	~$0.10
API Gateway	$0.00	~$0.04	~$0.35
S3 + CloudFront	~$0.50	~$0.60	~$1.00
Bedrock (embeddings)	$0.00	~$0.50	~$2.00
Bedrock (chat/agent)	$0.00	~$1.00	~$5.00
EventBridge + SNS	~$0.01	~$0.01	~$0.01
Step Functions	$0.00	~$0.03	~$0.25
Total	~$0.51	~$2.44	~$11.21

The key is that serverless scales to zero. No servers running waiting for requests. No databases with capacity minimums. Every penny corresponds to real work.

Infrastructure as code: Terraform as the foundation

All infrastructure is defined with Terraform — the same approach that already manages the project's current IAM. A single terraform apply stands up the complete system:

resource "aws_dynamodb_table" "knowledge_graph" {
  name         = "SecondBrain-KnowledgeGraph"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "PK"
  range_key    = "SK"
 
  attribute {
    name = "PK"
    type = "S"
  }
  attribute {
    name = "SK"
    type = "S"
  }
 
  # GSI for reverse queries (what points to this node?)
  global_secondary_index {
    name            = "GSI1"
    hash_key        = "SK"
    range_key       = "PK"
    projection_type = "ALL"
  }
 
  point_in_time_recovery { enabled = true }
}
 
resource "aws_lambda_function" "capture" {
  function_name = "SecondBrain-Capture"
  runtime       = "nodejs22.x"
  handler       = "capture.handler"
  filename      = "lambda.zip"
  role          = aws_iam_role.lambda_exec.arn
  timeout       = 30
  memory_size   = 512
 
  environment {
    variables = { TABLE_NAME = aws_dynamodb_table.knowledge_graph.name }
  }
}

Use cases: from general knowledge to specialized

Public and general knowledge

This is what jonmatum.com does today. The serverless version adds low-friction capture (an endpoint instead of a commit), real semantic search, and agents that can write. Cost stays in pennies.

Legal: case law as a knowledge graph

Nodes: each ruling, statute, contract template, legal opinion
Edges: "cites," "contradicts," "extends," "repeals," "applies principle of"
Capture: a lawyer reads a new ruling and adds it via an endpoint — the agent auto-classifies, extracts citations, connects with existing precedents
Surfacing: "this ruling from last week contradicts the precedent you used in case X"
MCP agent: a legal assistant that navigates the graph to find relevant precedents, identifies contradictions, and generates argumentation drafts

Research and development: the connected lab

An R&D team produces papers, prototypes, datasets, experimental results. Knowledge fragments across Notion, Google Drive, Slack, and researchers' memories. An R&D second brain:

Nodes: papers, experiments, datasets, hypotheses, results
Edges: "reproduces," "refutes," "extends," "uses dataset from," "inspired by"
Capture: a researcher finishes an experiment and logs results — the agent automatically connects with previous hypotheses and related papers
Surfacing: "María's experiment last week got results similar to your January hypothesis — should you collaborate?"
MCP agent: a research assistant that cross-references experimental results with existing literature

Education: the living curriculum

An educational institution or educational content creator:

Nodes: curriculum concepts, exercises, assessments, resources
Edges: "prerequisite of," "complements," "assesses understanding of"
Capture: a teacher adds a new resource — the agent connects it with the concepts it covers
Surfacing: "3 students failed on concept X — here are alternative resources covering the prerequisites"
MCP agent: a tutor that navigates the graph to create personalized learning paths

The common pattern

All these cases share the same architecture:

Loading diagram...

What changes between domains:

Component	General	Legal	R&D	Education
Node types	Concept, note, experiment	Ruling, statute, contract	Paper, experiment, dataset	Concept, exercise, resource
Relationships	"related to"	"cites," "repeals"	"reproduces," "refutes"	"prerequisite of"
Agent prompt	Staff+ engineer	Senior lawyer	Principal researcher	Personalized tutor
Surfacing	Forgotten seeds	Contradicting precedents	Cross-referenced results	Learning gaps

The AWS infrastructure is identical. A terraform apply with different variables.

Building in community

This design doesn't exist in a vacuum. It's born from the intersection of two communities I care about: the AWS community and the PKM (Personal Knowledge Management) community.

What I want to explore with the community:

The knowledge graph pattern in DynamoDB — single table with GSIs for bidirectional relationships. Does it work at 10,000 nodes? At 100,000? Where are the limits?
MCP agents with write access — most MCP implementations are read-only. What happens when the agent can mutate the graph? What quality controls are needed?
Serverless semantic search — Bedrock Titan Embeddings + DynamoDB vs. OpenSearch Serverless vs. pgvector on Aurora. What's the crossover point in cost and latency?
The real cost of Bedrock in production — the penny estimates are theoretical. How does it behave with real usage, throttling, and model cold starts?

What comes next

The current prototype — Vercel, MDX, agents in GitHub Actions — keeps working. I'm not migrating it tomorrow. But the design is ready, and the pieces are clear:

Phase 1: Capture API — a Lambda + API Gateway endpoint that accepts text and creates a seed in DynamoDB. The low-friction capture that's missing today
Phase 2: Semantic search — embeddings with Bedrock Titan, stored in DynamoDB, queried by cosine similarity in Lambda
Phase 3: MCP agent with write access — Bedrock AgentCore Runtime hosts the agent, AgentCore Gateway exposes Lambda tools as MCP. The agent door stops being read-only
Phase 4: Proactive surfacing — EventBridge + Lambda + SNS for daily digests with forgotten seeds, suggested connections, and relevant content

Each phase is independent and deployable separately. Each adds value without requiring the next. And each is a topic for a future essay — with code, real costs, and lessons learned.

Why it matters

This essay is that design. It's not definitive — it's a starting point for building, measuring, and learning in public.

References

AWS Serverless Applications Lens — AWS Well-Architected Framework. Best practices guide for serverless architectures.
Amazon Bedrock AgentCore — AWS. Platform for deploying AI agents with serverless Runtime, MCP Gateway, and identity management.
Model Context Protocol — Specification — Anthropic/Linux Foundation. Protocol specification for agent interoperability.
Amazon DynamoDB Developer Guide — AWS. Official DynamoDB documentation.
Building a Second Brain — Tiago Forte, 2022. The book that popularized the concept and the PARA method.
AWS Serverless — AWS. Official AWS serverless portfolio page.

From Prototype to Production: A Serverless Second Brain on AWS

The starting point

The architecture: each service, one responsibility

Memory layer: DynamoDB + S3

Compute layer: Lambda + Step Functions + EventBridge

Interface layer: CloudFront + API Gateway + AgentCore Gateway

The cost: pennies, not dollars

Infrastructure as code: Terraform as the foundation

Use cases: from general knowledge to specialized

Public and general knowledge

Legal: case law as a knowledge graph

Research and development: the connected lab

Education: the living curriculum

The common pattern

Building in community

What comes next

Why it matters

References

Related content

From Prototype to Production: A Serverless Second Brain on AWS

The starting point

The architecture: each service, one responsibility

Memory layer: DynamoDB + S3

Compute layer: Lambda + Step Functions + EventBridge

Interface layer: CloudFront + API Gateway + AgentCore Gateway

The cost: pennies, not dollars

Infrastructure as code: Terraform as the foundation

Use cases: from general knowledge to specialized

Public and general knowledge

Legal: case law as a knowledge graph

Research and development: the connected lab

Education: the living curriculum

The common pattern

Building in community

What comes next

Why it matters

References

Related content