Architecture design for scaling a personal second brain to a production system with AWS serverless — from the current prototype to specialized use cases in legal, research, and community building.
In "Building a Second Brain in Public" I documented how a weekend in March 2026 became a knowledge system with over 150 nodes, 450+ edges, and ~200,000 bilingual words. The prototype runs on Vercel — instant deploys, automatic SSL, global CDN. For a static site with MDX content, it's perfect.
But the prototype has clear limits. Capture requires creating MDX files with 11 frontmatter fields and committing — 15 to 30 minutes per concept. AI agents can read the full graph via MCP, but they can't write to it. Proactive surfacing is limited to a weekly QA agent that detects quality problems, not relevant content for the user. And semantic search failed in production due to the WASM model weight in the browser.
These aren't bugs — they're the natural boundaries of a static system. Crossing them requires stateful backend. And as an engineer who has built production systems on AWS serverless for years, that suite is where I have the most experience, the most control, and — being honest — the most curiosity to explore the limits.
This essay isn't a deployment tutorial. It's the architecture design for taking a personal second brain to production — and an exploration of what happens when that design is applied to specialized domains where structured knowledge has real value.
The principle that worked in the prototype — separating memory, compute, and interface — stays. What changes is that each layer gains capabilities the static system can't offer.
DynamoDB stores the structured metadata — the equivalent of the current MDX frontmatter plus graph edges and embedding vectors. The main table uses a single-table design:
| PK | SK | Data |
|---|---|---|
NODE#serverless | META | Type, status, titles, summaries, tags, timestamps |
NODE#serverless | EDGE#aws-lambda | Relationship type, weight, direction |
NODE#serverless | EDGE#aws-api-gateway | Relationship type, weight, direction |
NODE#serverless | EMBED | 1,024-dimension vector (Titan Text Embeddings V2) |
AUDIT#2026-03-19T10:30:00Z | NODE#serverless | Action, author, diff |
A GSI inverts PK/SK for reverse queries: "what nodes point to serverless?" Another GSI projects status for queries like "all seeds not updated in 7 days."
S3 stores long-form content — the MDX body in Spanish and English. DynamoDB has a 400KB item limit; an evergreen concept with 1,500 bilingual words fits, but separating content from metadata keeps graph queries fast and cheap.
Why not Aurora Serverless with Postgres? Three reasons:
Each Lambda function has a single responsibility:
These Lambda functions serve as tools for both the REST API (consumed by the SPA) and AgentCore Gateway, which automatically exposes them as MCP tools — read_node, add_node, connect_nodes, search, flag_stale — with no additional protocol code. Any MCP-compatible AI client can discover and invoke these tools through Gateway.
AgentCore Runtime hosts the AI agent in microVMs with session isolation. The agent reasons with Bedrock Claude and invokes tools registered in Gateway to interact with the graph.
Step Functions orchestrates multi-step flows — for example, the full capture pipeline: validate input → generate metadata with Bedrock → compute embeddings → persist → notify. If Bedrock fails due to throttling, Step Functions retries with exponential backoff without custom code.
CloudFront serves the static frontend from S3 — the same exported Next.js that runs on Vercel today, but with full control over headers, cache, and edge functions. API Gateway REST exposes the compute endpoints for the SPA with throttling, API keys, and IAM authorization. AgentCore Gateway exposes the same Lambda functions as MCP tools with OAuth authentication, semantic tool discovery, and automatic protocol translation.
The "two doors" — one interface for humans, another for AI agents — materialize:
The question every serverless architect must answer: how much does it cost at rest and under load?
| Service | Idle (0 req/day) | Moderate use (100 req/day) | High use (1,000 req/day) |
|---|---|---|---|
| DynamoDB on-demand | $0.00 | ~$0.25 | ~$2.50 |
| Lambda | $0.00 | ~$0.01 | ~$0.10 |
| API Gateway | $0.00 | ~$0.04 | ~$0.35 |
| S3 + CloudFront | ~$0.50 | ~$0.60 | ~$1.00 |
| Bedrock (embeddings) | $0.00 | ~$0.50 | ~$2.00 |
| Bedrock (chat/agent) | $0.00 | ~$1.00 | ~$5.00 |
| EventBridge + SNS | ~$0.01 | ~$0.01 | ~$0.01 |
| Step Functions | $0.00 | ~$0.03 | ~$0.25 |
| Total | ~$0.51 | ~$2.44 | ~$11.21 |
At rest, the system costs less than a coffee. Under moderate use — the pattern for a personal second brain — less than $3/month. Even under high load, it stays below $12/month. Estimates based on AWS pricing for us-east-1, March 2026.
The key is that serverless scales to zero. No servers running waiting for requests. No databases with capacity minimums. Every penny corresponds to real work.
All infrastructure is defined with Terraform — the same approach that already manages the project's current IAM. A single terraform apply stands up the complete system:
resource "aws_dynamodb_table" "knowledge_graph" {
name = "SecondBrain-KnowledgeGraph"
billing_mode = "PAY_PER_REQUEST"
hash_key = "PK"
range_key = "SK"
attribute {
name = "PK"
type = "S"
}
attribute {
name = "SK"
type = "S"
}
# GSI for reverse queries (what points to this node?)
global_secondary_index {
name = "GSI1"
hash_key = "SK"
range_key = "PK"
projection_type = "ALL"
}
point_in_time_recovery { enabled = true }
}
resource "aws_lambda_function" "capture" {
function_name = "SecondBrain-Capture"
runtime = "nodejs22.x"
handler = "capture.handler"
filename = "lambda.zip"
role = aws_iam_role.lambda_exec.arn
timeout = 30
memory_size = 512
environment {
variables = { TABLE_NAME = aws_dynamodb_table.knowledge_graph.name }
}
}Terraform offers practical advantages for this project: remote state in S3 with DynamoDB locking, explicit plan/apply that shows exactly what changes before executing, and a provider ecosystem covering all AWS services including AgentCore. The project's existing infrastructure — the IAM role for content agents with GitHub Actions OIDC — already runs on Terraform.
The current prototype is a general technical knowledge second brain — software engineering, cloud, AI concepts. But the serverless architecture I described has nothing technology-specific. It's a knowledge graph system with capture, semantic search, agents, and proactive surfacing. That opens the door to domains where structured knowledge has much higher value.
The base case. A professional — engineer, designer, product manager — builds their public second brain. They share what they learn, connect ideas, let AI agents consume their graph. The value is in accumulation and connections.
This is what jonmatum.com does today. The serverless version adds low-friction capture (an endpoint instead of a commit), real semantic search, and agents that can write. Cost stays in pennies.
A law firm handles thousands of documents — rulings, contracts, regulations, precedents. Today they live in SharePoint folders or document management systems that search by full text. A legal second brain would change the paradigm:
The differentiating value: connections between legal documents are the real knowledge of an experienced lawyer. A junior with graph access can navigate relationships that would take years of experience to discover.
The architecture is identical — what changes are node types, relationships, and the agent's prompt. DynamoDB stores metadata, S3 stores full documents, Bedrock generates embeddings and contextual responses.
An R&D team produces papers, prototypes, datasets, experimental results. Knowledge fragments across Notion, Google Drive, Slack, and researchers' memories. An R&D second brain:
The value: in teams of 5+ researchers, tacit knowledge — who worked on what, what was tried and failed, what connections exist between research lines — gets lost. The graph captures it and makes it navigable.
An educational institution or educational content creator:
All these cases share the same architecture:
What changes between domains:
| Component | General | Legal | R&D | Education |
|---|---|---|---|---|
| Node types | Concept, note, experiment | Ruling, statute, contract | Paper, experiment, dataset | Concept, exercise, resource |
| Relationships | "related to" | "cites," "repeals" | "reproduces," "refutes" | "prerequisite of" |
| Agent prompt | Staff+ engineer | Senior lawyer | Principal researcher | Personalized tutor |
| Surfacing | Forgotten seeds | Contradicting precedents | Cross-referenced results | Learning gaps |
The AWS infrastructure is identical. A terraform apply with different variables.
This design doesn't exist in a vacuum. It's born from the intersection of two communities I care about: the AWS community and the PKM (Personal Knowledge Management) community.
The AWS community has a strong tradition of "builders" — engineers who build in public, share architectures, and contribute to open source projects. AWS Community Builders, User Groups, re:Invent talks — all share a principle: show what you build, explain why you made those decisions, and let others learn from your mistakes.
A serverless second brain on AWS is exactly that kind of project. It's not a product — it's a reference prototype. An architecture that other builders can take, adapt to their domain, and deploy with terraform apply. The code is open source. Architecture decisions are documented in essays like this one. Costs are transparent.
What I want to explore with the community:
These are questions a single builder can't answer. They need data from multiple implementations, at multiple scales, with multiple usage patterns. That's exactly what a community of builders produces.
The current prototype — Vercel, MDX, agents in GitHub Actions — keeps working. I'm not migrating it tomorrow. But the design is ready, and the pieces are clear:
Each phase is independent and deployable separately. Each adds value without requiring the next. And each is a topic for a future essay — with code, real costs, and lessons learned.
The second brain isn't the destination — it's the vehicle. What matters is the knowledge it structures, the connections it reveals, and the decisions it informs. The serverless architecture is simply the most efficient way I know to keep that vehicle running — no servers to maintain, no fixed costs, and the ability to scale from a curious engineer to a full research team.
The gap between a personal second brain and a production knowledge system isn't about technology — it's about architecture. The pieces exist: DynamoDB for graphs, Bedrock for AI, Lambda for compute, MCP for interoperability. What's missing is the design that connects them with the right constraints: zero idle cost, automatic scaling, and the separation between memory, compute, and interface that allows each layer to evolve independently.
This essay is that design. It's not definitive — it's a starting point for building, measuring, and learning in public.
Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.
AWS serverless compute service that runs code in response to events without provisioning or managing servers, automatically scaling from zero to thousands of concurrent executions.
AWS serverless NoSQL database with single-digit millisecond latency at any scale, ideal for applications requiring high performance and automatic scalability.
AWS managed service for creating, publishing, and managing REST, HTTP, and WebSocket APIs that act as entry points to Lambda functions and other backend services.
AWS serverless service providing access to foundation models from multiple providers (Anthropic, Meta, Mistral, Amazon) via unified API, without managing ML infrastructure.
AWS object storage service with 99.999999999% durability, unlimited scalability, and multiple storage classes for cost optimization.
AWS pub/sub messaging service that distributes messages to multiple subscribers simultaneously, enabling fan-out patterns and notifications at scale.
AWS serverless event bus connecting applications using events, enabling decoupled event-driven architectures with rule-based routing.
AWS serverless orchestration service that coordinates multiple services into visual workflows, with built-in error handling, retries, and parallel execution.
AWS identity and access management service controlling who can do what in your account, with granular policies based on the principle of least privilege.
Data structures representing knowledge as networks of entities and relationships, enabling reasoning, connection discovery, and semantic queries over complex domains.
Open protocol created by Anthropic that standardizes how AI applications connect with external tools, data, and services through a universal interface.
Practices and strategies to minimize cloud spending without sacrificing performance, including right-sizing, reservations, spot instances, and eliminating idle resources.
AWS framework with six pillars of best practices for designing and operating reliable, secure, efficient, and cost-effective cloud systems.
Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.