jonmatumalpha
conceptsnotesexperimentsessays

© 2026 Jonatan Mata · alpha · v0.1.0

Essays

From Prototype to Production: A Serverless Second Brain on AWS

Architecture design for scaling a personal second brain to a production system with AWS serverless — from the current prototype to specialized use cases in legal, research, and community building.

evergreen#second-brain#aws#serverless#architecture#production#community#bedrock#dynamodb#lambda#mcp

The starting point

In "Building a Second Brain in Public" I documented how a weekend in March 2026 became a knowledge system with over 150 nodes, 450+ edges, and ~200,000 bilingual words. The prototype runs on Vercel — instant deploys, automatic SSL, global CDN. For a static site with MDX content, it's perfect.

But the prototype has clear limits. Capture requires creating MDX files with 11 frontmatter fields and committing — 15 to 30 minutes per concept. AI agents can read the full graph via MCP, but they can't write to it. Proactive surfacing is limited to a weekly QA agent that detects quality problems, not relevant content for the user. And semantic search failed in production due to the WASM model weight in the browser.

These aren't bugs — they're the natural boundaries of a static system. Crossing them requires stateful backend. And as an engineer who has built production systems on AWS serverless for years, that suite is where I have the most experience, the most control, and — being honest — the most curiosity to explore the limits.

This essay isn't a deployment tutorial. It's the architecture design for taking a personal second brain to production — and an exploration of what happens when that design is applied to specialized domains where structured knowledge has real value.

The architecture: each service, one responsibility

The principle that worked in the prototype — separating memory, compute, and interface — stays. What changes is that each layer gains capabilities the static system can't offer.

Loading diagram...

Memory layer: DynamoDB + S3

DynamoDB stores the structured metadata — the equivalent of the current MDX frontmatter plus graph edges and embedding vectors. The main table uses a single-table design:

PKSKData
NODE#serverlessMETAType, status, titles, summaries, tags, timestamps
NODE#serverlessEDGE#aws-lambdaRelationship type, weight, direction
NODE#serverlessEDGE#aws-api-gatewayRelationship type, weight, direction
NODE#serverlessEMBED1,024-dimension vector (Titan Text Embeddings V2)
AUDIT#2026-03-19T10:30:00ZNODE#serverlessAction, author, diff

A GSI inverts PK/SK for reverse queries: "what nodes point to serverless?" Another GSI projects status for queries like "all seeds not updated in 7 days."

S3 stores long-form content — the MDX body in Spanish and English. DynamoDB has a 400KB item limit; an evergreen concept with 1,500 bilingual words fits, but separating content from metadata keeps graph queries fast and cheap.

Why not Aurora Serverless with Postgres? Three reasons:

  1. Data model: a knowledge graph is naturally key-value with relationships — DynamoDB's single-table pattern models it without an ORM or schema migrations
  2. Predictable latency: Aurora Serverless v2 supports scaling to 0 ACUs since 2024, but waking up incurs reconnection latency. DynamoDB responds in consistent milliseconds, with no pauses or resumes
  3. Operational simplicity: DynamoDB requires no VPCs, subnets, or security groups. A single Terraform resource with IAM — no intermediate network layers

Compute layer: Lambda + Step Functions + EventBridge

Each Lambda function has a single responsibility:

  • Capture: receives text + optional URL, generates frontmatter with Bedrock, classifies type, suggests tags and cross-refs, persists to DynamoDB + S3
  • Search: hybrid search — keyword matching in DynamoDB + cosine similarity against pre-computed embeddings with Bedrock Titan
  • Graph: reads edges from DynamoDB, builds the graph in memory, responds with JSON for D3
  • Surfacing: daily cron via EventBridge that identifies forgotten seeds, nodes with few connections, concepts that should be related but aren't

These Lambda functions serve as tools for both the REST API (consumed by the SPA) and AgentCore Gateway, which automatically exposes them as MCP tools — read_node, add_node, connect_nodes, search, flag_stale — with no additional protocol code. Any MCP-compatible AI client can discover and invoke these tools through Gateway.

AgentCore Runtime hosts the AI agent in microVMs with session isolation. The agent reasons with Bedrock Claude and invokes tools registered in Gateway to interact with the graph.

Step Functions orchestrates multi-step flows — for example, the full capture pipeline: validate input → generate metadata with Bedrock → compute embeddings → persist → notify. If Bedrock fails due to throttling, Step Functions retries with exponential backoff without custom code.

Interface layer: CloudFront + API Gateway + AgentCore Gateway

CloudFront serves the static frontend from S3 — the same exported Next.js that runs on Vercel today, but with full control over headers, cache, and edge functions. API Gateway REST exposes the compute endpoints for the SPA with throttling, API keys, and IAM authorization. AgentCore Gateway exposes the same Lambda functions as MCP tools with OAuth authentication, semantic tool discovery, and automatic protocol translation.

The "two doors" — one interface for humans, another for AI agents — materialize:

  • Human door: CloudFront → SPA + API Gateway REST for search, interactive graph, tag navigation
  • Agent door: AgentCore Gateway → MCP tools with read and write, semantic discovery, and OAuth authorization

The cost: pennies, not dollars

The question every serverless architect must answer: how much does it cost at rest and under load?

ServiceIdle (0 req/day)Moderate use (100 req/day)High use (1,000 req/day)
DynamoDB on-demand$0.00~$0.25~$2.50
Lambda$0.00~$0.01~$0.10
API Gateway$0.00~$0.04~$0.35
S3 + CloudFront~$0.50~$0.60~$1.00
Bedrock (embeddings)$0.00~$0.50~$2.00
Bedrock (chat/agent)$0.00~$1.00~$5.00
EventBridge + SNS~$0.01~$0.01~$0.01
Step Functions$0.00~$0.03~$0.25
Total~$0.51~$2.44~$11.21

At rest, the system costs less than a coffee. Under moderate use — the pattern for a personal second brain — less than $3/month. Even under high load, it stays below $12/month. Estimates based on AWS pricing for us-east-1, March 2026.

The key is that serverless scales to zero. No servers running waiting for requests. No databases with capacity minimums. Every penny corresponds to real work.

Infrastructure as code: Terraform as the foundation

All infrastructure is defined with Terraform — the same approach that already manages the project's current IAM. A single terraform apply stands up the complete system:

resource "aws_dynamodb_table" "knowledge_graph" {
  name         = "SecondBrain-KnowledgeGraph"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "PK"
  range_key    = "SK"
 
  attribute {
    name = "PK"
    type = "S"
  }
  attribute {
    name = "SK"
    type = "S"
  }
 
  # GSI for reverse queries (what points to this node?)
  global_secondary_index {
    name            = "GSI1"
    hash_key        = "SK"
    range_key       = "PK"
    projection_type = "ALL"
  }
 
  point_in_time_recovery { enabled = true }
}
 
resource "aws_lambda_function" "capture" {
  function_name = "SecondBrain-Capture"
  runtime       = "nodejs22.x"
  handler       = "capture.handler"
  filename      = "lambda.zip"
  role          = aws_iam_role.lambda_exec.arn
  timeout       = 30
  memory_size   = 512
 
  environment {
    variables = { TABLE_NAME = aws_dynamodb_table.knowledge_graph.name }
  }
}

Terraform offers practical advantages for this project: remote state in S3 with DynamoDB locking, explicit plan/apply that shows exactly what changes before executing, and a provider ecosystem covering all AWS services including AgentCore. The project's existing infrastructure — the IAM role for content agents with GitHub Actions OIDC — already runs on Terraform.

Use cases: from general knowledge to specialized

The current prototype is a general technical knowledge second brain — software engineering, cloud, AI concepts. But the serverless architecture I described has nothing technology-specific. It's a knowledge graph system with capture, semantic search, agents, and proactive surfacing. That opens the door to domains where structured knowledge has much higher value.

Public and general knowledge

The base case. A professional — engineer, designer, product manager — builds their public second brain. They share what they learn, connect ideas, let AI agents consume their graph. The value is in accumulation and connections.

This is what jonmatum.com does today. The serverless version adds low-friction capture (an endpoint instead of a commit), real semantic search, and agents that can write. Cost stays in pennies.

Legal: case law as a knowledge graph

A law firm handles thousands of documents — rulings, contracts, regulations, precedents. Today they live in SharePoint folders or document management systems that search by full text. A legal second brain would change the paradigm:

  • Nodes: each ruling, statute, contract template, legal opinion
  • Edges: "cites," "contradicts," "extends," "repeals," "applies principle of"
  • Capture: a lawyer reads a new ruling and adds it via an endpoint — the agent auto-classifies, extracts citations, connects with existing precedents
  • Surfacing: "this ruling from last week contradicts the precedent you used in case X"
  • MCP agent: a legal assistant that navigates the graph to find relevant precedents, identifies contradictions, and generates argumentation drafts

The differentiating value: connections between legal documents are the real knowledge of an experienced lawyer. A junior with graph access can navigate relationships that would take years of experience to discover.

The architecture is identical — what changes are node types, relationships, and the agent's prompt. DynamoDB stores metadata, S3 stores full documents, Bedrock generates embeddings and contextual responses.

Research and development: the connected lab

An R&D team produces papers, prototypes, datasets, experimental results. Knowledge fragments across Notion, Google Drive, Slack, and researchers' memories. An R&D second brain:

  • Nodes: papers, experiments, datasets, hypotheses, results
  • Edges: "reproduces," "refutes," "extends," "uses dataset from," "inspired by"
  • Capture: a researcher finishes an experiment and logs results — the agent automatically connects with previous hypotheses and related papers
  • Surfacing: "María's experiment last week got results similar to your January hypothesis — should you collaborate?"
  • MCP agent: a research assistant that cross-references experimental results with existing literature

The value: in teams of 5+ researchers, tacit knowledge — who worked on what, what was tried and failed, what connections exist between research lines — gets lost. The graph captures it and makes it navigable.

Education: the living curriculum

An educational institution or educational content creator:

  • Nodes: curriculum concepts, exercises, assessments, resources
  • Edges: "prerequisite of," "complements," "assesses understanding of"
  • Capture: a teacher adds a new resource — the agent connects it with the concepts it covers
  • Surfacing: "3 students failed on concept X — here are alternative resources covering the prerequisites"
  • MCP agent: a tutor that navigates the graph to create personalized learning paths

The common pattern

All these cases share the same architecture:

Loading diagram...

What changes between domains:

ComponentGeneralLegalR&DEducation
Node typesConcept, note, experimentRuling, statute, contractPaper, experiment, datasetConcept, exercise, resource
Relationships"related to""cites," "repeals""reproduces," "refutes""prerequisite of"
Agent promptStaff+ engineerSenior lawyerPrincipal researcherPersonalized tutor
SurfacingForgotten seedsContradicting precedentsCross-referenced resultsLearning gaps

The AWS infrastructure is identical. A terraform apply with different variables.

Building in community

This design doesn't exist in a vacuum. It's born from the intersection of two communities I care about: the AWS community and the PKM (Personal Knowledge Management) community.

The AWS community has a strong tradition of "builders" — engineers who build in public, share architectures, and contribute to open source projects. AWS Community Builders, User Groups, re:Invent talks — all share a principle: show what you build, explain why you made those decisions, and let others learn from your mistakes.

A serverless second brain on AWS is exactly that kind of project. It's not a product — it's a reference prototype. An architecture that other builders can take, adapt to their domain, and deploy with terraform apply. The code is open source. Architecture decisions are documented in essays like this one. Costs are transparent.

What I want to explore with the community:

  1. The knowledge graph pattern in DynamoDB — single table with GSIs for bidirectional relationships. Does it work at 10,000 nodes? At 100,000? Where are the limits?
  2. MCP agents with write access — most MCP implementations are read-only. What happens when the agent can mutate the graph? What quality controls are needed?
  3. Serverless semantic search — Bedrock Titan Embeddings + DynamoDB vs. OpenSearch Serverless vs. pgvector on Aurora. What's the crossover point in cost and latency?
  4. The real cost of Bedrock in production — the penny estimates are theoretical. How does it behave with real usage, throttling, and model cold starts?

These are questions a single builder can't answer. They need data from multiple implementations, at multiple scales, with multiple usage patterns. That's exactly what a community of builders produces.

What comes next

The current prototype — Vercel, MDX, agents in GitHub Actions — keeps working. I'm not migrating it tomorrow. But the design is ready, and the pieces are clear:

  1. Phase 1: Capture API — a Lambda + API Gateway endpoint that accepts text and creates a seed in DynamoDB. The low-friction capture that's missing today
  2. Phase 2: Semantic search — embeddings with Bedrock Titan, stored in DynamoDB, queried by cosine similarity in Lambda
  3. Phase 3: MCP agent with write access — Bedrock AgentCore Runtime hosts the agent, AgentCore Gateway exposes Lambda tools as MCP. The agent door stops being read-only
  4. Phase 4: Proactive surfacing — EventBridge + Lambda + SNS for daily digests with forgotten seeds, suggested connections, and relevant content

Each phase is independent and deployable separately. Each adds value without requiring the next. And each is a topic for a future essay — with code, real costs, and lessons learned.

The second brain isn't the destination — it's the vehicle. What matters is the knowledge it structures, the connections it reveals, and the decisions it informs. The serverless architecture is simply the most efficient way I know to keep that vehicle running — no servers to maintain, no fixed costs, and the ability to scale from a curious engineer to a full research team.

Why it matters

The gap between a personal second brain and a production knowledge system isn't about technology — it's about architecture. The pieces exist: DynamoDB for graphs, Bedrock for AI, Lambda for compute, MCP for interoperability. What's missing is the design that connects them with the right constraints: zero idle cost, automatic scaling, and the separation between memory, compute, and interface that allows each layer to evolve independently.

This essay is that design. It's not definitive — it's a starting point for building, measuring, and learning in public.

References

  • AWS Serverless Applications Lens — AWS Well-Architected Framework. Best practices guide for serverless architectures.
  • Amazon Bedrock AgentCore — AWS. Platform for deploying AI agents with serverless Runtime, MCP Gateway, and identity management.
  • Model Context Protocol — Specification — Anthropic/Linux Foundation. Protocol specification for agent interoperability.
  • Amazon DynamoDB Developer Guide — AWS. Official DynamoDB documentation.
  • Building a Second Brain — Tiago Forte, 2022. The book that popularized the concept and the PARA method.
  • AWS Serverless — AWS. Official AWS serverless portfolio page.

Related content

  • Serverless

    Cloud computing model where the provider manages infrastructure automatically, allowing code execution without provisioning or managing servers, paying only for actual usage.

  • AWS Lambda

    AWS serverless compute service that runs code in response to events without provisioning or managing servers, automatically scaling from zero to thousands of concurrent executions.

  • AWS DynamoDB

    AWS serverless NoSQL database with single-digit millisecond latency at any scale, ideal for applications requiring high performance and automatic scalability.

  • AWS API Gateway

    AWS managed service for creating, publishing, and managing REST, HTTP, and WebSocket APIs that act as entry points to Lambda functions and other backend services.

  • AWS Bedrock

    AWS serverless service providing access to foundation models from multiple providers (Anthropic, Meta, Mistral, Amazon) via unified API, without managing ML infrastructure.

  • AWS S3

    AWS object storage service with 99.999999999% durability, unlimited scalability, and multiple storage classes for cost optimization.

  • AWS SNS

    AWS pub/sub messaging service that distributes messages to multiple subscribers simultaneously, enabling fan-out patterns and notifications at scale.

  • AWS EventBridge

    AWS serverless event bus connecting applications using events, enabling decoupled event-driven architectures with rule-based routing.

  • AWS Step Functions

    AWS serverless orchestration service that coordinates multiple services into visual workflows, with built-in error handling, retries, and parallel execution.

  • AWS IAM

    AWS identity and access management service controlling who can do what in your account, with granular policies based on the principle of least privilege.

  • Knowledge Graphs

    Data structures representing knowledge as networks of entities and relationships, enabling reasoning, connection discovery, and semantic queries over complex domains.

  • Model Context Protocol (MCP)

    Open protocol created by Anthropic that standardizes how AI applications connect with external tools, data, and services through a universal interface.

  • Cost Optimization

    Practices and strategies to minimize cloud spending without sacrificing performance, including right-sizing, reservations, spot instances, and eliminating idle resources.

  • AWS Well-Architected Framework

    AWS framework with six pillars of best practices for designing and operating reliable, secure, efficient, and cost-effective cloud systems.

  • Infrastructure as Code

    Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.

Essays