jonmatumalpha
conceptsnotesexperimentsessays

© 2026 Jonatan Mata · alpha · v0.1.0

Experiments

Content Agent with Strands and Bedrock

Three-agent system that automates the bilingual MDX content lifecycle: deterministic QA auditing, surgical fixes, and full upgrades — all orchestrated with Strands Agents, Claude Sonnet 4 on Amazon Bedrock, and GitHub Actions with a diamond workflow pattern.

evergreen#strands-agents#bedrock#claude#github-actions#terraform#oidc#automation#content-pipeline#python#multi-agent

What it is

A three-agent AI system that automates the full content lifecycle for a bilingual knowledge base (Spanish/English). The agents run in GitHub Actions, use Strands Agents as the orchestration framework, and Claude Sonnet 4 on Amazon Bedrock as the language model.

The system implements a continuous feedback loop: a QA agent audits content and opens issues → a fix agent applies surgical corrections → a content agent generates full upgrades → a human reviews and approves the PRs.

System architecture

Loading diagram...

The three agents

QA Agent — deterministic auditor

The QA agent (agents/qa_agent.py) runs structural checks without an LLM and optionally a deep review with Claude. It does not modify files — it only opens issues.

Structural checks (no LLM, no cost):

  • Word count (700+ for evergreen concepts)
  • Reference count (5+) and tier diversity (2+)
  • Cross-references (3+) and broken cross-refs
  • Required sections (¿Por qué importa?)
  • Mermaid diagram accessibility (accTitle, accDescr)
  • English headings in Spanish files
  • External links that should be internal (/concepts/slug)
  • Placeholder stubs in .en.mdx files

Deep review (--deep, uses Bedrock):

  • Filler text (generic interchangeable sentences)
  • Unsourced claims (statistics, pricing, dates)
  • Weak sections that restate the definition
  • Generic "¿Por qué importa?" without specific tradeoffs
  • Pseudocode instead of runnable examples
  • Comparison tables with abstract categories
# Local execution
python -m agents.qa_agent --dry-run --status evergreen    # audit without creating issues
python -m agents.qa_agent --deep --slug serverless        # LLM review of one concept
python -m agents.qa_agent --discover                      # JSON matrix for CI
python -m agents.qa_agent --single git                    # audit one + create issue

QA Fix Agent — surgical corrections

The fix agent (agents/qa_fix_agent.py) processes QA issues with minimal changes. It does not rewrite content — it only fixes what the issue describes.

Fix strategies by finding type:

FindingStrategy
refs — missing referencesFind primary source, add to ES + EN, verify URL
ref_tiers — low diversityIdentify missing tier, add reference from that tier
xrefs — few cross-refsRead content, find related concepts, add to frontmatter
broken_xref — broken refRemove non-existent slug or replace with valid one
heading — English headingTranslate to Spanish keeping heading level
ext_link — external linkReplace external URL with /concepts/slug in ES + EN
missing_section — missing sectionAdd section with substantive content
mermaid — no accessibilityAdd accTitle: and accDescr: to diagram
python -m agents.qa_fix_agent --issue 175       # fix a single QA issue
python -m agents.qa_fix_agent --batch 5         # fix 5 issues
python -m agents.qa_fix_agent --dry-run         # test without LLM

Content Agent — full upgrades

The content agent (agents/content_agent.py) generates full rewrites to bring content from seed/growing to evergreen quality. It processes both upgrade: and qa: issues.

python -m agents.content_agent --issue 143      # process one issue
python -m agents.content_agent --batch 3        # process 3 issues
python -m agents.content_agent --dry-run        # test without LLM

Shared tools

All three agents share four tools defined with the Strands @tool decorator:

from strands import Agent, tool
from strands.models import BedrockModel
 
@tool
def verify_url(url: str) -> str:
    """Verify a URL returns HTTP 200."""
    r = httpx.head(url, follow_redirects=True, timeout=10)
    return f"{url} → HTTP {r.status_code}"
 
@tool
def read_file(path: str) -> str:
    """Read a file from the repository."""
    with open(os.path.join(os.environ["REPO_ROOT"], path)) as f:
        return f.read()
 
@tool
def write_file(path: str, content: str) -> str:
    """Write content to a file."""
    with open(os.path.join(os.environ["REPO_ROOT"], path), "w") as f:
        f.write(content)
    return f"Written: {path}"
 
@tool
def list_concept_files() -> str:
    """List existing concepts for cross-references."""
    # returns available slugs for the frontmatter concepts: array

Diamond pattern in GitHub Actions

All three workflows use the same execution pattern — a "diamond" that discovers work, distributes it in parallel, and consolidates results:

Loading diagram...

Plan — discovers what to process (open issues or concepts with findings), generates a JSON matrix.

Matrix — each item runs in an isolated job. If one fails, the others continue (fail-fast: false).

Summary — downloads artifacts from all jobs, writes a summary to the GitHub job summary.

Rate limit management

Workflows that use an LLM (content agent, QA fix agent) serialize matrix jobs (max-parallel: 1) to avoid Bedrock throttling. The structural QA agent — which does not use an LLM — keeps high parallelism (max-parallel: 5). The QA agent in deep mode serializes to 1.

# content-agent.yml — serialized to avoid throttling
strategy:
  fail-fast: false
  max-parallel: 1
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
 
# content-qa.yml — dynamic based on mode
strategy:
  fail-fast: false
  max-parallel: ${{ inputs.deep == true && 1 || 5 }}
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}

As a fallback, agents include a short retry with backoff (1 attempt, 10 seconds) that fails fast to avoid burning unnecessary CI minutes.

Tuned timeouts

WorkflowPlanWorkSummary
Content Agent3 min15 min2 min
QA Fix Agent3 min10 min2 min
QA Audit3 min5 min2 min

Authentication without static secrets

Authentication uses OIDC — GitHub Actions obtains an ephemeral JWT token and exchanges it for temporary AWS credentials:

resource "aws_iam_role" "content_agent" {
  name = "jonmatum-content-agent"
 
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Federated = aws_iam_openid_connect_provider.github.arn }
      Action    = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:jonmatum/jonmatum.com:*"
        }
      }
    }]
  })
}
 
resource "aws_iam_role_policy" "bedrock_invoke" {
  role = aws_iam_role.content_agent.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"]
      Resource = [
        "arn:aws:bedrock:*::foundation-model/anthropic.claude-sonnet-4-*",
        "arn:aws:bedrock:*:*:inference-profile/us.anthropic.claude-sonnet-4-*"
      ]
    }]
  })
}

No stored API keys, no credential rotation, no leak risk. The role only allows InvokeModel on Bedrock.

Measured results

Data from system runs:

MetricValue
Concepts audited per QA cycle40 evergreen
Typical findings per audit5-8 concepts
Time per structural auditUnder 2 min (no LLM)
Time per deep review (LLM)~1 min per concept
Time per full upgrade~5 min per concept
Time per surgical fix~2 min per concept
Cost per upgrade (Sonnet 4)~$0.10-0.15
Cost per surgical fix~$0.03-0.05
Cost per deep review~$0.05
Successful validation rate~85% (15% fail lint and are discarded)

Lessons learned

  • OIDC eliminates credential management: the GitHub Actions → AWS integration with OIDC is configured in 20 lines of Terraform. No rotation, no leaks.
  • The system prompt is the quality control: AGENTS.md rules are injected as the system prompt — the agent follows structure, language, and reference standards without additional logic.
  • Post-generation validation is mandatory: without pnpm validate + pnpm lint:content after writing, the agent generates frontmatter with missing fields or content that fails the linter.
  • Serialization avoids throttling: max-parallel: 1 on LLM workflows eliminates Bedrock rate limit issues without spending CI minutes on retries.
  • Specialized agents outperform a generic one: separating auditing (deterministic, free), fixing (surgical, cheap), and upgrading (full, more expensive) optimizes cost and quality per task type.
  • MDX and angle brackets do not mix: <50% or <10 in MDX content is parsed as JSX. Agents must use prose: "less than 50%", "over 10".
  • Strands tool model simplifies integration: defining @tool with docstrings auto-generates the schema for the model — no manual JSON Schema needed.

Why it matters

This system demonstrates that a set of specialized agents with simple tools (file read/write, HTTP verification) can maintain a knowledge base autonomously. The pattern is replicable: any repository with structured content and documented quality rules can implement the same loop — audit, fix, upgrade, review. The human shifts from writer to editor: reviewing PRs instead of writing content from scratch.

References

  • Strands Agents — Amazon Bedrock — Strands, 2025. Strands integration with Bedrock.
  • Strands Agents — Custom Tools — Strands, 2025. Custom tool definition with @tool.
  • Amazon Bedrock Documentation — AWS, 2025. Official service documentation.
  • GitHub OIDC with AWS — GitHub, 2025. OIDC configuration for GitHub Actions with AWS.
  • Terraform AWS IAM OIDC Provider — HashiCorp, 2025. Terraform resource for OIDC.
  • GitHub Actions — Using a matrix for your jobs — GitHub, 2025. Matrix strategy documentation for workflows.

Related content

  • Strands Agents

    Open source SDK from AWS for building AI agents with a model-driven approach. Functional agents in a few lines of code, with multi-model support, custom tools, MCP, multi-agent, and built-in observability.

  • AI Agents

    Autonomous systems that combine language models with reasoning, memory, and tool use to execute complex multi-step tasks with minimal human intervention.

  • GitHub Actions

    GitHub's native CI/CD platform. Declarative YAML workflows that automate build, test, deploy, and any development lifecycle task — directly from the repository.

  • CI/CD

    Continuous Integration and Continuous Delivery/Deployment — practices that automate code integration, testing, and delivery to production. Foundation of modern software engineering.

  • Infrastructure as Code

    Practice of defining and managing infrastructure through versioned configuration files instead of manual processes. Foundation of modern operations automation.

  • Terraform

    HashiCorp's Infrastructure as Code tool that enables defining, provisioning, and managing multi-cloud infrastructure through declarative HCL files.

  • Multi-Agent Systems

    Architectures where multiple specialized AI agents collaborate, compete, or coordinate to solve complex problems that exceed a single agent's capability.

  • Agentic Workflows

    Design patterns where AI agents execute complex multi-step tasks autonomously, combining reasoning, tool use, and iterative decision-making.

Experiments