Content Agent with Strands and Bedrock

What it is

A three-agent AI system that automates the full content lifecycle for a bilingual knowledge base (Spanish/English). The agents run in GitHub Actions, use Strands Agents as the orchestration framework, and Claude Sonnet 4 on Amazon Bedrock as the language model.

The system implements a continuous feedback loop: a QA agent audits content and opens issues → a fix agent applies surgical corrections → a content agent generates full upgrades → a human reviews and approves the PRs.

System architecture

Loading diagram...

The three agents

QA Agent — deterministic auditor

The QA agent (agents/qa_agent.py) runs structural checks without an LLM and optionally a deep review with Claude. It does not modify files — it only opens issues.

Structural checks (no LLM, no cost):

Word count (700+ for evergreen concepts)
Reference count (5+) and tier diversity (2+)
Cross-references (3+) and broken cross-refs
Required sections (¿Por qué importa?)
Mermaid diagram accessibility (accTitle, accDescr)
English headings in Spanish files
External links that should be internal (/concepts/slug)
Placeholder stubs in .en.mdx files

Deep review (--deep, uses Bedrock):

Filler text (generic interchangeable sentences)
Unsourced claims (statistics, pricing, dates)
Weak sections that restate the definition
Generic "¿Por qué importa?" without specific tradeoffs
Pseudocode instead of runnable examples
Comparison tables with abstract categories

# Local execution
python -m agents.qa_agent --dry-run --status evergreen    # audit without creating issues
python -m agents.qa_agent --deep --slug serverless        # LLM review of one concept
python -m agents.qa_agent --discover                      # JSON matrix for CI
python -m agents.qa_agent --single git                    # audit one + create issue

QA Fix Agent — surgical corrections

The fix agent (agents/qa_fix_agent.py) processes QA issues with minimal changes. It does not rewrite content — it only fixes what the issue describes.

Fix strategies by finding type:

Finding	Strategy
`refs` — missing references	Find primary source, add to ES + EN, verify URL
`ref_tiers` — low diversity	Identify missing tier, add reference from that tier
`xrefs` — few cross-refs	Read content, find related concepts, add to frontmatter
`broken_xref` — broken ref	Remove non-existent slug or replace with valid one
`heading` — English heading	Translate to Spanish keeping heading level
`ext_link` — external link	Replace external URL with `/concepts/slug` in ES + EN
`missing_section` — missing section	Add section with substantive content
`mermaid` — no accessibility	Add `accTitle:` and `accDescr:` to diagram

python -m agents.qa_fix_agent --issue 175       # fix a single QA issue
python -m agents.qa_fix_agent --batch 5         # fix 5 issues
python -m agents.qa_fix_agent --dry-run         # test without LLM

Content Agent — full upgrades

The content agent (agents/content_agent.py) generates full rewrites to bring content from seed/growing to evergreen quality. It processes both upgrade: and qa: issues.

python -m agents.content_agent --issue 143      # process one issue
python -m agents.content_agent --batch 3        # process 3 issues
python -m agents.content_agent --dry-run        # test without LLM

Shared tools

All three agents share four tools defined with the Strands @tool decorator:

from strands import Agent, tool
from strands.models import BedrockModel
 
@tool
def verify_url(url: str) -> str:
    """Verify a URL returns HTTP 200."""
    r = httpx.head(url, follow_redirects=True, timeout=10)
    return f"{url} → HTTP {r.status_code}"
 
@tool
def read_file(path: str) -> str:
    """Read a file from the repository."""
    with open(os.path.join(os.environ["REPO_ROOT"], path)) as f:
        return f.read()
 
@tool
def write_file(path: str, content: str) -> str:
    """Write content to a file."""
    with open(os.path.join(os.environ["REPO_ROOT"], path), "w") as f:
        f.write(content)
    return f"Written: {path}"
 
@tool
def list_concept_files() -> str:
    """List existing concepts for cross-references."""
    # returns available slugs for the frontmatter concepts: array

Diamond pattern in GitHub Actions

All three workflows use the same execution pattern — a "diamond" that discovers work, distributes it in parallel, and consolidates results:

Loading diagram...

Plan — discovers what to process (open issues or concepts with findings), generates a JSON matrix.

Matrix — each item runs in an isolated job. If one fails, the others continue (fail-fast: false).

Summary — downloads artifacts from all jobs, writes a summary to the GitHub job summary.

Rate limit management

Workflows that use an LLM (content agent, QA fix agent) serialize matrix jobs (max-parallel: 1) to avoid Bedrock throttling. The structural QA agent — which does not use an LLM — keeps high parallelism (max-parallel: 5). The QA agent in deep mode serializes to 1.

# content-agent.yml — serialized to avoid throttling
strategy:
  fail-fast: false
  max-parallel: 1
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
 
# content-qa.yml — dynamic based on mode
strategy:
  fail-fast: false
  max-parallel: ${{ inputs.deep == true && 1 || 5 }}
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}

As a fallback, agents include a short retry with backoff (1 attempt, 10 seconds) that fails fast to avoid burning unnecessary CI minutes.

Tuned timeouts

Workflow	Plan	Work	Summary
Content Agent	3 min	15 min	2 min
QA Fix Agent	3 min	10 min	2 min
QA Audit	3 min	5 min	2 min

Authentication without static secrets

Authentication uses OIDC — GitHub Actions obtains an ephemeral JWT token and exchanges it for temporary AWS credentials:

resource "aws_iam_role" "content_agent" {
  name = "jonmatum-content-agent"
 
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Federated = aws_iam_openid_connect_provider.github.arn }
      Action    = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:jonmatum/jonmatum.com:*"
        }
      }
    }]
  })
}
 
resource "aws_iam_role_policy" "bedrock_invoke" {
  role = aws_iam_role.content_agent.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"]
      Resource = [
        "arn:aws:bedrock:*::foundation-model/anthropic.claude-sonnet-4-*",
        "arn:aws:bedrock:*:*:inference-profile/us.anthropic.claude-sonnet-4-*"
      ]
    }]
  })
}

No stored API keys, no credential rotation, no leak risk. The role only allows InvokeModel on Bedrock.

Measured results

Data from system runs:

Metric	Value
Concepts audited per QA cycle	40 evergreen
Typical findings per audit	5-8 concepts
Time per structural audit	Under 2 min (no LLM)
Time per deep review (LLM)	~1 min per concept
Time per full upgrade	~5 min per concept
Time per surgical fix	~2 min per concept
Cost per upgrade (Sonnet 4)	~$0.10-0.15
Cost per surgical fix	~$0.03-0.05
Cost per deep review	~$0.05
Successful validation rate	~85% (15% fail lint and are discarded)

Lessons learned

OIDC eliminates credential management: the GitHub Actions → AWS integration with OIDC is configured in 20 lines of Terraform. No rotation, no leaks.
The system prompt is the quality control: AGENTS.md rules are injected as the system prompt — the agent follows structure, language, and reference standards without additional logic.
Post-generation validation is mandatory: without pnpm validate + pnpm lint:content after writing, the agent generates frontmatter with missing fields or content that fails the linter.
Serialization avoids throttling: max-parallel: 1 on LLM workflows eliminates Bedrock rate limit issues without spending CI minutes on retries.
Specialized agents outperform a generic one: separating auditing (deterministic, free), fixing (surgical, cheap), and upgrading (full, more expensive) optimizes cost and quality per task type.
MDX and angle brackets do not mix: <50% or <10 in MDX content is parsed as JSX. Agents must use prose: "less than 50%", "over 10".
Strands tool model simplifies integration: defining @tool with docstrings auto-generates the schema for the model — no manual JSON Schema needed.

Why it matters

This system demonstrates that a set of specialized agents with simple tools (file read/write, HTTP verification) can maintain a knowledge base autonomously. The pattern is replicable: any repository with structured content and documented quality rules can implement the same loop — audit, fix, upgrade, review. The human shifts from writer to editor: reviewing PRs instead of writing content from scratch.

References

Strands Agents — Amazon Bedrock — Strands, 2025. Strands integration with Bedrock.
Strands Agents — Custom Tools — Strands, 2025. Custom tool definition with @tool.
Amazon Bedrock Documentation — AWS, 2025. Official service documentation.
GitHub OIDC with AWS — GitHub, 2025. OIDC configuration for GitHub Actions with AWS.
Terraform AWS IAM OIDC Provider — HashiCorp, 2025. Terraform resource for OIDC.
GitHub Actions — Using a matrix for your jobs — GitHub, 2025. Matrix strategy documentation for workflows.

What it is

System architecture

Loading diagram...

The three agents

QA Agent — deterministic auditor

The QA agent (agents/qa_agent.py) runs structural checks without an LLM and optionally a deep review with Claude. It does not modify files — it only opens issues.

Structural checks (no LLM, no cost):

Word count (700+ for evergreen concepts)
Reference count (5+) and tier diversity (2+)
Cross-references (3+) and broken cross-refs
Required sections (¿Por qué importa?)
Mermaid diagram accessibility (accTitle, accDescr)
English headings in Spanish files
External links that should be internal (/concepts/slug)
Placeholder stubs in .en.mdx files

Deep review (--deep, uses Bedrock):

Filler text (generic interchangeable sentences)
Unsourced claims (statistics, pricing, dates)
Weak sections that restate the definition
Generic "¿Por qué importa?" without specific tradeoffs
Pseudocode instead of runnable examples
Comparison tables with abstract categories

# Local execution
python -m agents.qa_agent --dry-run --status evergreen    # audit without creating issues
python -m agents.qa_agent --deep --slug serverless        # LLM review of one concept
python -m agents.qa_agent --discover                      # JSON matrix for CI
python -m agents.qa_agent --single git                    # audit one + create issue

QA Fix Agent — surgical corrections

The fix agent (agents/qa_fix_agent.py) processes QA issues with minimal changes. It does not rewrite content — it only fixes what the issue describes.

Fix strategies by finding type:

Finding	Strategy
`refs` — missing references	Find primary source, add to ES + EN, verify URL
`ref_tiers` — low diversity	Identify missing tier, add reference from that tier
`xrefs` — few cross-refs	Read content, find related concepts, add to frontmatter
`broken_xref` — broken ref	Remove non-existent slug or replace with valid one
`heading` — English heading	Translate to Spanish keeping heading level
`ext_link` — external link	Replace external URL with `/concepts/slug` in ES + EN
`missing_section` — missing section	Add section with substantive content
`mermaid` — no accessibility	Add `accTitle:` and `accDescr:` to diagram

python -m agents.qa_fix_agent --issue 175       # fix a single QA issue
python -m agents.qa_fix_agent --batch 5         # fix 5 issues
python -m agents.qa_fix_agent --dry-run         # test without LLM

Content Agent — full upgrades

The content agent (agents/content_agent.py) generates full rewrites to bring content from seed/growing to evergreen quality. It processes both upgrade: and qa: issues.

python -m agents.content_agent --issue 143      # process one issue
python -m agents.content_agent --batch 3        # process 3 issues
python -m agents.content_agent --dry-run        # test without LLM

Shared tools

All three agents share four tools defined with the Strands @tool decorator:

from strands import Agent, tool
from strands.models import BedrockModel
 
@tool
def verify_url(url: str) -> str:
    """Verify a URL returns HTTP 200."""
    r = httpx.head(url, follow_redirects=True, timeout=10)
    return f"{url} → HTTP {r.status_code}"
 
@tool
def read_file(path: str) -> str:
    """Read a file from the repository."""
    with open(os.path.join(os.environ["REPO_ROOT"], path)) as f:
        return f.read()
 
@tool
def write_file(path: str, content: str) -> str:
    """Write content to a file."""
    with open(os.path.join(os.environ["REPO_ROOT"], path), "w") as f:
        f.write(content)
    return f"Written: {path}"
 
@tool
def list_concept_files() -> str:
    """List existing concepts for cross-references."""
    # returns available slugs for the frontmatter concepts: array

Diamond pattern in GitHub Actions

All three workflows use the same execution pattern — a "diamond" that discovers work, distributes it in parallel, and consolidates results:

Loading diagram...

Plan — discovers what to process (open issues or concepts with findings), generates a JSON matrix.

Matrix — each item runs in an isolated job. If one fails, the others continue (fail-fast: false).

Summary — downloads artifacts from all jobs, writes a summary to the GitHub job summary.

Rate limit management

# content-agent.yml — serialized to avoid throttling
strategy:
  fail-fast: false
  max-parallel: 1
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
 
# content-qa.yml — dynamic based on mode
strategy:
  fail-fast: false
  max-parallel: ${{ inputs.deep == true && 1 || 5 }}
  matrix: ${{ fromJson(needs.plan.outputs.matrix) }}

As a fallback, agents include a short retry with backoff (1 attempt, 10 seconds) that fails fast to avoid burning unnecessary CI minutes.

Tuned timeouts

Workflow	Plan	Work	Summary
Content Agent	3 min	15 min	2 min
QA Fix Agent	3 min	10 min	2 min
QA Audit	3 min	5 min	2 min

Authentication without static secrets

Authentication uses OIDC — GitHub Actions obtains an ephemeral JWT token and exchanges it for temporary AWS credentials:

resource "aws_iam_role" "content_agent" {
  name = "jonmatum-content-agent"
 
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Federated = aws_iam_openid_connect_provider.github.arn }
      Action    = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:jonmatum/jonmatum.com:*"
        }
      }
    }]
  })
}
 
resource "aws_iam_role_policy" "bedrock_invoke" {
  role = aws_iam_role.content_agent.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"]
      Resource = [
        "arn:aws:bedrock:*::foundation-model/anthropic.claude-sonnet-4-*",
        "arn:aws:bedrock:*:*:inference-profile/us.anthropic.claude-sonnet-4-*"
      ]
    }]
  })
}

No stored API keys, no credential rotation, no leak risk. The role only allows InvokeModel on Bedrock.

Measured results

Data from system runs:

Metric	Value
Concepts audited per QA cycle	40 evergreen
Typical findings per audit	5-8 concepts
Time per structural audit	Under 2 min (no LLM)
Time per deep review (LLM)	~1 min per concept
Time per full upgrade	~5 min per concept
Time per surgical fix	~2 min per concept
Cost per upgrade (Sonnet 4)	~$0.10-0.15
Cost per surgical fix	~$0.03-0.05
Cost per deep review	~$0.05
Successful validation rate	~85% (15% fail lint and are discarded)

Lessons learned

OIDC eliminates credential management: the GitHub Actions → AWS integration with OIDC is configured in 20 lines of Terraform. No rotation, no leaks.
The system prompt is the quality control: AGENTS.md rules are injected as the system prompt — the agent follows structure, language, and reference standards without additional logic.
Post-generation validation is mandatory: without pnpm validate + pnpm lint:content after writing, the agent generates frontmatter with missing fields or content that fails the linter.
Serialization avoids throttling: max-parallel: 1 on LLM workflows eliminates Bedrock rate limit issues without spending CI minutes on retries.
Specialized agents outperform a generic one: separating auditing (deterministic, free), fixing (surgical, cheap), and upgrading (full, more expensive) optimizes cost and quality per task type.
MDX and angle brackets do not mix: <50% or <10 in MDX content is parsed as JSX. Agents must use prose: "less than 50%", "over 10".
Strands tool model simplifies integration: defining @tool with docstrings auto-generates the schema for the model — no manual JSON Schema needed.

Why it matters

References

Strands Agents — Amazon Bedrock — Strands, 2025. Strands integration with Bedrock.
Strands Agents — Custom Tools — Strands, 2025. Custom tool definition with @tool.
Amazon Bedrock Documentation — AWS, 2025. Official service documentation.
GitHub OIDC with AWS — GitHub, 2025. OIDC configuration for GitHub Actions with AWS.
Terraform AWS IAM OIDC Provider — HashiCorp, 2025. Terraform resource for OIDC.
GitHub Actions — Using a matrix for your jobs — GitHub, 2025. Matrix strategy documentation for workflows.

Content Agent with Strands and Bedrock

What it is

System architecture

The three agents

QA Agent — deterministic auditor

QA Fix Agent — surgical corrections

Content Agent — full upgrades

Shared tools

Diamond pattern in GitHub Actions

Rate limit management

Tuned timeouts

Authentication without static secrets

Measured results

Lessons learned

Why it matters

References

Related content

Content Agent with Strands and Bedrock

What it is

System architecture

The three agents

QA Agent — deterministic auditor

QA Fix Agent — surgical corrections

Content Agent — full upgrades

Shared tools

Diamond pattern in GitHub Actions

Rate limit management

Tuned timeouts

Authentication without static secrets

Measured results

Lessons learned

Why it matters

References

Related content