Skip to content

AI Agents

All agents follow the agent pattern and use prompts stored in the database.

Agent Overview

Agent Model Purpose
Brief Generator Sonnet Structures article outlines from backlog items, injects trusted sources
Researcher Sonnet Gathers evidence via PubMed, web search, vault notes
Writer Sonnet Produces full article drafts from brief + research dossier
Factuality Checker Sonnet Validates medical claim accuracy against research dossier
SEO Optimizer Haiku Checks keyword usage, heading structure, meta descriptions
Style Checker Haiku Evaluates readability, tone, structure compliance, AI-pattern detection
Synthesis Sonnet Reconciles quality gate feedback, produces revised drafts
Triage Haiku Classifies research news items for newsworthiness
Digest Writer Sonnet Produces per-language research digest articles
Image Generator Haiku Composes DALL-E prompts from article context, generates featured images

Model Selection

Default model assignments are defined in src/config.py and can be overridden via prompt versions in the database:

Tier Model Agents
Sonnet (reasoning/creative) claude-sonnet-4-6 Brief Generator, Researcher, Writer, Factuality Checker, Synthesis, Digest Writer
Haiku (classification/scoring) claude-haiku-4-5 SEO Optimizer, Style Checker, Triage, Image Generator

All prompts are versioned in the database and managed via Alembic data migrations. See Manage Prompts for details.

Structured Output

Most agents use Anthropic's structured output mode. Instead of extracting JSON from free-text responses, call_agent() passes a structured_output_schema parameter derived from the Pydantic model's JSON schema. The API guarantees valid JSON matching the schema, eliminating the need for JSON extraction and repair logic.

The schema is transformed by schema_transform.py to comply with Anthropic's restricted JSON Schema subset (no oneOf, forced additionalProperties: false, unsupported keywords moved to description hints).

Language Support

The Factuality Checker supports both English and German content with language-specific prompts (v3) for improved cultural and linguistic accuracy.

Trusted Sources

The Brief Generator queries the trusted_sources table for active sources filtered by article language and injects them as a <trusted_sources> XML block in the user message. This seeds key_sources_to_consult with authoritative sources (Cochrane, NICE, ATA, etc.) without restricting the researcher's search freedom. Sources with language="both" (scientific) are included for all articles; language-specific sources (medical journalism) are filtered by the backlog item's language.

Content-Type Caps

The Brief Generator and Researcher cap dossier sources and keywords by content type to prevent oversized briefs:

  • Cornerstone: up to 15 sources, 8 secondary keywords
  • Satellite: up to 8 sources, 5 secondary keywords
  • Research news: up to 5 sources, 3 secondary keywords

Deterministic SEO Checks

The SEO quality gate combines LLM-based evaluation with deterministic Python checks computed by src/agents/seo_checks.py. These run without any LLM cost and measure:

  • Keyword density -- primary keyword occurrence rate with hyphen normalization and proximity matching for German compound words
  • Secondary keyword counts -- per-keyword occurrence counts
  • Heading hierarchy -- validates heading levels don't skip (e.g., H1 to H3 without H2)
  • Word count -- total words in the draft
  • FAQ detection -- checks for FAQ headings, Rank Math FAQ blocks, or 3+ consecutive H3 headings ending with ?

Results are passed to the SEO optimizer agent as deterministic_checks alongside the LLM evaluation.

Agent Modules

Each agent is defined in src/agents/ with a corresponding module:

src/agents/
  base.py                  # call_agent() -- shared invocation logic
  schemas.py               # Pydantic models for all agent I/O
  schema_transform.py      # JSON Schema transformer for Anthropic structured output
  prompts.py               # Prompt loading with TTL cache
  brief_generator.py       # generate_brief()
  researcher.py            # research()
  writer.py                # write_article()
  factuality_checker.py    # check_factuality()
  seo_optimizer.py         # optimize_seo()
  seo_checks.py            # Deterministic SEO checks (no LLM)
  style_checker.py         # check_style()
  synthesis.py             # synthesize()
  triage.py                # triage_items()
  digest_writer.py         # write_digest()
  readability.py           # check_readability()
  image_generator.py       # generate_featured_image()
  cost_tracker.py          # Token usage tracking and cost calculation