AI Agents¶
All agents follow the agent pattern and use prompts stored in the database.
Agent Overview¶
| Agent | Model | Purpose |
|---|---|---|
| Brief Generator | Sonnet | Structures article outlines from backlog items, injects trusted sources |
| Researcher | Sonnet | Gathers evidence via PubMed, web search, vault notes |
| Writer | Sonnet | Produces full article drafts from brief + research dossier |
| Factuality Checker | Sonnet | Validates medical claim accuracy against research dossier |
| SEO Optimizer | Haiku | Checks keyword usage, heading structure, meta descriptions |
| Style Checker | Haiku | Evaluates readability, tone, structure compliance, AI-pattern detection |
| Synthesis | Sonnet | Reconciles quality gate feedback, produces revised drafts |
| Triage | Haiku | Classifies research news items for newsworthiness |
| Digest Writer | Sonnet | Produces per-language research digest articles |
| Image Generator | Haiku | Composes DALL-E prompts from article context, generates featured images |
Model Selection¶
Default model assignments are defined in src/config.py and can be overridden via prompt versions in the database:
| Tier | Model | Agents |
|---|---|---|
| Sonnet (reasoning/creative) | claude-sonnet-4-6 |
Brief Generator, Researcher, Writer, Factuality Checker, Synthesis, Digest Writer |
| Haiku (classification/scoring) | claude-haiku-4-5 |
SEO Optimizer, Style Checker, Triage, Image Generator |
All prompts are versioned in the database and managed via Alembic data migrations. See Manage Prompts for details.
Structured Output¶
Most agents use Anthropic's structured output mode. Instead of extracting JSON from free-text responses, call_agent() passes a structured_output_schema parameter derived from the Pydantic model's JSON schema. The API guarantees valid JSON matching the schema, eliminating the need for JSON extraction and repair logic.
The schema is transformed by schema_transform.py to comply with Anthropic's restricted JSON Schema subset (no oneOf, forced additionalProperties: false, unsupported keywords moved to description hints).
Language Support
The Factuality Checker supports both English and German content with language-specific prompts (v3) for improved cultural and linguistic accuracy.
Trusted Sources
The Brief Generator queries the trusted_sources table for active sources filtered by article language and injects them as a <trusted_sources> XML block in the user message. This seeds key_sources_to_consult with authoritative sources (Cochrane, NICE, ATA, etc.) without restricting the researcher's search freedom. Sources with language="both" (scientific) are included for all articles; language-specific sources (medical journalism) are filtered by the backlog item's language.
Content-Type Caps
The Brief Generator and Researcher cap dossier sources and keywords by content type to prevent oversized briefs:
- Cornerstone: up to 15 sources, 8 secondary keywords
- Satellite: up to 8 sources, 5 secondary keywords
- Research news: up to 5 sources, 3 secondary keywords
Deterministic SEO Checks¶
The SEO quality gate combines LLM-based evaluation with deterministic Python checks computed by src/agents/seo_checks.py. These run without any LLM cost and measure:
- Keyword density -- primary keyword occurrence rate with hyphen normalization and proximity matching for German compound words
- Secondary keyword counts -- per-keyword occurrence counts
- Heading hierarchy -- validates heading levels don't skip (e.g., H1 to H3 without H2)
- Word count -- total words in the draft
- FAQ detection -- checks for FAQ headings, Rank Math FAQ blocks, or 3+ consecutive H3 headings ending with
?
Results are passed to the SEO optimizer agent as deterministic_checks alongside the LLM evaluation.
Agent Modules¶
Each agent is defined in src/agents/ with a corresponding module:
src/agents/
base.py # call_agent() -- shared invocation logic
schemas.py # Pydantic models for all agent I/O
schema_transform.py # JSON Schema transformer for Anthropic structured output
prompts.py # Prompt loading with TTL cache
brief_generator.py # generate_brief()
researcher.py # research()
writer.py # write_article()
factuality_checker.py # check_factuality()
seo_optimizer.py # optimize_seo()
seo_checks.py # Deterministic SEO checks (no LLM)
style_checker.py # check_style()
synthesis.py # synthesize()
triage.py # triage_items()
digest_writer.py # write_digest()
readability.py # check_readability()
image_generator.py # generate_featured_image()
cost_tracker.py # Token usage tracking and cost calculation