AI Agents¶

All agents follow the agent pattern and use prompts stored in the database.

Agent Overview¶

Agent	Model	Purpose
Brief Generator	Sonnet	Structures article outlines from backlog items, injects trusted sources
Researcher	Sonnet	Gathers evidence via PubMed, web search, vault notes
Writer	Sonnet	Produces full article drafts from brief + research dossier
Factuality Checker	Sonnet	Validates medical claim accuracy against research dossier
SEO Optimizer	Haiku	Checks keyword usage, heading structure, meta descriptions
Style Checker	Haiku	Evaluates readability, tone, structure compliance, AI-pattern detection
Synthesis	Sonnet	Reconciles quality gate feedback, produces revised drafts
Triage	Haiku	Classifies research news items for newsworthiness
Digest Writer	Sonnet	Produces per-language research digest articles
Image Generator	Haiku	Composes DALL-E prompts from article context, generates featured images

Model Selection¶

Default model assignments are defined in src/config.py and can be overridden via prompt versions in the database:

Tier	Model	Agents
Sonnet (reasoning/creative)	`claude-sonnet-4-6`	Brief Generator, Researcher, Writer, Factuality Checker, Synthesis, Digest Writer
Haiku (classification/scoring)	`claude-haiku-4-5`	SEO Optimizer, Style Checker, Triage, Image Generator

All prompts are versioned in the database and managed via Alembic data migrations. See Manage Prompts for details.

Structured Output¶

Most agents use Anthropic's structured output mode. Instead of extracting JSON from free-text responses, call_agent() passes a structured_output_schema parameter derived from the Pydantic model's JSON schema. The API guarantees valid JSON matching the schema, eliminating the need for JSON extraction and repair logic.

The schema is transformed by schema_transform.py to comply with Anthropic's restricted JSON Schema subset (no oneOf, forced additionalProperties: false, unsupported keywords moved to description hints).

Language Support

The Factuality Checker supports both English and German content with language-specific prompts (v3) for improved cultural and linguistic accuracy.

Trusted Sources

The Brief Generator queries the trusted_sources table for active sources filtered by article language and injects them as a <trusted_sources> XML block in the user message. This seeds key_sources_to_consult with authoritative sources (Cochrane, NICE, ATA, etc.) without restricting the researcher's search freedom. Sources with language="both" (scientific) are included for all articles; language-specific sources (medical journalism) are filtered by the backlog item's language.

Content-Type Caps

The Brief Generator and Researcher cap dossier sources and keywords by content type to prevent oversized briefs:

Cornerstone: up to 15 sources, 8 secondary keywords
Satellite: up to 8 sources, 5 secondary keywords
Research news: up to 5 sources, 3 secondary keywords

Deterministic SEO Checks¶

The SEO quality gate combines LLM-based evaluation with deterministic Python checks computed by src/agents/seo_checks.py. These run without any LLM cost and measure:

Keyword density -- primary keyword occurrence rate with hyphen normalization and proximity matching for German compound words
Secondary keyword counts -- per-keyword occurrence counts
Heading hierarchy -- validates heading levels don't skip (e.g., H1 to H3 without H2)
Word count -- total words in the draft
FAQ detection -- checks for FAQ headings, Rank Math FAQ blocks, or 3+ consecutive H3 headings ending with ?

Results are passed to the SEO optimizer agent as deterministic_checks alongside the LLM evaluation.

Agent Modules¶

Each agent is defined in src/agents/ with a corresponding module:

src/agents/
  base.py                  # call_agent() -- shared invocation logic
  schemas.py               # Pydantic models for all agent I/O
  schema_transform.py      # JSON Schema transformer for Anthropic structured output
  prompts.py               # Prompt loading with TTL cache
  brief_generator.py       # generate_brief()
  researcher.py            # research()
  writer.py                # write_article()
  factuality_checker.py    # check_factuality()
  seo_optimizer.py         # optimize_seo()
  seo_checks.py            # Deterministic SEO checks (no LLM)
  style_checker.py         # check_style()
  synthesis.py             # synthesize()
  triage.py                # triage_items()
  digest_writer.py         # write_digest()
  readability.py           # check_readability()
  image_generator.py       # generate_featured_image()
  cost_tracker.py          # Token usage tracking and cost calculation