Skip to content

Architecture

System Overview

graph TB
    subgraph External
        WP[WordPress]
        PubMed[PubMed API]
        BraveSearch[Brave Search]
        S2[Semantic Scholar]
        CT[ClinicalTrials.gov]
        Vault[Obsidian Vault]
        Anthropic[Anthropic Claude API]
        OpenAI[OpenAI DALL-E 3]
    end

    subgraph Storage
        DB[(Neon Postgres)]
        R2[Cloudflare R2]
    end

    subgraph Observability
        Langfuse[Langfuse]
        Sentry[Sentry]
    end

    subgraph Pipeline["Pipeline (Prefect)"]
        BF[Brief Flow]
        AF[Article Flow]
        RF[Resume Flow]
        PF[Publish Flow]
        RNF[Research News Flow]
        DF[Digest Flow]
    end

    subgraph Agents
        BG[Brief Generator]
        RES[Researcher]
        WR[Writer]
        FC[Factuality Checker]
        SEO[SEO Optimizer]
        SC[Style Checker]
        SYN[Synthesis]
        TRI[Triage]
        DW[Digest Writer]
        IMG[Image Generator]
    end

    subgraph Dashboard["Dashboard (Streamlit)"]
        UI[Operator Dashboard]
    end

    BF --> BG
    AF --> RES --> WR --> FC & SEO & SC --> SYN
    RF --> AF
    PF --> WP
    RNF --> TRI --> DF --> DW
    RES --> PubMed & BraveSearch & S2 & CT & Vault
    BG & RES & WR & FC & SEO & SC & SYN & TRI & DW --> Anthropic
    IMG --> OpenAI
    IMG --> R2
    PF --> R2
    AF & BF & PF & RNF & DF & RF --> DB
    UI --> DB
    UI --> R2
    Anthropic -.->|traces| Langfuse
    AF -.->|errors| Sentry

Component Roles

Pipeline (Prefect)

Prefect flows handle orchestration -- sequencing agent calls, managing retries, and tracking runs. No business logic lives in flows. Each flow is a thin wrapper that calls agents and persists results. The Prefect server runs self-hosted inside the Fly.io container (not Prefect Cloud), backed by a dedicated Neon Postgres database.

Nine deployments are registered across two work pools: generate-briefs, produce-article, batch-produce, resume-article, publish-articles, republish-articles, research-news-scan, and produce-digests on the content pool, plus recover-stuck-articles on a dedicated recovery pool.

Agents

Agents are the core business logic layer. Each agent:

  • Loads its prompt from the database via call_agent()
  • Sends a request to the Anthropic API (with structured output schema when applicable)
  • Returns a validated Pydantic model
  • Records cost and token usage to the database
  • Emits a Langfuse generation observation (when configured)

See Agent Pattern for the full convention.

Tools

Tools provide agents with external data access:

  • PubMed -- searches medical literature (search_pubmed, fetch_abstract, fetch_paper)
  • Semantic Scholar -- searches academic papers with citation counts and abstracts
  • ClinicalTrials.gov -- searches clinical trials by condition
  • Web Search -- queries Brave Search API
  • Vault Reader -- searches research notes via pgvector hybrid search (semantic + full-text)
  • Keyword Data -- looks up backlog items by keyword and content architecture by pillar
  • Image Generator -- creates featured images via DALL-E 3 with pillar-specific color palettes
  • R2 Storage -- uploads/downloads featured images to/from Cloudflare R2
  • Source Scanner -- scans registered external sources for new items
  • URL Validator -- HEAD-checks source URLs, removes unreachable sources from dossiers
  • Source Enrichment -- constructs publication URLs from PMID/DOI identifiers
  • Citation Formatter -- generates deterministic citation labels from source metadata
  • Sanitizer -- strips control characters, collapses whitespace, truncates external content

WordPress Publisher

The publishing layer converts articles to WordPress format:

  • Featured image download from Cloudflare R2 and upload to WordPress media library
  • Markdown to Gutenberg blocks (headings, paragraphs, tables, lists, blockquotes, callouts, citations)
  • ACF custom field mapping (summary, reading time, evidence level, review date)
  • Polylang language association (EN/DE) with translation pair linking
  • Taxonomy assignment via LLM classification (topic, audience, evidence_level)
  • Custom post type routing (satellite, cornerstone, research_news)

Database

Neon Postgres stores all pipeline state across 14 tables:

  • Content backlog and article lifecycle (with status state machine)
  • Article images (metadata + R2 object keys; binary data stored in Cloudflare R2)
  • Quality scores per gate per iteration
  • Agent prompts (versioned, keyed by agent + content_type)
  • Content templates (structural guidance per content type and pillar)
  • Run tracking and cost data
  • Research news sources and triaged items
  • Trusted sources catalog (authoritative sources for brief generation)
  • Vault note embeddings (pgvector hybrid search)
  • Dashboard users and audit log

Observability

Three-layer observability stack:

  • structlog -- JSON-formatted structured logging with context binding (article_id, agent_name, flow_name)
  • Langfuse -- LLM-specific tracing with generation observations, session grouping, tags, and quality scores
  • Sentry -- error tracking with httpx and SQLAlchemy integrations

See Observability for detailed documentation.

Dashboard

The Streamlit dashboard provides operator visibility across 8 pages: pipeline overview, article management, research news, review/publishing, analytics, source management, prompt management, and user administration. It uses its own database connection pool (NullPool) to avoid Streamlit/asyncpg event loop conflicts.

Data Flow

A typical article production follows this path:

  1. Backlog item created (dashboard or CSV import)
  2. Brief flow generates a structured article brief (status: brief_generating to brief_pending)
  3. Operator approves the brief (status: brief_approved)
  4. Article flow runs the full lifecycle:
    • Researcher gathers evidence (PubMed, web, vault)
    • Writer produces a draft from brief + dossier
    • Quality gates run in parallel (factuality, SEO, style -- config-driven per content type)
    • Synthesis reconciles feedback and produces a revised draft
    • If any gate fails, the writer rewrites (up to 3 iterations)
    • Image generation for satellite and cornerstone articles
  5. Operator reviews in the dashboard (status: ready_for_review)
  6. Publish flow pushes approved articles to WordPress as drafts

Key Design Decisions

  • Prompts as data -- stored in DB, versioned via Alembic, not hardcoded
  • Structured output -- Anthropic's structured output mode guarantees valid JSON matching Pydantic schemas
  • SQLAlchemy Core only -- no ORM, direct select/insert/update with Table + MetaData
  • Async by default -- asyncpg for Postgres, httpx for HTTP
  • Strict typing -- mypy strict mode, Pydantic v2 strict validation
  • Cost ceilings -- per-flow token budgets enforced at runtime via ContextVar
  • Config-driven quality gates -- active gates and pass thresholds are configurable per content type
  • pgvector hybrid search -- vault knowledge base uses semantic (vector, 70% weight) + full-text search (30% weight)
  • Prompt caching -- system prompts sent with cache_control: {"type": "ephemeral"} for Anthropic server-side caching (24h TTL, 10% input rate on cache hits)
  • Self-hosted Prefect -- Prefect server co-located with worker in Fly.io container, backed by dedicated Neon database