Skip to content

Naluma Content Pipeline

Automated content production pipeline for a tinnitus information website. AI agents research medical literature, draft articles, run quality checks, and publish to WordPress — with human review at the gate.

How It Works

graph LR
    A[Backlog Item] --> B[Brief Generation]
    B --> C[Research]
    C --> D[Writing]
    D --> E[Quality Gates]
    E --> F{Pass?}
    F -->|No| G[Rewrite]
    G --> E
    F -->|Yes| H[Human Review]
    H --> I[Publish to WordPress]

The pipeline turns a content backlog item into a published article through a series of AI agent steps:

  1. Brief Generation — structures the article outline, keywords, and target audience
  2. Research — gathers evidence from PubMed, web search, and an Obsidian vault
  3. Writing — produces a full article draft from the brief and research dossier
  4. Quality Gates — factuality, SEO, and readability checks run in parallel
  5. Synthesis — reconciles feedback, triggers rewrites if needed (up to 3 iterations)
  6. Human Review — operator approves or requests changes
  7. Publish — pushes approved articles to WordPress as drafts

A separate research news pipeline scans RSS feeds weekly, triages relevance, and produces digest articles.

Tech Stack

Layer Technology
Language Python 3.12, strict typing (mypy)
Orchestration Prefect 3.x
LLM Anthropic Claude (Sonnet for drafting, Haiku for classification/review)
Database Neon Postgres, SQLAlchemy Core (async via asyncpg + pgvector)
Validation Pydantic v2 (strict mode)
Dashboard Streamlit
Documentation MkDocs Material (Cloudflare Pages)
Publishing WordPress REST API + Polylang (EN/DE)
Deployment Docker, Fly.io
CI/CD GitHub Actions

Project Structure

src/
  agents/       # AI agents (researcher, writer, quality checkers, synthesis, image, etc.)
  tools/        # PubMed, web search, vault reader (pgvector), image gen, sanitizer
  pipeline/     # Prefect flows (article, brief, batch, publish, research news, cornerstone)
  wordpress/    # WordPress client, Gutenberg converter, ACF, taxonomy, Polylang
  db/           # SQLAlchemy tables + pgvector, queries, Alembic migrations

dashboard/      # Streamlit operator dashboard

docs/
  site/         # MkDocs Material source (Diátaxis structure)

tests/
  unit/         # Isolated tests, no DB
  integration/  # Real DB tests
  e2e/          # Full pipeline tests with mocked agents
  evaluation/   # LLM-as-judge prompt evaluations (not run in CI)