Agents API¶
agents
¶
AI agents for the naluma content pipeline.
Each agent module exposes a single async entry-point function that calls
call_agent() from base.py and returns a validated Pydantic model
from schemas.py.
Modules: base: Generic Anthropic API caller with tool loop, retry, and cost tracking. researcher: Gathers evidence from web, PubMed, Semantic Scholar, and vault. writer: Produces full article drafts from research dossiers. synthesis: Merges quality-gate feedback into revised drafts. humanizer: Strips residual AI writing patterns for natural tone. factuality_checker: Verifies claims against sources. seo_optimizer: Evaluates and improves SEO signals. seo_checks: Deterministic SEO validation checks (separate from LLM-based optimizer). style_checker: Evaluates draft style and readability. triage: Classifies and routes content items. digest_writer: Produces research news digests. brief_generator: Creates article briefs from backlog items. readability: Readability scoring utilities. cost_tracker: Per-call and per-flow cost accounting. image_generator: AI image generation for articles. ai_patterns: Prohibited vocabulary maps and prompt pattern blocks. prompts: Database-backed prompt loader with TTL cache. schemas: Pydantic I/O models for all agents.
ai_patterns
¶
Shared AI pattern library — vocabulary, constructions, and hype terms to avoid.
Used by writer, synthesis, style checker, humanizer, and digest writer agents. The key design principle (Anthropic best practice): include replacement guidance alongside banned words — tell the model what TO write, not just what to avoid.
Import format_pattern_block(language) to inject the full pattern guidance into
any prompt. Pass "en" or "de" to get language-specific content.
format_pattern_block(language)
¶
Format all pattern guidance into a single text block for prompt injection.
Args:
language: "en" for English patterns, "de" for German patterns.
Returns: A multi-section formatted string ready to embed in a system or user prompt.
Raises:
ValueError: If language is not "en" or "de".
base
¶
Generic Anthropic API caller with tool use loop, retry, and cost tracking.
Provides call_agent — the single entry point for all agent interactions
with the Anthropic API. Handles prompt loading, tool dispatch, exponential
backoff on transient errors, cost recording, and structured logging.
AgentResponse(content, tool_results=list(), usage=dict(), cost_usd=0.0, model='', duration_ms=0, stop_reason='')
dataclass
¶
Structured result from a call_agent invocation.
AgentLoopLimitError(agent_name, iterations)
¶
Bases: Exception
Raised when the tool use loop exceeds MAX_TOOL_ITERATIONS.
JsonExtractionError(response_text)
¶
Bases: Exception
Raised when JSON cannot be extracted from an LLM response.
ExtractionResult(json_str, method, was_repaired)
dataclass
¶
Result of a JSON extraction attempt.
extract_json(text)
¶
Extract JSON from an LLM response that may contain markdown fencing.
Handles three cases:
1. Pure JSON (starts with { or [) — trailing prose stripped via
raw_decode, then repaired.
2. Markdown code block (``json ... `````) — contents extracted.
3. JSON embedded in prose — first{/[to matching last}/]``
extracted.
After extraction, repairs common LLM JSON errors (unescaped quotes).
Returns an :class:ExtractionResult with json_str, method, and
was_repaired. Raises :class:JsonExtractionError when no JSON can be
found.
parse_agent_output(response_content, model_class, *, agent_name='')
¶
Extract JSON from an LLM response and validate it against a Pydantic model.
.. deprecated::
All agents now use :func:validate_structured_output with
structured_output_schema. This function is retained only for
tests/evaluation/judge.py which still uses :func:extract_json.
Raises: JsonExtractionError: If no JSON can be extracted from the response. ValueError: If the extracted JSON does not match the model schema.
validate_structured_output(response_content, model_class, *, agent_name)
¶
Validate a structured output response against a Pydantic model.
Used by agents that pass structured_output_schema to call_agent().
The API guarantees valid JSON matching the schema, so no extraction
is needed — just model_validate_json() with standardized error logging
and Sentry context.
Guards: - Empty/whitespace-only responses raise ValueError immediately (Sentry CONTENT-AUTOMATION-13: factuality_checker returned ""). - Trailing-comma JSON is repaired before re-validation (Sentry CONTENT-AUTOMATION-14: style_checker returned JSON with trailing comma).
Raises: ValueError: If validation fails (wraps the original exception).
call_agent(agent_name, content_type, messages, *, article_id=None, tools=None, tool_definitions=None, max_tokens=4096, max_iterations=MAX_TOOL_ITERATIONS, flow_name=None, session=None, structured_output_schema=None, assistant_prefill=None, on_iteration_end=None)
async
¶
Call an agent via the Anthropic API with full orchestration.
Loads the prompt, calls the API (with retries), runs the tool use loop
if needed, records cost, and returns a structured AgentResponse.
Raises:
PromptNotFoundError: If no active prompt exists for the agent.
AgentLoopLimitError: If the tool loop exceeds MAX_TOOL_ITERATIONS.
APIStatusError: On non-retryable API errors or exhausted retries.
brief_generator
¶
Brief generator agent for the naluma content pipeline.
Takes a BacklogItem and ContentTemplate, invokes the LLM with tool
access to web search, content architecture, and vault notes, and returns a
fully validated ArticleBrief.
generate_brief(backlog_item, template, *, article_id=None)
async
¶
Generate an ArticleBrief from a backlog item and content template.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backlog_item
|
BacklogItem
|
The content backlog entry to produce a brief for. |
required |
template
|
ContentTemplate
|
The structural template that defines expected article sections. |
required |
article_id
|
UUID | None
|
Optional article UUID for cost tracking. |
None
|
Returns:
| Type | Description |
|---|---|
ArticleBrief
|
A fully validated brief ready for the writer agent. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the LLM response cannot be parsed into an |
cost_tracker
¶
Cost tracking for agent calls — pricing, recording, and breakdown queries.
Provides calculate_cost to convert token usage into USD, record_agent_cost
to atomically accumulate costs on articles, and get_article_cost_breakdown to
query per-agent aggregates from the quality_scores table.
calculate_cost(model, tokens_in, tokens_out, cache_read_input_tokens=0, cache_creation_input_tokens=0)
¶
Return the estimated cost in USD for the given token counts.
Uses MODEL_PRICING for known models. Unknown models trigger a
warning log and fall back to the most expensive pricing.
Anthropic prompt caching rates: - Cache read tokens: 10% of the normal input rate. - Cache creation tokens: 125% of the normal input rate.
record_agent_cost(session, article_id, agent_name, tokens_in, tokens_out, cost_usd, flow_name=None, cache_read_input_tokens=0, cache_creation_input_tokens=0)
async
¶
Atomically accumulate cost_usd and token counts on the article.
After recording, optionally enforces the flow cost ceiling via
check_cost_limit if flow_name is provided.
get_article_cost_breakdown(session, article_id)
async
¶
Return per-agent cost aggregates from quality_scores.
Each dict contains agent_name, cost_usd, tokens_in, tokens_out.
Returns an empty list when no scores exist.
digest_writer
¶
Digest writer agent — generates patient-accessible research news digests.
Pure generation agent with no tool use. Constructs a structured user message
from triaged source items with their triage angles, calls the LLM, and parses
the response into a validated DigestOutput.
write_digest(newsworthy_items, triage_results, language, *, article_id=None, rewrite_feedback=None)
async
¶
Generate research news digest from triaged items.
Args: newsworthy_items: Source items that passed triage. triage_results: Triage results with suggested angles. language: Target language code (e.g., "en", "de"). article_id: Optional article ID for cost tracking. rewrite_feedback: Optional structural feedback from a previous failed pre-check, included as rewrite instructions.
Returns: A validated DigestOutput with editorial intro, item summaries, and SEO metadata.
Raises: ValueError: If the LLM response cannot be parsed as DigestOutput.
factuality_checker
¶
Factuality checker agent — verifies claims in article drafts.
Pure reasoning agent that cross-references the draft against the research
dossier and returns a validated :class:FactualityOutput. Does not use
external tools; all verification is dossier-based.
check_factuality(draft, dossier, brief, *, article_id=None)
async
¶
Verify factual claims in draft against dossier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
draft
|
str
|
The article draft markdown to fact-check. |
required |
dossier
|
ResearchDossier
|
The research dossier containing sources and evidence gathered earlier. |
required |
brief
|
ArticleBrief
|
The article brief describing the topic, keywords, and targets. |
required |
article_id
|
UUID | None
|
Optional article UUID for cost tracking. |
None
|
Returns:
| Type | Description |
|---|---|
FactualityOutput
|
Validated factuality check output including claim counts, issues, and verification notes. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the agent response cannot be parsed into a |
humanizer
¶
Humanizer polish agent — strips residual AI patterns from final drafts.
Sonnet-based agent that runs after quality gates pass. Replaces banned AI vocabulary and construction patterns without altering factual claims, statistics, source references, or medical terminology.
Uses a user-message instruction to direct the model to begin with the first heading, and a preamble-extraction fallback to strip any reasoning that leaks through.
polish_draft(draft, language, content_type='all', *, article_id=None)
async
¶
Run a lightweight humanizer polish on a final draft.
This is called after quality gates pass, before storing the article as READY_FOR_REVIEW. It replaces banned AI vocabulary and construction patterns without altering factual content.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
draft
|
str
|
The final article draft in markdown. |
required |
language
|
str
|
Article language code ("en" or "de"). |
required |
content_type
|
str
|
Content type for prompt lookup (default "all"). |
'all'
|
article_id
|
UUID | None
|
Optional article ID for cost tracking. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The polished draft, or the original if no changes were needed. |
image_generator
¶
Image generator agent — generates two featured image candidates per article.
Claude reads article context and returns two scene descriptions via structured output. Python code assembles full DALL-E prompts from the fixed template and pillar color palette, then generates both images in parallel.
generate_featured_images(article_id, brief, revised_draft, title, seo_title)
async
¶
Generate two contextual featured image candidates for an article.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
article_id
|
UUID
|
UUID of the article for cost tracking. |
required |
brief
|
dict[str, object]
|
The article brief as a dict (from |
required |
revised_draft
|
str
|
The synthesised/final draft text. |
required |
title
|
str
|
The article title. |
required |
seo_title
|
str
|
The SEO-optimised title. |
required |
Returns:
| Type | Description |
|---|---|
DualImageResult
|
Contains one or two |
prompts
¶
Prompt loader with TTL cache for agent system prompts.
Wraps the raw get_active_prompt DB query with an in-memory cache
to avoid database round-trips on every agent call. The cache can be
cleared immediately via invalidate_prompt_cache() (e.g. after a
dashboard edit) or left to expire naturally after
PROMPT_CACHE_TTL_SECONDS (default 300 s).
PromptNotFoundError(agent_name, content_type)
¶
Bases: Exception
Raised when no active prompt exists for the given agent/content-type.
get_prompt(agent_name, content_type)
async
¶
Return the active AgentPrompt for agent_name / content_type.
Results are cached in-memory for up to PROMPT_CACHE_TTL_SECONDS.
On cache miss (or expiry) the database is queried via
get_active_prompt, which falls back to content_type='all'
automatically.
Raises: PromptNotFoundError: If no active prompt exists at all.
invalidate_prompt_cache(agent_name=None)
¶
Clear cached prompts.
If agent_name is None every entry is removed. Otherwise only
entries whose key starts with the given agent name are dropped.
readability
¶
Deterministic readability and style checks — pure Python, no LLM cost.
Computes Flesch-Kincaid grade level, average sentence length, and within-target checks that complement the LLM-based evaluation in the style checker agent.
run_style_checks(draft, brief)
¶
Compute deterministic readability metrics for a draft.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
draft
|
str
|
The article draft in markdown. |
required |
brief
|
ArticleBrief
|
The article brief. The |
required |
Returns:
| Type | Description |
|---|---|
StyleCheckResult
|
Flesch-Kincaid grade, average sentence length, and within-target flag. |
check_digest_headlines(draft)
¶
Check that H2 headlines are concise descriptive titles, not sentences.
Returns (passed, reason) tuple.
researcher
¶
Researcher agent — gathers evidence for article production.
Invokes PubMed, Semantic Scholar, ClinicalTrials.gov, web search, and
Obsidian vault tools via the base call_agent loop, then returns a
validated :class:ResearchDossier.
Progressive checkpointing: the researcher periodically saves its findings
via update_research_notes. After each checkpoint the on_iteration_end
callback compresses the conversation context, and forced guardrails ensure
checkpoints happen even when the model ignores prompt guidance. If the
iteration cap is hit, the last checkpoint is converted to a fallback dossier.
SourceMeta(article_title='', authors='', doi='', publication_name='', publication_url='', article_url='', publication_year='')
dataclass
¶
Bibliographic metadata extracted deterministically from a tool result.
research(brief, *, article_id=None, vault_context=None)
async
¶
Run the researcher agent against brief and return a validated dossier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
brief
|
ArticleBrief
|
The article brief describing the topic, keywords, and sources. |
required |
article_id
|
UUID | None
|
Optional article UUID for cost tracking. |
None
|
vault_context
|
list[dict[str, object]] | None
|
Pre-retrieved vault chunks from pgvector similarity search.
When provided, included as |
None
|
Returns:
| Type | Description |
|---|---|
ResearchDossier
|
Validated research output including sources, evidence, and FAQ suggestions. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the agent response cannot be parsed into a |
AgentLoopLimitError
|
If the iteration cap is hit and no checkpoint is available. |
schema_transform
¶
Transform Pydantic JSON schemas for Anthropic structured output.
Local copy of the transformation logic from the Anthropic SDK
(anthropic.lib._parse._transform.transform_schema). Inlined to
decouple from a private API that may change without notice on SDK
upgrades.
The transformer converts standard JSON Schema (as emitted by
BaseModel.model_json_schema()) into the restricted subset accepted
by the Anthropic structured-output endpoint:
- Strips unsupported keywords (
minimum,maximum,additionalPropertiesetc.) and appends them todescriptionas soft hints. - Converts
oneOf→anyOf. - Forces
additionalProperties: falseon all objects. - Recursively processes
$defs,items,properties.
transform_schema(json_schema)
¶
Transform a JSON schema for Anthropic structured-output compatibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
json_schema
|
dict[str, Any]
|
A standard JSON Schema dict (e.g. from |
required |
Returns:
| Type | Description |
|---|---|
dict
|
A transformed copy with unsupported keywords moved into
|
schemas
¶
Agent I/O schemas for the naluma content pipeline.
Every agent in the pipeline has typed input/output contracts defined here.
All models use Pydantic v2 with strict validation (ConfigDict(strict=True)).
AeoSpec
¶
Bases: BaseModel
Answer Engine Optimisation specification for an article brief.
BriefSection
¶
Bases: BaseModel
A section in the article structure.
ContentTemplateRef
¶
Bases: BaseModel
Content template reference embedded in a brief.
BriefSourceDirective
¶
Bases: BaseModel
Research direction for the researcher agent.
ArticleBrief
¶
Bases: BaseModel
Complete brief handed to the writer agent.
ResearchSource
¶
Bases: BaseModel
A single source discovered during the research phase.
ResearchDossier
¶
Bases: BaseModel
Complete research output handed to the writer agent.
FaqItem
¶
Bases: BaseModel
A single FAQ question-answer pair.
WriterOutput
¶
Bases: BaseModel
Output from the writer agent.
QualityFeedback
¶
Bases: BaseModel
Structured feedback from a quality gate agent.
QualityGateOutput
¶
Bases: BaseModel
Base output for all quality gate agents.
FactualityIssue
¶
Bases: BaseModel
A single factual issue flagged by the factuality checker.
FactualityOutput
¶
KeywordCount
¶
Bases: BaseModel
Count of a secondary keyword in the draft.
SeoCheckResult
¶
Bases: BaseModel
Deterministic SEO check results (non-LLM).
SeoLlmEvaluation
¶
Bases: BaseModel
LLM-based qualitative SEO evaluation.
SeoSuggestedFix
¶
Bases: BaseModel
A single actionable SEO fix suggestion.
SeoOutput
¶
StyleCheckResult
¶
Bases: BaseModel
Deterministic readability check results (non-LLM).
ElementPresence
¶
Bases: BaseModel
Whether a required structural element is present.
StyleStructureCompliance
¶
Bases: BaseModel
Structure compliance section of style checker output.
StyleVoiceIssue
¶
Bases: BaseModel
A single voice/tone issue flagged by the style checker.
StyleVoiceEvaluation
¶
Bases: BaseModel
Voice and tone evaluation from the style checker.
StyleAiPattern
¶
Bases: BaseModel
A single AI-writing pattern detected by the humanizer check.
StyleHumanizerCheck
¶
Bases: BaseModel
Humanizer check results from the style checker.
StyleMedicalIssue
¶
Bases: BaseModel
A single medical language issue flagged by the style checker.
StyleMedicalLanguage
¶
Bases: BaseModel
Medical language evaluation from the style checker.
StyleOutput
¶
SynthesisChange
¶
Bases: BaseModel
A single change made during synthesis.
SynthesisConflictResolution
¶
Bases: BaseModel
A conflict resolution between quality gates.
SynthesisOutput
¶
Bases: BaseModel
Output from the synthesis agent that merges quality feedback into the draft.
TriageResult
¶
Bases: BaseModel
Triage decision for a single discovered source item.
TriageOutput
¶
Bases: BaseModel
Wrapper for structured output — contains the list of triage results.
DigestSourceCitation
¶
Bases: BaseModel
Structured source metadata for a single digest item citation.
DigestItemSummary
¶
Bases: BaseModel
Summary of a single item in a research digest.
DigestOutput
¶
Bases: BaseModel
Complete research digest output.
ImageScene
¶
Bases: BaseModel
A single scene description for DALL-E prompt assembly.
ImageScenesOutput
¶
Bases: BaseModel
Structured output from Claude: two scene descriptions.
SingleImageResult
¶
Bases: BaseModel
Result for one generated image.
DualImageResult
¶
Bases: BaseModel
Result containing one or two generated images.
seo_checks
¶
Deterministic SEO checks — pure Python, no LLM cost.
Computes measurable SEO metrics (keyword density, heading hierarchy, etc.) that complement the LLM-based evaluation in the SEO optimizer agent.
run_seo_checks(draft, brief, seo_title, meta_description)
¶
Compute all deterministic SEO metrics for a draft.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
draft
|
str
|
The article draft in markdown. |
required |
brief
|
ArticleBrief
|
The article brief containing keyword and target metadata. |
required |
seo_title
|
str
|
The proposed SEO title tag. |
required |
meta_description
|
str
|
The proposed meta description. |
required |
Returns:
| Type | Description |
|---|---|
SeoCheckResult
|
All measurable SEO check results. |
seo_optimizer
¶
SEO optimizer agent — evaluates and scores article SEO quality.
Pure evaluation agent with no tool use. Runs deterministic SEO checks first,
then calls the LLM with the draft, brief, SEO metadata, and check results
to produce a scored SeoOutput.
optimize_seo(draft, brief, seo_title, meta_description, *, article_id=None)
async
¶
Evaluate SEO quality of an article draft.
Runs deterministic checks first, then calls the LLM for qualitative
evaluation. Returns a scored SeoOutput combining both.
Args: draft: The article draft in markdown. brief: The article brief with keyword and target metadata. seo_title: The proposed SEO title tag. meta_description: The proposed meta description. article_id: Optional article ID for cost tracking.
Returns:
A validated SeoOutput with score, feedback, and suggested fixes.
Raises:
ValueError: If the LLM response cannot be parsed as SeoOutput.
style_checker
¶
Style checker agent — evaluates draft style, readability, and voice compliance.
Pure evaluation agent with no tool use. Runs deterministic readability checks first, then constructs a structured user message with the draft, brief, and deterministic results for LLM-based qualitative evaluation.
check_style(draft, brief, *, article_id=None)
async
¶
Evaluate the style, readability, and voice compliance of a draft.
Runs deterministic readability checks via run_style_checks, then
sends the draft, brief, and deterministic results to the LLM for
qualitative evaluation.
Args: draft: The article draft in markdown. brief: The article brief specifying topic, structure, and voice targets. article_id: Optional article ID for cost tracking.
Returns:
A validated StyleOutput with readability metrics, structure
compliance, voice evaluation, and humanizer check results.
Raises:
ValueError: If the LLM response cannot be parsed as StyleOutput.
synthesis
¶
Synthesis agent -- merges quality-gate feedback into a revised article draft.
Pure reasoning agent with no tool use. Constructs a structured user message
from the draft, factuality/SEO/style gate outputs, and the original brief,
calls the LLM, and parses the response into a validated SynthesisOutput.
synthesize(draft, factuality_output, seo_output, style_output, brief, *, faq_items=None, dossier=None, article_id=None)
async
¶
Merge quality-gate feedback into a revised article draft.
Args:
draft: The current article draft markdown.
factuality_output: Output from the factuality checker gate.
seo_output: Output from the SEO optimizer gate. Pass None for content
types that skip the SEO gate (e.g. digests); the <seo_report>
block will be omitted from the synthesis prompt entirely.
style_output: Output from the style checker gate.
brief: The original article brief for reference.
dossier: Optional research dossier for factuality-fix cross-referencing.
article_id: Optional article ID for cost tracking.
Returns:
A validated SynthesisOutput containing the revised draft,
change log, and conflict resolutions.
Raises:
ValueError: If the LLM response cannot be parsed as SynthesisOutput.
triage
¶
Triage agent — scores source items for newsworthiness.
Evaluates pending source items on relevance, novelty, and accessibility
using the triage prompt. Optionally fetches full paper abstracts via
the fetch_paper tool. Persists triage decisions for each item.
triage_items(items, recent_digest_summaries, *, article_id=None)
async
¶
Score pending source items for newsworthiness.
Calls the triage agent (Haiku model) with XML-tagged items and recent digest summaries. Persists triage decisions via update_triage_status().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[SourceItem]
|
Source items to evaluate for newsworthiness. |
required |
recent_digest_summaries
|
list[str]
|
Summaries of recently published digests for novelty assessment. |
required |
article_id
|
UUID | None
|
Optional article UUID for cost tracking. |
None
|
Returns:
| Type | Description |
|---|---|
list[TriageResult]
|
A list of validated triage decisions, one per input item. |
Raises:
| Type | Description |
|---|---|
ValidationError
|
If the structured output cannot be validated into |
writer
¶
Writer agent — produces a full article draft from a brief and research dossier.
Pure generation agent with no tool use. Constructs a structured user message
from the ArticleBrief and ResearchDossier, calls the LLM, and parses
the response into a validated WriterOutput.
write_article(brief, dossier, parent_cornerstone=None, *, rewrite_instructions=None, article_id=None)
async
¶
Generate an article draft from a brief and research dossier.
Args: brief: The article brief specifying topic, structure, and SEO targets. dossier: Research output with sources, evidence, and patient perspective. parent_cornerstone: Optional parent cornerstone article text for internal linking context. rewrite_instructions: Optional instructions for rewriting a previous draft. article_id: Optional article ID for cost tracking.
Returns:
A validated WriterOutput containing the draft, meta description,
SEO title, and suggested slug.
Raises:
ValueError: If the LLM response cannot be parsed as WriterOutput.
Core¶
base
¶
Generic Anthropic API caller with tool use loop, retry, and cost tracking.
Provides call_agent — the single entry point for all agent interactions
with the Anthropic API. Handles prompt loading, tool dispatch, exponential
backoff on transient errors, cost recording, and structured logging.
call_agent(agent_name, content_type, messages, *, article_id=None, tools=None, tool_definitions=None, max_tokens=4096, max_iterations=MAX_TOOL_ITERATIONS, flow_name=None, session=None, structured_output_schema=None, assistant_prefill=None, on_iteration_end=None)
async
¶
Call an agent via the Anthropic API with full orchestration.
Loads the prompt, calls the API (with retries), runs the tool use loop
if needed, records cost, and returns a structured AgentResponse.
Raises:
PromptNotFoundError: If no active prompt exists for the agent.
AgentLoopLimitError: If the tool loop exceeds MAX_TOOL_ITERATIONS.
APIStatusError: On non-retryable API errors or exhausted retries.
extract_json(text)
¶
Extract JSON from an LLM response that may contain markdown fencing.
Handles three cases:
1. Pure JSON (starts with { or [) — trailing prose stripped via
raw_decode, then repaired.
2. Markdown code block (``json ... `````) — contents extracted.
3. JSON embedded in prose — first{/[to matching last}/]``
extracted.
After extraction, repairs common LLM JSON errors (unescaped quotes).
Returns an :class:ExtractionResult with json_str, method, and
was_repaired. Raises :class:JsonExtractionError when no JSON can be
found.
validate_structured_output(response_content, model_class, *, agent_name)
¶
Validate a structured output response against a Pydantic model.
Used by agents that pass structured_output_schema to call_agent().
The API guarantees valid JSON matching the schema, so no extraction
is needed — just model_validate_json() with standardized error logging
and Sentry context.
Guards: - Empty/whitespace-only responses raise ValueError immediately (Sentry CONTENT-AUTOMATION-13: factuality_checker returned ""). - Trailing-comma JSON is repaired before re-validation (Sentry CONTENT-AUTOMATION-14: style_checker returned JSON with trailing comma).
Raises: ValueError: If validation fails (wraps the original exception).
Schemas¶
schemas
¶
Agent I/O schemas for the naluma content pipeline.
Every agent in the pipeline has typed input/output contracts defined here.
All models use Pydantic v2 with strict validation (ConfigDict(strict=True)).
AeoSpec
¶
Bases: BaseModel
Answer Engine Optimisation specification for an article brief.
BriefSection
¶
Bases: BaseModel
A section in the article structure.
ContentTemplateRef
¶
Bases: BaseModel
Content template reference embedded in a brief.
BriefSourceDirective
¶
Bases: BaseModel
Research direction for the researcher agent.
ArticleBrief
¶
Bases: BaseModel
Complete brief handed to the writer agent.
ResearchSource
¶
Bases: BaseModel
A single source discovered during the research phase.
ResearchDossier
¶
Bases: BaseModel
Complete research output handed to the writer agent.
FaqItem
¶
Bases: BaseModel
A single FAQ question-answer pair.
WriterOutput
¶
Bases: BaseModel
Output from the writer agent.
QualityFeedback
¶
Bases: BaseModel
Structured feedback from a quality gate agent.
QualityGateOutput
¶
Bases: BaseModel
Base output for all quality gate agents.
FactualityIssue
¶
Bases: BaseModel
A single factual issue flagged by the factuality checker.
FactualityOutput
¶
KeywordCount
¶
Bases: BaseModel
Count of a secondary keyword in the draft.
SeoCheckResult
¶
Bases: BaseModel
Deterministic SEO check results (non-LLM).
SeoLlmEvaluation
¶
Bases: BaseModel
LLM-based qualitative SEO evaluation.
SeoSuggestedFix
¶
Bases: BaseModel
A single actionable SEO fix suggestion.
SeoOutput
¶
StyleCheckResult
¶
Bases: BaseModel
Deterministic readability check results (non-LLM).
ElementPresence
¶
Bases: BaseModel
Whether a required structural element is present.
StyleStructureCompliance
¶
Bases: BaseModel
Structure compliance section of style checker output.
StyleVoiceIssue
¶
Bases: BaseModel
A single voice/tone issue flagged by the style checker.
StyleVoiceEvaluation
¶
Bases: BaseModel
Voice and tone evaluation from the style checker.
StyleAiPattern
¶
Bases: BaseModel
A single AI-writing pattern detected by the humanizer check.
StyleHumanizerCheck
¶
Bases: BaseModel
Humanizer check results from the style checker.
StyleMedicalIssue
¶
Bases: BaseModel
A single medical language issue flagged by the style checker.
StyleMedicalLanguage
¶
Bases: BaseModel
Medical language evaluation from the style checker.
StyleOutput
¶
SynthesisChange
¶
Bases: BaseModel
A single change made during synthesis.
SynthesisConflictResolution
¶
Bases: BaseModel
A conflict resolution between quality gates.
SynthesisOutput
¶
Bases: BaseModel
Output from the synthesis agent that merges quality feedback into the draft.
TriageResult
¶
Bases: BaseModel
Triage decision for a single discovered source item.
TriageOutput
¶
Bases: BaseModel
Wrapper for structured output — contains the list of triage results.
DigestSourceCitation
¶
Bases: BaseModel
Structured source metadata for a single digest item citation.
DigestItemSummary
¶
Bases: BaseModel
Summary of a single item in a research digest.
DigestOutput
¶
Bases: BaseModel
Complete research digest output.
ImageScene
¶
Bases: BaseModel
A single scene description for DALL-E prompt assembly.
ImageScenesOutput
¶
Bases: BaseModel
Structured output from Claude: two scene descriptions.
SingleImageResult
¶
Bases: BaseModel
Result for one generated image.
DualImageResult
¶
Bases: BaseModel
Result containing one or two generated images.
Prompts¶
prompts
¶
Prompt loader with TTL cache for agent system prompts.
Wraps the raw get_active_prompt DB query with an in-memory cache
to avoid database round-trips on every agent call. The cache can be
cleared immediately via invalidate_prompt_cache() (e.g. after a
dashboard edit) or left to expire naturally after
PROMPT_CACHE_TTL_SECONDS (default 300 s).
PromptNotFoundError(agent_name, content_type)
¶
Bases: Exception
Raised when no active prompt exists for the given agent/content-type.
get_prompt(agent_name, content_type)
async
¶
Return the active AgentPrompt for agent_name / content_type.
Results are cached in-memory for up to PROMPT_CACHE_TTL_SECONDS.
On cache miss (or expiry) the database is queried via
get_active_prompt, which falls back to content_type='all'
automatically.
Raises: PromptNotFoundError: If no active prompt exists at all.
invalidate_prompt_cache(agent_name=None)
¶
Clear cached prompts.
If agent_name is None every entry is removed. Otherwise only
entries whose key starts with the given agent name are dropped.
Cost Tracking¶
cost_tracker
¶
Cost tracking for agent calls — pricing, recording, and breakdown queries.
Provides calculate_cost to convert token usage into USD, record_agent_cost
to atomically accumulate costs on articles, and get_article_cost_breakdown to
query per-agent aggregates from the quality_scores table.
calculate_cost(model, tokens_in, tokens_out, cache_read_input_tokens=0, cache_creation_input_tokens=0)
¶
Return the estimated cost in USD for the given token counts.
Uses MODEL_PRICING for known models. Unknown models trigger a
warning log and fall back to the most expensive pricing.
Anthropic prompt caching rates: - Cache read tokens: 10% of the normal input rate. - Cache creation tokens: 125% of the normal input rate.
record_agent_cost(session, article_id, agent_name, tokens_in, tokens_out, cost_usd, flow_name=None, cache_read_input_tokens=0, cache_creation_input_tokens=0)
async
¶
Atomically accumulate cost_usd and token counts on the article.
After recording, optionally enforces the flow cost ceiling via
check_cost_limit if flow_name is provided.
get_article_cost_breakdown(session, article_id)
async
¶
Return per-agent cost aggregates from quality_scores.
Each dict contains agent_name, cost_usd, tokens_in, tokens_out.
Returns an empty list when no scores exist.