Opening the Black Box: Making AI Agent Decisions Visible | ClariTrial Blog

When a human analyst writes a report, you can ask them where they got a number. When an AI agent writes one, the number arrives in a paragraph with no footnote, no query log, and no way to distinguish a database lookup from a confident guess. The black-box problem is not that the agent is wrong; it is that you have no way to check.

In regulated industries, this is a non-starter. Pharmaceutical companies reviewing competitive intelligence, clinicians evaluating trial options, and compliance officers auditing data pipelines all need to see the receipts.

What "visible" means in practice

Visibility in an agentic system has three layers:

What tools ran (transparency)
Where the data came from (provenance)
Which parts of the answer are fact vs. inference (structure)

ClariTrial implements all three.

Layer 1: the trace panel

Every assistant message in ClariTrial carries metadata: the model ID, the prompt version, the user's selected intent mode, and a step-by-step tool trace. Users can expand a "How this answer was built" panel below any response to see exactly which tools the lead model called, in what order, with a truncated preview of each tool's input.

When the lead delegates to a specialist subagent (e.g., trial_discovery or evidence_synthesis), the specialist's internal tool calls are extracted and displayed as nested lines prefixed with ↳. This means a user can see not just "the agent consulted a specialist" but "the specialist searched ClinicalTrials.gov for PROTAC AND phase 3, then fetched trial detail for NCT05233033."

The trace also adapts to context. When a response is served from cache (a repeated question that was already answered), the panel shows a "Cached" badge and explains that no live tool trace is available because the original pipeline ran on a previous request.

Layer 2: provenance badges

Below each assistant message, colored source badges indicate which data sources contributed to the answer: ClinicalTrials.gov, AACT SQL, PubMed, OpenFDA, WHO ICTRP, or curated data. These are extracted from provenance objects attached to tool results at the source level, not inferred from the model's text.

This matters because the model might mention PubMed in its narrative without having actually queried PubMed. The badges reflect what the tools actually did, not what the model says it did.

Layer 3: answer structure

The system prompt requires the model to organize tool-grounded responses with three headings:

Facts: direct data pulls, tied to a source (NCT IDs, phases, enrollment numbers, SQL row fields, PubMed PMIDs).
Summary: aggregates computed from those facts (counts, sorted lists) without new claims.
Interpretation: mechanism hypotheses, tradeoffs, and "what this might mean" language, clearly labeled.

A post-response compliance check verifies that these headings appear when tools were called. If the model skips the structure, the trace panel shows a warning: "Facts/Summary/Interpretation headings not detected."

Why this matters beyond clinical trials

The three-layer pattern (tool trace, source provenance, answer structure) is portable. A financial AI agent could show which market data feeds it queried, badge the sources (Bloomberg, SEC filings, internal models), and separate reported figures from projected estimates. A legal research agent could trace which case law databases it searched, badge the jurisdictions, and separate holdings from dicta.

The implementation cost is modest: metadata on tool results, a serialization step that extracts and formats traces, and UI components that render them. The trust return is significant, because every time a user expands the trace panel and confirms the agent did what it claimed, confidence compounds.

The honest-limits principle

Visibility also means being honest about what you cannot show. ClariTrial's methodology sheet (accessible from the chat header) explains the system's source scope, notes that it does not ingest proprietary deal rooms or patent databases, and states that outputs are not a substitute for clinical or regulatory review.

Cached responses explicitly say they lack live trace data. Draft-analysis responses (where the model's language crosses from data reporting into guidance) trigger a visible banner: "Verify before acting."

Transparency is not just showing what the system did. It is also showing what the system did not do, and what the user should verify elsewhere.