Context & History
Every year millions of scientific papers appear, creating a gap between the amount of information and what any individual can read. The original Consensus platform acted as a vertical search engine, indexing papers and providing citation‑backed summaries. While useful, the approach left users to perform the heavy work of interpreting and connecting results. To close that gap, the team re‑designed the product around a multi‑agent workflow called Scholar Agent, which mirrors how a researcher plans, searches, reads, and synthesises evidence. This shift, powered by GPT‑5 and the Responses API, moves the system from simple retrieval toward a full‑featured research companion.
Implementation & Best Practices
Before constructing the individual agents, outline the end‑to‑end workflow: define the research question format, decide which evidence sources the system may access, set quality thresholds for citation relevance, and map each stage to a dedicated agent. Establish a routing layer that directs sub‑tasks to the appropriate agent and handles fallback when no suitable evidence exists. Once this roadmap is clear, you can develop each agent with a narrow focus, enforce strict tool‑calling contracts, and build evaluation pipelines that check citation traceability and factual accuracy.
Agent Architecture Overview
The system consists of four core agents plus a routing controller. Each agent receives a concise instruction set and returns structured data that the next agent consumes. Keeping responsibilities separate reduces error propagation and makes debugging straightforward.
Planning Agent
It parses the user query, breaks it into sub‑questions, and decides which actions to perform next. By limiting the scope to planning, the model avoids drifting into content generation too early. Key Takeaway: Clear planning limits hallucinations.
Search Agent
Using the plan, this agent queries the Consensus index, the user’s private library, and the citation graph to retrieve relevant documents. It returns a list of paper IDs, titles, and brief relevance scores.
Reading Agent
For each selected paper, the reading agent extracts key sections, methods, and results, converting them into a uniform JSON format. Batch processing is possible for efficiency.
Analysis Agent
The analysis agent synthesises the extracted data, creates outlines, and generates any required visualisations. It then assembles the final answer, ensuring every claim links back to a source in the context pack.
Responses API Integration
The routing layer calls the Responses API to invoke each agent as a separate request. This design gives fine‑grained cost control and lets developers monitor latency per step. Switching from chat completions to the Responses API also simplifies error handling because each call returns a structured response object.
Evaluation and Hallucination Control
After the answer is generated, an automated evaluator checks that every citation appears in the context pack and that the answer does not contain unsupported statements. If the quality check fails, the system returns a polite refusal with suggestions for query refinement.
Model Selection Guidance
Choosing a model that balances context length with tool‑calling reliability is essential. For most research tasks, GPT‑5’s extended context window and stable tool‑calling behaviour provide a good baseline, as discussed in Choosing the right AI model for your project. When newer models become available, test them against the same evaluation suite before replacing the production model.