Snapshot
| Context7 | Ref | |
|---|---|---|
| Approach | Simple batch retrieval | Iterative search + read with sessions |
| Token Usage | 10K tokens per query | 50-70% savings on average, up to 95% |
| Tools | resolve-library-id, get-library-docs | ref_search_documentation, ref_read_url |
| Advanced MCP features | None | Stateful search session, pre-built prompts |
| Scrape any URL | No | Yes |
Search Philosophy
Context7’s Approach
Context7 uses a naive RAG strategy: given a query, fetch the most relevant documents up to a fixed number of tokens (typically 10K). This “dump and hope” approach assumes what you need is in that bundle of content. Strengths:- Simple, predictable behavior
- One of the most popular MCP servers showing why MCP is valuable
- Doesn’t match iterative agent/human search patterns
- Fixed token budget may be too much or too little
- Costly to iterate (10K tokens per retry)
- Cannot avoid returning duplicate results across queries
Ref’s Approach
Ref uses agentic search with MCP sessions: provides search() and read() tools, allowing agents to:- Issue queries and get result overviews
- Selectively read only relevant documents
- Iterate efficiently with session state
- Never return same link twice - agents can access prior results from context
- On-the-fly extraction - automatically filter large pages (e.g., 90K token Figma docs → 5K relevant tokens)
- Pre-fetching - results are cached for faster reads
- 50-70% average token savings vs 10K baseline
- Up to 95% savings on some queries (500 tokens vs 10K)
- No drop in recall quality
Evaluation Metrics
Ref optimizes for token usage because:- Tokens cost money
- Context rot: irrelevant tokens degrade output quality
- Agents build context over multiple searches, so session-level metrics matter more than single-query precision
Why Ref Wins
Dramatic token savings: 50-70% average reduction vs naive RAG approaches, with up to 95% savings on some queries. This translates to real cost savings and better output quality through reduced context rot. Matches how agents work: Ref’s search + read tools align with how frontier models are trained. OpenAI explicitly requires this pattern for Deep Research integration, signaling this is the future of agentic search. Session-powered intelligence:- Never returns duplicate links across queries
- Automatically extracts relevant sections from large docs (e.g., 90K token pages → 5K relevant tokens)
- Enables efficient iteration without re-fetching prior results