The Data Map - How AI Agents Navigate Financial Data with agentii.ai
The hardest problem in AI-powered financial research isn't getting data — it's getting the right data.
An SEC 10-K filing can span 200+ pages. A clinical trial record contains dozens of nested data points. A single earnings calendar event links to multiple filing types across different fiscal periods. Throw "search for LLY revenue guidance" at a naive AI agent and it either drowns in irrelevant pages or misses the critical disclosure buried on page 147.
agentii.ai solves this with two things: a data map that AI agents can navigate programmatically, and citation watermarks that give every data point forensic provenance back to its source SEC filing page.
The Four-Verb Taxonomy
Every agentii.ai API endpoint falls into one of four verbs. Your agent learns this taxonomy once and can navigate the entire data surface:
- search_ — Find things.
search_documents,search_xbrl_facts,search_companies,search_earnings_calendar. These are your entry points. - list_ — Discover what's available.
list_coverage,list_sources,list_domains. Before you search, you orient. - read_ — Get the full content.
read_source_outline,read_source_pages. Only call these when you know exactly which pages you need. - get_ — Retrieve structured records.
get_company_profile,get_company_financials. Single-entity lookups.
The Three-Layer Agentic Search Protocol
For unstructured documents (SEC filings, clinical trial data), agentii.ai provides a three-layer protocol that achieves ~99% token efficiency vs. naive page-by-page loading:
Layer 1 — Document Discovery: Use search_documents to find candidate filings without reading content. Pre-computed labels classify 8-K disclosure types, so your agent knows a filing is about "regulation-fd" or "material-impairment" before opening it.
Layer 2 — Page Map: Use read_source_outline to scan ALL pages' descriptions and keywords without loading page content. This returns a page map — typically 200 entries for a large 10-K — letting your agent pinpoint the 3-5 pages that actually matter.
Layer 3 — Deep Read: Use read_source_pages to load full page content for ONLY those selected pages. Every data point includes a citation_id in the format agentii://source/... that resolves to the exact SEC filing page.
Citation Watermarks: Trust Infrastructure
Every API response from agentii.ai includes citation watermarks — agentii://source/... URIs that resolve to actual SEC filing pages. This is not a "sources" section at the bottom of a report. It's per-data-point provenance.
When Claude Code produces a DCF model with "LLY 2024 revenue: $45.0B," the agentii://source/... citation tells you exactly which 10-K page that number came from. No competitor — FactSet, S&P Global, Polygon, Yahoo Finance — offers page-level data provenance.
This matters for three reasons:
- Auditability: Every number can be traced to its source in under 10 seconds
- Trust: AI-generated research with forensic provenance is more credible than black-box outputs
- Compliance: For institutional users, citation watermarks create an audit trail for AI-assisted analysis
Getting Started
Your AI agent is one .mcp.json file away from 24+ financial data tools:
{
"mcpServers": {
"agentii": {
"url": "https://mcp.agentii.ai/mcp",
"headers": { "Authorization": "Bearer ${AGENTII_API_KEY}" }
}
}
}
Copy this into your agent's project root. Restart your agent. Run tools/list. You'll see 24+ tools auto-discovered.
Then try your first query: "Use agentii.ai to search LLY's latest 10-K filing and tell me the key financial highlights."
The agent will use the three-layer protocol — discover the filing, scan the page map, deep-read the relevant pages — and return a citation-backed summary in seconds.