🚀 Getting Started
PageIndex Cloud Service Architecture
PageIndex provides a cloud service. You upload documents to the PageIndex cloud, where they are processed into a tree-structured index. Once processed, you chat with the document in two ways:
You run your own LLM or agent. Connect PageIndex as a retrieval tool via MCP, and your LLM calls it automatically.
PageIndex provides the LLM. Call the Chat API to get answers directly, no LLM setup required on your side.
Your Documents
│
▼
┌───────────────────────────────┐
│ PageIndex Cloud │
│ Document Processing │
└───────────────┬───────────────┘
│
┌────────────┴────────────┐
▼ ▼
┌────────────────────────┐ ┌────────────────────────┐
│ Your LLM / Agent │ │ Chat API │
│ (via MCP) │ │ (PageIndex LLM) │
└────────────────────────┘ └────────────────────────┘What is MCP? MCP (Model Context Protocol) is an open standard that lets LLM agents call external tools. If you are building your own AI agent using a framework like LangChain, CrewAI, or OpenAI Agents SDK, MCP lets your LLM call PageIndex as tools automatically, with no custom integration code needed. See the PageIndex MCP page for setup.
Step 1. Upload & Process a Document
To get started, visit the PageIndex Developer Dashboard and generate your API key .
This step is the same for both MCP and Chat API integrations.
Install the SDK
pip install -U pageindexInitialize the client
from pageindex import PageIndexClient
pi_client = PageIndexClient(api_key="YOUR_API_KEY")Submit a document
result = pi_client.submit_document("./2023-annual-report.pdf")
doc_id = result["doc_id"]Check processing status
status = pi_client.get_document(doc_id)["status"]
if status == "completed":
print('Document processing completed')Once processing is complete, the document is available to both integrations below using the same doc_id.
See Document Processing for the full document management reference.
Step 2. Choose Your Integration
Option A: Use with Your Own LLM / Agent via MCP
Connect PageIndex to your own LLM or agent framework via MCP. Your LLM receives PageIndex as a callable tool and uses it automatically for document retrieval.
- Works with Claude Agent SDK, OpenAI Agents SDK, LangChain, CrewAI, Google ADK, and any MCP-compatible client.
- Your LLM stays in control — PageIndex handles the retrieval.
See the MCP Setup Guide for configuration and integration examples.
Option B: Chat API (PageIndex provides the LLM)
Use PageIndex’s own LLM to answer questions about your documents directly. No LLM setup required on your side.
response = pi_client.chat_completions(
messages=[{"role": "user", "content": "What are the key findings in this document?"}],
doc_id=doc_id
)
print(response["choices"][0]["message"]["content"])See Chat API reference for streaming support and more usage details.