⚙️ PageIndex Python SDK
Submit documents for PageIndex Processing
Chat with your PageIndex-processed documents
Initialize
To get started, visit the PageIndex Developer Dashboard and generate your API key .
pip install -U pageindexfrom pageindex import PageIndexClient
pi_client = PageIndexClient(api_key="YOUR_API_KEY")Submit a Document
Upload a PDF for processing. The returned doc_id is used for subsequent operations.
result = pi_client.submit_document("./2023-annual-report.pdf")
doc_id = result["doc_id"]Check if processing has completed:
status = pi_client.get_document(doc_id)["status"]
if status == "completed":
print('Document processing completed')See Document Processing for full parameters and response format.
Chat API (beta)
Ask questions about one or more processed documents. The Chat API runs a full agentic, reasoning-based RAG workflow under the hood using PageIndex.
Chat with a specific document
response = pi_client.chat_completions(
messages=[{"role": "user", "content": "What are the key findings in this document?"}],
doc_id="pi-abc123def456"
)
print(response["choices"][0]["message"]["content"])Chat with multiple documents
response = pi_client.chat_completions(
messages=[{"role": "user", "content": "Compare these two documents on the results"}],
doc_id=["pi-abc123def456", "pi-abc123ghi789"]
)
print(response["choices"][0]["message"]["content"])See Chat API (beta) for the full reference.
Want results instantly with streaming? The Chat API also supports streaming responses.
You can build a customized agentic retrieval API by prompting the Chat API. See this notebook for an example.
Get Document Tree Index
Get the hierarchical tree index generated for a processed document.
tree_result = pi_client.get_tree(doc_id)["result"]See Document Processing for more details on the tree structure.
SDK Full Reference
- Document Processing: Submit documents for PageIndex processing.
- Chat API (beta): Chat with your PageIndex-processed documents.
Also see OCR for structure-preserving PDF to markdown conversion.