Skip to Content
Python SDK

⚙️ PageIndex Python SDK

Initialize

To get started, visit the PageIndex Developer Dashboard and generate your API key.

pip install -U pageindex
from pageindex import PageIndexClient pi_client = PageIndexClient(api_key="YOUR_API_KEY")

Submit a Document

Upload a PDF for processing. The returned doc_id is used for subsequent operations.

result = pi_client.submit_document("./2023-annual-report.pdf") doc_id = result["doc_id"]

Check if processing has completed:

status = pi_client.get_document(doc_id)["status"] if status == "completed": print('Document processing completed')

See Document Processing for full parameters and response format.

Chat API (beta)

Ask questions about one or more processed documents. The Chat API runs a full agentic, reasoning-based RAG workflow under the hood using PageIndex.

Chat with a specific document

response = pi_client.chat_completions( messages=[{"role": "user", "content": "What are the key findings in this document?"}], doc_id="pi-abc123def456" ) print(response["choices"][0]["message"]["content"])

Chat with multiple documents

response = pi_client.chat_completions( messages=[{"role": "user", "content": "Compare these two documents on the results"}], doc_id=["pi-abc123def456", "pi-abc123ghi789"] ) print(response["choices"][0]["message"]["content"])

See Chat API (beta) for the full reference.

💡

Want results instantly with streaming? The Chat API also supports streaming responses.

You can build a customized agentic retrieval API by prompting the Chat API. See this notebook for an example.

Get Document Tree Index

Get the hierarchical tree index generated for a processed document.

tree_result = pi_client.get_tree(doc_id)["result"]

See Document Processing for more details on the tree structure.

SDK Full Reference

Also see OCR for structure-preserving PDF to markdown conversion.

💬 Community & Support

Last updated on