Skip to Content
Quickstart

🚀 Quickstart: PageIndex SDK

Get started by first getting your 🔑 API key from the PageIndex API Dashboard.

Below is a brief introduction to using the Python SDK, including example code for common operations.

Install the SDK

pip install pageindex

Initialize the Client

from pageindex import PageIndexClient pi_client = PageIndexClient(api_key="YOUR_API_KEY")

Submit a document

result = pi_client.submit_document("./2023-annual-report.pdf") doc_id = result["doc_id"]

Check processing status

status = pi_client.get_document(doc_id)["status"] if status == "completed": print('File processing completed')

Once document processing is completed, the following services will be available.

PageIndex Chat (beta)

Ask questions directly about one or more documents. The Chat API includes a full reasoning-based RAG workflow using PageIndex.

Chat with a specific document

You can chat with a specific document by providing a document id with the query.

response = pi_client.chat_completions( messages=[{"role": "user", "content": "What are the key findings in this document?"}], doc_id="pi-abc123def456" ) print(response["choices"][0]["message"]["content"])

Get page-level references by prompting

You can get page-level references by simply prompting.

response = pi_client.chat_completions( messages=[{"role": "user", "content": "Which page discusses the evaluation methods?"}], doc_id="pi-abc123def456" ) print(response["choices"][0]["message"]["content"])

You can also get a customized retrieval API by simply prompting the Chat API. See this notebook for an example.

Compare multiple documents

You can chat with multiple documents by providing a list of document IDs.

response = pi_client.chat_completions( messages=[{"role": "user", "content": "Compare these two documents on the results"}], doc_id=["pi-abc123def456", "pi-abc123ghi789"] ) print(response["choices"][0]["message"]["content"])

Don’t want to wait for generation? See PageIndex Chat API for streaming responses.

PageIndex Tree

Get the PageIndex tree once the processing is ready.

tree_result = pi_client.get_tree(doc_id)["result"]

See PageIndex tree generation for more details.

Additional Features

👉 See the full API Reference for more examples including:

  • PageIndex OCR: Transform PDF to structure-preserving markdowns.

💬 Support

Last updated on