🚀 Quickstart: PageIndex API
Get started with PageIndex by first getting your 🔑 API key .
The PageIndex API consists of two main components:
- PageIndex Tree Generation: Upload a document to generate a PageIndex tree index.
- PageIndex Retrieval: Ask a query to retrieve relevant content from a document.
Below is a brief introduction to using the API, including example response formats.
🌲 PageIndex Tree Generation
Use this API to extract and return a PageIndex structure from a document.
Currently accepts PDF files only (more formats coming soon).
Endpoints:
Endpoint | Method | Description |
---|---|---|
https://api.vectify.ai/pageindex/ | POST | Submit a document to generate a PageIndex tree |
https://api.vectify.ai/pageindex/{doc_id}/ | GET | Get status and tree generation result of a document |
https://api.vectify.ai/pageindex/{doc_id}/ | DELETE | Delete a document and its PageIndex tree |
Example (Python):
Submit document for PageIndex tree generation
import requests
api_key = "YOUR_API_KEY"
file_path = "./2023-annual-report.pdf"
with open(file_path, 'rb') as file:
submit_response = requests.post(
"https://api.vectify.ai/pageindex/",
headers={"api_key": api_key},
files={"file": file}
)
doc_id = submit_response.json().get("doc_id")
Check processing status and retrieve result
status_response = requests.get(
f"https://api.vectify.ai/pageindex/{doc_id}/",
headers={"api_key": api_key}
)
status_data = status_response.json()
if status_data.get("status") == "completed":
print("PageIndex Tree Structure:", status_data.get("result"))
🔎 PageIndex Retrieval
Use this API to retrieve relevant content from a document. This requires a completed PageIndex tree generation (doc_id
).
Retrieval function requires a completed PageIndex tree generation.
Currently, only single-document retrieval is supported. Multi-document retrieval is coming soon. See also the Doc Search page for document search examples.
Endpoints:
Endpoint | Method | Description |
---|---|---|
https://api.vectify.ai/pageindex/{doc_id}?query=YOUR_QUERY_TEXT | GET | Submit a query to retrieve relevant content from a document |
https://api.vectify.ai/pageindex/retrieval/{retrieval_id}/ | GET | Get retrieval result and status |
Example (Python):
Submit retrieval query
import requests
api_key = "YOUR_API_KEY"
doc_id = "YOUR_PAGEINDEX_DOC_ID"
query = "What are the main risk factors?"
retrieval_response = requests.get(
f"https://api.vectify.ai/pageindex/{doc_id}?query={query}",
headers={"api_key": api_key}
)
retrieval_id = retrieval_response.json().get("retrieval_id")
Check status and retrieve result
status_response = requests.get(
f"https://api.vectify.ai/pageindex/retrieval/{retrieval_id}/",
headers={"api_key": api_key}
)
status_data = status_response.json()
if status_data.get("status") == "completed":
print("Retrieved Content:", status_data.get("retrieved_nodes"))
👉 See the full API Reference for optional parameters, examples, and integration guides.