🌲 PageIndex Document Processing
PageIndex generates a hierarchical “table of contents” tree that maintains the original document’s logical flow and organizational structure. This LLM-optimized “table of contents” enables precise navigation and is ready for reasoning-based RAG. See our cookbook for a practical example.
Currently accepts PDF files only (more formats coming soon).
Submit Document for Processing
- Uploads a PDF document to generate a PageIndex hierarchical tree.
- Returns a document identifier (
doc_id) for subsequent operations.
Parameters
| Name | Type | Required | Description | Default |
|---|---|---|---|---|
| file | Blob | Buffer | ArrayBuffer | yes | File content to upload | - |
| fileName | string | yes | Name of the file | - |
| options.mode | string | no | Processing mode | - |
| options.folderId | string | no | Target folder ID. Falls back to folderScope if not set | - |
Example Request
import { readFileSync } from "fs";
const file = readFileSync("./2023-annual-report.pdf");
const result = await client.api.submitDocument(file, "2023-annual-report.pdf");
const doc_id = result.doc_id;Example Response
{
"doc_id": "pi-abc123def456"
}You can organize documents into Folders for better workspace management (Max plan).
Get Processing Status & Tree Structure
Check processing status and (when complete) get the PageIndex tree for a submitted document.
Parameters
| Name | Type | Required | Description | Default |
|---|---|---|---|---|
| docId | string | yes | Document ID | - |
| options.nodeSummary | boolean | no | Include node summary for each node in response | false |
Example Request
const tree = await client.api.getTree(doc_id);
if (tree.status === "completed") {
console.log("PageIndex Tree Structure:", tree.result);
}Example Response (Processing):
{
"doc_id": "pi-abc123def456",
"status": "processing"
}Example Response (Completed):
{
"doc_id": "pi-abc123def456",
"status": "completed",
"result": [
{
"title": "Financial Stability",
"node_id": "0006",
"page_index": 21,
"text": "The Federal Reserve maintains financial stability through comprehensive monitoring and regulatory oversight...",
"nodes": [
{
"title": "Monitoring Financial Vulnerabilities",
"node_id": "0007",
"page_index": 22,
"text": "The Federal Reserve's monitoring focuses on identifying and assessing potential risks..."
},
{
"title": "Domestic and International Cooperation and Coordination",
"node_id": "0008",
"page_index": 28,
"text": "In 2023, the Federal Reserve collaborated internationally with central banks and regulatory authorities..."
}
]
}
]
}Get Document Metadata
Retrieve document information including processing status, page count, and creation time.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| docId | string | yes | Document ID |
Example Request
const doc = await client.api.getDocument(doc_id);
console.log(`Document: ${doc.name}`);
console.log(`Status: ${doc.status}`);
console.log(`Pages: ${doc.pageNum}`);Example Response
{
"id": "pi-abc123def456",
"name": "2023-annual-report.pdf",
"description": "Annual Report 2023",
"status": "completed",
"createdAt": "2024-01-15T10:30:00.000Z",
"pageNum": 42,
"folderId": "folder-123"
}| Field | Type | Description |
|---|---|---|
| id | string | Document ID |
| name | string | Document filename |
| description | string | Document description |
| status | string | "queued" | "processing" | "completed" | "failed" |
| createdAt | string | ISO 8601 timestamp |
| pageNum | number | Number of pages |
| folderId | string | Folder ID (if assigned) |
List Documents
Retrieve a paginated list of all documents, ordered by creation date (newest first).
Parameters
| Name | Type | Required | Description | Default |
|---|---|---|---|---|
| options.limit | number | no | Maximum documents to return (1–100) | 50 |
| options.offset | number | no | Number of documents to skip | 0 |
| options.folderId | string | no | Filter by folder ID | - |
Example
const result = await client.api.listDocuments({ limit: 10 });
console.log(`Total: ${result.total}`);
for (const doc of result.documents) {
console.log(`${doc.name} (${doc.status})`);
}
// Next page
const page2 = await client.api.listDocuments({ limit: 10, offset: 10 });Response
{
"documents": [
{
"id": "pi-abc123def456",
"name": "2023-annual-report.pdf",
"status": "completed",
"createdAt": "2024-01-15T10:30:00.000Z",
"pageNum": 42
}
],
"total": 25,
"limit": 10,
"offset": 0
}Delete a Document
Permanently delete a PageIndex document and all its associated data.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| docId | string | yes | Document ID |
Example Request
await client.api.deleteDocument(doc_id);