Skip to Content

🔧 MCP Tools

The client.tools layer wraps PageIndex MCP capabilities as typed JavaScript methods, designed for building custom AI agent tool integrations where you need fine-grained control over tool invocation, parameter handling, and response processing.

💡

If your framework supports MCP natively, you can connect to PageIndex MCP directly — no SDK wrapper needed. See the MCP page for connection configs with Vercel AI SDK, Claude Agent SDK, OpenAI Agents SDK, LangChain, and more.

When to Use client.tools

  • You need to customize tool behavior — add validation, transform inputs/outputs, filter results, or compose tools into multi-step workflows.
  • You want to mix PageIndex tools with other tools in a single tool registry for your agent.
  • You need programmatic access to document reading, search, or folder management outside of a chat/agent context.

Discovery Workflow

The typical exploration order for an agent traversing a PageIndex library:

  1. getFolderStructure — orientation: get the folder tree to understand the library layout.
  2. browseDocuments — primary discovery: list documents by time or relevance, optionally scoped to a folder.
  3. searchDocuments — escalation: keyword search with LLM re-rank when browseDocuments misses a document you expect to exist.
  4. getDocument — confirm processing status before reading.
  5. getDocumentStructure + getPageContent — drill into the document.

Agentic Integration Example

The primary use case for client.tools is wrapping PageIndex capabilities as tools for AI agent frameworks. Here’s a complete example using the Vercel AI SDK:

import { tool } from "ai"; import { z } from "zod"; import { PageIndexClient } from "@pageindex/sdk"; const client = new PageIndexClient({ apiKey: "YOUR_API_KEY", folderScope: "optional-folder-id", }); const pageIndexTools = { pageindex_get_folder_structure: tool({ description: "Orientation step: returns the folder hierarchy as a tree. Call FIRST when exploring a library.", parameters: z.object({ folderId: z.string().optional(), depth: z.number().int().min(1).max(10).optional(), }), execute: async (params) => client.tools.getFolderStructure(params), }), pageindex_browse_documents: tool({ description: 'Primary document discovery. sort="time" (newest first) or sort="relevance" + query.', parameters: z.object({ folderId: z.string().optional(), recursive: z.boolean().optional(), sort: z.enum(["time", "relevance"]).optional(), query: z.string().optional(), offset: z.number().int().min(0).optional(), limit: z.number().min(1).max(50).optional(), }), execute: async (params) => client.tools.browseDocuments(params), }), pageindex_search_documents: tool({ description: "Escalation tool: keyword search with LLM re-rank. Use after browse_documents missed a document you expect to exist.", parameters: z.object({ query: z.string().min(1), folderId: z.string().optional(), recursive: z.boolean().optional(), limit: z.number().min(1).max(50).optional(), }), execute: async (params) => client.tools.searchDocuments(params), }), pageindex_get_document_structure: tool({ description: "Get the hierarchical outline for a document.", parameters: z.object({ docName: z.string(), part: z.number().int().min(1).optional(), }), execute: async (params) => client.tools.getDocumentStructure(params), }), pageindex_get_page_content: tool({ description: "Read text content from specific pages of a document.", parameters: z.object({ docName: z.string(), pages: z.string().describe('Page spec: "5", "3,7,10", or "5-10"'), }), execute: async (params) => client.tools.getPageContent(params), }), };

Then pass these tools to your agentic framework:

import { streamText } from "ai"; const result = streamText({ model, messages, tools: pageIndexTools, });

The AI model will autonomously decide when to explore folders, search documents, read structure, and fetch page content based on the user’s question.


Available Tools

Browse Documents

Unified document discovery — supports both time-ordered listing and semantic relevance ranking, with optional folder drill-down.

Parameters

NameTypeRequiredDescriptionDefault
folderIdstringnoFolder scope. Omit or use "root" for the library root; pass a folder ID to drill into a sub-folder-
recursivebooleannoWhen false, returns direct contents plus sub-folders. When true, flattens all descendants into one document listfalse
sort"time" | "relevance"no"time" sorts by upload date; "relevance" ranks by semantic relevance to query"time"
querystringnoRequired when sort="relevance"; must be omitted otherwise. Natural-language query is OK-
offsetnumbernoPagination offset from a previous next_offset0
limitnumbernoNumber of documents to return (1–50)10

Example: Latest documents

const result = await client.tools.browseDocuments({ limit: 10 }); for (const doc of result.documents) { console.log(`${doc.name} — ${doc.status}`); } if (result.has_more) { const next = await client.tools.browseDocuments({ limit: 10, offset: result.next_offset ?? 0, }); }

Example: Semantic relevance

const result = await client.tools.browseDocuments({ sort: "relevance", query: "quarterly revenue trends", limit: 5, });

Example: Drill into a folder

const result = await client.tools.browseDocuments({ folderId: "folder-abc123", recursive: true, });

Response

{ "folders": [ { "id": "folder-abc123", "name": "Research Papers", "path": "/Research Papers", "description": "2024 research collection", "children_count": 2, "file_count": 5, "created_at": "2024-06-15T10:30:00.000Z" } ], "documents": [ { "name": "2023-annual-report.pdf", "description": "Annual Report 2023", "status": "completed", "created_at": "2024-01-15T10:30:00.000Z", "folder_id": "folder-abc123", "path": "/Research Papers/2023-annual-report.pdf" } ], "sort": "time", "next_offset": 10, "has_more": true, "next_steps": { "summary": "...", "options": ["..."] } }

Get Folder Structure

Retrieve the folder hierarchy as a tree. Use this for orientation when exploring a library — folder names reveal content domains and return folder IDs you can pass to browseDocuments() or searchDocuments().

Parameters

NameTypeRequiredDescriptionDefault
folderIdstringnoSubtree root. Omit or use "root" for the entire library; pass a folder ID to scope to a subtree-
pathstringnoAlternative subtree root specified by path-
depthnumbernoMaximum depth to traverse (1–10)10
includeCountsbooleannoInclude file_count and children_count on each node-

Example

const result = await client.tools.getFolderStructure({ depth: 3 }); console.log(`Total folders: ${result.total_folders}`); function printTree(node, indent = 0) { const pad = " ".repeat(indent * 2); console.log(`${pad}${node.name} (${node.file_count} files)`); for (const child of node.children) { printTree(child, indent + 1); } } printTree(result.tree);

Response

{ "tree": { "name": "root", "file_count": 12, "children_count": 3, "children": [ { "id": "folder-abc123", "name": "Research Papers", "path": "/Research Papers", "file_count": 5, "children_count": 2, "children": [ { "id": "folder-def456", "name": "Q1 Reports", "path": "/Research Papers/Q1 Reports", "file_count": 3, "children_count": 0, "children": [] } ] } ] }, "total_folders": 4, "depth": 3, "truncated": false, "next_steps": { "summary": "...", "options": ["..."] } }

Search Documents

Keyword search with LLM re-rank — use as an escalation path when browseDocuments({ sort: "relevance", query }) misses a document you strongly believe exists.

Parameters

NameTypeRequiredDescriptionDefault
querystringyesKeyword query (multiple keywords are AND-matched against file name and description). Not a natural-language sentence-
folderIdstringnoFolder scope-
recursivebooleannoInclude descendant folders. Set true to widen scope when a folder-scoped search returns no hitsfalse
limitnumbernoNumber of documents to return (1–50)10

Example

const result = await client.tools.searchDocuments({ query: "annual report 2023", limit: 5, }); for (const doc of result.documents) { console.log(`${doc.name} — score: ${doc.score}`); }

Response

{ "documents": [ { "name": "2023-annual-report.pdf", "description": "Annual Report 2023", "status": "completed", "created_at": "2024-01-15T10:30:00.000Z", "folder_id": "folder-abc123", "path": "/Research Papers/2023-annual-report.pdf", "score": 9 } ], "next_steps": { "summary": "...", "options": ["..."] } }

score is an LLM relevance score from 6 to 10 — higher is more relevant.


Get Document Structure

Retrieve the hierarchical table of contents / outline for a processed document.

Parameters

NameTypeRequiredDescriptionDefault
docNamestringyesDocument name-
partnumbernoPart number for large documents (when structure is split across multiple parts)-
waitForCompletionbooleannoWait until processing completes before returningfalse
folderIdstringnoFolder scope override-

Example

const result = await client.tools.getDocumentStructure({ docName: "2023-annual-report.pdf", waitForCompletion: true, }); console.log(result.structure); if (result.total_parts && result.total_parts > 1) { const part2 = await client.tools.getDocumentStructure({ docName: "2023-annual-report.pdf", part: 2, }); }

Response

{ "doc_name": "2023-annual-report.pdf", "structure": "1. Executive Summary (p.1-3)\n 1.1 Key Findings (p.1)\n 1.2 Recommendations (p.2-3)\n2. Financial Overview (p.4-15)\n ...", "total_parts": 1 }

Get Page Content

Read text content and image annotations for specific pages.

Parameters

NameTypeRequiredDescriptionDefault
docNamestringyesDocument name-
pagesstringyesPage specification: single ("5"), comma-separated ("3,7,10"), or range ("5-10")-
waitForCompletionbooleannoWait until processing completesfalse
folderIdstringnoFolder scope override-

Example

const result = await client.tools.getPageContent({ docName: "2023-annual-report.pdf", pages: "1-5", }); for (const page of result.content) { console.log(`--- Page ${page.page} ---`); console.log(page.text); if (page.image_count && page.image_count > 0) { console.log(page.image_annotations); } }

Response

{ "doc_name": "2023-annual-report.pdf", "total_pages": 42, "requested_pages": 5, "returned_pages": 5, "content": [ { "page": 1, "text": "Executive Summary\n\nThis annual report presents...", "image_count": 1, "image_annotations": [ "Figure 1: Revenue growth chart — 2023-annual-report.pdf/images/page1_fig1.png" ] }, { "page": 2, "text": "Key Findings\n\n1. Revenue increased by 15%..." } ] }

Get Document Image

Retrieve an embedded image as base64-encoded data. Image paths come from getPageContent() results.

Parameters

NameTypeRequiredDescription
imagePathstringyesImage path in format "<docName>/<imagePath>", from getPageContent() image annotations

Example

const pages = await client.tools.getPageContent({ docName: "2023-annual-report.pdf", pages: "1", }); const annotation = pages.content[0].image_annotations?.[0]; const imagePath = annotation?.split(" — ")[1]; if (imagePath) { const image = await client.tools.getDocumentImage({ imagePath }); console.log(image.mimeType); // "image/png" const buffer = Buffer.from(image.data, "base64"); writeFileSync("chart.png", buffer); }

Get Document (by Name)

Look up a document by name, with optional wait for processing completion.

Parameters

NameTypeRequiredDescriptionDefault
docNamestringyesDocument name-
waitForCompletionbooleannoWait until processing completesfalse
folderIdstringnoFolder scope override — disambiguator when two documents share the same name-

Example

const doc = await client.tools.getDocument({ docName: "2023-annual-report.pdf", waitForCompletion: true, }); console.log(doc.status); // "completed"

Remove Documents (Batch)

Delete multiple documents by name in a single operation.

Parameters

NameTypeRequiredDescription
docNamesstring[]yesArray of document names to delete
folderIdstringnoFolder scope override

Example

const result = await client.tools.removeDocument({ docNames: ["old-report.pdf", "draft-v1.pdf", "missing.pdf"], }); for (const item of result.results) { console.log(`${item.doc_name}: ${item.status}`); }

Response

{ "results": [ { "doc_name": "old-report.pdf", "status": "deleted" }, { "doc_name": "draft-v1.pdf", "status": "deleted" }, { "doc_name": "missing.pdf", "status": "not_found" } ], "next_steps": { "summary": "...", "options": ["..."] } }

status is one of "deleted", "not_found", or "failed". Failed items include an additional error field.


💬 Community & Support

Last updated on