🧭 Debug: Test Intent Routing
▼ Show
See which V4 route your query would be sent to before running a full search.
Classify
🎯 Try Demo Features
Explore hybrid retrieval, metadata filtering, and multimodal results with one click
🖼️ Multimodal"dashboard navigation"
🔍 Filtered Searchsource_type = image
💳 OCR Query"Access Bank transaction"
🔎 Find Similar Images
Upload an image to find visually similar ones already in your knowledge base.
Upload Query Image
Top results
3
5
10
Find Similar
📦 Batch Ingest
Ingest up to 100 documents in one request. Each document needs an id and content.
Documents JSON Array
Concurrency
3
5
10
📦 Batch Ingest
Example
🗑️ Delete Document
Permanently removes a document and all its chunks from the index and database.
Document ID
Delete
📊 Index Stats
Refresh
Live on Cloudflare Edge • Metadata filtering enabled
🔎 Query Analytics
—
Recent queries (last 10)
No queries yet — run a search to see analytics.
💰 Cost Calculator
Queries per Day
Calculate
Monthly Cost Projection
Click “Calculate” to see projection
🔑 Authentication
API Key (required for protected endpoints)
Test
Tenant:
ADMIN
🤖 AI Models
▼ Details
Loading active model...
Embedding Models (EMBEDDING_MODEL)
Key Model Dims Note
qwen3-0.6b@cf/qwen/qwen3-embedding-0.6b1024 ★ Default 2026. Best retrieval quality.
bge-m3@cf/baai/bge-m31024 Multilingual. Needs 1024d index.
bge-small@cf/baai/bge-small-en-v1.5384 Legacy. Set EMBEDDING_MODEL="bge-small" to keep existing 384d index.
Reflection & Synthesis Models (REFLECTION_MODEL)
Key Model Note
kimi-k2.5@cf/moonshot/kimi-k2.5★ Default. Best multi-doc reasoning.
llama-3.2-3b@cf/meta/llama-3.2-3b-instructLower cost, lower quality.
Vision & OCR: always uses
@cf/meta/llama-4-scout-17b-16e-instruct — best native multimodal on Workers AI.
📋 License Management
Validate
Create (admin)
List (admin)
Revoke (admin)
✅ Setup Status
Run Check
?
API Key
Click “Run Check” to verify
?
Vectorize Index
Click “Run Check” to verify
?
D1 Database
Click “Run Check” to verify
?
Workers AI Binding
Click “Run Check” to verify
🧠 Knowledge Reflection
Sample un-reflected documents and synthesise insights with Llama. Run periodically to build up cross-document knowledge.
Run Reflection
🔧 Setup Your Own Instance
1
Clone the repository
git clone https://github.com/dannwaneri/vectorize-mcp-worker.git
cd vectorize-mcp-worker
npm installcopy
2
Create Cloudflare resources
# Create Vectorize index (384d for default model)
wrangler vectorize create mcp-knowledge-base --dimensions=384 --metric=cosine
# Create D1 database
wrangler d1 create mcp-knowledge-dbcopy
Copy the database_id from the D1 output and paste it into your wrangler.toml.
3
Run database migrations
wrangler d1 execute mcp-knowledge-db --file=./schema.sqlcopy
4
Set your API key secret
wrangler secret put API_KEY
# Enter a strong random key when prompted
# Generate one: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"copy
5
Deploy
wrangler deploycopy
Your worker will be live at https://<worker-name>.<subdomain>.workers.dev
6
First ingest & search
1. Go to Ingest → paste any text into the Document form → click Ingest Document
2. Go to Search → ask a question about the text → results appear below
👥 Connect via MCP (Claude Desktop / Cursor)
Add this to your Claude Desktop or Cursor MCP config to use this worker as a knowledge tool.
{
"mcpServers": {
"vectorize": {
"command": "npx",
"args": [
"mcp-remote",
"https://<your-worker>.workers.dev/mcp",
"--header",
"Authorization: Bearer YOUR_API_KEY"
]
}
}
}copy
Replace <your-worker> with your actual worker subdomain and YOUR_API_KEY with the secret you set in step 4.