Datablocks turn long documents into reusable KV caches. Train once, query millions of times. Up to 85% cost savings with OpenAI-compatible APIs.
import requests
# Train a datablock on your document
response = requests.post(
"https://api.datablocks.ai/v1/datablocks/train",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "qwen",
"documents": [{"id": "doc1", "text": document}],
"datablock_name": "my-datablock"
}
)
# Use it for fast, cost-effective inference
response = requests.post(
"https://api.datablocks.ai/v1/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "qwen",
"messages": [{"role": "user", "content": "Summarize this."}],
"datablocks": [{"id": datablock_id, "source": "wandb"}]
}
)Skip document reprocessing. Datablocks load pre-computed KV caches directly into your model for instant context.
Pay only for datablock loading instead of processing input tokens on every request. Massive savings at scale.
Train once, use millions of times. Perfect for documents you query repeatedly like legal contracts or medical records.
Process contracts, case law, and regulatory documents. Query the same legal corpus millions of times without reprocessing.
View examples →Analyze patient histories, research papers, and clinical trials. HIPAA-compliant infrastructure with enterprise SLAs.
View examples →Query earnings reports, SEC filings, and market research. Real-time analysis at a fraction of the cost.
View examples →Build AI coding assistants with full repository context. Support 100K+ token codebases efficiently.
View examples →Get 1M free tokens to try datablocks. No credit card required.
Get Started Free