Quick Start
Get started with datablocks in under 5 minutes. Train a datablock and start making efficient inference requests.
1. Get Your API Key
Sign up for a datablocks account and generate an API key from your dashboard.
Sign up here to get started with a free trial including 1M free tokens.
2. Install the Client Library
Install the datablocks Python client:
pip install datablocks-client
3. Train Your First Datablock
Train a datablock on your document to create a reusable KV cache:
import requests
# Your document content
document = """
[Your long document content here - can be up to 100K tokens]
This could be research papers, technical documentation, legal contracts,
medical records, or any other long-form content you need to query repeatedly.
"""
# Train the datablock
response = requests.post(
"/api/v1/datablocks/train",
headers={"Authorization": f"Bearer {YOUR_API_KEY}"},
json={
"model": "qwen",
"documents": [{"id": "doc1", "text": document}],
"datablock_name": "my-first-datablock",
"parameters": {
"num_learned_tokens": 1024 # KV cache size
}
}
)
datablock_id = response.json()["datablock_id"]
print(f"Training started! Datablock ID: {datablock_id}")Training typically completes in 5-15 minutes. You'll receive a datablock ID to use for inference.
4. Run Inference with Your Datablock
Once training completes, use your datablock for fast, cost-effective inference:
import requests
response = requests.post(
"/api/v1/chat/completions",
headers={
"Authorization": f"Bearer {YOUR_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": "qwen",
"messages": [
{
"role": "user",
"content": "What are the main findings in this document?"
}
],
"datablocks": [
{
"id": datablock_id,
"source": "wandb"
}
]
}
)
answer = response.json()["choices"][0]["message"]["content"]
print(answer)What You Just Achieved
Next Steps
Training Guide
Learn advanced training parameters and best practices for optimal compression.
Inference Guide
Explore advanced usage patterns like batch processing and multi-datablock queries.
Examples & Use Cases
See real-world applications across legal, medical, financial, and coding domains.
Try the Playground
Test the API interactively with our web-based playground tool.
Complete Example
Here's a complete end-to-end example combining training and inference:
import requests
import time
API_KEY = "your-api-key-here"
BASE_URL = "/api/v1"
# Step 1: Train datablock
train_response = requests.post(
f"{BASE_URL}/datablocks/train",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "qwen",
"documents": [{"id": "doc1", "text": "Your long document..."}],
"datablock_name": "example-datablock"
}
)
datablock_id = train_response.json()["datablock_id"]
# Step 2: Wait for training (poll status)
while True:
status_response = requests.get(
f"{BASE_URL}/datablocks/{datablock_id}/status",
headers={"Authorization": f"Bearer {API_KEY}"}
)
status = status_response.json()["status"]
if status == "completed":
break
elif status == "failed":
raise Exception("Training failed")
time.sleep(30) # Check every 30 seconds
# Step 3: Run inference
inference_response = requests.post(
f"{BASE_URL}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"model": "qwen",
"messages": [{"role": "user", "content": "Summarize the key points"}],
"datablocks": [{"id": datablock_id, "source": "wandb"}]
}
)
print(inference_response.json()["choices"][0]["message"]["content"])