Documentation Index
Fetch the complete documentation index at: https://docs.openai-nebula.com/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
Calculate the token count of Claude messages to estimate costs before sending requests. This endpoint does not consume quota and only performs local calculations.
Authentication
Bearer Token, e.g., Bearer sk-xxxxxxxxxx
Request Parameters
Claude model identifier. Supported models include:
claude-opus-4-5-20251101 (Recommended replacement for claude-3-opus)
claude-haiku-4-5-20251001
claude-sonnet-4-5-20250929
claude-sonnet-4-20250514
- Other Claude series models
List of conversation messages, each containing role (user/assistant) and content. content can be a string or an array of media content.Supported content types:
- Plain text messages
- Multimodal messages (including images)
- Tool call results
System prompt (optional), can be a string or an array of media content. Used to set the model’s behavior and role.
Tool definitions list (optional), used to calculate tokens related to tool calls.
Response Parameters
Total token count of input messages, including:
- System prompt tokens
- All messages tokens
- Tools definition tokens (if any)
Basic Examples
Simple Text Message
With System Prompt
Multi-turn Conversation
Python Example
curl -X POST "https://llm.ai-nebula.com/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
]
}'
curl -X POST "https://llm.ai-nebula.com/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"system": "You are a helpful AI assistant.",
"messages": [
{
"role": "user",
"content": "What is artificial intelligence?"
}
]
}'
curl -X POST "https://llm.ai-nebula.com/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"messages": [
{
"role": "user",
"content": "Hello"
},
{
"role": "assistant",
"content": "Hi! How can I help you today?"
},
{
"role": "user",
"content": "Tell me about the history of AI"
}
]
}'
from anthropic import Anthropic
client = Anthropic(
api_key="sk-xxxxxxxxxx",
base_url="https://llm.ai-nebula.com"
)
# Count tokens
response = client.messages.count_tokens(
model="claude-sonnet-4-5-20250929",
system="You are a helpful assistant.",
messages=[
{"role": "user", "content": "Hello, Claude!"}
]
)
print(f"Input tokens: {response.input_tokens}")
Advanced Use Cases
curl -X POST "https://llm.ai-nebula.com/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"messages": [
{
"role": "user",
"content": "What is the weather in San Francisco?"
}
],
"tools": [
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
]
}'
Multimodal Content
curl -X POST "https://llm.ai-nebula.com/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image",
"source": {
"type": "url",
"url": "https://example.com/image.jpg"
}
}
]
}
]
}'
Use Cases
1. Cost Estimation
Calculate token counts before sending bulk requests to estimate costs:
# Batch cost calculation
messages_batch = [...] # Batch messages
total_tokens = 0
for messages in messages_batch:
response = client.messages.count_tokens(
model="claude-sonnet-4-5-20250929",
messages=messages
)
total_tokens += response.input_tokens
# Calculate total cost based on pricing
cost = total_tokens * price_per_token
print(f"Estimated cost: ${cost:.4f}")
2. Context Window Management
Check if messages exceed the model’s context window limit:
MAX_CONTEXT_WINDOW = 200000 # Claude Sonnet 4.5's context window
response = client.messages.count_tokens(
model="claude-sonnet-4-5-20250929",
messages=long_conversation
)
if response.input_tokens > MAX_CONTEXT_WINDOW:
print(f"Warning: Message tokens ({response.input_tokens}) exceed context window limit")
# Perform message truncation or summarization
3. Prompt Optimization
Compare token consumption of different prompts:
prompts = [
"Concise prompt...",
"Detailed prompt...",
"Very detailed prompt..."
]
for prompt in prompts:
response = client.messages.count_tokens(
model="claude-sonnet-4-5-20250929",
system=prompt,
messages=[{"role": "user", "content": "test"}]
)
print(f"{len(prompt)} chars -> {response.input_tokens} tokens")
Important Notes
- Image tokens use a fixed estimate (~1000 tokens), actual count may vary based on resolution
- Does not include output-related parameters like
max_tokens, only counts input tokens
- This endpoint does not make actual AI requests and does not consume quota
Error Handling
Missing Required Parameters
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "Key: 'ClaudeCountTokensRequest.Model' Error:Field validation for 'Model' failed on the 'required' tag"
}
}
Invalid API Key
{
"error": {
"message": "Invalid token",
"type": "invalid_request_error"
}
}