Pricing & Costs
Batch Processing
Submitting requests asynchronously in bulk for a 50% price discount.
Batch APIs (offered by OpenAI and Anthropic) accept large files of requests and process them with higher latency — typically within 24 hours — in exchange for a 50% discount on both input and output prices. Batch processing is ideal for offline tasks like dataset annotation, document classification, and nightly report generation.
Related Terms
Input Price
The per-million-token cost charged for tokens in your prompt.
Output Price
The per-million-token cost charged for tokens the model generates.
Throughput
The number of tokens or requests a model can process per second.
Latency
The time between sending a request and receiving the first token of a response.