Ana Kavramlar

Çıkarım (Inference)

Eğitilmiş bir yapay zeka modelinin yeni girdiler için çıktı üretme süreci.

Çıkarım, API üzerinden her istek gönderdiğinizde gerçekleşen çalıştırma işlemidir. Modelin ağırlıklarını güncellemez; harcanan token başına ücretlendirilir.

İlgili Terimler

Throughput

The number of tokens or requests a model can process per second.

Latency

The time between sending a request and receiving the first token of a response.

Streaming

Delivering model output token-by-token as it is generated rather than waiting for the full response.

Batch Processing

Submitting requests asynchronously in bulk for a 50% price discount.