Grundkonzepte
Tokenizer
The algorithm that converts raw text into a sequence of tokens for a language model.
A tokenizer splits input text into tokens using algorithms like Byte-Pair Encoding (BPE) or SentencePiece. Each model has its own vocabulary and tokenizer — the same text will produce different token counts across providers. This matters for pricing: more efficient tokenization means fewer tokens per character, reducing your API bill. OpenAI's tiktoken and Anthropic's tokenizer can produce meaningfully different counts.