Mimari

Mixture of Experts (MoE)

An architecture where only a subset of model parameters is activated per token.

In a Mixture of Experts model, the network is divided into many 'expert' sub-networks. A learned router selects which experts process each token, so only a fraction of total parameters are active at inference time. This allows MoE models to have very large total parameter counts while remaining computationally efficient. GPT-4 and Mixtral are widely believed to use MoE.

İlgili Terimler