Skip to content

Layers / Decoder / Mlp

Classes

Class Description
HybridMoEMLPConfig Configuration for a MoE layer combining an always-active dense MLP with top-K routed experts
MLPConfig Configuration for a dense feedforward (MLP) layer with optional gating and activation recomputation
MoEMLPConfig Configuration for a Mixture-of-Experts (MoE) feedforward layer with top-k token routing