HybridMoEMLPConfig¶
Module: fast_llm.layers.decoder.mlp.config
Variant of: MLPBaseConfig — select with type: hybrid_moe
Inherits from: MLPBaseConfig, BlockWithBiasConfig, BlockConfig
Fields¶
dense—architecture-
Type: MLPConfig Default: (sub-fields optional)
Configuration for the always-active dense MLP.
post_norm—architecture-
Type: NormalizationConfig or
NoneDefault:NoneOptional normalization applied to the MLP output.
pre_norm—architecture-
Type: NormalizationConfig or
NoneDefault:NoneOptional normalization applied to the MLP input.
routed—architecture-
Type: MoEMLPConfig Default: (sub-fields optional)
Configuration for the top-K routed expert MLP.
lr_scale—feature-
Type:
floatorNoneDefault:NoneScaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.