MLPConfig¶
Module: fast_llm.layers.decoder.mlp.config
Variant of: MLPBaseConfig — select with type: mlp
Inherits from: MLPBaseConfig, BlockWithBiasConfig, BlockConfig
Fields¶
activation—core-
Type:
ActivationTypeDefault:NoneThe MLP intermediate activation type. Default: SiLU for gated MLP, GeLU otherwise.
add_linear_biases—architecture-
Type:
boolDefault:TrueAdd biases to linear layers. May be overridden for individual layers.
gated—architecture-
Type:
boolDefault:FalseEnable gated MLP.
intermediate_size—architecture-
Type:
intDefault:4096Hidden dimension of the MLP intermediate state.
layer_1—architecture-
Type: AffineLinearConfig Default: (sub-fields optional)
Configuration for the first MLP layer.
layer_2—architecture-
Type: AffineLinearConfig Default: (sub-fields optional)
Configuration for the second MLP layer.
lr_scale—feature-
Type:
floatorNoneDefault:NoneScaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.
recompute_level—performance-
Type:
MLPRecomputeLevelDefault:"none"Set which of the MLP intermediate activations will be recomputed during the backward passes. This provides a trade-off between memory and speed.