Skip to content

MLPConfig

Module: fast_llm.layers.decoder.mlp.config

Variant of: MLPBaseConfig — select with type: mlp

Inherits from: MLPBaseConfig, BlockWithBiasConfig, BlockConfig

Fields

activationcore

Type: ActivationType    Default: None

The MLP intermediate activation type. Default: SiLU for gated MLP, GeLU otherwise.

add_linear_biasesarchitecture

Type: bool    Default: True

Add biases to linear layers. May be overridden for individual layers.

gatedarchitecture

Type: bool    Default: False

Enable gated MLP.

intermediate_sizearchitecture

Type: int    Default: 4096

Hidden dimension of the MLP intermediate state.

layer_1architecture

Type: AffineLinearConfig    Default: (sub-fields optional)

Configuration for the first MLP layer.

layer_2architecture

Type: AffineLinearConfig    Default: (sub-fields optional)

Configuration for the second MLP layer.

lr_scalefeature

Type: float or None    Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.

recompute_levelperformance

Type: MLPRecomputeLevel    Default: "none"

Set which of the MLP intermediate activations will be recomputed during the backward passes. This provides a trade-off between memory and speed.