Skip to content

DecoderBlockConfig

Module: fast_llm.layers.decoder.config

Variant of: BlockConfig — select with type: decoder

Inherits from: BlockConfig, ModuleConfig

Fields

mixerarchitecture

Type: MixerConfig    Default: (sub-fields optional)

Configuration for the attention/mixer layer.

mlparchitecture

Type: MLPBaseConfig    Default: (sub-fields optional)

Configuration for the feedforward (MLP) layer.

normalizationarchitecture

Type: NormalizationConfig    Default: (sub-fields optional)

Configuration for the block normalization layers.

distillation_loss_weightfeature

Type: float    Default: 1.0

Weight for the scale the activation distillation loss.

distillation_modelfeature

Type: str or None    Default: None

Name of the reference model to use for activation-level distillation.

dropoutfeature

Type: float    Default: 0.0

Dropout applied to the residual connections.

lr_scalefeature

Type: float or None    Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.