NormalizationConfig¶

Abstract

This class cannot be instantiated directly. Use one of the variants listed below.

Module: fast_llm.layers.common.normalization.config

Inherits from: ModuleConfig

Fields¶

lr_scale — feature

Type: float or None Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent layer and individual parameters, if applicable.

Variants¶

Select a variant by setting type: to one of the following values.

`type` value	Class	Description
`fixed_rms_norm`	FixedRMSNormConfig	RMS normalization without a learnable weight (fixed unit scale). Used for value norms in Gemma-family models
`gated_rms_norm`	GatedRMSNormalizationConfig	Configuration for gated RMS normalization, which applies a learned activation gate alongside the norm weight
`layer_norm`	LayerNormalizationConfig
`none`	NoNormalizationConfig
`rms_norm`	RMSNormalizationConfig

Used in¶

key_norm in AttentionConfig
query_norm in AttentionConfig
value_norm in AttentionConfig
normalization in DecoderBlockConfig
post_mixer_normalization in DecoderBlockConfig
post_mlp_normalization in DecoderBlockConfig
pre_mixer_normalization in DecoderBlockConfig
pre_mlp_normalization in DecoderBlockConfig
post_norm in HybridMoEMLPConfig
pre_norm in HybridMoEMLPConfig
normalization in LanguageModelHeadConfig
post_norm in MLPBaseConfig
pre_norm in MLPBaseConfig
post_norm in MLPConfig
pre_norm in MLPConfig
post_norm in MoEMLPConfig
pre_norm in MoEMLPConfig
router_normalization in MoEMLPConfig
normalization in PatchEmbeddingsConfig