NormalizationConfig¶
Abstract
This class cannot be instantiated directly. Use one of the variants listed below.
Module: fast_llm.layers.common.normalization.config
Inherits from: ModuleConfig
Fields¶
lr_scale—feature-
Type:
floatorNoneDefault:NoneScaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent layer and individual parameters, if applicable.
Variants¶
Select a variant by setting type: to one of the following values.
type value |
Class | Description |
|---|---|---|
fixed_rms_norm |
FixedRMSNormConfig | RMS normalization without a learnable weight (fixed unit scale). Used for value norms in Gemma-family models |
gated_rms_norm |
GatedRMSNormalizationConfig | Configuration for gated RMS normalization, which applies a learned activation gate alongside the norm weight |
layer_norm |
LayerNormalizationConfig | |
none |
NoNormalizationConfig | |
rms_norm |
RMSNormalizationConfig |
Used in¶
key_normin AttentionConfigquery_normin AttentionConfigvalue_normin AttentionConfignormalizationin DecoderBlockConfigpost_mixer_normalizationin DecoderBlockConfigpost_mlp_normalizationin DecoderBlockConfigpre_mixer_normalizationin DecoderBlockConfigpre_mlp_normalizationin DecoderBlockConfigpost_normin HybridMoEMLPConfigpre_normin HybridMoEMLPConfignormalizationin LanguageModelHeadConfigpost_normin MLPBaseConfigpre_normin MLPBaseConfigpost_normin MLPConfigpre_normin MLPConfigpost_normin MoEMLPConfigpre_normin MoEMLPConfigrouter_normalizationin MoEMLPConfignormalizationin PatchEmbeddingsConfig