Skip to content

MixerConfig

Abstract

This class cannot be instantiated directly. Use one of the variants listed below.

Module: fast_llm.layers.decoder.config

Inherits from: BlockWithBiasConfig, BlockConfig, ModuleConfig

Fields

lr_scalefeature

Type: float or None    Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.

Variants

Select a variant by setting type: to one of the following values.

type value Class Description
attention AttentionConfig Configuration for multi-head and grouped-query attention with optional rotary embeddings
stochastic StochasticMixerConfig Stochastic mixer that uniformly samples from multiple mixer options during training

Used in