Skip to content

MLPBaseConfig

Abstract

This class cannot be instantiated directly. Use one of the variants listed below.

Module: fast_llm.layers.decoder.config

Inherits from: BlockWithBiasConfig, BlockConfig, ModuleConfig

Fields

post_normarchitecture

Type: NormalizationConfig or None    Default: None

Optional normalization applied to the MLP output.

pre_normarchitecture

Type: NormalizationConfig or None    Default: None

Optional normalization applied to the MLP input.

lr_scalefeature

Type: float or None    Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.

Variants

Select a variant by setting type: to one of the following values.

type value Class Description
hybrid_moe HybridMoEMLPConfig Configuration for a MoE layer combining an always-active dense MLP with top-K routed experts
mlp MLPConfig Configuration for a dense feedforward (MLP) layer with optional gating and activation recomputation
moe MoEMLPConfig Configuration for a Mixture-of-Experts (MoE) feedforward layer with top-k token routing

Used in