Skip to content

OptimizerConfig

Module: fast_llm.engine.optimizer.config

Fields

learning_ratecore

Type: LearningRateScheduleConfig    Default: (sub-fields optional)

A schedule for the learning rate.

weight_decaycore

Type: float    Default: 0.01

Weight decay (Adamw).

beta_1optional

Type: float    Default: 0.9

First Adam momentum.

beta_2optional

Type: float    Default: 0.999

Second Adam Momentum.

default_learning_rate_scalefeature

Type: float    Default: 1.0

Default multiplier to apply to the learning rate schedule, for parameters that do not define a scale.

epsilonoptional

Type: float    Default: 1e-08

Regularizer for Adam.

gradient_norm_clippingfeature

Type: float    Default: 1.0

Clip the gradient norm to this value.

gradient_scalerfeature

Type: GradientScalerConfig    Default: (sub-fields optional)

Configuration for the fixed or dynamic gradient scaling.

Used in