OptimizerConfig¶

Module: fast_llm.engine.optimizer.config

Fields¶

learning_rate — core

Type: LearningRateScheduleConfig Default: (sub-fields optional)

A schedule for the learning rate.

weight_decay — core

Type: float Default: 0.01

Weight decay (Adamw).

beta_1 — optional

Type: float Default: 0.9

First Adam momentum.

beta_2 — optional

Type: float Default: 0.999

Second Adam Momentum.

default_learning_rate_scale — feature

Type: float Default: 1.0

Default multiplier to apply to the learning rate schedule, for parameters that do not define a scale.

epsilon — optional

Type: float Default: 1e-08

Regularizer for Adam.

gradient_norm_clipping — feature

Type: float Default: 1.0

Clip the gradient norm to this value.

gradient_scaler — feature

Type: GradientScalerConfig Default: (sub-fields optional)

Configuration for the fixed or dynamic gradient scaling.