OptimizerConfig¶
Module: fast_llm.engine.optimizer.config
Fields¶
learning_rate—core-
Type: LearningRateScheduleConfig Default: (sub-fields optional)
A schedule for the learning rate.
weight_decay—core-
Type:
floatDefault:0.01Weight decay (Adamw).
beta_1—optional-
Type:
floatDefault:0.9First Adam momentum.
beta_2—optional-
Type:
floatDefault:0.999Second Adam Momentum.
default_learning_rate_scale—feature-
Type:
floatDefault:1.0Default multiplier to apply to the learning rate schedule, for parameters that do not define a scale.
epsilon—optional-
Type:
floatDefault:1e-08Regularizer for Adam.
gradient_norm_clipping—feature-
Type:
floatDefault:1.0Clip the gradient norm to this value.
gradient_scaler—feature-
Type: GradientScalerConfig Default: (sub-fields optional)
Configuration for the fixed or dynamic gradient scaling.