Skip to content

StageConfig

Module: fast_llm.engine.multi_stage.config

Fields

full_precision_gradientsoptional

Type: bool    Default: True

Reduce and accumulate gradients in fp32 to improve numerical stability.

store_frozen_weights_in_optimization_precisionoptional

Type: bool    Default: True

Store frozen weights in full precision even if not needed.Allows preserving the precision for saved checkpoints, at the cost of memory and compute (copy) overheads.

debug_activation_memorylogging

Type: bool    Default: False

Log memory usage after each layer.

debug_all_param_gradientslogging

Type: int    Default: 0

Log each parameter gradient after reduction.

debug_global_tensorslogging

Type: bool    Default: True

Reconstruct global tensors for debug logs (slow, uses lots of memory, does not concat sequential micro-batches).

debug_layer_gradientslogging

Type: int    Default: 0

Log the (input) gradients of each layer.

debug_layer_outputslogging

Type: int    Default: 0

Log the output of each layer.

debug_param_gradientslogging

Type: int    Default: 0

Log the gradient shard after reduction.

debug_param_initlogging

Type: int    Default: 0

Log the parameters after initialization.

debug_param_updatelogging

Type: int    Default: 0

Log the parameters after update.

debug_tensor_parallellogging

Type: bool    Default: False

Check for tensor-parallel desyncs and log an error if a desync is found. High overhead