StageConfig¶
Module: fast_llm.engine.multi_stage.config
Fields¶
full_precision_gradients—optional-
Type:
boolDefault:TrueReduce and accumulate gradients in fp32 to improve numerical stability.
store_frozen_weights_in_optimization_precision—optional-
Type:
boolDefault:TrueStore frozen weights in full precision even if not needed.Allows preserving the precision for saved checkpoints, at the cost of memory and compute (copy) overheads.
debug_activation_memory—logging-
Type:
boolDefault:FalseLog memory usage after each layer.
debug_all_param_gradients—logging-
Type:
intDefault:0Log each parameter gradient after reduction.
debug_global_tensors—logging-
Type:
boolDefault:TrueReconstruct global tensors for debug logs (slow, uses lots of memory, does not concat sequential micro-batches).
debug_layer_gradients—logging-
Type:
intDefault:0Log the (input) gradients of each layer.
debug_layer_outputs—logging-
Type:
intDefault:0Log the output of each layer.
debug_param_gradients—logging-
Type:
intDefault:0Log the gradient shard after reduction.
debug_param_init—logging-
Type:
intDefault:0Log the parameters after initialization.
debug_param_update—logging-
Type:
intDefault:0Log the parameters after update.
debug_tensor_parallel—logging-
Type:
boolDefault:FalseCheck for tensor-parallel desyncs and log an error if a desync is found. High overhead