DistributedConfig¶
Module: fast_llm.engine.distributed.config
Fields¶
compute_dtype—core-
Type:
DataTypeDefault:"float32"The data type used for the forward and backward passes.
dp_seed_shift—optional-
Type:
intDefault:317_767_863_445_754_100_075_399_033_823Seed shift for extra randomness.
inference_seed_shift—optional-
Type:
intDefault:220_111_337_975_202_516_901_860_145_957Seed shift for extra randomness.
pp_gen_init_seed_shift—optional-
Type:
intDefault:631_112_027_069_964_424_381_239_824_623Seed shift for extra randomness.
pp_gen_seed_shift—optional-
Type:
intDefault:500_460_795_349_110_443_334_056_239_993Seed shift for extra randomness.
pp_seed_shift—optional-
Type:
intDefault:811_026_271_858_507_938_190_098_775_099Seed shift for extra randomness.
sample_seed_shift—optional-
Type:
intDefault:751_127_116_949_963_770_353_413_160_199Seed shift for extra randomness.
seed—optional-
Type:
intDefault:1234A seed for training.
timeout—optional-
Type:
floatDefault:60Timeout for distributed operations.
tp_gen_init_seed_shift—optional-
Type:
intDefault:894_750_739_684_993_243_926_471_979_237Seed shift for extra randomness.
tp_gen_seed_shift—optional-
Type:
intDefault:278_779_420_836_085_904_093_221_202_933Seed shift for extra randomness.
tp_seed_shift—optional-
Type:
intDefault:705_275_193_289_568_515_128_435_800_471Seed shift for extra randomness.
train_seed_shift—optional-
Type:
intDefault:938_219_878_163_699_459_065_752_841_447Seed shift for extra randomness.
valid_seed_shift—optional-
Type:
intDefault:683_552_447_587_140_661_489_672_773_353Seed shift for extra randomness.
pipeline_parallel—performance-
Type:
intDefault:1Pipeline parallelism group size.
sequence_data_parallel—performance-
Type:
intDefault:1Sequence data parallelism group size.
sequence_tensor_parallel—performance-
Type:
boolDefault:FalseEnable sequence tensor parallelism.
tensor_parallel—performance-
Type:
intDefault:1Tensor parallelism group size.
backend—expert-
Type:
DistributedBackendDefault:"nccl"The distributed backend to use.
force_cpu_initialization—expert-
Type:
boolDefault:FalseInitialize on cpu even if cuda is enabled. Useful for matching cpu and cuda runs.
local_world_size—expert-
Type:
intDefault:NoneNumber of GPUs in each node. Typically provided by torchrun or equivalent through the
LOCAL_WORLD_SIZEenvironment variable. optimization_dtype—expert-
Type:
DataTypeDefault:"float32"The data type used for the optimizer.
pipeline_first—expert-
Type:
boolDefault:FalsePrioritize the pipeline groups for placement of nearby ranks over data groups.
rank—expert-
Type:
intDefault:NoneRank of the local process. Typically provided by torchrun or equivalent through the
RANKenvironment variable. use_cuda—expert-
Type:
boolDefault:TrueEnable CUDA device.
world_size—expert-
Type:
intDefault:NoneSize of the world group, e.e., total number of GPUs. Typically provided by torchrun or equivalent through the
WORLD_SIZEenvironment variable.