Skip to content

DistributedConfig

Module: fast_llm.engine.distributed.config

Fields

compute_dtypecore

Type: DataType    Default: "float32"

The data type used for the forward and backward passes.

dp_seed_shiftoptional

Type: int    Default: 317_767_863_445_754_100_075_399_033_823

Seed shift for extra randomness.

inference_seed_shiftoptional

Type: int    Default: 220_111_337_975_202_516_901_860_145_957

Seed shift for extra randomness.

pp_gen_init_seed_shiftoptional

Type: int    Default: 631_112_027_069_964_424_381_239_824_623

Seed shift for extra randomness.

pp_gen_seed_shiftoptional

Type: int    Default: 500_460_795_349_110_443_334_056_239_993

Seed shift for extra randomness.

pp_seed_shiftoptional

Type: int    Default: 811_026_271_858_507_938_190_098_775_099

Seed shift for extra randomness.

sample_seed_shiftoptional

Type: int    Default: 751_127_116_949_963_770_353_413_160_199

Seed shift for extra randomness.

seedoptional

Type: int    Default: 1234

A seed for training.

timeoutoptional

Type: float    Default: 60

Timeout for distributed operations.

tp_gen_init_seed_shiftoptional

Type: int    Default: 894_750_739_684_993_243_926_471_979_237

Seed shift for extra randomness.

tp_gen_seed_shiftoptional

Type: int    Default: 278_779_420_836_085_904_093_221_202_933

Seed shift for extra randomness.

tp_seed_shiftoptional

Type: int    Default: 705_275_193_289_568_515_128_435_800_471

Seed shift for extra randomness.

train_seed_shiftoptional

Type: int    Default: 938_219_878_163_699_459_065_752_841_447

Seed shift for extra randomness.

valid_seed_shiftoptional

Type: int    Default: 683_552_447_587_140_661_489_672_773_353

Seed shift for extra randomness.

pipeline_parallelperformance

Type: int    Default: 1

Pipeline parallelism group size.

sequence_data_parallelperformance

Type: int    Default: 1

Sequence data parallelism group size.

sequence_tensor_parallelperformance

Type: bool    Default: False

Enable sequence tensor parallelism.

tensor_parallelperformance

Type: int    Default: 1

Tensor parallelism group size.

backendexpert

Type: DistributedBackend    Default: "nccl"

The distributed backend to use.

force_cpu_initializationexpert

Type: bool    Default: False

Initialize on cpu even if cuda is enabled. Useful for matching cpu and cuda runs.

local_world_sizeexpert

Type: int    Default: None

Number of GPUs in each node. Typically provided by torchrun or equivalent through the LOCAL_WORLD_SIZE environment variable.

optimization_dtypeexpert

Type: DataType    Default: "float32"

The data type used for the optimizer.

pipeline_firstexpert

Type: bool    Default: False

Prioritize the pipeline groups for placement of nearby ranks over data groups.

rankexpert

Type: int    Default: None

Rank of the local process. Typically provided by torchrun or equivalent through the RANK environment variable.

use_cudaexpert

Type: bool    Default: True

Enable CUDA device.

world_sizeexpert

Type: int    Default: None

Size of the world group, e.e., total number of GPUs. Typically provided by torchrun or equivalent through the WORLD_SIZE environment variable.

Used in