DatasetPreparatorDistributedConfig¶

Module: fast_llm.data.preparation.gpt_memmap.config

Fields¶

backend — optional

Type: str Default: "gloo"

Distributed backend to use.

timeout — optional

Type: int Default: 3600

Timeout in seconds for torch distributed operations. Default is 3600.

rank — expert

Type: int Default: None

Rank of the local process. Typically provided by torchrun or equivalent through the RANK environment variable.

world_size — expert

Type: int Default: None

Size of the world group. Typically provided by torchrun or equivalent through the WORLD_SIZE environment variable.