Skip to content

DatasetPreparatorDistributedConfig

Module: fast_llm.data.preparation.gpt_memmap.config

Fields

backendoptional

Type: str    Default: "gloo"

Distributed backend to use.

timeoutoptional

Type: int    Default: 3600

Timeout in seconds for torch distributed operations. Default is 3600.

rankexpert

Type: int    Default: None

Rank of the local process. Typically provided by torchrun or equivalent through the RANK environment variable.

world_sizeexpert

Type: int    Default: None

Size of the world group. Typically provided by torchrun or equivalent through the WORLD_SIZE environment variable.

Used in