Skip to content

LanguageModelEmbeddingsConfig

Module: fast_llm.layers.language_model.config

Inherits from: BlockConfig, ModuleConfig

Fields

num_position_embeddingsarchitecture

Type: int    Default: 2048

Number of absolute position embeddings, if applicable.

position_embeddingsarchitecture

Type: OptionalParameterConfig    Default: (sub-fields optional)

Configuration for the word embedding (weight).

vocab_sizearchitecture

Type: int    Default: 49152

Size of the vocabulary, i.e., number of vocabulary embeddings and logits.

word_embeddingsarchitecture

Type: ParameterConfig    Default: (sub-fields optional)

Configuration for the word embedding (weight).

dropoutfeature

Type: float    Default: 0.0

Dropout applied to the embedding layer.

lr_scalefeature

Type: float or None    Default: None

Scaling factor for the layer learning rate. Combines multiplicatively with the scale set by the parent and child layers, if applicable.

full_precision_residualstability

Type: bool    Default: False

Store the residuals for the model in full precision (optimization_dtype).

vocab_parallelperformance

Type: bool    Default: True

Allow for tensor-parallel vocabulary embeddings and output weights. Disable to allow for sequence-tensor-parallel input tokens, logits and cross-entropy computation. The sequence-tensor-parallel version typically runs faster, but may incur a small memory cost. Affects RNG for initialization and dropout.

Used in