Skip to content

LanguageModelDPOLossConfig

Module: fast_llm.layers.language_model.loss.config

Variant of: LanguageModelLossConfig — select with type: dpo

Inherits from: LanguageModelLossConfig

Fields

betacore

Type: float    Default: 1.0

Beta parameter for DPO loss (controls strength of preference optimization).

weightcore

Type: float    Default: 1.0

Weight for this loss in the total loss computation.

reference_modelfeature

Type: str    Default: (required)

Name of the reference model to use for dpo.