LanguageModelGRPOLossConfig¶
Module: fast_llm.layers.language_model.loss.config
Variant of: LanguageModelLossConfig — select with type: grpo
Inherits from: LanguageModelLossConfig
Fields¶
weight—core-
Type:
floatDefault:1.0Weight for this loss in the total loss computation.
epsilon_high-
Type:
floatDefault:0.2Upper clip parameter for ratio of log probs
epsilon_low-
Type:
floatDefault:0.2Lower clip parameter for ratio of log probs
use_triton—expert-
Type:
boolorNoneDefault:NoneEnable triton implementation. Default: use if available.