LanguageModelGRPOLossConfig¶
Module: fast_llm.layers.language_model.loss.config
Variant of: LanguageModelLossConfig — select with type: grpo
Inherits from: LanguageModelLossConfig
Fields¶
weight—core-
Type:
floatDefault:1.0Weight for this loss in the total loss computation.
metrics—feature-
Type:
GRPOMetricsLevelDefault:"none"Additional GRPO metrics to log.
basic: per-token ratio, KL, and advantage statistics.with_entropy: also log per-token entropy. Not supported with pipeline_parallel > 1. epsilon_high-
Type:
floatDefault:0.2Upper clip parameter for ratio of log probs
epsilon_low-
Type:
floatDefault:0.2Lower clip parameter for ratio of log probs
use_triton—expert-
Type:
boolorNoneDefault:NoneEnable triton implementation. Default: use if available.