LanguageModelDistillationLossConfig¶
Module: fast_llm.layers.language_model.loss.config
Variant of: LanguageModelLossConfig — select with type: distillation
Inherits from: LanguageModelLossConfig
Fields¶
loss_type—core-
Type:
EntropyLossTypeDefault:"cross_entropy"Type of loss to use.
weight—core-
Type:
floatDefault:1.0Weight for this loss in the total loss computation.
reference_model—feature-
Type:
strDefault:"teacher"Name of the reference model for knowledge distillation.
temperature—optional-
Type:
floatDefault:1.0Temperature for teacher softmax.
use_triton—expert-
Type:
boolorNoneDefault:NoneEnable triton implementation. Default: use if available.