Skip to content

LanguageModelDistillationLossConfig

Module: fast_llm.layers.language_model.loss.config

Variant of: LanguageModelLossConfig — select with type: distillation

Inherits from: LanguageModelLossConfig

Fields

loss_typecore

Type: EntropyLossType    Default: "cross_entropy"

Type of loss to use.

weightcore

Type: float    Default: 1.0

Weight for this loss in the total loss computation.

reference_modelfeature

Type: str    Default: "teacher"

Name of the reference model for knowledge distillation.

temperatureoptional

Type: float    Default: 1.0

Temperature for teacher softmax.

use_tritonexpert

Type: bool or None    Default: None

Enable triton implementation. Default: use if available.