Skip to content

Dataset Warnings Config

🔵 Default value: DatasetWarningsOptions()

Some thresholds can be set to modify the number of warnings in the Dataset Warnings.

from pydantic import BaseModel

class DatasetWarningsOptions(BaseModel):
    min_num_per_class: int = 20 # (1)
    max_delta_class_imbalance: float = 0.5 # (2)
    max_delta_representation: float = 0.05 # (3)
    max_delta_mean_words: float = 3.0 # (4)
    max_delta_std_words: float = 3.0 # (5)
  1. Threshold for the first set of warnings (missing samples).
  2. Threshold for the second set of warnings (class imbalance).
  3. Threshold for the third set of warnings (representation mismatch).
  4. Threshold for the fourth set of warnings (length mismatch).
  5. Threshold for the fourth set of warnings (length mismatch).
{
  "dataset_warnings": {
    "min_num_per_class": 40
  }
}