Similarity Analysis Config

Default value: SimilarityOptions()

Environment Variable: SIMILARITY

In Key Concepts, Similarity Analysis explains how the different configuration attributes will affect the analysis results. Note that language-related defaults are dynamically selected based on the language specified in the Language Config (default is English).

If your machine does not have a lot of computing power, similarity can be set to null. It can be enabled later on in the application.

Class DefinitionConfig ExampleDisabling Similarity AnalysisEnglish defaultsFrench defaults

from pydantic import BaseModel

class SimilarityOptions(BaseModel):
    faiss_encoder: str = "" # Language-based default value # (1)
    conflicting_neighbors_threshold: float = 0.9 # (2)
    no_close_threshold: float = 0.5 # (3)

Language model used for utterance embeddings for similarity analysis. The name of your encoder must be supported by sentence-transformers.
Threshold to determine the ratio of utterances that should belong to another class for the smart tags conflicting_neighbors_train/conflicting_neighbors_eval.
Threshold for cosine similarity for the smart tags no_close_train/no_close_eval.

For example, to change the encoder used for utterance embeddings:

{
  "similarity": {
    "faiss_encoder": "your_encoder"
  }
}

{
  "similarity": null
}

# Sentence encoder
faiss_encoder = "all-MiniLM-L12-v2"

# Sentence encoder
faiss_encoder = "distiluse-base-multilingual-cased-v1"