Embeddings
The Embeddings section consolidates settings that control how the KB Engine prepares text for training and search. Here you configure which article fields are used for predictions, how text is normalized before embedding, and equivalent terms for keyword search (Synonym Sets).
These settings apply primarily to unstructured data sources (web, File Repository, SharePoint, Shared Drive, and custom data sources).
Provider and Model
The LLM provider and the specific model used to generate vector embeddings for the Knowledge Base.
- Provider: Select Druid or one of the LLM providers currently activated on your tenant.
- Model: Select the machine learning model to be used for data vectorization.
Trainable Elements
Trainable Elements define which specific parts of an article—such as the title, question, short description, or answer—the Knowledge Base Engine uses for predictions from unstructured data sources.
Each element can be enabled or disabled individually using its respective toggle switch, allowing for single or multi-element configurations. These preferences are set at the Knowledge Base and data source level, with granular overrides available at the node and leaf level.
| Parameter | Description |
|---|---|
| Question | Uses the question and title for predictions. |
| Answer |
Uses the short description and answer for predictions. New AI Agents use both Question and Answer by default. |
| Train generated summary |
Uses LLM-generated summaries for predictions when the LLM content chunker creates them during document processing. Enabling this can increase chunking time but may improve match quality at prediction time. NOTE: Requires Generate summary set to be selected in section Content Chunkers, otherwise summaries are not generated.
|
| Train generated questions |
Uses LLM-generated questions for training when the LLM content chunker creates them. The number in the UI (for example, [0]) reflects how many questions are configured for training in the content chunker settings. For more information, see Content Chunkers. |
Generator Strategies
Generator Strategies control how text is formatted before embedding during KB training. Select the strategy that fits your content and search needs.
| Parameter | Description |
|---|---|
| Identity |
Keeps the original text unchanged throughout training. This is the typical default when no transformation is required. |
| Lower case | Converts all text to lowercase for uniform case formatting. |
| Split | Splits paragraphs into individual sentences and applies sentence-level embeddings to improve comprehension and search accuracy. |
| Upper case | Converts all text to uppercase. Use when uppercase formatting is required for your content or downstream processing. |
Synonym Sets
Synonym Sets let you group equivalent terms so the search can match user queries even when wording differs from your KB content. Each set lists related words or phrases (for example, in stock, available, ready to ship).
Matches on the exact query term score at 100%; matches on a synonym score at 80%, so synonyms broaden results without outweighing the user’s original words.
For more information, see Synonyms Dictionary.
