Knowledge Base
The Knowledge Base is a powerful tool that enhances your virtual assistant’s ability to provide accurate and relevant responses by serving as a centralized repository of information.
With the Knowledge Base, you can compile and manage a collection of text-based articles, enabling your chatbot to deliver the best possible answers when a user’s intent isn’t covered in predefined conversation flows.
To create a comprehensive Knowledge Base, you can integrate multiple data sources, including structured Excel and PDF files, file repositories, websites, SharePoint documents, and network shared drives. DRUID processes these sources to extract relevant content, organizing it into Q&A pairs or articles that your chatbot can use to improve user interactions.
In this guide, you’ll learn how to access the Knowledge Base, how to add and manage data sources, and how to effectively integrate your chatbot with the Knowledge Base to improve its ability to handle user queries.
Accessing the Knowledge Base
To access the Knowledge Base, select the desired bot and solution and from the NLU menu, click Knowledge Base.
When you access the Knowledge Base for the first time, the page is empty. To create your bot knowledge base, add as many data sources as you want, extract the data and train the KB.
Adding data sources
DRUID extracts text articles / paragraphs from the following data sources:
- Structured data sources (structured Excel and PDF files)
- File repository (Word, Excel and PDF, both structured and unstructured)
- Paragraphs from websites
- Documents from SharePoint libraries (Word, Excel and PDF files)
- Network shared drives.
For instructions on adding different types of data sources, refer to the relevant topic.
Use the Knowledge Base on the bot
By default, if during the conversation, the bot NLP model does not match the user input with any of the existing flows, it will execute the Intent not recognized flow set on your bot (if any).
To provide your chatbot with the capability to search through the Knowledge Base when the user input does not match with any of the existing flows, go to the bot Details page and in the Dialogue management section, tap on Use Knowledge Base.
By default, only the answer corresponding to the article with the higher probability will be shown to the user.
DRUID offers a comprehensive set of Knowledge Base solution templates to address various scenarios, including the use of generative AI. For more information, explore the Solution Library.
Return top 5 articles matching the user intent
To show top 5 articles matching the user’s question, from the Solutions Library, import solution Knowledge Base Starter. This solution template contains the flow Knowledge-Base-response-flow dedicated to displaying responses to users when the Flow Engine predicts against the Knowledge Base.
Go to the bot Details page and in the Dialogue management section, from the Knowledge Base response flow drop-down, select Knowledge-Base-response-flow.
When the users ask a question and no flow is matched in the bot model, the question will be searched within the Knowledge Base. The answer corresponding to the question with the higher probability will be shown to the user, along with Related topics that contains the first 5 topics with the higher probability, displayed in a card with repeater buttons.
Rephrase user question to provide incremental, contextual KB search
Improve your Knowledge Base search by rephrasing user questions to provide incremental, contextual results. From the Solutions Library, import the solution named "Knowledgebase with GPT V 2_0 - Azure". This solution combines the DRUID Knowledge Base and GPT from Azure to deliver a highly intelligent, human-like conversation experience. It includes two flows dedicated to rephrasing user intent and responses using Azure OpenAI when the Flow Engine predicts against the Knowledge Base.
Go to the bot's Details page and in the Dialogue Management section:
-
From the Knowledge Base response flow field, select Knowledge-Base-response-flow-refine-question-azure.com.
-
From the Intent rephrase flow field, select Intent rephrase flow.
Knowledge Base Basic Settings
To access the Knowledge Base settings, in the Knowledge base page, click the Settings button.
The KB General Settings appear.
The table below provides the description of the Knowledge Base general settings.
Setting | Description. |
---|---|
Embeddings Provider |
There are three providers available:
|
Embeddings Model |
An embeddings model is a machine learning model that transforms data, such as text or images, into a vector of numbers (an embedding). This vector representation captures the semantic meaning or relationships within the data, allowing for more efficient comparisons, searches, and analysis. Note: This parameter is available in DRUID 8.3 and higher.
The following embedding models are available in DRUID:
Note: The HigherEducation.v1 (technology preview) and MultiAspect embedding models are available in Druid version 8.13 and later.
Hint: The Paraphrase embeddings model processes up to 125 tokens per paragraph and is ideal for short sentences, while other models support up to 512 tokens per paragraph.
|
Set results threshold |
Note: This feature is available in DRUID version 7.14 onwards.
The Results Threshold settings determine how matching utterances are evaluated against the Knowledge Base (KB). These settings vary depending on whether the bot is new or existing. For new bots, Use Bot NLU Thresholds is enabled by default. The KB uses the NLU thresholds configured on the bot (NLU menu > Configurations > Intents tab > Thresholds and Parameters section). For existing bots, the behavior varies based on the NLU thresholds:
To control how the KB evaluates and matches user input, disable Use Bot NLU Thresholds and adjust the 'Min Match Score' and 'Target Match Score' using the slider, ensuring that it aligns with your desired performance thresholds. Note: If you enable Use Bot NLU Thresholds, the threshold values set on the slider will be lost.
|
Search balance |
By default, the search within the knowledge base is performed using a mix of the following two algorithms: the keyword (Text) search algorithm and the semantic (Vector) search algorithm. Additionally, you can use the reranker to perform further analysis and enhance the result quality. If you don't use the reranker, the recommend value for the search balance is Vector 80% , Text 20% , meaning that the search will use 80% the semantics search algorithm (the search uses the semantic search algorithm that returns more accurate results) and 20% the text search algorithm (the search uses the keyword /text algorithm, which might return a lot of noise). Move the slider to set the search balance based on your needs. Hint: In DRUID version 7.14 and later, the values you set for the Search balance slider and the Score Calculator strategy in Advanced Settings are synchronized. Any changes made to one will be reflected in the other.
|
Search inside answers |
Tap on if you want the user says to be matched against both the question and answer pairs available in structured data sources. If this option is off, the user says is matched only against the questions available in the structured data sources. |
Use Knowledge Base |
Tap on to provide your chatbot with the capability to search through the Knowledge Base when the user input does not match with any of the existing flows. From the Knowledge Base response flow drop-down, select Knowledge-Base-response-flow. |
Intent rephrase flow |
If you want to rephrase / improve user intent by using an external service (e.g., GPT) to a user intent that is optimal for your bot model, select the flow you specifically designed for rephrasing user intent. After the user sends the intent, the bot first executes the "Intent rephrase flow" that rephrases the utterance and then the bot uses the result (the rephrased intent stored in [[Intent]].Text) to predict in the model. Important! The "Intent rephrase flow" is executed only when the user input is sent while the conversation is in idle mode. Once in a flow, the flow no longer executes (the user intent is not altered).
Hint: This is particularly useful for Knowledge Base with ChatPT.
|
Save the settings.
Testing the Knowledge Base performance
Testing the performance of your Knowledge Base is important because it ensures that the Knowledge Base is delivering accurate and relevant responses, helping to identify and address any issues to improve overall performance.
To test the performance of your knowledge base, on the Knowledge Base page, enter a question in the User Says area and select the language. The model will search across all the data sources in the Knowledge Base and list the articles with a matching score higher than 0.5, along with the data source where each article was found. If you have changed the threshold ([[Intent]].KBQnAItems[0].Score) in the solution configuration, only the articles meeting that threshold will be listed.
If you selected a score calculator strategy from the KB settings, for each result, you will see the total matching score and the weights of the algorithms used. In DRUID version 7.14 and higher, you can view the graphical representation of these weights by clicking the Info icon.
Search within the KB
You can perform exact match searches across the entire Knowledge base or within a data source. The search results will return all data source elements (node, leaf) and articles that exactly match your specified keywords.
When searching for specific keywords at the KB level, a maximum of 30 matching records (if available) will be displayed under the corresponding data source name.
Starting with release 8.10, you can refine your Knowledge Base search results using filters for data source type, document type, and the option to exclude specific data sources or elements. These filtering capabilities help you perform more precise and efficient searches, giving you greater control over the results.
When searching at the data source level, up to 30 results will be shown if they exist.