Network Shared Drive Data Sources
A Network Shared Drive data source enables you to enrich your KB with internal documents, policies, or FAQs, improving the accuracy and comprehensiveness of your bot's responses. By incorporating Network Shared Drive data sources, you will extract paragraphs from Excel, Word, and PDF files stored on your network.
DRUID supports adding and extracting data from the following storage types:
- Local File: Extract data from files located on the same machine as the KB Agent.
- Local Share: Access and process information from files stored on a shared drive within your network, accessible to the KB Agent.
- FTP: Integrate data from an FTP server using TLS implicit encryption (for cloud deployments only).
-
SFTP: Integrate data from an SFTP server using TLS implicit encryption (for cloud deployments only).
Adding a Network shared drive data source
This section will guide you through the process of adding a shared drive data source:
Step 1: Create the data source
Follow these instructions to create a data source based on your storage type.

To create data sources from FTP storage via TLS implicit encryption, follow these steps:
- Click the Add New button. The Add New Data Source page opens.
- In the Name field, provide a name for the data source. This helps you identify and search for the data source easily.
- From the Language drop-down, select the language of the data you upload. It must be one of the bot languages.
- From the Type drop-down, select Shared drive.
- Select FTP as Storage Type.
- In the Uri field, enter the relative path to the folder (on the FTP server) you want to crawl.
- In the Host field, enter the host name of the FTP server.
- Enter the FTP login ID (User name) and the FTP login password (Password).
- If the FTP server uses a self-signed certificate or one not issued by a recognized Certificate Authority, select Disable Certificate Validation. Failure to do so will result in unsuccessful data crawling and extraction.
- Enter the FTP Port for data transfers.
- Optionally, set the Min score threshold and the Target match score for the data source. If not set, the thresholds from the Knowledge Base will apply.
- To verify the FTP credentials, click the Test button. If the check fails, check and review the FTP credentials to ensure they are correct. You can also verify the FTP credentials later by going to the Details tab of the data source and clicking the Test button at the bottom of the page.
- Click Create. The new data source appears on the Knowledge base page.

To create data sources from SFTP storage via TLS implicit encryption, follow these steps:
- Click the Add New button. The Add New Data Source page opens.
- In the Name field, provide a name for the data source. This helps you identify and search for the data source easily.
- From the Language drop-down, select the language of the data you upload. It must be one of the bot languages.
- From the Type drop-down, select Shared drive.
- Select SFTP as Storage Type.
- In the Uri field, enter the relative path to the folder (on the SFTP server) you want to crawl.
- In the Host field, enter the host name of the SFTP server.
- Enter the SFTP login ID (User name) and the SFTP login password (Password).
- Enter the SFTP Port for data transfers.
- Optionally, set the Min score threshold and the Target match score for the data source. If not set, the thresholds from the Knowledge Base will apply.
- To verify the SFTP credentials, click the Test button. If the check fails, check and review the SFTP credentials to ensure they are correct. You can also verify the SFTP credentials later by going to the Details tab of the data source and clicking the Test button at the bottom of the page.
- Click Create. The new data source appears on the Knowledge base page.

To create data sources from the local machine where the KB Agent is installed, follow these steps:
- Click the Add New button. The Add New Data Source page opens.
- In the Name field, provide a name for the data source. This helps you identify and search for the data source easily.
- From the Language drop-down, select the language of the data you upload. It must be one of the bot languages.
- From the Type drop-down, select Shared drive.
- Select Local File as Storage Type.
- In the Uri field, enter path to the local folder you want to crawl. To get the path, go to the desired folder and from Windows explorer, copy the folder path.
- Optionally, set the Min score threshold and the Target match score for the data source. If not set, the thresholds from the Knowledge Base will apply.
- Click Create. The new data source appears on the Knowledge base page.

To crawl and extract data from files stored on the shared drive within your network, accessible to the KB Agent, follow these steps:
- Click the Add New button. The Add New Data Source page opens.
- In the Name field, provide a name for the data source. It will help you easily identify and search for a data source.
- From the Language drop-down, select the language of the data you upload. It must be one of the bot languages.
- From the Type drop-down, select Shared drive.
- Select Local Share as Storage Type.
- In the Uri field, enter the shared drive file path the KB Agent can access.
- Optionally, set the Min score threshold and the Target match score for the data source. If not set, the thresholds from the Knowledge Base will apply.
- Click Create. The new data source appears on the Knowledge base page.
Step 2. Crawl the data source
On the Knowledge base page click the edit icon to edit the data source. The data source configuration page appears by default on the Extracted Paragraphs tab. Upon reaching the configuration page, you'll notice that the content of the root reflects the file structure from the Uri link you provided during data source creation. By default, all folders and files are excluded from scraping. To include files / folders for scrapping, click the three dots displayed at the right-side of the item and click Include.
Click the Start crawling button (). The Start Crawling Parameters page appears.
Define the crawling policy by setting the parameters described in the table below.
Parameter | Description |
---|---|
URL | Automatically populated with the Uri (or the Host for FTP storage) you specified when adding the data source. |
Depth |
The number of directory levels the crawler will explore from the URL. Note: To improve crawling efficiency, crawl each node individually instead of the entire root, especially if the storage has a deep structure. Set the depth to '0' to achieve this.
|
After you define the crawling policy, click Start.
By default all nodes are excluded from scrapping. To crawl specific nodes, click the dots next to the desired node in the file repository explorer and select Crawl Path.
When the crawling completes, the extracted articles display under the Extracted Paragraphs tab.
Step 3. Train the data source
To ensure the KB Engine searches through the data source paragraphs, train your data source by clicking the Train button at the top-left corner of the data source.
Testing the data source performance
Testing the performance of a data source is important because it ensures that the extracted paragraphs are relevant. This process helps identify and rectify any issues, improving the overall quality and effectiveness of your bot's responses. By validating the data source performance, you can enhance user satisfaction.
To test the performance of the data source, on the Extracted Paragraphs page, in the User Says area, enter a question and select the language. All matched paragraphs will be displayed along with their scores.
You can improve the performance of the data source by reviewing and editing the paragraphs based on your needs.
Editing paragraphs
To ensure your Knowledge base high quality, we recommend you to review the extracted paragraphs and take the proper actions to improve them: open the URL from where the crawler extracted the paragraph and compare the content, edit or delete the paragraph. Refine your paragraphs by transforming unstructured data into a question-and-answer format.
To edit a paragraph, click the Action icon displayed inline with the paragraph and click Edit. Edit the paragraph Title and / or Content and save the changes.
Fine-tuning Predictions
You can configure Advanced Settings at both the data source and node/leaf levels to achieve more precise predictions. This approach offers granular control, allowing you to adjust the extractors and trainable elements, resulting in better accuracy and performance. Unlike KB-level settings, which apply changes broadly, this targeted method adapts configurations to the unique needs of each data source or element, streamlining your authoring process.
Fine-tuning at the data source level
- Navigate to the desired data source.
- Select the Advanced Settings tab.
- Modify advanced parameters as needed and save the settings.
Fine-tuning at the node or leaf level
- In the tree explorer, select the desired node or leaf.
- On the right side, select the Advanced Settings tab.
- Modify advanced parameters as needed and save the settings.
Reset advanced settings
To reset advanced configurations at the data source and node/leaf levels to match the KB Advanced settings, go to Knowledge Base > Advanced Settings and click the Save to All button. This action streamlines your settings management by applying consistent KB Advanced settings across your entire configuration with just one click.
Enhance KB prediction
Refine your articles by transforming unstructured data into a question-and-answer format. Edit articles and add question / title / short description.
Access the Knowledge Base Advanced Settings, set the "trainableColumns" parameter to "Question,Answer", then train the Knowledge Base. The KB Engine will leverage both questions and answers from unstructured data sources during the prediction process, ultimately leading to improved prediction accuracy.