Voice AudioCodes
The Voice channel through VoiceAI Connect from AudioCodes enables you to deliver a seamless voice experience to the users talking to your DRUID virtual assistants.
VoiceAI Connect acts like a hub connecting different telephony systems (telephony channel, public telephony provider, contact center, enterprise communication platform, or any platform communicating via WebRTC) to the DRUID bot framework and voice AI cognitive services.
In a typical bot deployment, VoiceAI Connect receives a phone call and connects it to your bot.
Prerequisites
- For DRUID on premise deployments, make sure that you provide inbound access from the following messaging endpoint: DRUID.BotApp.
Activate the Voice AudioCodes Channel
To activate the channel, follow these steps:
- In the DRUID Portal go to your bot settings. Click the Channels tab, then click Voice, AudioCodes – VoiceAi Connect. The channel info section expands.
- Generate a token by clicking the Generate button.
- By default, the communication between DRUID and VoiceAi Connect is done via the WebSocket protocol. For special deployments where network special restrictions may deny communication via WebSocket, do not use Web Socket, disable the checkbox.
- In the Reply timeout in seconds field, enter the threshold the bot can respond before the call is automatically disconnected. For more information, see Handle Conversation Disconnect.
- In the Language map JSON field, provide a one-to-one mapping between the language codes used by the Speech-To-Text (STT) service provider (key on the left) and DRUID-specific language codes, that is, ISO 639-1(key on the right), providing the Text-to-speech voice DRUID will be using. For reference, consult the locales and voices supported for Text-to-speech provided by Azure Cognitive Services.
- Send the token and DRUID URL to your DRUID representative and DRUID Team in partnership with AudioCodes Professional Services will set up the connection between your virtual assistant and VoiceAI Connect.
Use the following format for the language codes mapping:
"<STT Provider language code/locale>": "<DRUID-specific language code>|<Text-to-speech voice>"
For example:
{
"ro-RO": "ro|ro-RO-AlinaNeural",
"en-US": "en-US|en-US-AshleyNeural",
"th-TH": "th-TH|th-TH-AcharaNeural"
}
After the channel’s activation, the following fields are available in DRUID:
- [[ChatUser]].ChannelId = “audiocodes”. It identifies the channel.
- [[ChatUser]].Phone – Stores the user’s phone number.
- [[ChatUser]].CalleePhoneNumber – Stores the bot’s phone number.
Author Flows for Voice
DRUID provides authors with a simple way to configure flows for Voice channel by providing the SpeakMessage in the Voice section on flow steps.
Best Practices
To make the bot’s voice sound more natural, follow these best practices:
- For multi-channel bots do not customize voice messages per channel, instead check the steps message spelling and punctuation marks and correct the mistakes if any. Also properly use diacritics and accented characters.
- Provide short messages in the SpeakMessage field on steps.
- Use properly the punctuation marks. This way you make it easier for the bot to read using pauses and the pitch of the voice to make the message clear.
- For hero, thumbnail and choice steps, if you want the bot to speak what’s in the cards, buttons, etc., provide the desired voice message in the SpeakMessage field on these steps.
Sending events to VoiceAI Connect
You can send events (playUrl, transfer, hangup, etc.) from DRUID to VoiceAI Connect by using DRUID Backchannel Flow Steps to generate any supported VoiceAi Connect event.
In Input mapping on the backchannel step, provide the entity that stores activity parameters of that event.
Set session parameters using the "config" event
To set the parameters for the entire call (session parameters), at the beginning of the call, on the welcome flow, add a backchannel step named config and in Input mapping provide the entity which stores the timeouts and actions you want to set.
On the flow step, click the Metadata section, tap on Advanced Editing and in the JSON field add the "sessionParams" object and set the desired parameters.
For information on the config event and general parameters for the config event, see VoiceAI Connect documentation, section General bots parameters and Changing call settings.
If later on you decide to update some session parameters, use a backchannel step called SetVoiceSessionParamsand in Input mapping provide the entity which stores the timeouts and actions you want to address and in the Metadata section add the object specific to the action you want to perform and provide the parameters.
Modifying call parameters
Any backchannel type flow step is sent as event to VoiceAI Connect (transfer, hangup, etc.). In the activityParams property of that specific VoiceAi Connect event, DRUID sends the JSON object of the entity specified in Input mapping on the backchannel step.
If you want to modify activity parameters without sending a specific event to AudioCodes’ VoiceAI Connect (detect language change on conversation activity), you need to use a backchannel step called SetVoiceActivityParams. In the activityParams property, DRUID sends the JSON object of the entity specified in Input mapping on the backchannel step.
If you want to modify session parameters (handle bot delay, detect language change on session level, etc.), you need to use a backchannel step called SetVoiceSessionParams. In the sessionParams property, DRUID sends the JSON object of the entity specified in Input mapping on the backchannel step.
Capture a Collection of Dual Tone Multi Frequency (DTMF) Digits
You can configure flow steps to capture a collection of digits the user presses at the phone's keypad, either within a specific time frame between digits, or a maximum number of digits or the digits pressed prior a specific digit set on the flow step.
To capture digits, on the flow step, click the Metadata section, tap on Advanced Editing and in the JSON field add the "audioCodesDTMF" object and set the parameters described in the table below.
Parameter |
Type |
Description |
Mandatory |
---|---|---|---|
sendDTMF |
Boolean |
To capture a collection of digits, set the parameter sendDTMF to false; otherwise, the bot will capture only the first digit pressed by the user on the phone’s keypad. |
Yes |
bargeInOnDTMF |
Boolean |
When set to true, allows the user to interrupt the bot by pressing a DTMF digit which terminates the bot response. Note: To prevent users from interrupting the bot, we strongly recommend you to set the parameter bargeInOnDTMF to false.
|
No |
dtmfCollect |
Boolean |
Set this parameter to true to capture all the DTMF digits entered by the user. Default value is false; that is, the bot captures only the first digit pressed by the user. |
Yes |
dtmfCollectInterDigitTimeoutMS |
Number |
The timeout in milliseconds the bot waits for the user to press another digit before it captures the digits. The timeout is triggered after the user enters the first DTMF digit and is reset after each digit. The default value is 2000ms. Note:
|
Yes* |
dtmfCollectMaxDigits |
Number |
The maximum number of DTMF digits the user is expected to press on the phone’s keypad. The default is 5. Note:
|
Yes* |
dtmfCollectSubmitDigit |
String |
Defines a special DTMF "submit" digit that when received from the user, the bot captures the digits pressed before it without waiting for the timeout to expire or for the maximum number of expected digits. The valid value is any symbol on a phone keypad. The default is # (pound key). Note: The parameter is applicable only when the dtmfCollect parameter is configured to true.
|
Yes* |
*The parameter controls how the bot captures the digits. You can use these parameters in any combination, but at least one is mandatory.

This section describes how to configure a prompt step to capture the CNP provided by the users prior to pressing # (pound key) on their phone keypad.
Enter [[Account]].ClientCNP in Input mapping on the step.
On the prompt step Metadata section, click Advanced editing and in the JSON editor add the following code:
"audioCodesDTMF":{
"sendDTMF":false,
"bargeInOnDTMF":false,
"dtmfCollect":true,
"dtmfCollectSubmitDigit":"#"
}
Handling Bot Delay
Handling bot delays is particularly useful when the bot executes an integration which might take longer to complete.
By setting timeouts, you can configure the following actions to address situations when the bot takes time to respond to a message sent to it:
- Play a textual prompt to the user
- Play an audio file to the user
- Disconnect the call
- Resume speech recognition (so the call will not remain hanging).
To handle not delay, add a backchannel step called SetVoiceSessionParams. In Input mapping provide the entity which stores the timeouts and actions you want to address.
Make sure that the entity you provide in Input mapping contains fields named exactly as the parameters expected by VoiceAI Connect. For the complete list of parameters, see VoiceAI Connect documentation.
In the SetVariables section of the backchannel step, configure the timeouts and define the actions based on your needs.
Handle Conversation Disconnect
A call disconnects if the bot does not respond within the Reply timeout in seconds threshold set on the channel or if the user says nothing for 120 seconds.
You can configure what happens on conversation disconnect. Go to the bot details, click the Dialogue management section header and from the Voice call terminate flow field, select the flow to be triggered on disconnect. If no such flow is set, the call disconnects.
When the call disconnects, the following data is logged in the conversation context:
- [[ChatUser]].VoiceConversationTerminatedReason - The reason for which the call is disconnected. E.g., “Client Side”. The reason of the disconnect can be one of the following:
- SocketInterrupted – The connection was interrupted.
- UserBecameSilent – The user said nothing for 120 seconds.
- CallTerminatedByCaller – The user terminated the call.
- [[ChatUser]].VoiceConversationTerminatedReasonCode – The code (text) associated to the disconnect reason. E.g., “client-disconnected”.
Transferring the call
There are cases when the bot cannot handle the call by itself, so it needs to escalate the call to a call center live agent. By default, once VoiceAI Connect performs the transfer, it immediately disconnects the call with the bot, regardless of whether the transfer succeeded or not.
For the bot to escalate the call to contact center live agent, add a backchannel step named transfer. You need to configure the backchannel step so that the bot provides the transferTarget that is the URI to where the call should be transferred call to. Typically, the URI is a "tel" or "sip" URI. You can configure the backchannel step so that upon transfer the bot provides additional SIP headers.
In Input mapping on the transfer backchannel flow step, provide the entity which stores the desired values. In the figure below, we created an example entity,[[VoiceParams]], which stores the values of the parameters associated with the transfer event.
Make sure that the entity you provide in Input mapping contains fields named exactly as the parameters expected by VoiceAI Connect for the transfer event. For the complete list of transfer event parameters, see VoiceAI Connect documentation.
In the SetVariables section of the respective transfer backchannel step, set the transferTarget and the desired additional transfer parameters and SIP headers (if you want to send any).
Disconnecting the call
At any stage of the conversation, the bot can disconnect the conversation. For the bot to disconnect the call, add a backchannel step named hangup.
You can configure the backchannel step so that upon disconnect the bot provides a textual reason that will be passed to the peer on the SIP Reason header and will appear in the CDR of the call. In addition, you can also configure the backchannel step to add SIP headers and their values, which will be included in the SIP BYE message.
To add the disconnect reason and additional SIP headers, on the hangup backchannel step, in Input mapping provide the entity which stores the desired values.
Make sure that the entity you provide in Input mapping contains fields named exactly as the parameters expected by VoiceAI Connect. For the complete list of hangup event parameters, see VoiceAI Connect documentation.
In the SetVariables section of the hangup backchannel step, set the disconnect reason and/or additional SIP headers.
Conversation History
All voice conversations begin with “[Voice start event]”. This is particularly useful for debugging purposes to measure the time from the moment when the call was initiated (bot picks up the call) until the bot says the first message.
For this channel, DRUID also logs in the Conversation History the Speech-to-Text Confidence Score, that is, the value representing the confidence level of the recognition received from the speech-to-text provider.
When the call disconnects due to an error, the disconnect reason logged in the Conversation History is “PlatformIntegrationError” (the message status is Platform Integration error).
Store call initiation metadata in [[QueryParams]] for future usage
By default, VoiceAI Connect sends an initial event to the bot when the call is initiated together with specific SIP headers. By default, [[ChatUser]].Phone stores the user’s phone number and [[ChatUser]].CalleePhoneNumber stores the bot’s phone number.
Storing incoming custom SIP headers (metadata sent by the contact center solution together with initiation call event) can be particularly useful for by companies when running outbound dialing campaigns, whereby the dialer automatically initiates calls with potential customers.
You can store additional incoming custom SIP headers in dedicated fields in the [[QueryParams]] system entity. For that, you need to create dedicated field that have the same name as the incoming SIP header key.
For more information about sensing SIP headers to the bot, see VoiceAI Connect documentation.
Storing AudioCodes Helpdesk Conversations in Conversation History with Agent Assist
To log conversations between AudioCodes helpdesk agents and users in the Conversation History, you must set up Agent Assist. Doing so provides valuable insights into user-agent interactions, helping you refine bot responses, improve escalation flows, and enhance overall bot performance.
To set up Agent Assist follow these steps:
- In the DRUID Portal go to your bot settings. Click the Channels tab, then click Voice, AudioCodes – VoiceAi Connect. The channel info section expands.
- Select Allow assist bot and copy the Assist bot Druid url and the Token as you will need them in the subsequent steps.
- Log into AudioCodes LiveHub and select Bots from the left menu.
- On the Bots page, click the Add new assist bot button.
- Select Druid as bot framework, then click Next.
- Enter the bot details. In the Bot URL field, paste the Assist bot Druid url you copied from DRUID. In the Token field, paste the token you copied from DRUID.
- Click the Validate bot configuration button. If the validation fails, check that you entered the correct Bot URL and token. If the validation passes, click Next.
- Set the assist bot settings and click Create.
- Click Routing in the left menu. On the Routing Rules tab, search for the main bot you created for the DRUID bot integration and click the Edit button. Do not search for the assist bot you created.
- Tap on Assist bot and select the assist bot you created previously from the drop-down, then click Update.
The Connect your bot wizard appears.
The Edit routing rule page appears.
Agent Assist is now successfully set up, enabling the logging of AudioCodes agent-user interactions in the Conversation History.
Recommending Responses to Helpdesk Agents with Agent Assist
Agent Assist analyzes real-time voice interactions and suggests AI-powered responses to help AudioCodes helpdesk agents provide faster and more accurate support. It uses large language models (LLMs) to interpret client messages and recommend the most relevant replies from the Knowledge Base.
How It Works
When a client-bot conversation transfers to a live agent in AudioCodes, Agent Assist activates automatically. It processes each message in real time by:
- Updating ConversationInfo.AgentAssistMessages[i] to maintain a complete transcript of client-agent interactions, storing all client and agent messages.
- Triggering the Agent Assist special flow on the first client message to search the Knowledge Base for a relevant response.
- ·Sending the suggested response to the helpdesk agent through the configured third-party tool (where agents handle client calls).
- Preventing interruptions by ensuring that when multiple messages arrive, only the latest client message triggers a response after the previous Agent Assist flow execution finishes.
Set up Agent Assist for response recommendations
- Open Solution Library, search for solution Helpdesk agent assist, and import it.
- Go to Bot Details > General details > Dialogue Management.
- Select Use Knowledge Base.
- From the Knowledge Base response flow field, select Agent assist flow.
- Go to Apps and configure the connection strings for the GPT-azure.com app.
- Go to Flows, search for 'Agent assist flow', and click on the Add Suggestion step. The solution comes by default with Druid Data Service integration, saving the suggested responses to the Agent Assist workspace that comes with the solution.
You can remove this integration and configure an integration with the third-party tool your helpdesk agents use to handle client calls.
After setup, Agent Assist will automatically suggest responses, helping agents provide faster and more accurate support.