Skip to main content
Getting great results from your AI assistant comes down to picking the right engine settings. Use this guide when configuring a new assistant or troubleshooting an existing one.

1. Pick a mode

ModeWhy choose itRecommended model
Dualplex (Beta)Fast turn-taking + premium or cloned voicesGemini Flash 2.0/2.5 or GPT-5 Realtime
Speech-to-speechFastest turn-taking and most natural flowGPT-5 Realtime
PipelineMaximum control over voice and long-form repliesGPT-5 Mini
Experiment with all three modes. Record the same scenario in each and compare response time and caller satisfaction before committing to one.
See assistant modes for a full comparison.

2. Choose a transcriber (Pipeline only)

TranscriberAccuracyLatencyBest for
Azure⭐⭐⭐⭐SlowerHighest transcription fidelity
Gladia⭐⭐⭐FasterGood all-rounder for most languages
Deepgram⭐⭐⭐FasterSolid choice — test against Gladia for your language
Different languages, accents, and background noise affect each transcriber differently. Run a quick A/B test and keep the better performer.

3. Select an LLM model

ModelStrengthsTrade-offs
GPT-5 MiniBalanced reasoning with low latencyMay be slower than realtime models for rapid turn-taking
GPT-5 RealtimeDesigned for ultra-low-latency voice turnsBest for speech-to-speech and Dualplex
GPT-4oStrong reasoning and multimodal understandingHigher latency
Gemini Flash 2.0/2.5Ultra-fast for voice turnsExcellent for Dualplex and multimodal
If speed is critical, use GPT-5 Realtime (great for speech-to-speech) or Gemini Flash 2.0/2.5 (great with Dualplex). For richer reasoning, use GPT-4o or GPT-5 Mini and offset the higher latency with filler audio.

4. Noise cancellation

  • Turn ON when callers are on speakerphone or in noisy environments for cleaner transcription.
  • Turn OFF if words are being clipped or the assistant is missing parts of what callers say.
If your assistant isn’t hearing callers clearly, try disabling noise cancellation first.

5. Conversation timers

ParameterRecommendedWhy
Re-engagement interval~30 sGives callers enough time to think. Lower values can feel pushy.
Max silence duration~60 sPrevents premature hang-ups while still ending truly silent calls.
Test with real calls — too low can interrupt callers, too high leaves awkward gaps.

6. Initial message

ModeHow it’s usedBest practice
PipelineRead exactly as written (TTS conversion)Write the greeting verbatim: “Hello, this is Alex from…”
DualplexRead exactly as written (ElevenLabs TTS)Write verbatim, then select your cloned voice
Speech-to-speechInterpreted as a prompt by the modelInclude instructions like “Greet the customer and say…” or prepend say exactly: to ensure literal output

7. Ambient sound

Ambient sound adds subtle background noise under the assistant’s voice to mask processing delays and create a more natural audio experience. It is enabled by default.
If your assistant isn’t hearing callers well, try turning off ambient sound or lowering its volume.

8. Endpointing sensitivity

Control when your assistant starts talking after a caller finishes speaking.
SettingEffectUse when
Lower sensitivityAssistant responds faster after caller stopsYou want snappy, quick-turn conversations
Higher sensitivityAssistant waits longer before respondingCallers give longer, more detailed replies
If your assistant cuts off callers mid-sentence, increase sensitivity. If responses feel sluggish, decrease it.

9. Debug with the call transcript

If something isn’t working as expected:
1

Go to Call history

Navigate to the Call history page in your dashboard.
2

Open the last test call

Click on the most recent call you tested.
3

Review the transcript

The call transcript includes all function calls and their parameters, making it easy to spot where the assistant went off-track.
For a complete list of every toggle and slider, see the general settings reference.