Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python

Ғылым және технология

🔑 Get your AssemblyAI API key here: www.assemblyai.com/?...
Learn how to build a real-time AI voice assistant using Python that can handle incoming calls, transcribe speech, generate intelligent responses, and provide a human-like conversational experience. Perfect for call centers, customer support, and virtual receptionist applications.
In this coding tutorial, you'll integrate multiple cutting-edge technologies, including:
1. Assemblyai Speech-to-Text API for accurate real-time transcription.
2. OpenAI's powerful language models for natural language processing (NLP) and response generation.
3. ElevenLabs' AI voice synthesis to convert text responses into natural-sounding audio.
Step-by-step, you'll create a Python application that seamlessly combines these APIs, enabling your AI assistant to listen to incoming audio, comprehend the speech, formulate contextual responses, and communicate back with synthesized voice in real-time.
Github code: github.com/smithakolan/Assemb...
Timestamps:
00:00 - Intro & Demo of application
01:10 - Outline of application
01:58 - Step 1: download python libraries
06:21 - Step 1: Streaming Speech-to-Text with AssemblyAI
12:11 - Step 3: OpenAI Chat completion
15:32 - Step 4: Generate Human-like audio with Elevenlabs
18:48 - Running our AI Call Assistant
#AIVoiceAssistant #RealTimeSpeechRecognition #NaturalLanguageProcessing #AIVoiceSynthesis #PythonTutorial #CallCenterAutomation #VoiceBot #StreamingSpeechtoText
▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬
🖥️ Website: www.assemblyai.com
🐦 Twitter: / assemblyai
🦾 Discord: / discord
▶️ Subscribe: kzread.info?...
🔥 We're hiring! Check our open roles: www.assemblyai.com/careers
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#MachineLearning #DeepLearning

Пікірлер: 42

@euginekholmogorov5196Ай бұрын
amazing lady and also an engineer omg)) thank you a million, I'll just add this to my stack
@NatGreenOnlineАй бұрын
Using Groq / Mistral AI instead of OpenAI will greatly reduce the latency issue you have in your demo.
@logannon
Ай бұрын
can you fine tune groq?
@AssemblyAI
Ай бұрын
Great suggestion, we will explore this in the next tutorial. This one was meant to be as accessible as possible so that people could build quickly.
@user-vm8yn4hb4w
14 күн бұрын
@@logannon no its impossible to fine tune groq. thats the problem. you have to use rag instead of fine tuning. but if you wanna make chatbot for specific domain you should try other service
@thebackpainmiracle6 күн бұрын
Exactly what I was intending on making. Thanks!
@JokerJarvis-cy2sw2 ай бұрын
Please a tutorial on llava vision model to analyze video live with cv2 And I am unable to get my API token from assembly AI website please fix it
@sarap.sadegh4691Ай бұрын
hi thanks for your video . i want Api real time conversation with python for Farsi language . the LLM support Farsi language?
@theghostyced5 күн бұрын
how would you handle interruptions while the ai is talking?
@simonsandeep497720 күн бұрын
The programming is not responding after the first introduction ,as shown in the video ;though even after using the github code. Any alternative with step by step instruction video ?
@urekmazino13274 күн бұрын
any way to make one with adam voice like the one in elevenlabs?😊
@yuchengpeng7706Ай бұрын
This video is so great! I'm following your video but now I ran into this problem, I can install the package in Pycharm with Windows system, but I got this error: OSError: Cannot find mpv-1.dll, mpv-2.dll or libmpv-2.dll in your system %PATH%. I'm a researcher in the art field with only a debutant python knowledge, could you help me solve this problem? Thanks a lot!
@JeffreyJohnson-vy1zmАй бұрын
Two questions: How can we improve the latency between the patient's response and the AI voice reply? and What can be done for the AI Voice to account for patient input if the patient speaks while the AI voice is speaking?
@AssemblyAI
Ай бұрын
Hi Jeffrey, two very good questions! These deserve a video on their own, to be honest. To improve latency one thing you could try is running the LLM locally so you can get a faster inference over calling openai's API. As for handling overlapping speech, I've written the program to stop listening when the AI voice is responding back. But what you could do, is run another thread that is still listening while the AI voice is speaking.
@EvertvanBrussel
Ай бұрын
As for the latency, I was assuming the majority of the latency was actually coming from ElevenLabs? And likely also from whatever functions might be needed to actually check the availability of the dentist and then also to schedule the actual appointment in the end. Am I wrong? So yeah I think running the LLM locally will surely help, or using Groq, but I'm not convinced yet that that is the biggest bottleneck.
@TheBestgokuАй бұрын
why not chunk text and output instead of output after all text is generated?
@uttamdwivedi7709Ай бұрын
I followed this tutorial then in the end I realized .. assemblyAI doesn't provide the support for the Japanese language in the live Reltimetranscriber. Which sucks .. lol can't use it. Any help? @assemblyAI
@iainhmunro24 күн бұрын
Hi There - I was just looking at the code. Where is the appointment setting details / info coming from ?
@AssemblyAI
21 күн бұрын
All that is coming from the LLM we are using, so it's not hard-coded.
@mehdismaeili37432 күн бұрын
Excellent .
@vishalsaichindepalli2798Ай бұрын
For some reason, the microphone isn't picking up my voice. I enabled all permissions on my mac and am still having trouble. Is there any way to fix this?
@michaelnumnum
Ай бұрын
I think you need to pay for the real-time transcription for this at AssemblyAI
@Vrilogs
9 күн бұрын
streaming from assembly ai is a paid service. So, first you need add balance into your account. If you have not done that yet. Hope that helps :)
@mrunexpected102 ай бұрын
can u make just a chat bot word to voice
@sillystuff62472 ай бұрын
super cool
@Alex-qo5jeАй бұрын
How can i conect to my phone number and google calendar?🙏🏼
@AssemblyAI
21 күн бұрын
You can make use of the Google API for google calendar and something like Twilio's API for making phone calls.
@PalashDandge6 күн бұрын
i am getting error "Cannot find reference 'generate' in '__init__.py' " on from elevenlabs import generate, stream line can you please help me to resolve this issue
@user-po9ru7dl9j
4 күн бұрын
yes same error, did you find a solution to it mate?
@viditsharma6990Ай бұрын
i am facing the mpv value error on windows i already installed it many times how can i fix that
@sethuraman9884
24 күн бұрын
just use vlc instead mpv bro
@user-vm8yn4hb4w
14 күн бұрын
@@sethuraman9884 thank you guys
@user-vm8yn4hb4w
14 күн бұрын
or check environment path of mpv. when you command mpv --version on cmd. you have to see its running
@nithishreddy768423 күн бұрын
An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997) what to do with this error?
@islamicinterestofficial
22 күн бұрын
same error. You found the solution?
@chittisai47
16 күн бұрын
most likely your microphone is switched off pls check
@daeralbraАй бұрын
The only downside is the fact it takes a while to respond with voice.
@user-qp1jq3eh3e2 күн бұрын
I am very api to have found this
@drmarioschannel2 ай бұрын
after watching your video, i think i prefer interacting with humans
@urekmazino13274 күн бұрын
why are you saying fro. scratch if you're only using api