How to Install & Use Whisper AI Voice to Text
Ғылым және технология
In this step-by-step tutorial, learn how to transcribe speech into text using OpenAI's Whisper AI. Whisper AI is an AI speech recognition system that can transcribe and translate audio files in approximately 100 different languages.
📚 RESOURCES
- Install Python: www.python.org/
- Install PyTorch: pytorch.org/get-started/locally/
- Install Chocolatey: chocolatey.org/
⌚ TIMESTAMPS
00:00 Introduction
00:40 Install overview
01:00 Install Python
02:31 Install PyTorch
03:55 Install Chocolatey package manager
04:53 Install ffmpeg
05:28 Install Whisper AI
05:59 Transcribe one file
07:18 Output files
07:58 Transcribe multiple files
08:39 Available models
09:51 Transcribe in other languages
10:31 Translate to English
11:06 Help
11:40 Quality
12:04 Uninstall
12:14 Wrap up
📺 RELATED VIDEOS
- Run Whisper AI in the cloud for free using Google Colab: • Best FREE Speech to Te...
😢 Uninstall instructions:
- Uninstall Whisper AI
In command prompt, enter:
pip uninstall openai-whisper
- Uninstall ffmpeg
In command prompt, enter:
choco uninstall ffmpeg
- Uninstall Chocolatey
In File Explorer, delete the folder:
"C:\ProgramData\chocolatey"
- Uninstall PyTorch
In Command Prompt, enter:
Pip3 uninstall torch torchvision torchaudio
- Uninstall Python
Go to Installed Apps in Windows Settings, search for Python and Python Launcher, click the three dots, and then uninstall.
📩 NEWSLETTER
- Get the latest high-quality tutorial and tips and tricks videos emailed to your inbox each week: kevinstratvert.com/newsletter/
🔽 CONNECT WITH ME
- Official web site: www.kevinstratvert.com
- LinkedIn: / kevinstratvert
- Discord: bit.ly/KevinStratvertDiscord
- Twitter: / kevstrat
- Facebook: / kevin-stratvert-101912...
- TikTok: / kevinstratvert
- Instagram: / kevinstratvert
🎒 MY COURSES
- Go from Excel novice to data analysis ninja in just 2 hours: kevinstratvert.thinkific.com/
🙏 REQUEST VIDEOS
forms.gle/BDrTNUoxheEoMLGt5
🔔 SUBSCRIBE ON KZread
kzread.info?...
🙌 SUPPORT THE CHANNEL
- Hit the THANKS button in any video!
- Amazon affiliate link: amzn.to/3kCP2yz (Purchasing through this link gives me a small commission to support videos on this channel -- the price to you is the same)
#stratvert #whisperai #openai
Пікірлер: 679
Run Whisper AI in the cloud using Google Colab (requires no install and is also free): kzread.info/dash/bejne/aoeFuI97aJbagLg.html
@amlaaaa479
Жыл бұрын
Didn't work for me. I just get error reports
@daedalusjones4228
Жыл бұрын
Works great for me using Co-Lab. Or on my hard drive. Both work great. But here's something: I have multiple gmail accounts. And I have a number of tools, add-ons, extensions to Google Drive/Docs/Sheets, including Co-Lab, Apps Scripts, etc. And I initially set them all up on one google account. But when I go to set up those same tools in my other google drive accounts, I get an error message, and can't do it. It seems that I can't have stuff in Co-Lab, for example, in more than one google account.
@francescooliva5951
Жыл бұрын
there is a way with the installation on windows to use whisper OFFLINE?
@KevinStratvert
Жыл бұрын
@@francescooliva5951 once you install, you can use offline.
@francescooliva5951
Жыл бұрын
@@KevinStratvert so the only time i go online Is to download for the First Time the pre-trained model?(tiny/medium/large according to my choice)? I have a AMD Radeon 530 GPU... But whisper seems to not read It. In fact i use 99% of my CPU in task manager. What Is the medium time to transcribe a medium kind of file?
Gosh, Kevin, this is the first video I've seen of your and I am mightily impressed! I've been in IT for over 30 years and I can tell you that your presentation is one of the leanest and meanest I've ever seen. What a great contribution this is to the community. Thank you very much!
This is probably my favorite video on KZread ever. It is amazing. It takes a process that I found complicated and turns it into easy to follow steps. It actually takes what could be stress inducing and makes it relaxing with some unintentional ASMR presentation. Very well done.
Thank you for doing a complete walkthrough, unlike so many other KZreadrs who act like they're being thorough but later find out they're skipping small but essential steps as if we already know!
Amazing walkthrough. Thank you. You've made something that would have been overwhelming for me and taken me hours (if I could do it at all) seem so easy and I was done in under half an hour!!
for the ones having issues with "file doesn't exist" you have to make sure that you add the file type at the end even if its not named that. For example if you file is named "file" and its an mp3 then you must type in whisper file.mp3. Hope this helps because this was not specified
@lauram14
4 ай бұрын
I need help FP16 is not supported on CPU; using FP32 instead.. what does this mean?
@federicobartolozzi680
3 ай бұрын
@@lauram14 nothing, just more ram used and low speed
@iphoneapple1892
3 ай бұрын
thank you i was stuck for two week now its work
@hiteshbiswas776
3 ай бұрын
still facing the issue for m4a file... is it possible we need to give only certain file types
@lambdaboy9999
2 ай бұрын
wait why are you here?
Thank you , Kevin, for sharing your knowledge and teaching skills in this and your other KZread contributions. I followed this KZread video to the letter and was able, with only a few hiccups (of my making), to transcribe very important audio files my wife recorded on her iPad. My Win 10 PC did the job flawlessly to my wife's stringent specifications. Happy wife, happy home. I first tried your excellent video, "Audio to Text" which was satisfactory for very small audio files due to the limited capacity available through Microsoft. The AI system worked very well on a 6 mb audio file (four pages of text in a MS Word file). I haven't yet tried a larger file size but believe it would work fine for larger files. Again thank you for all you do, for sharing your selfless talents and wonderful passion for what you present.
Hi Kevin. Been watching you for awhile and just want to say thanks for all the explanations. Concise and interesting. You've helped me a lot and, again, I thank you. Keep it rolling!
Amazing job Kevin. My first attempt at installing Whisper was bad, but your video had me running in no time.
You know, this is one of those videos that you wish you could like 100 times. Much appreciated, man. Amazing video. Thank you so much. Subscribed
I had previous success with your Stable Diffusion video for a local install. It was the only one I found that was clear and perfectly detailed! This video also was excellent, I just followed your step by step instructions and everything is working great!
It worked after some serious debugging but couldn't have done it without this video. Thank you a ton!!
YOUR VIDEO IS AMAZING!!! It helped me so much with learning languages, I used this whisper program, converting speech to text, and then I used chat GPT as a super translator, IT IS ABSOLUTELLY AMAZING. Thanks to this video I did in 1 day the amount of work for 4 days. The quality of Whisper is absolutelly amazing. Kevin Stratvert is the BEST, Thank you
It worked on Python 3.11.4 and the latest PyTorch! I used a CPU and a 1 minute speech took 4 minutes to be transcribed using the small model, 10 minutes using the medium. The installation in the cloud (00:35) is much faster, with the result in under 1 minute. The medium model can recognize technical words. Thanks for showing this tool.
Wow, many thanks Kevin. I had my own videos that I was planning to do Voiceover and found it very difficult to listen to and translate the video, this way I was able to generate Arabic text and it is pretty good and even the translate feature to English is excellent. This video solved a lot for me, and I have tested it, and very promising. Many thanks again.
Another incredibly useful video and so very easy to follow as well! It works perfectly for my large assembly recordings. Thanks so much Kevin. You're such a great teacher, I just love your stuff!
I was recently thinking how great it would be to have Whisper local, instead of online only. And, voila, here's Kevin! Readin' minds, and don't even know it; well, you do now. Thanks!
This was indeed a helpful video, even if I wish you skipped package managers for ffmpeg installation. I got Whisper installed and working, testing transcription on a recording of a 70 minute meeting. With a fairly muscular PC, I tried with small, medium and large models. Surprisingly I got more accurate results with small, in addition to quicker results. Great tool, wonderful intro.
This exercise gave me some solid experience troubleshooting errors. I had to pull teeth to get Homebrew (using a Mac) to install properly, and then had an SSL certificate error, but Google & Stack Overflow came to the rescue, and Whisper is working like a charm. Thanks for the great video! By the way, if anyone gets an SSL certificate error using Python3 (which apparently is common), just enter the following in terminal, exactly as written (but check your version*): /Applications/Python\ 3.11/Install\ Certificates.command * Just adjust the version number to match your release, in the example above, I updated it to 3.11
@mehmetbakideniz
25 күн бұрын
people like you further motivate me to share my knowledge with the internet. Thank you so much! you have saved me a ton of time.
Many thanks for another excellent video. Some of the versions from this video have been updated but I was able to find the ones you mentioned so everything is working as expected. I teach English (and digital literacy) and sometimes wanted a transcript for an interesting podcast. This is great as it is free, has no time limit and offers other languages which I am keen to test soon. I also like your video which shows how to use the online version of Word. Btw, I use some of your tutorials in the classroom for my MS app classes and the students love your videos too. The only adjustment I do is slow down the playback as it is sometimes a little too fast for my learners :). Many thanks again and please keep up the good work!
Thanks again, Kevin, for a very useful video. Nice to see Python at work. It reminds me of old-time programming - at least a little. I am 71 and wrote my first program using punch cards... :)
@noreenstxs9605
Жыл бұрын
I dropped my whole stack of punch cards once :)
@JohnDoe-rx3vn
Жыл бұрын
I was just telling my buddy about that. I think AI is going to be as big a jump as punch cards/numbered lines to named variables was
@ALifetimeofFitness
Жыл бұрын
@@noreenstxs9605 I did that back around 1970!
@hubertmallard7254
Жыл бұрын
I'm 72... Cards.. IBM 1130 Fortran Apple 2 Pascal 😅
@PoeLemic
Жыл бұрын
@@hubertmallard7254 Yeah, same here. Programmed in Fortran, Cobol, Pascal, etc. What about the TRS-80? Remember those?
Thank you very much Kevin. Your channel helps even laymen like myself appear like tech nerds when I share these solutions with friends. And I always recommend your channel to them.
Outstanding tutorial as always Kevin. Thank you. I used this to transcribe my recording of a 45-minute webinar so I could read along and highlight as I listened to the replay. It took just 11 minutes on my high-end gaming computer with a Geforce RTX-3060 Ti graphics card. Very useful tool!. ‼
@generalgeert
Жыл бұрын
SOunds great, which model did you use? the default small model or a higher one?
@jimdarley
Жыл бұрын
@@generalgeert I used whisper -model medium
@huy3148
6 ай бұрын
Which CPU did you have for that transcribe? Thank you
This is one of the best step-by-step instructions I've ever seen. Thank you!
This is the first time I got a video playing straight after its release!
UPDATE: This is truly the holy grail. For technical writers, journalists, people who do tons of interviews that need accurate transcription. For paralegals. This is a game-changer. I had used the one via Co-Lab before, per Kevin's instructions. But you are limited to three transcripts a day or something. With this on my HARD DRIVE, I can translate multiple files. I assume there's no cap, no limit. Getting the transcript in all those multiple formats Kevin shows in the video? Almost too good to believe. I don't have a dedicated graphics card, so I chose "CPU." (Hence the slowness, I reckon.) I DO have an i7 processor. But it's a laptop with only 8GB of RAM, and no ability to add more. I want a desktop so that I can upgrade RAM, get a dedicated graphics card, upgrade processors, etc. For more of this kind of thing. Automation. Some heavy lifting. ------------------ Okay. Seems to be running. Slowly, but running. I now have at least two different versions of Python installed on my PC. Installed 3-10-10 just for whisper. Already had 3-11 to run globally. I always make choosing an installation location more complicated than it has to be. But I don't want to run into compatibility problems with the various versions of Python -- plus, I don't know what the implications are as far as Environment Variables, and the fact that the various versions all have to call ffmpeg, chocolatey, or selenium, or whatever. I installed 3.11 in the default location for Program Files. In installed 3-10-10 in a folder directly on C drive that I created for it, called python-3-10-10. I think that part of the key to success here is following kevin's protocol of going to the folder where the audio files are at, and typing CMD directly into the address bar FROM THERE. (I've seen one or two other vids about python. No one mentioned this good tip.) Anyway, with my limited knowledge, I think it's like this: I've installed the following globally: chocolately ffmpeg python 3-11 pytorch. Then, I've installed 3-10 locally, in a folder on the c drive. I bring my audio files into that 3-10 folder, enter CMD into the address bar there, and all is well. I'm running 3-10, and still, I guess, accessing all those global resources that I need to.
@antipupsz2411
Жыл бұрын
I have a similar hardware setup (no CUDA, only CPU), and been wondering how long does it take to transcribe a 1-hour long video file using the --large model. What's been your experience?
@daedalusjones4228
Жыл бұрын
@@antipupsz2411 Yup, I believe it was faster using Co-Lab. The advantage of using it on your hard drive, though, is transcribing multiple files. Set and forget it, go outside.
@etnisu
Жыл бұрын
@@antipupsz2411 hey how do you achive to transcribe 1 hour. I tried 1 hour .mkv file but everytime it only transcribe 1 minute :(
@Voldemorts.Nipple
Жыл бұрын
@@etnisu You have to wait a lot for it to keep transcribing
@Voldemorts.Nipple
Жыл бұрын
@@antipupsz2411 Hey I'm transcribing one hour as well and it's been like 4 hours and only now it's halfway. I'm using medium model with my 6GB GPU and this is very slow. How long did it take for you?
As an educator, I really like you style of explaining. Tnx
It's a great help to sort and summarize important info from a vidz 😊. Thank you mr. Kevs!
Thank you Kevin for what you do. I followed the instructions. I added the following in case some newbies wanted this. I installed Python version 3.11.5 in Windows 11 and it works fine. In Windows Explorer, I created a folder under the C: Drive called Whisper. I then copied my mp3 audio file (from data drive) to C:\Whisper, typed in cmd in the address field to bring up the Command Prompt, and then typed whisper filename.mp3 --model medium [and then Enter]. A 36-minute conversation (50mb) took a little over 39 minutes to run. I then cut all the files from C:\Whisper and pasted them into a folder on my data drive. Then I copied the text version into a version of Word that I don’t pay a monthly fee for and saved it. 😊 Hope this helps someone.
@struppifrohlich2008
8 ай бұрын
I tried Python 3.11.5 too, but every time i go in my C:\Whisper folder and type in CMD where I type in Whisper test.wav it says: FileNotFoundError: [WinError 2] The system cannot find the specified file Do you know a solution?
Love these videos, Kevin Keep them coming man!
Thank you! This is exactly what we needed to transcribe our tiny DnD podcast!
This is why I love internet! To execute a neural network you just have to follow simple guidelines! There are issues and stuff to figure out yourself, but this is such a great jumpstart!
So useful and clearly presented, never stop making videos
My brother you have saved me literally over a thousand hours of work. This made a life-changing improvement on my productivity
Thank you so much. Great instructions with exactly the right level of detail. Got whisper running on first try.
bless your soul, my assignment would've never been submitted on time if it weren't for this video 🙏
It is really amazing how good it is at transcribing songs! Using that for my home build arranger/karaoke keyboard :)
Excellent how-to, easy to follow and descriptive. Thanks!
Thank You! I was able to transcribe my mp3 file. Excellent technology for next week's online course.
Great instructional video. Clear and informative. Thank you.
amazing tutorial. Thank you for this super high quality well thought out tutorial. went super smooth.
Crystal clear tutorial. Worked the first time trying. Thanx buddy! 😁
@KevinStratvert
5 ай бұрын
Great to hear!
Thanks so so much this great programme .Right now l am running an English school . During this Covid 19 it is really hit my business so bad . I will share this useful app to help up my students . Again thanks so much .
Incredibly helpful. Thank you. Whenever I want to use some (free and very useful) open-source tool I'm always baffled how difficult unintuitive it is to get it running by yourself
Thank you Kevin for sharing your walkthrough, been looking at paid at platform for transcription. So easy when you know how
Very useful thanks. As always very clear succinct videos
Kevin, you are my tech genius! That came in the right time. Thanks heaps for your amazing video:)
incroyable!! merci beaucoup j'ai tout compris c'était méga clair. bravo continue comme ça.
Appreciate your teaching Kevin, love and respect from Singapore :)
Awesome tutorial. Thanks Kevin. Whisper AI is an amazing tool.
Wow! Really impressed how quick and easy this was. Would love a follow up video on how to incorporate something like pyannote to this so that we can also have speaker diarization!
Bro I dunno what to say but this is the thing that I have been looking for. Thank you a lot.
Very cool my dude, thank you for helping with this. I would have never gotten this on my own
Awesome! Thank you so much. You helped me actually get this to work (after watching several other videos!).
Amazing! Thanks fpr such a helpful video, dude!
Awesome work Kevin. Subscribed
You are a legend. What an amazingly helpful and easy to listen to tutorial on this.
Kelvin you such a sweet heart... just when I needed a transcribing software... vaahlaaaaah!!! Here you are with the solution.. Kelvin are you reading my mind?Answer me Nice one bruh... you make everything seems easy.. And working SMART Muah!!! Kelvin Kelvin!!!!! Thank you ❤
This was very helpful. Thanks a lot!
Well, I got it to work so I'm good. Your instructions are excellent!
Thank you for great video, Kevin!
THANK YOU - this tutorial is fantastic.
Excellent tutorial. Great job. Thank you
Running flawlessly for me. What a fantastic guide. I had to download latest version of pip to work but no hitches installing anything for me.
Brooo! The CMD trick is so good!
extremely good tutorial! Thank you!
Thanks, Kevin. Super helpful!
I was able to transcribe and translate audio with Whisper!! Thank you so much!! >M
Worked for me. Thanks... good content.
God bless you! Thank you for explaining the process in a simple and easy to follow way.
great explanation and all straightforward
It's working !!! Thank you for help ))
Thanks for the clear instructions to use the tool. It works on python 3.11.1 also albeit with a few errors that can be ignored
@pbartkus
8 ай бұрын
I'm running it on python 3.11.5 error free.
Amazing video dude, thanks!
just 2 words for you.. you are incredibly awesome.
thank you for that guide, simple and to the point, but full of info, like.
@the_kvadronikus
11 ай бұрын
and yes, ive install and use whisper, it works, somewhere lose correct endings of words or choose wrong letter, but it have insanely quality of transcribation even for Russian lang on normal base.
Holy... I've been following you for quite some time now and I have to say, you lost me on this one. I'm sure there's another way I can accomplish this, not to say you are wrong, or giving bad advice or whatever, in fact just the opposite, you explained it perfectly and of course I have no doubt it's doable. In fact I'm writing this to get you more comments on the video. Great job, it's just one I'll pass on.
Thank you very much, great walkthrough and thanks for the uninstall informations too
Super helpful, thank you so much
Thanks! Worked perfectly ;)
Worked for me, thank you.
Good tutorial! Easy to follow
Hi Kevin, thank you for the training video.
Works perfect! Thanks!
Just awesome!!! Thanks a TON :buddy )
哇真的好用,讲的很细致!!在中国永远找不到如此细致的教程
Well, it is a wonderful video and useful too, but it's taking longer time to load the transcript. Thanks to you Kevin!!
Thank you for the details. I like your tutorial being logical and explaining things from the base. I am curious about the text being split into each clip. what those clips were split based on? if the audio is 2-person conversation, will each clip be based on person. I am stuck on person identification using whisper
Great explanation thank you very much!!!
omggggg!!!!!!!! this was so smooth .... Thanks!!
Your content is a jewel, ty!
great illustration and I have successfully installed it on my computer. Thank you @kevin
Fantastic video. I'm going to grab the transcript and start installing on another i7 laptop and see what happens. Thank you sir!
Yes, I was! Thank you very much!😃
Godlike tutorial. Thanks!
Thank you ! This was useful
thank you! really appreciate this
Thank you so much, this is awesome!
Nice video. I was able to install Whisper on Ubuntu 22.04 LTS and transcribe without a hitch.😄 Well done.
I would have never managed without this video. Thanks man Also, 2.7 Gb for Torch. Wow!!