Updated AI Voice Cloning with RVC Inference - Tortoise with RVC Local Installation

Ғылым және технология

Links referenced in the video:
AI Voice Cloning Repo - github.com/JarodMica/ai-voice...
How to get RVC Voice Models - • How to Get AI Voice Mo...
How to Train a Tortoise Voice - • Local AI Voice Cloning...
RVC/Voice Changer Playlist - • AI Voice Changer
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/JarodMica
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoffee.com/jarodsjo...

Пікірлер: 276

@Artholos6 ай бұрын
PARTY TIME! Jarod you’re the best! The hero we needed 🎉
@Jarods_Journey
6 ай бұрын
Thank you thank you Artholos :)!
@Gwynbleiddsanity4 ай бұрын
just when i was starting to miss the gcollab rvc now i finally have it downloaded on my pc thank you so much
@Mowgi6 ай бұрын
Awesome! I've taken a pause on AI for a bit to focus on some other things, but I'm excited for all the neat things waiting for me when I come back to it. Love your content!
@Jarods_Journey
6 ай бұрын
All good, gotta do what you gotta do! AI will be here, bigger and better whenever you return to it. Appreciate it!
@spetheofmusic6 ай бұрын
I used to make FL Studio tutorials on my main channel like this, straight to the point and super helpful and effective. Well played
@popi37892 ай бұрын
Oh my god this is so impressive. I've been playing around with MRQ repo for a year now, and I've got some pretty good models out of it but this, I think, is going to take it to a whole other level. That was a great idea to link the two technologies! I can't wait for a Linux version too :)
@어슬렁어슬렁호시탐탐6 ай бұрын
The tts program that becomes rvc gives me great happiness.
@libertyprime2013Ай бұрын
I was a bit impatient waiting for the command lol. Thanks! You're awesome!
@nicolaykoriagin6 ай бұрын
Oh JEEZ. This is simply incredible! I can't stress how impressed I am. You did an amazing job by combining these two technologies together! I mean... damn, god bless IT guys like you. This is outstanding. I have a one question though... is it possible to make a proper sound-based text-to-speech on languages other than English? Like, I don't know, if it would be possible to select or type the language locale in a new input field before clicking generate, so the system will recognize it's not the basic English. Just wondering what some of the favorite characters would sound like in translation, with their voices. :)
@yashkhd11006 ай бұрын
Thanks mate..!! You are one of the youtuber..who knows about AI Voice inside out. I'm a pro dev but not into AI space. I find all these stuff exciting. Want to explore all voice related stuff but have time constraints. It will be great if you can make an Udemy course covering all aspect of Text to Speech..will definitely like to purchase it..
@Jarods_Journey
6 ай бұрын
Appreciate it! I've got something in the works so will probably announce it on my channel whenever I get things sorted!
@Tranquilized_5 ай бұрын
Thank you, I learned a valuable lesson from this video. Don't do something yourself when someone else can do it better.
@KodaPaul
4 ай бұрын
I feel you... Been struggling with this deepspeed cuda shit trying to make tortoise work lol
@Ecliptic-P3 ай бұрын
tysm for this istg ive been wanting to update my model and make new ones for a while- - a very happy vocaloid user
@ScrakSFMs5 ай бұрын
This is perfect for me as I am a Filmmaker that is working on a new project that will use some characters from a game that has a Wiki where I can download the voice lines and turn into a model and use this RVC to make them say the story without spending hours in editing myself. Thanks!
@ScrakSFMs
5 ай бұрын
I had the idea if this didn't exist to just record myself doing the character's lines I wanted and using a RVC to make it sound like the characters but that'd be very time consuming and worse. So once again a huge time saver. And the RVC I used to clone my own voice, the training has stopped working and I couldn't add any more voices which sucked. So I hope this doesn't have any of the issues I've occurred so far yet.
@BenjaminTemplar
3 ай бұрын
That is a great idea guys. I’m a filmmaker too. Great vibes!
@ScrakSFMs
3 ай бұрын
@@BenjaminTemplar No problem, hope the best for you!
@benfrombc
Ай бұрын
what are requirement to achieve this, can this be done with 2018 Mace mini ?
@williamreid745 ай бұрын
Nice work! Great detailed info!
@vmfox21525 ай бұрын
Great work. Thank you. But I have another question: What would have to be changed to be able to use other voice languages?
@DeDoodles6 ай бұрын
Thanks for the amazing work you've put in. Im loving the results of your RVC update. Is there a way to turn off rvc in your audiobook maker. And use the output from the new tortoise rvc instead of the audiobook makers built in rvc.
@Jarods_Journey
6 ай бұрын
I appreciate it and I'm glad you're finding these things useful. It does not unfortunately. I will have to update the audiobook maker to put that into account as several things have changed since the initial release, just in case you want to disable/enable RVC. Thanks for the support 🙏!
@kaziahmed3 ай бұрын
Hey Jarods, amazing video! I have a question, is the training for RVC and TTS same? Or do I have to train a model seperately for RVC?
@joshuadelacruz39076 ай бұрын
I will try to replace the old one I got with this. I haven't got time to work with it so it's okay. But I want to ask, how much seconds or minutes does it take your computer to produce a 500-words AI voice recording? And is Ryzen 5 5600x and RTX 3070 good enough for this use case?
@SKYGGEMUSIC4 ай бұрын
Great ! Can we upload a singing vocal recording and render it with a .pth file (RVC model), or is it for speech only ?
@RobertJene6 ай бұрын
2:07 yes I need all those models thank you. It's like when I purchase an audio fx plugin and if it doesn't come with presets... I'm mad. I need some presets to help with my workflow. Voice models means I have choices for some of my videos where I don't have to train a model if I don't want to.
@BlueMoonJason3 ай бұрын
Hi! I'm trying to clone Zoom's voice from The Flash TV series. I used ElevenLabs as it was recommended to me as being a top AI voice cloning tool. I used Instant Voice Cloning but, despite sufficient and clear samples of his voice, the AI voice didn't sound that great. Would this be a suitable program for it or is there a best-in-class option I'm missing?
@scetchmonkey0074 ай бұрын
I noticed that the audio references are all very small wave files, is this the best way to do it? or is it just what you have? Would a single long file also be suitable reference and does it have to be in Wave format?
@ricoletta6 ай бұрын
Hey Jarod! This is awesome, thanks for sharing. I'm working on cloning some singers' voices, and I was wondering if it's possible to clone the style of singing (i.e. vibrato) as well? On RVC, it seems to only layer the voice quality itself as opposed to the style of singing so was wondering if that is a viable option.
@Jarods_Journey
6 ай бұрын
Tortoise doesn't transfer singing features so not possible there. As for RVC, that's where the index file comes in. It should help to reintroduce aspects of the original training files back into the output
@ricoletta
6 ай бұрын
@@Jarods_Journey Got it, thanks!
@LinkinV3 ай бұрын
i just made a voice but no .index file comes out, im trying to use RVC GUI for ai covers. any fixes?
@akorneev7772 ай бұрын
Thank you for you hard work! This program is trully amazing. Lterally the best of locally running TTS setups. The quality of the result is outstanding. I just wonder if one could have some control over speed of the reading. I didn't find this in the interface. Do we have some built in text tags like , or sinething like this?
@Okuraio4 ай бұрын
i'm having this problem: "Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: torch.cat(): expected a non-empty list of Tensors" what it could be?
@carolineito93123 ай бұрын
Thanks for the video! Where can I find documentation about this tool so that I learn what each setting is intended to do? I'm having a hard time trying to control the emotions.
@DM-dy6vn3 ай бұрын
So, you will always need a chuck of the original voice for the model trained on this very data to work. Is it correct?
@user-td9fx6nq9t4 ай бұрын
Hey thanks for offering your repo. How much Gram is required to run this? I'm running a GTX 1650 NVIDIA card, it's only got 4 gb. Is that enough, and if so it will run be slowly?
@user-iu7in7oo9t5 ай бұрын
Can we do a singing model for this? the old rvc webui seem broken, it wont train no more i spent 2 days trying to make it work, but it wont proceed.
@elijahpavich10956 ай бұрын
What if I already installed a previous tortoise model from your other tutorial. Is there a way to update or download the needed rvc extension myself
@Jarods_Journey
6 ай бұрын
You need to redownload this package to get the RVC inference functionality
@XenkoVence4 ай бұрын
I've got a bit of a unique issue. Training doesn't seem to do anything. It'll run for a bit then seemingly pause. I gave it 12 hours to see if it was just running in the background and still nothing. The graphs don't even show up, so it's kinda hard to tell what's going on. Nothing in the terminal says there's an error, so I figured I'd bring it up here. Am I missing dependencies or something? Is it cause I installed it on an external hard drive instead of the D or C drive? There's not a lot to go on, but any help would be nice.
@antonvideo23905 ай бұрын
Thank You Jarod very smooth.
@cleverestx3 ай бұрын
How are you getting the vocal samples though, for fictional characters? This part is the major chore from what I've noticed. What am I missing to make this easier?
@northwestrepair4 ай бұрын
how to train it to speak like morgan freeman ?
@ImagoPictures5 ай бұрын
There's a way to prevent the output to have those "I'm really" before the generated sentences?
@gregorikeller692825 күн бұрын
Hello Jarod, thank you for the sharing, I need a tool like this with posibilities to do an API request to do that in bulk. does this tool allows it ?
@EliteSparklz5 ай бұрын
I'm at a loss of creating my own RVC model. Not sure if I can do it in-app or what program to download
@313matze4 ай бұрын
Great video, thanks a bunch!
@spetheofmusic6 ай бұрын
this is awesome bro!
@Zencat42026 күн бұрын
This is awesome, excellent work! Sorry for the dumb question, but how can I access this from other computers on my network? ip address:7860 is refusing a connection, and I can't seem to figure out why. Disabling firewall does not fix the problem and I"m a bit stumped. Many thanks in advance!
@AbdullahNiazi71776 ай бұрын
ModuleNotFoundError: No module named 'bark' ModuleNotFoundError: No module named 'vall_e' RVC options are not showing in the UI ..?? Some guidance please.
@9ALiTY4 ай бұрын
7zip won't extract it, says unsupported compression type.
@GraveUypo6 ай бұрын
do you have a suggestion for language model that runs fast? I use amethyst-13b-mistral.Q8_0 and it is by far the best local model i've tested, by a landslide, completely different dimension. it is actually comparable to gpt4. but it takes like 90 seconds to generate each reply. it's like a person typing at ~70 wpm. maybe there's a model that's 10 times faster and 50% as good?
@Jarods_Journey
6 ай бұрын
You might want to look at 7B parameter models and then look at what model loader your using. If my memory serves me straight, exllama2 was there fastest I think in my testing
@tea63103 ай бұрын
Hey, when I train my voice it keeps saying "ai-voice-cloning>pause" What do I do?
@godofdream91126 ай бұрын
YOU ARE DOING GODS WORK...
@jenilsoni80132 ай бұрын
There are no files in the weights folder in my device, what can i do?
@michaelroberts11205 ай бұрын
I have a question. Instead of bundling all the models and creating a file tens of gigabytes in size, why didn't you simply allow the user to select and download what models you want after installation?
@thedeliverus3 ай бұрын
I'm a bit confused on why we need to add a voice audio sample if we are doing text to speech use case. Can i just keep it at random and use the RVC? Im new to this thanks!!
@Jarods_Journey
3 ай бұрын
Yes, you could keep it random, but since it's random, it may produce male or female pitches, all things you won't be able to adjust correct with RVC so it'll sound different on each generation
@BroadwayCJ97Ай бұрын
'runtime\python.exe' is not recognized as an internal or external command, operable program or batch file. Press any key to continue . . . I downloaded the zip file, extracted it, clicked start and this popped up. If I press a key it closes. What am I doing wrong?
@D-5000m
Ай бұрын
same issues here, did you downloaded the ver 3?
@rickparsley42484 ай бұрын
hey, can i use this along with audiobook maker from your past project ? i really like that
@jonogrimmer60136 ай бұрын
Having issues with the voice sounding american using an existing trained UK voice via RVC?
@Aakash9834 ай бұрын
When I try to convert the voice it throw me an error on the output box. How to fix it?
@danielkuperstein18355 ай бұрын
is it possible to edit the text that is being said by the model?
@benfrombcАй бұрын
what are requirement to achieve this, can this be done with 2018 Mace mini ?
@jamesdolphin41585 ай бұрын
I am getting this error whenever I attempt to run training. "Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu" I have a GTX 1060, and have attempted to search online for any solutions but haven't found anything that has helped me despite my best efforts, installing different pytorch versions and such too. Any advice/help/solutions would greatly be appreciated.
@masterpav6 ай бұрын
there are many rvc like mangio-rvc, applio and more to cloning voice model but which one is the best for cloning ? i do have nvidia gpu
@ricoletta
6 ай бұрын
No one size fits all. It's mostly context dependent based on my experience
@TurboJor5 ай бұрын
Cool! What about emotion custom models? Where can you find those?
@user-re4lt3dj1nАй бұрын
I installed rvc W-Okada and my voice doesn’t change, it’s the same. How can I fix that?
@Aks153146 ай бұрын
The rvc model is not showing in the colab notebook is there any solution
@ValentineMJ5 ай бұрын
I can't get it to use the audio file as a prompt.
@mercy40015 ай бұрын
Bro I have been trying to figure out whats wrong with my RVC its unable to detect the voice samples that I packed in a folder and Copied it as path and pasted it on RVC to train my voice but It just shows Its unable to detect the file…what can I do to fix it and If you can help can I personally dm u in any of ur social media acc to show My desktop and Show the problem that I am having
@Artificialintelligenceo3 ай бұрын
Really nice tool, really help full video. Thanks for all.
@Soljarag56 ай бұрын
Awesome work man! The only thing that I don't understand is at 7:26. Is that a separate model you trained using TTS? What's the difference in that and the wav file you selected on the main tab?
@Jarods_Journey
6 ай бұрын
If you've trained a voice model in tortoise before, this is for selecting that voice/autoregressive model. If you haven't, then you could disregard this section
@sag.ja.zur.botschaft5 ай бұрын
how can i pause the training (needs 2 days) and continue it later on? :O:O:O:O
@grahamulax5 ай бұрын
whoa........WHOA! This is hot. I was looking for elevenlabs alternatives and I think this is it. I love local training and having a 4090 (we have the same one! It rules!) helps a ton and I just dont want to use services! This rules! Instant SUB! Cant wait to try this later!
@alessiolanzillotta79295 ай бұрын
Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: Workspace can't be allocated, no enough memory. I have a gtx960M, is there some settings to change for try this on local?
@Jarods_Journey
5 ай бұрын
You're running out of vram, so unfortunately I think it may not be possible. You can try "checking" the option in the settings tab that enables low vram
@frosti74 ай бұрын
So this repo can do TTV and not Voice-to-voice?
@kushaljain67112 ай бұрын
How to create and get .pth file for the training purpose?
@DreamFilmVFX6 ай бұрын
Hi Jarods, awsome tutorial. I have a question for you. i have an RTX 3050 with just 4gb of Vram (I Know that's pretty poor) , but i've noticed that rvc inference using RVC-Vits it's pretty fast, even when i put 3 min of singing, the inference takes about 20 seconds.. but when i use this tts AI, it's taking forever just for saying "Hi i'm fabio and this is my voice" (using ultrafast preset). why? and there is a way to make it faster as RVC-Vits? thank you.
@clmcwilliams
5 ай бұрын
When the guy named his TTS model "Tortoise" he was poking fun at how slow it was x.x
@DreamFilmVFX
5 ай бұрын
🤣🤣🤣 yeah, I would have called it "snail" on my machine at least @@clmcwilliams
@TheSynan3 ай бұрын
Can these voice models be used with W-Okada VC?
@danielg91742 ай бұрын
cmd: "KeyboardInterrupt ^CTerminate batch job (Y/N)?" Help?
@myte1why6 ай бұрын
@Jarods_Journey just a question: did this compatiable with book generator? and that is really good man.
@Jarods_Journey
6 ай бұрын
Yes, it'll work. Just make sure to turn off RVC in tortoise as RVC is already built into the audiobook maker
@myte1why
6 ай бұрын
@@Jarods_Journey ok thanks alot by the way to give fast feed back about book maker its work really good. just feel like when it finishes full eneration or regeneration of sentences it will be cool to give a sound indication as its finished. but thanks alot for this kind a tool 😁
@TheDog2M4 ай бұрын
This looks great, quick question: does it work with English speech only?
@PGRjoystick4 ай бұрын
any chance for colab pro version ? currently i don't have an expensive gpu for training
@cheesiangleow47825 ай бұрын
I’m trying to clone my own voice too. Can this work on linux too? Because it seems like your main machine is windows
@33rdframe2 ай бұрын
you are a gem!
@joseijosei4 ай бұрын
Does it work for Spanish?
@NoahMine1Ай бұрын
when i click start nothing happens on the cmd
@Dreem2002 ай бұрын
C:\Windows\System32>runtime\python.exe infer-web.py --pycmd runtime\python.exe --port 7897 The system cannot find the path specified. Pls how to fix this
@mnitant5 ай бұрын
can use this tortoise - rvc chain using gradio api??
@dougmaisner3 ай бұрын
great stuff!
@user-dr8se8tu1x3 ай бұрын
Hi, how to use it for German or hindi Language? Trained my voice for hindi. But by generating Text to speech I am receiving Config Error. Thanks in advance for your support.
@anti-dreamstansunited33915 ай бұрын
I got error extracting, I've downloaded 7zip. Any idea why?
@panosjr_greece
5 ай бұрын
Me too... Use winrar to extract it.
@Mark-vv8by20 күн бұрын
is there a way to do a batch TTS with this?
@BareillyFRS21 күн бұрын
why there is need of your voice sample if you already put model of your voice
@chad5893Ай бұрын
Hey, great tuto. I'm getting a CUDA out of memory error after I go through everything and hit generate. Ive got a 4080. Any suggestions? EDIT: I've got 16GB of dedicated GPU mem, running start.bat allocates 4.8GB and then running Generate blows past 15GB then throws up the error. 2ND EDIT: Reduced the samples to 8 and now it works...lol...no more edits I promise.
@Molandria3 ай бұрын
Can, RVC models be used in multiple voice changers?
@1ajayc14 күн бұрын
where is the AIHUB discord to download models?
@soraygoularssm86696 ай бұрын
I wanted to host the inference as an API on the cloud, I want it to be really fast similar to elevenlabs, what should I do? What GPU should I choose? What if I want to run 100 concurrent generations?
@Jarods_Journey
6 ай бұрын
Repo isn't natively supported for the cloud. You'd need to adjust the repo to fit your needs here
@matthewthomasvallejos25256 ай бұрын
Jarod you are awesome
@ahmetab066 ай бұрын
Guys, is there a video where a famous voice actor tries this for a long time? I couldn't find a long video like elevenlabs where I could hear a few examples.
@Canna_Science_and_Technology5 ай бұрын
Looks great.
@muzaffarurokov31865 ай бұрын
rvc_pipe not found module. Why?
@fekra-ads6 ай бұрын
File "ctypes\__init__.py", line 374, in __init__ FileNotFoundError: Could not find module 'C:\Users\ahmad\OneDrive\سطح المكتب\ai-voice-cloning untime\Lib\site-packages\torchaudio\lib\libtorchaudio.pyd' (or one of its dependencies). Try using the full path with constructor syntax. how i can fix it?
@FlorianKMusic
5 ай бұрын
have the same error
@WaifuRecaps3 ай бұрын
I'm currently testing workflow with openai generated TTS to RVC. openai seems to create superior base TTS to tortoise
@stevecato6 ай бұрын
GUI is great for testing, etc - but can it also be used from code via API? Thanks.
@Jarods_Journey
6 ай бұрын
Yes, but ATM, you need to setup RVC stuff inside of the AI voice cloning repo. Then you can call generate as you normally would from the Gradio interface
@bread.banana.3 ай бұрын
Are my settings wrong? 1 sentence can take 5 minutes to generate... I am running on a $2,000 ASUS gaming laptop that I purchased last year. I followed all your steps on the previous "Local AI Voice Cloning with Tortoise TTS - 2024 Installation (Check LATEST update in description)" video. However, even after loading the trained finetune model, it takes forever just to generate 1 sentence. Am I doing something wrong?
@Jarods_Journey
3 ай бұрын
Bring samples down to 2 and see if that helps. If your GPU isn't being detected, this could also be the case, in which, you'd need to reinstall it
@bread.banana.
3 ай бұрын
@@Jarods_Journey THANK YOU! I brought it down to 2, but still no difference. I will keep playing around with the settings, reinstalling, etc. I know you're busy and don't want to pester you with too many obvious questions.
@utoob22Ай бұрын
Which software name is the RVC ? Where can we get it
@ArtorioVideojogos4 ай бұрын
For some reason I can't extract the whole file, it says "unsupported compression method" in 7-Zip.
@Dan-Levi
4 ай бұрын
Try upgrading your 7-Zip installation
@koreangoku4 ай бұрын
With all these voice changers....does it make a difference using a cheap mic vs a super expensive mic?