Updated AI Voice Cloning with RVC Inference - Tortoise with RVC Local Installation
Ғылым және технология
Links referenced in the video:
AI Voice Cloning Repo - github.com/JarodMica/ai-voice...
How to get RVC Voice Models - • How to Get AI Voice Mo...
How to Train a Tortoise Voice - • Local AI Voice Cloning...
RVC/Voice Changer Playlist - • AI Voice Changer
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/JarodMica
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoffee.com/jarodsjo...
Пікірлер: 276
PARTY TIME! Jarod you’re the best! The hero we needed 🎉
@Jarods_Journey
6 ай бұрын
Thank you thank you Artholos :)!
just when i was starting to miss the gcollab rvc now i finally have it downloaded on my pc thank you so much
Awesome! I've taken a pause on AI for a bit to focus on some other things, but I'm excited for all the neat things waiting for me when I come back to it. Love your content!
@Jarods_Journey
6 ай бұрын
All good, gotta do what you gotta do! AI will be here, bigger and better whenever you return to it. Appreciate it!
I used to make FL Studio tutorials on my main channel like this, straight to the point and super helpful and effective. Well played
Oh my god this is so impressive. I've been playing around with MRQ repo for a year now, and I've got some pretty good models out of it but this, I think, is going to take it to a whole other level. That was a great idea to link the two technologies! I can't wait for a Linux version too :)
The tts program that becomes rvc gives me great happiness.
I was a bit impatient waiting for the command lol. Thanks! You're awesome!
Oh JEEZ. This is simply incredible! I can't stress how impressed I am. You did an amazing job by combining these two technologies together! I mean... damn, god bless IT guys like you. This is outstanding. I have a one question though... is it possible to make a proper sound-based text-to-speech on languages other than English? Like, I don't know, if it would be possible to select or type the language locale in a new input field before clicking generate, so the system will recognize it's not the basic English. Just wondering what some of the favorite characters would sound like in translation, with their voices. :)
Thanks mate..!! You are one of the youtuber..who knows about AI Voice inside out. I'm a pro dev but not into AI space. I find all these stuff exciting. Want to explore all voice related stuff but have time constraints. It will be great if you can make an Udemy course covering all aspect of Text to Speech..will definitely like to purchase it..
@Jarods_Journey
6 ай бұрын
Appreciate it! I've got something in the works so will probably announce it on my channel whenever I get things sorted!
Thank you, I learned a valuable lesson from this video. Don't do something yourself when someone else can do it better.
@KodaPaul
4 ай бұрын
I feel you... Been struggling with this deepspeed cuda shit trying to make tortoise work lol
tysm for this istg ive been wanting to update my model and make new ones for a while- - a very happy vocaloid user
This is perfect for me as I am a Filmmaker that is working on a new project that will use some characters from a game that has a Wiki where I can download the voice lines and turn into a model and use this RVC to make them say the story without spending hours in editing myself. Thanks!
@ScrakSFMs
5 ай бұрын
I had the idea if this didn't exist to just record myself doing the character's lines I wanted and using a RVC to make it sound like the characters but that'd be very time consuming and worse. So once again a huge time saver. And the RVC I used to clone my own voice, the training has stopped working and I couldn't add any more voices which sucked. So I hope this doesn't have any of the issues I've occurred so far yet.
@BenjaminTemplar
3 ай бұрын
That is a great idea guys. I’m a filmmaker too. Great vibes!
@ScrakSFMs
3 ай бұрын
@@BenjaminTemplar No problem, hope the best for you!
@benfrombc
Ай бұрын
what are requirement to achieve this, can this be done with 2018 Mace mini ?
Nice work! Great detailed info!
Great work. Thank you. But I have another question: What would have to be changed to be able to use other voice languages?
Thanks for the amazing work you've put in. Im loving the results of your RVC update. Is there a way to turn off rvc in your audiobook maker. And use the output from the new tortoise rvc instead of the audiobook makers built in rvc.
@Jarods_Journey
6 ай бұрын
I appreciate it and I'm glad you're finding these things useful. It does not unfortunately. I will have to update the audiobook maker to put that into account as several things have changed since the initial release, just in case you want to disable/enable RVC. Thanks for the support 🙏!
Hey Jarods, amazing video! I have a question, is the training for RVC and TTS same? Or do I have to train a model seperately for RVC?
I will try to replace the old one I got with this. I haven't got time to work with it so it's okay. But I want to ask, how much seconds or minutes does it take your computer to produce a 500-words AI voice recording? And is Ryzen 5 5600x and RTX 3070 good enough for this use case?
Great ! Can we upload a singing vocal recording and render it with a .pth file (RVC model), or is it for speech only ?
2:07 yes I need all those models thank you. It's like when I purchase an audio fx plugin and if it doesn't come with presets... I'm mad. I need some presets to help with my workflow. Voice models means I have choices for some of my videos where I don't have to train a model if I don't want to.
Hi! I'm trying to clone Zoom's voice from The Flash TV series. I used ElevenLabs as it was recommended to me as being a top AI voice cloning tool. I used Instant Voice Cloning but, despite sufficient and clear samples of his voice, the AI voice didn't sound that great. Would this be a suitable program for it or is there a best-in-class option I'm missing?
I noticed that the audio references are all very small wave files, is this the best way to do it? or is it just what you have? Would a single long file also be suitable reference and does it have to be in Wave format?
Hey Jarod! This is awesome, thanks for sharing. I'm working on cloning some singers' voices, and I was wondering if it's possible to clone the style of singing (i.e. vibrato) as well? On RVC, it seems to only layer the voice quality itself as opposed to the style of singing so was wondering if that is a viable option.
@Jarods_Journey
6 ай бұрын
Tortoise doesn't transfer singing features so not possible there. As for RVC, that's where the index file comes in. It should help to reintroduce aspects of the original training files back into the output
@ricoletta
6 ай бұрын
@@Jarods_Journey Got it, thanks!
i just made a voice but no .index file comes out, im trying to use RVC GUI for ai covers. any fixes?
Thank you for you hard work! This program is trully amazing. Lterally the best of locally running TTS setups. The quality of the result is outstanding. I just wonder if one could have some control over speed of the reading. I didn't find this in the interface. Do we have some built in text tags like , or sinething like this?
i'm having this problem: "Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: torch.cat(): expected a non-empty list of Tensors" what it could be?
Thanks for the video! Where can I find documentation about this tool so that I learn what each setting is intended to do? I'm having a hard time trying to control the emotions.
So, you will always need a chuck of the original voice for the model trained on this very data to work. Is it correct?
Hey thanks for offering your repo. How much Gram is required to run this? I'm running a GTX 1650 NVIDIA card, it's only got 4 gb. Is that enough, and if so it will run be slowly?
Can we do a singing model for this? the old rvc webui seem broken, it wont train no more i spent 2 days trying to make it work, but it wont proceed.
What if I already installed a previous tortoise model from your other tutorial. Is there a way to update or download the needed rvc extension myself
@Jarods_Journey
6 ай бұрын
You need to redownload this package to get the RVC inference functionality
I've got a bit of a unique issue. Training doesn't seem to do anything. It'll run for a bit then seemingly pause. I gave it 12 hours to see if it was just running in the background and still nothing. The graphs don't even show up, so it's kinda hard to tell what's going on. Nothing in the terminal says there's an error, so I figured I'd bring it up here. Am I missing dependencies or something? Is it cause I installed it on an external hard drive instead of the D or C drive? There's not a lot to go on, but any help would be nice.
Thank You Jarod very smooth.
How are you getting the vocal samples though, for fictional characters? This part is the major chore from what I've noticed. What am I missing to make this easier?
how to train it to speak like morgan freeman ?
There's a way to prevent the output to have those "I'm really" before the generated sentences?
Hello Jarod, thank you for the sharing, I need a tool like this with posibilities to do an API request to do that in bulk. does this tool allows it ?
I'm at a loss of creating my own RVC model. Not sure if I can do it in-app or what program to download
Great video, thanks a bunch!
this is awesome bro!
This is awesome, excellent work! Sorry for the dumb question, but how can I access this from other computers on my network? ip address:7860 is refusing a connection, and I can't seem to figure out why. Disabling firewall does not fix the problem and I"m a bit stumped. Many thanks in advance!
ModuleNotFoundError: No module named 'bark' ModuleNotFoundError: No module named 'vall_e' RVC options are not showing in the UI ..?? Some guidance please.
7zip won't extract it, says unsupported compression type.
do you have a suggestion for language model that runs fast? I use amethyst-13b-mistral.Q8_0 and it is by far the best local model i've tested, by a landslide, completely different dimension. it is actually comparable to gpt4. but it takes like 90 seconds to generate each reply. it's like a person typing at ~70 wpm. maybe there's a model that's 10 times faster and 50% as good?
@Jarods_Journey
6 ай бұрын
You might want to look at 7B parameter models and then look at what model loader your using. If my memory serves me straight, exllama2 was there fastest I think in my testing
Hey, when I train my voice it keeps saying "ai-voice-cloning>pause" What do I do?
YOU ARE DOING GODS WORK...
There are no files in the weights folder in my device, what can i do?
I have a question. Instead of bundling all the models and creating a file tens of gigabytes in size, why didn't you simply allow the user to select and download what models you want after installation?
I'm a bit confused on why we need to add a voice audio sample if we are doing text to speech use case. Can i just keep it at random and use the RVC? Im new to this thanks!!
@Jarods_Journey
3 ай бұрын
Yes, you could keep it random, but since it's random, it may produce male or female pitches, all things you won't be able to adjust correct with RVC so it'll sound different on each generation
'runtime\python.exe' is not recognized as an internal or external command, operable program or batch file. Press any key to continue . . . I downloaded the zip file, extracted it, clicked start and this popped up. If I press a key it closes. What am I doing wrong?
@D-5000m
Ай бұрын
same issues here, did you downloaded the ver 3?
hey, can i use this along with audiobook maker from your past project ? i really like that
Having issues with the voice sounding american using an existing trained UK voice via RVC?
When I try to convert the voice it throw me an error on the output box. How to fix it?
is it possible to edit the text that is being said by the model?
what are requirement to achieve this, can this be done with 2018 Mace mini ?
I am getting this error whenever I attempt to run training. "Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu" I have a GTX 1060, and have attempted to search online for any solutions but haven't found anything that has helped me despite my best efforts, installing different pytorch versions and such too. Any advice/help/solutions would greatly be appreciated.
there are many rvc like mangio-rvc, applio and more to cloning voice model but which one is the best for cloning ? i do have nvidia gpu
@ricoletta
6 ай бұрын
No one size fits all. It's mostly context dependent based on my experience
Cool! What about emotion custom models? Where can you find those?
I installed rvc W-Okada and my voice doesn’t change, it’s the same. How can I fix that?
The rvc model is not showing in the colab notebook is there any solution
I can't get it to use the audio file as a prompt.
Bro I have been trying to figure out whats wrong with my RVC its unable to detect the voice samples that I packed in a folder and Copied it as path and pasted it on RVC to train my voice but It just shows Its unable to detect the file…what can I do to fix it and If you can help can I personally dm u in any of ur social media acc to show My desktop and Show the problem that I am having
Really nice tool, really help full video. Thanks for all.
Awesome work man! The only thing that I don't understand is at 7:26. Is that a separate model you trained using TTS? What's the difference in that and the wav file you selected on the main tab?
@Jarods_Journey
6 ай бұрын
If you've trained a voice model in tortoise before, this is for selecting that voice/autoregressive model. If you haven't, then you could disregard this section
how can i pause the training (needs 2 days) and continue it later on? :O:O:O:O
whoa........WHOA! This is hot. I was looking for elevenlabs alternatives and I think this is it. I love local training and having a 4090 (we have the same one! It rules!) helps a ton and I just dont want to use services! This rules! Instant SUB! Cant wait to try this later!
Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: Workspace can't be allocated, no enough memory. I have a gtx960M, is there some settings to change for try this on local?
@Jarods_Journey
5 ай бұрын
You're running out of vram, so unfortunately I think it may not be possible. You can try "checking" the option in the settings tab that enables low vram
So this repo can do TTV and not Voice-to-voice?
How to create and get .pth file for the training purpose?
Hi Jarods, awsome tutorial. I have a question for you. i have an RTX 3050 with just 4gb of Vram (I Know that's pretty poor) , but i've noticed that rvc inference using RVC-Vits it's pretty fast, even when i put 3 min of singing, the inference takes about 20 seconds.. but when i use this tts AI, it's taking forever just for saying "Hi i'm fabio and this is my voice" (using ultrafast preset). why? and there is a way to make it faster as RVC-Vits? thank you.
@clmcwilliams
5 ай бұрын
When the guy named his TTS model "Tortoise" he was poking fun at how slow it was x.x
@DreamFilmVFX
5 ай бұрын
🤣🤣🤣 yeah, I would have called it "snail" on my machine at least @@clmcwilliams
Can these voice models be used with W-Okada VC?
cmd: "KeyboardInterrupt ^CTerminate batch job (Y/N)?" Help?
@Jarods_Journey just a question: did this compatiable with book generator? and that is really good man.
@Jarods_Journey
6 ай бұрын
Yes, it'll work. Just make sure to turn off RVC in tortoise as RVC is already built into the audiobook maker
@myte1why
6 ай бұрын
@@Jarods_Journey ok thanks alot by the way to give fast feed back about book maker its work really good. just feel like when it finishes full eneration or regeneration of sentences it will be cool to give a sound indication as its finished. but thanks alot for this kind a tool 😁
This looks great, quick question: does it work with English speech only?
any chance for colab pro version ? currently i don't have an expensive gpu for training
I’m trying to clone my own voice too. Can this work on linux too? Because it seems like your main machine is windows
you are a gem!
Does it work for Spanish?
when i click start nothing happens on the cmd
C:\Windows\System32>runtime\python.exe infer-web.py --pycmd runtime\python.exe --port 7897 The system cannot find the path specified. Pls how to fix this
can use this tortoise - rvc chain using gradio api??
great stuff!
Hi, how to use it for German or hindi Language? Trained my voice for hindi. But by generating Text to speech I am receiving Config Error. Thanks in advance for your support.
I got error extracting, I've downloaded 7zip. Any idea why?
@panosjr_greece
5 ай бұрын
Me too... Use winrar to extract it.
is there a way to do a batch TTS with this?
why there is need of your voice sample if you already put model of your voice
Hey, great tuto. I'm getting a CUDA out of memory error after I go through everything and hit generate. Ive got a 4080. Any suggestions? EDIT: I've got 16GB of dedicated GPU mem, running start.bat allocates 4.8GB and then running Generate blows past 15GB then throws up the error. 2ND EDIT: Reduced the samples to 8 and now it works...lol...no more edits I promise.
Can, RVC models be used in multiple voice changers?
where is the AIHUB discord to download models?
I wanted to host the inference as an API on the cloud, I want it to be really fast similar to elevenlabs, what should I do? What GPU should I choose? What if I want to run 100 concurrent generations?
@Jarods_Journey
6 ай бұрын
Repo isn't natively supported for the cloud. You'd need to adjust the repo to fit your needs here
Jarod you are awesome
Guys, is there a video where a famous voice actor tries this for a long time? I couldn't find a long video like elevenlabs where I could hear a few examples.
Looks great.
rvc_pipe not found module. Why?
File "ctypes\__init__.py", line 374, in __init__ FileNotFoundError: Could not find module 'C:\Users\ahmad\OneDrive\سطح المكتب\ai-voice-cloning untime\Lib\site-packages\torchaudio\lib\libtorchaudio.pyd' (or one of its dependencies). Try using the full path with constructor syntax. how i can fix it?
@FlorianKMusic
5 ай бұрын
have the same error
I'm currently testing workflow with openai generated TTS to RVC. openai seems to create superior base TTS to tortoise
GUI is great for testing, etc - but can it also be used from code via API? Thanks.
@Jarods_Journey
6 ай бұрын
Yes, but ATM, you need to setup RVC stuff inside of the AI voice cloning repo. Then you can call generate as you normally would from the Gradio interface
Are my settings wrong? 1 sentence can take 5 minutes to generate... I am running on a $2,000 ASUS gaming laptop that I purchased last year. I followed all your steps on the previous "Local AI Voice Cloning with Tortoise TTS - 2024 Installation (Check LATEST update in description)" video. However, even after loading the trained finetune model, it takes forever just to generate 1 sentence. Am I doing something wrong?
@Jarods_Journey
3 ай бұрын
Bring samples down to 2 and see if that helps. If your GPU isn't being detected, this could also be the case, in which, you'd need to reinstall it
@bread.banana.
3 ай бұрын
@@Jarods_Journey THANK YOU! I brought it down to 2, but still no difference. I will keep playing around with the settings, reinstalling, etc. I know you're busy and don't want to pester you with too many obvious questions.
Which software name is the RVC ? Where can we get it
For some reason I can't extract the whole file, it says "unsupported compression method" in 7-Zip.
@Dan-Levi
4 ай бұрын
Try upgrading your 7-Zip installation
With all these voice changers....does it make a difference using a cheap mic vs a super expensive mic?