p3tro
Күн бұрын
153,103
1

So-Vits-SVC: Local Training Tutorial (How to make your own model)

Музыка

my socials:
email: nicholasmpetro@gmail.com
instagram: / itsp3tro
soundcloud: / nickp3tro
spotify: open.spotify.com/artist/5Kr7b...
try asking Chat-GPT for help! No joke
Have someone train a custom SVC model for you: www.customaivoices.com/
my original tutorial for running So-Vits-SVC locally: • So-Vits-SVC A.i. Vocal...
Github for 34j fork: github.com/34j/so-vits-svc-fork
A.I. World discord: / discord
there should be a google collab tutorial linked there as well!
audio slicer gui: github.com/flutydeer/audio-sl...
commands:
svc pre-resample
svc pre-config
svc pre-hubert
svc train -t
for pyutil: conda install pyutil
chapters:
00:00 - Intro
00:20 - Easier Alternatives (AI World)
00:32 - Trouble Shooting tips
1:33 - Overview
1:55 - Gathering your Data-set
2:10 - Ultimate Vocal Remover
2:25 - Audio Slicer G.U.I
3:34 - Folder Structure
3:55 - Setting Conda Location
4:20 - SVC pre-resample
5:05 - How to know it worked
5:21 - SVC pre-config
5:40 - Editing config file + settings
6:16 - Explaining steps/epochs/batch size
7:44 - Simplified Setting's explanation
8:13 - SVC pre-hubert
8:20 - Common errors
9:10 - SVC train -t
9:25 - ChatGPT trouble shooting example
10:00 - Using the Model
10:15 - Alternatives + Outro

Пікірлер: 349

@GilaCees Жыл бұрын
I watched for research purposes and haven't implemented but the explanation was in depth and I could tell you have great understanding of the process. Subscribed and will keep an eye out, while I work on my understanding of local training. Thanks for the great content
@ItsJustAdrean Жыл бұрын
By far the most comprehensive, easy to follow guide on this confusing application. Great work. I'm a computer dunce after years of not flexing those brain muscles, and it's working!
@joon4470
Жыл бұрын
how long does it take
@embatocadosjaguanlimares409
Жыл бұрын
@@joon4470 to me on a rtx3090 = 3 hours
@ItsJustAdrean
Жыл бұрын
@@joon4470 it took me 2 days to get a good model on a 1060 card. Not bad
@AbelVelasco18 ай бұрын
This is by far the best explanation I've seen on this topic.Thank you! Keep up the great work!
@ZeroBudgetLife Жыл бұрын
people who make these kind of tutorials, should be as clear and informative as you. thank you. hopefull my model successfully trains.
@goshochernii
Жыл бұрын
sorry for stupid question but, what gpu are you using?
@ZeroBudgetLife
Жыл бұрын
@@goshochernii thats a good Qnd, i actually found few open source models which i use in Colab to train my model and then i have so-vits on PC. i have a very crappy 1050ti 4gb. so i train in google colab(12GB or 16gb) then i take the .pth file and generate the mp3 to wav on my PC locally. not the best work around. but it works for me. but i do want to upgrade to a better GPU.
@ekuly3592 Жыл бұрын
This really helps a lot! I was stuck at the 'dataset_raw' step and your video solve my problem, many thanks~
@watsonwrote Жыл бұрын
Thanks! Great explanation and I didn't even run into any errors during training.
@Sirbibi1308 Жыл бұрын
This was so helpful! Thank you so much! Very nice results after 8000 epochs... can´t wait toe 9999 to be finished!
@ferdyrp Жыл бұрын
Thank you so much! This video is so far is the best and well-explained of this topics
@TheMasterfulcreator Жыл бұрын
For the issue you're fixing @4:42 The github clearly explains to just use dataset_raw as the top folder without it being inside a dataset folder so there was no reason to have ever created a dataset folder with dataset_raw inside of it and that's why you had to move dataset_raw outside of that folder.
@suave5394 Жыл бұрын
Best training tutorial out rn brotha. Like actually. Spent 2 hours looking everywhere on how to figure this out. Thanks g.
@SlayBellsMusic7 ай бұрын
I had Stable Diffusion installed prior to watching this. My first model is training right now. Thanks so much!
@VanAkita Жыл бұрын
A more practical explanation on steps, epoches, batch size etc, based on my experience with the Diff-SVC coding: Steps=samples/batch size=how fast an epoch is generated Eval.interval=the number of epoches needed to save your model's training progress. If you let the model train indefinitely, the epoches will theoretically generate forever; you either close the command prompt or execute code to stop the process. If you plan the model to be 100% HQ on one sitting but be trained for hours, put the eval number high. If you plan to do it in multiple sittings and in parts, depending on your samples and gpu power apply a number in the hundreds or thousands. When the number of epoches is done, you will get a G_ file, which on a later time you can continue training from that number of epoches (for example for evaluation of 1000, you will get a saved model for 1000,2000,3000 etc. Generally the bigger number of epoch, the better for the quality for the result) Hope that makes it a bit more clear, it is actually not that complicated. Thank you for the video, i will start training my own models for this finally 😉☺️
@BRAZEN_Muse Жыл бұрын
this was amazing!!! I got it to work! the output was not perfect but I only used 73 samples. Still, this tech is very impressive. thank you so much for the tutorial bro!
@jmas679
10 ай бұрын
bro how did u do it i cant even get past the presample step...
@marshalleq9 ай бұрын
I'm 49 now, been in IT for about 30 years. This is the first time, I've ever seen a younger guy doing tech and thinking dang dude love your work. I can recall when I was younger being in a similar position where you're leading your field - my advice, never stop, never change to be a manager, just stick with what you're naturally good at and work will never be work. It is a rare thing for a tech person to be able to nail simple instructions in plain language. You got it. Well done.
@Molandria
3 ай бұрын
Usually people find themselves being promoted until they hate their job and hate their life, and aren't good at their job anymore.
@user-jt5vm3mi1w
3 ай бұрын
ageist
@Molandria
3 ай бұрын
@user-jt5vm3mi1w What a silly comment. Recognizing Patterns amongst groups is not any "ist" rather, its Human and recognizing patterns of behaviors amongst groups can help us, as a culture to recognize where problems are, and help us help those who need it. Example: One group commits half the crime. You: "___ist!" Result, more victims, more loss of life. Us: Lets look into this. Oh look, around 6% of them are actually the perps! And their own group are their main victims! Lets help them! Result: Less victims, less loss of life. Bit we are the bad ones, lol.
@truthmob1 Жыл бұрын
This is for the person who was having troubleshooting errors in the first like minute of the video. Open the start menu and search for "Environment Variables". Click on the "Advanced" tab and then the button that says "Environment Variables". Move your mouse down to the bottom window and scroll until you find the "Path" variable. Click on it and click edit. Click "New" and add the file location of script, in which the default looks like the following, "C:\Users\*user*\AppData\Roaming\so-vits-svc-fork\venv\Scripts". After you have added this just click okay until you are out of the Environment Variables window, restart your command prompt and it should work.
@theis37trials41 Жыл бұрын
this was great, exactly the issue i was having, thank you!
@mediumgentium Жыл бұрын
You are a legend, thank you very much, I fixed all the problems myself
@Elutai Жыл бұрын
W theres hardly been any good videos for this finally i can try it out >:)
@AI-Consultant Жыл бұрын
one day you will look back at this video and say, wow was that a lot of work, now we just drag and drop one or two maybe three vocal track acapella and boom it's trained
@cwdoby
Жыл бұрын
That's essentially what you do with eleven labs, just drag the vocal tracks and get audio. So yeah we are already there! It's just for these free options it's going to take a little bit more to get there.
@AI-Consultant
Жыл бұрын
@@cwdoby i don't think eleven labs does singing?
@KABZProductions
Жыл бұрын
@@AI-Consultant it does not.
@VenetinOfficial
Жыл бұрын
@@AI-Consultant can probably fake it with newtone/melodyne and a slow speaking speed.
@chopov11 Жыл бұрын
you are awsome thank you the sovits ai community is so cool
@LoneRanger.801 Жыл бұрын
What an excellent video. You explain really well. Subscribed.
@jabrielmadeit Жыл бұрын
I finished training, but now I want to add more samples to see if I can get even better results, how do I do that?
@SantoValentino Жыл бұрын
Do you know how to make your second model? I made my first, thanks to you. I don’t know how to share it to the discord and I don’t know how to make a new voice model. For instance, I trained the voice “Rakim”. He’s in his own datasets folders. When I make a new folder with a new name, how do I avoid training Rakim again?
@karinkurusu Жыл бұрын
Heya, great vid! It was really helpful and easy to follow. Just one more question tho lol, how would I go about training a new model? I tried removing all traces of the previous dataset and stuff in my directory and repeated the whole process for a new dataset, but my epoch counter in the terminal still continued from where it left off. Does that mean anything? (I went with 472 epochs on the first model, then 2000 on the new one, when I started training, it started at 472/1999 epochs)
@Sunfox37
Жыл бұрын
Having the same issue, have you found out how to fix this?
@helmeteer Жыл бұрын
thank you for the tutorial, you explained everything pretty well. just one thing i'm not sure about; in your notes you have 110 samples / 7 batch size = 15 steps, but since 15*7=105, would that leave 5 samples unused? i have a batch number like that that doesn't really divide well and i'm not sure if i should go under or over with my step amount
@JoelDickinsonMusic Жыл бұрын
Could you make a short video (or tell me) how to resume training if you want to come back to it later after closing the session? I would be so grateful!
@pertyuk Жыл бұрын
Thanks for the video p3tro! Super helpful while installing this program. I just had one question: What would the process for making another dataset look like? Would I still use all 4 commands in the description? Thanks!
@bajza6046
Жыл бұрын
For any dataset you create, you would have to run those 4 commands, since the preprocessing is necessary for the actual program to work, but if you already ran those 4 commands on the dataset there's no need to run them again
@iankinzel Жыл бұрын
What happens if your vocal dataset includes vocals from a couple different vocalists? Like, say you don't want to create a clone of one specific guy, but you want to create a blend from a couple singers whose voices share a similar timbre.
@Robin6000 Жыл бұрын
These epochs still confuse the hell out of me. If I have 35 samples and a batch size of 35, that would mean that it would take 1 step to go through everything. Using your equation it would mean that 25000 steps would take 25000 epochs, right? So let's say I would x10 my sample size to 350 and keep the batch size at 35. That would mean that it would now take 10 step to complete 1 epoch. Putting that into the equation, reaching 25000 steps now requires 2500 epochs stead of 25000. How does increasing my sample size decrease the amount of epochs? Shouldn't it increase it? Don't we want the epoch number to be fair low so that it doesn't take so long to train? Am I dumb?
@KamuiVox Жыл бұрын
how do i train another model, after i already trained a model, while still keeping the old model?
@paultrustmusic Жыл бұрын
What do you think the time line will be a to have an easier to use interface? How far way are we from that?
@nenebotete9 ай бұрын
hey p3tro! thanks for sharing your knowledge on using local training. ive been training my own voice ever since! btw, do you have information how to adjust the ff under so vits fork v4.1.11 -cluster infer ratio -noise scale -pad seconds -chunk seconds -max chunk seconds thanks!!
@theis37trials41 Жыл бұрын
This video saved me! One question, wehn training new models do I need to move or clear previous data or config files from other models to new locations?
@TheNickelodius10 ай бұрын
I did everythig accordig the manual - and it was success! Now after all lines I put, there is a message 'epoch 6 (7,8)/9999'. Do I need to wait? Thanx
@Darkil Жыл бұрын
Hello, someone had a breakdown when you use the model and at the output you get a file without sound, although the console says that it was completed successfully?
@Anya-ef3sb Жыл бұрын
Dude your videos helped so much! Do you know if this software could be used for audiobooks?
@premiumpatchstudio5 ай бұрын
Guys, don't any of you had the same problem? At the beginning I have, for example, 2000 audiofiles. When I'm making pre-hubert they became about 1000. And when I start train only 100 remains. Is it okay? What's the problem?
@SantoValentino Жыл бұрын
Do you have any idea why I wouldn't be able to edit and save the json file? I change the settings (like epochs) and it saves correctly, but when I train the process shows Epoch 1/9999. when i open the json file again it reverts back to default numbers.
@SantoValentino
Жыл бұрын
I figured it out. I have to edit the config file AFTER creating it with account pre config 🎉
@arthurkirkland64878 ай бұрын
thank you so much for this tutorial! How do I start training a new voice and also how do I add more samples to an exciting voice?
@manoknight3780 Жыл бұрын
But what if i want to have same voice as combine, for example. Will that work properly?
@banduharisch Жыл бұрын
Great ! Can you please explain what settings should be used in UVR to extract vocals from songs bcz UBR has several settings to choose and unable to understand those. Thanks ! Harischandra
@dcmb1967 Жыл бұрын
is it normal for the epoch count to start back from zero when continuing to train? So my friend wanted to try out my dataset so I zipped it while it was training and anaconda came back with a "permission denied" since the files were in use. After it finished zipping I just redid the training command, but instead of picking back where it left off, it started over? I looked on the github and they just said to use the same command, but not sure why its starting completely over
@carolh.7134 Жыл бұрын
I already began training my model, but there are some wav files that I wanna delete because they don't sound as good as the others. Will there be a problem if I delete these files? Or should I start training from zero?
@embatocadosjaguanlimares409 Жыл бұрын
AWESOME tutorial my friend. Can you tell me what I have to do to train a second model? May I have to backup in a new folder the config.json and the .pth file, erase them of the svc folder and so and just then train again?
@defmastadee Жыл бұрын
Thx for tutorial... I have a question.. if we have only one song of the artist, how can we have 100
@xevaken8 ай бұрын
I got everything working like installation, model training and running the UI. But when I run the ui, my mouse and keyboard even audio stops working. Nothing works except my trackpack for some reason. I'm on WIndows 10 and there's no error message of any kind. Please help!
@straykidsurr1793 Жыл бұрын
I've been training a vocal for hours and nothing has been updated in the logs folder. after 6 hours reached 500 epoch
@fressvm2245 Жыл бұрын
tysm for the video bro
@Liza.Wharton Жыл бұрын
i'm only getting two results in the folder when i use the slicer.. what am i doing wrong?
@raazzzin Жыл бұрын
I'm getting the following error when clicking Infer. I am running on a Mac. I did train the model on my Mac as well. Any ideas, thanks in advance. RuntimeError: don't know how to restore data location of torch.storage.UntypedStorage (tagged with mps:0)
@catcherblockdigital Жыл бұрын
Is there a way to pause training and test?
@hmmokidk11 ай бұрын
Any idea why my models sound really silly? Like this particular one only appears to hit 1 note. I selected samples from 3 different songs but it doesn't sound right at all.
@powerstationw Жыл бұрын
if i made a cloned voice with elevenlabs, can i use it with so vits svc? or do i have to train a separate model?
@nintendho3775 Жыл бұрын
Anyone know how to do this properly ? I’ll pay for a custom model I can’t seem to figure it out :(
@nicholaspierson2005 Жыл бұрын
Is there a way change already existing RVC voice model files into SVC voice model files?
@senomichaelsantosa4500 Жыл бұрын
Hi, i want to ask something. So i'm running the svc on the google collab and i wanted to train it to 9999 Epoch but it got disconnected at the half. Anyone know how to resume the training?
@chillingFriend Жыл бұрын
thank you very much for this! i got to work on my pc, but i cannot get the training done, as my gpu is too weak... could you make a tutorial on how to train on paperspace? i cannot get it running there... would be really amazing! Thanks for you cool content and cheers!
@leo373Osakana Жыл бұрын
When I do svc train, I got an unicodedecode error 'charmap' codec can't decode byte 0x8d in position 102: character maps to Please help I'm an informatics noob
@Aiovo Жыл бұрын
when i put the wav files into the audio slicer the results is still longer than 10 seconds, I am not sure why?
@p3tro
Жыл бұрын
Sorry if I didn't cover this well enough! Basically, it goes by time between silences I think it's the last setting. The lower you go the lower the length, to get under 10 seconds I set mine to 25ms but there were still maybe 10% of the data set at higher like 25 seconds long, so I lowered it to 15 and ran the longer ones through that way
@imakecontent22069 ай бұрын
Is there any way to use RVC on Mac?
@shailendrarathore445 Жыл бұрын
How to use Google colab where to put reference audio file and Target audio file..
@Paparder4 ай бұрын
I have a question. So far I've learned how to correctly implement data for the program to analyze. My question is: What happens after that? How does the program deliver a song? And how is this song structured? Is it random or can you tell the program to sing certain words in a certain key? Looking forward to your response.
@volu9913 Жыл бұрын
How do I use the svc commands without Anaconda? I've installed my so-vits-svc-fork directly with install.bat, not with Anaconda Powershell in an env.
@waleyqiao Жыл бұрын
What makes a good dataset? Does the vocal have to cover a wide range of phonetics, pitches, cadences?
@sludgebucket3042
Жыл бұрын
I would think so, I noticed some of the models I was testing today sounded more artificial at certain octaves and some sounded more natural at those same octaves, probably because the training data didn't have many examples of that singer singing those particular notes
@cwdoby
Жыл бұрын
You want your training to be good at what it does. Like if you're training a singer who can sing soft but also sing really hard, like Michael Jackson or Kurt Cobain, you are going to want different training for each style even if they use both in the same song. Then just using editor to put both in later. This will make your song sound the best.
@Antonsetiady Жыл бұрын
Can i do this on laptop with gpu rtx3080 vram16gb and amd ryzen9 5900hx Or better Intel core i9 137900x nvidia gtx4060 vram 12gb
@oldluk Жыл бұрын
can we use clean vocals from other sources from training? like interviews, podcasts, etc
@TheMasterfulcreator
Жыл бұрын
i've seen some good videos of obama singing. i'm about to try to make a lara croft one from dialogue. it's easy to try and see!
@vevo_ai Жыл бұрын
küss dein herz bre
@akhileshkannan842111 ай бұрын
what would be the system requirements for this software? I have a very basic Laptop Computer which is pretty old too. Has a 6th gen i3 with onboard graphics. I want to make some AI voice overs of popular songs just for entertainment purposes.
@SantoValentino Жыл бұрын
FYI, I couldn't install pyutil with "conda install pyutil" and I had to google which phrase to use then it wotked. Training my first samples now. Thanks petro
@SantoValentino
Жыл бұрын
I believe I used "pip install -U pyutil"
@AdelChamsdine Жыл бұрын
I can install this on mac m1?
@candyman3537 Жыл бұрын
When I run svc hubert -fm crepe, I will get the error message: RuntimeError: don't know how to restore data location of torch.storage.UntypedStorage (tagged with mps:0). I suspect it's because I'm running in Mac without GPU. My question is how to disable GPU?
@sharpcircle687511 ай бұрын
Hi! I've been trying to train a model but whenever there's an error or I close the windows and reopen it to continue the training, it's seems to start over each time ( 0/X epoch and 0/X steps). I also checked the log files and it seems to create a new one at each "SVC train -t" run so I'm not sure if it's possible to just pause and resume training lol
@sharpcircle6875
11 ай бұрын
Well turns out it might be due to the log eval number... Well there are no checkpoints at all if you put this number to the max x) I'm more confortable with a number that saves a checkpoint at every 50 or 100 epochs :v
@Blitztreeshorts Жыл бұрын
I need help plz wtf does he put the dataset folder and how does he setup the code on conda I’m confused
@firstland_fr11 ай бұрын
How to setup many models ? In isolated env ?
@TuanNguyen-xw2qy2 ай бұрын
hi, command: svc pre-resample, so what and where is svc?
@williamsjessem Жыл бұрын
So I had CHATgpt figure out what the numbers mean in the config file and this is what it said - "train": This section contains the training configuration. "log_interval": The frequency (in number of steps) at which training progress is logged. "eval_interval": The frequency (in number of steps) at which the model is evaluated on the validation set. "seed": The random seed for reproducibility. "epochs": The total number of training epochs. "learning_rate": The initial learning rate for the optimizer. "betas": Coefficients used for computing running averages of gradient and its square for the Adam optimizer. "eps": Term added to improve numerical stability in Adam optimizer. "batch_size": The number of samples per batch. "fp16_run": Whether to use mixed-precision training (half-precision). "bf16_run": Whether to use bfloat16 precision training. "lr_decay": The learning rate decay factor. "segment_size": Likely refers to the size of segments into which the data is divided for processing. "init_lr_ratio": The initial learning rate ratio. "warmup_epochs": The number of epochs for the learning rate warmup phase. "c_mel": The weight of the mel spectrogram loss component. "c_kl": The weight of the Kullback-Leibler (KL) divergence loss component. "use_sr": Possibly stands for "use scheduled sampling rate". Scheduled sampling is a technique to make the model more robust. "max_speclen": The maximum length of the spectrogram. "port": The port number for logging or communication purposes. "keep_ckpts": The number of model checkpoints to keep. "num_workers": The number of worker processes for data loading. "log_version": The version of the logging utility. "ckpt_name_by_step": If true, checkpoints are named by the training step. If false, likely named by epoch. "accumulate_grad_batches": The number of batches for which the gradient is accumulated before performing a backward/update pass. "data": This section contains the data configuration. "training_files": The location of the training data file. "validation_files": The location of the validation data file. "max_wav_value": The maximum possible amplitude of the waveform. "sampling_rate": The sampling rate of the audio. "filter_length": The length of the filter in the Fourier Transform applied to the audio signal. "hop_length": The number of samples to step between frames in the Fourier Transform. "win_length": The window length in the Fourier Transform. "n_mel_channels": The number of Mel-frequency bands. "mel_fmin": The minimum frequency in Mel-frequency calculation. "mel_fmax": The maximum frequency in Mel-frequency calculation. "contentvec_final_proj": Possibly a boolean indicating whether to use a final projection layer for the content vector. "model": This section contains the model configuration. "inter_channels", "hidden_channels", "filter_channels", "n_heads
@Highaith Жыл бұрын
Can some 1 please tell me if this Does works in macbook air ?
@pikasfed Жыл бұрын
I'm working on Google Colab, but I think my problem does not depend on that. I think I'm splitting the files correctly, they're all under 10 seconds, but when I run svc pre-resample it does find the folder, but it doesn't process any files, as if they were all too long or invalid. I've tried uploading them at different sample rates (44.1k, 48k, 44k), but I can't get it to work. They're all wavs in mono, that don't contain any weird characters in their names, just underscores and spaces, like yours. If you need more details please let me know, what could be causing this?
@hoshirou_to64p
Жыл бұрын
same as me :(
@pikasfed
Жыл бұрын
@@hoshirou_to64p I solved it. I was putting my audio files directly insid dataset_raw/, but I needed to create another folder first apparently, so dataset_raw/folder/files.. which is weird because I remember trying that already yesterday and it didn't work, but whatever it worked now and that solved it for me
@hoshirou_to64p
Жыл бұрын
@@pikasfed useful info! thanks for sharing bro!
@Nuns341
Жыл бұрын
@@pikasfed u have google lab pro? how long did training take?
@pikasfed
Жыл бұрын
@@Nuns341 no colab free. training time really depends on the dataset you have. for my first model it took 3-4 hours with RVC, but I spent days on the dataset
@laykit3513 Жыл бұрын
i'd like to get some help for mac installation for audiosplicer cos it is a bit complex
@Musikforalla1 Жыл бұрын
”No audio data was found” tensorboard. How can i solve this
@zetazetazetaOG Жыл бұрын
I need help plssss :( this svc : The term 'svc' is not recognized as the name of a cmdlet, function, script file, or executable program. Check if you typed the name correctly, or if you included a path, check that the path is correct and Try again.
@MySoundsYourEars
Жыл бұрын
Reinstall so vits with thé command install
@bruhmoment23123 Жыл бұрын
[W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [DESKTOP-N1DGT8B]:55495 (system error: 10049 - The requested address is not valid in its context.). C:\Users\user\anaconda3\envs\so-vits-fork\python.exe: Error while finding module specification for '__main__' (ValueError: __main__.__spec__ is None) i tried looking at ChatGPT and looking online, but I could not fix this error. i tried turning off firewalls. i tried removing things from the host file. i have tried a VPN. i have tried different network. does anyone know how I can fix it.
@4stacks67long Жыл бұрын
Do i only have to make one voice recording to work?
@christianmontero8312 Жыл бұрын
so if I have 100 samples can you tell me how many epochs, batch size etc? I tried using your example but it wouldn't work :( I actually left it alone and skipped your step initially and it started to load. I just can't get past svc pre-hubert properly.
@darkfield1952 Жыл бұрын
Is it possible to use an existing model and train it further? Or do you have to start over every time?
@InquireWithin
Жыл бұрын
Im wondering the same thing, this would be great to know how to do
@KenDoStudios Жыл бұрын
so i got so vits fork running but some how there are no files in the folder i installed it in. (the d usb drive) do i need to upload the files and put them in the correct folder? also i checked the files are not in my c drive either.
@p3tro
Жыл бұрын
They might be located on a temp drive on your main C drive! I used a anaconda environment for that reason because it can be hard to find the proper pathing of a regular python installation. If you used anaconda look for a folder called “anaconda” and inside it will be “env” and in their will be the so Vits folder
@klaurcschwackerberg1880 Жыл бұрын
But you did not say this is only for MAC ! Why is there no RVC GUI version for PC users ?
@sludgebucket3042 Жыл бұрын
Will this work with 4gb vram? I have a GTX 960 4gb... or am I better off using google colab? And if so is there a tut to do this on google colab?
@philomade
Жыл бұрын
You want at least 12gb vram ideally
@Remzenhaide Жыл бұрын
please teach how to train model on colab
@brooklynrebels754 Жыл бұрын
8:07 how long does it take for you to train a model with that numbers? I followed the same shown here, have 137 of my voice samples & it’s been 16 hours since i left my laptop running to train the model & my epoch is only at 71/9999. Hope you could help me out on this
@suave5394
Жыл бұрын
depends on your gpu. i have a 4090 and it took 6 hours to finish with similar numbers to his.
@VENOMPresents
Жыл бұрын
@@suave5394 training a model and my process doesn't show the goal... i mean he says the epoch number that is processing on the moment (which is the 99º) but doesn't say how much left to finish :(
@joooosl
Жыл бұрын
it's really haevy on your gpu. i have 315 samples and i'm at 62/9999 after 1 hour (rx 6800xt). i guess the solution is to turn down the epotch count, but i also don't want to waste all of the model trained already. he said that if it stops mid session you can restart it again. has anybody tried pausing it, changing the config to a lower epoch count and then continueing?
@matbeedotcom
Жыл бұрын
lol doing ML work on a laptop is hilarious
@Darkknight39
Жыл бұрын
@@suave5394 me too thanks for this now i know how long. dose it matter what they are named
@youtubegenixtv9736 Жыл бұрын
Is it possible to use this system for talking dialogs only? Is that possible?
@-kuler882
Жыл бұрын
Yes
@Horaceingrammusic11 ай бұрын
How can I do this on a mac
@bombasticbagman Жыл бұрын
Can I do this to myself to make myself sound good in songs?
@BenPotts Жыл бұрын
i have a 2070 Super but for some reason only my CPU is being utilized?
@p3tro
Жыл бұрын
Possibly an error during the PyTorch installation? Not sure I’ve seen that before -!
@cwdoby Жыл бұрын
how to slice up rap? its going so fast the 10ms doesn't work.
@jabkadnia5293 Жыл бұрын
As I understand no so-vits-svc on my rx 580?
@WaltzeVT3 ай бұрын
I tried going into the discord but the #ai-bot channel is not there anymore. Is there an alternative?
@bellekitty706711 ай бұрын
What gpu do you need for this?
@youtubegenixtv9736 Жыл бұрын
can u make this with cpu?
@timshrum4064 Жыл бұрын
Why are you pointing it to the G_0.pth model vs. the D_0.pth model? What's the difference between the two?
@p3tro
Жыл бұрын
Arbitrary as to why I did it in the video, I should of specified the D path is generally the better of the 2 models though in general !