The Secrets Behind Voice Cloning & AI Covers

Ғылым және технология

To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/bycloud . The first 200 of you will get 20% off Brilliant’s annual premium subscription!
Have you ever wondered how are AI covers made? How are presidents playing overwatch together? Well in this video you'll find out all of the details about AI generated voice, AI voice cloning or voice deepfake that is literally everywhere on the internet right now. From memes to AI covers, AI voice synthesis has become the spotlight without people knowing what and how it is being done. In this video, I'll cover the basics of how AI voice works and how people are using this technology to do things that you have seen.
Special thanks:
- Synthetic Voices
- JustinJohn
- and my editor Askejm
Online Services
[Uberduck] uberduck.ai/
[Fakeyou] fakeyou.com/
[ElevenLabs] elevenlabs.io/
Local UIs
[Tacotron2] github.com/BenAAndrew/Voice-C...
[Tacotron2 Tutorial] • Voice Cloning App
[Ultimate Voice Remover 5] github.com/Anjok07/ultimatevo...
[TorToiSe] git.ecker.tech/mrq/ai-voice-c...
[TorToiSe Tutorial] • Local Voice Cloning fo...
[so-vits-svc 4.0] github.com/voicepaw/so-vits-s...
[so-vits-svc 4.0 Tutorial] • Super Fast Voice To Vo...
[so-vits-svc 5.0 (NEW)] github.com/PlayVoice/so-vits-...
[RVC] github.com/RVC-Project/Retrie...
[RVC Tutorial] • AI Voice Cloning for S...
This video is supported by the kind Patrons & KZread Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä
[Discord] / discord
[Twitter] / bycloudai
[Patreon] / bycloud
[Music] massobeats - lotus
[Profile & Banner Art] / pygm7
[Video Editor] @askejm
0:00 Intro
2:06 Text-to-Speech AI backbones
3:38 Vocoder AI backbones
4:54 Voice2Voice AI backbones
7:51 TalkNET
8:10 Online services
10:42 Local UIs
11:46 Ultimate Combo?!
13:14 TorToiSe + RVC vs ElevenLabs Pro voice
15:29 Sponsor & Outro

Пікірлер: 171

  • @bycloudAI
    @bycloudAI11 ай бұрын

    To plug the sponsor: try everything Brilliant has to offer free for a full 30 days, visit brilliant.org/bycloud . The first 200 of you will get 20% off Brilliant’s annual premium subscription! P.S. Nothing in this video is voiced by a real person. All the voices are fake (except for 12:32 lol) The first 1 min (0:00~0:58) is generated using voice2voice with my real voice as the reference. 0:58~12:47 is generated with the combo which I mentioned in 11:46. From 11:46 till the end is all ElevenLabs Pro Voice Cloning.

  • @bycloudAI

    @bycloudAI

    11 ай бұрын

    @@thelegendguyofficial dw the music and the content is not HAHAHA and will probably not be anytime soon here's the music yt link kzread.info/dash/bejne/YoatyaRmncTWo84.html this person makes banger lofi, go support them

  • @NevelWong

    @NevelWong

    11 ай бұрын

    @@bycloudAI So.... if it's ai generated, it cannot be copyrighted, right? So if I use this copyright-free voice to train a model of, and I then use that model to narrate my own videos, that would be legal, right? I am equal parts concerned and titillated.

  • @jamessharpe2630

    @jamessharpe2630

    11 ай бұрын

    ​​@@NevelWongvoices in general can't be copyrighted. If it was a slogan(arrangement of sounds) or roar/yell then yeah copyrightable.

  • @Mark_Rober

    @Mark_Rober

    11 ай бұрын

    I was thinking to myself every so often 'his voice sounds a bit fake' but I swear it was just because this video was about cloning AI voices and if you had done anything else, like make a minecraft video for example, I wouldn't even have imagined it being AI.

  • @Deagan

    @Deagan

    11 ай бұрын

    based.

  • @__aceofspades
    @__aceofspades11 ай бұрын

    I didnt realize this was AI narrated until you said it was... I just assumed the scuff in the audio was due to using a worse mic like from a laptop or some screw up when editing, it sounded off but not AI off. As much as I believe AI is the future, we are clearly going to be in for a very very rough ride from here on out. You'll basically only be able to trust that something was real if you saw it in person, no audio, no pictures, and no video will be trustworthy.

  • @gh0stpyram1d

    @gh0stpyram1d

    11 ай бұрын

    fr i had a whole mental picture of how this admin looked and i realize that was a mental picture of a robot lmaoooo

  • @asdfssdfghgdfy5940

    @asdfssdfghgdfy5940

    11 ай бұрын

    Nah there are relatively simple ways of digitally signing things to prove you said them or filmed them etc. It will become a problem for the masses for sure especially if people keep believing whatever they see on Facebook. It will be easy enough for the more tech savvy peeps, or people who are required to vet things (e.g. Reporters) to work out if they are real or not. Or at least if they have been signed or not.

  • @quazar-omega

    @quazar-omega

    11 ай бұрын

    Then the Matrix credits roll in inside your eyes

  • @kamranheer203
    @kamranheer20311 ай бұрын

    WTF I thought that was your voice. I guess generative AI these days is something else.

  • @albertsitoe7340

    @albertsitoe7340

    11 ай бұрын

    I I struggle to understand how society will even function in the next 50 to 100 years

  • @David.Alberg

    @David.Alberg

    11 ай бұрын

    ​@@albertsitoe7340Bro all the experts struggle if the society will function in 3-5 years 😂

  • @Kynatosh

    @Kynatosh

    11 ай бұрын

    I heard artifacts so I had doubts

  • @trollenz
    @trollenz11 ай бұрын

    Everything you always wanted to know about speech synthesis* (*but you've never found). Thanks mate for this masterclass ! ❤

  • @rotors_taker_0h
    @rotors_taker_0h11 ай бұрын

    Nice information dump, good job on collecting all this info. To be honest, this tech is good enough that I wouldn't be surprised if any of your previous videos were voiced by AI too. As a random youtube viewer I have no idea if cartoon cloud's voice is a real person or totally generated anyway.

  • @juanjesusligero391
    @juanjesusligero39111 ай бұрын

    Your videos are the best, seriously! Not only do you keep us in the loop about all the cool AI stuff, but you also manage to make it super entertaining. Big thumbs up, man! :D

  • @phizc
    @phizc11 ай бұрын

    First time viewer here. When this video showed up in my feed, that click-baity title almost made me skip it, but this is definitely the best video about different options for TTS and voice cloning I've seen yet. Well done. I'll definitely stick around and see what other videos you've made.

  • @krystiankrysti1396
    @krystiankrysti139611 ай бұрын

    "most boring" bit you mention is actually the most useful info in this video, links to websites and what theyre for

  • @zjihf
    @zjihf11 ай бұрын

    Thank you I ve been searching for this so long

  • @wuy4
    @wuy411 ай бұрын

    That Asmongold cameo lol

  • @zyxwvutsrqponmlkh
    @zyxwvutsrqponmlkh11 ай бұрын

    You knocked this one out of the park. A+ video.

  • @gameb30232
    @gameb3023211 ай бұрын

    this is so cool i wanted to do this for so long! thank you!

  • @icedude_907
    @icedude_90711 ай бұрын

    Thanks so much for this - this is a great place to start for AI voice generation on local machines. I'm eager to experiment on mine

  • @iambinarymind
    @iambinarymind11 ай бұрын

    Fantastic overview. Much thanks, bycloud

  • @andreya.l.1270
    @andreya.l.127011 ай бұрын

    I missed your videos man, good work, keep it up

  • @netoeli
    @netoeli11 ай бұрын

    The fact that you have to let us know that was not an actual real discord call with asmongold, as if the intelligence in the choice of words did not give it away already

  • @Askejm

    @Askejm

    11 ай бұрын

    TRUE

  • @shadowrealms2676

    @shadowrealms2676

    10 ай бұрын

    @@Askejm BIG W!

  • @Siacourage
    @Siacourage4 ай бұрын

    Best video about AI voice cloning I've found so far on the internet. I'm saving it to revisit later when I have more powerful hardware to run the Tortoise and RVC combo. In the meantime I think Eleven Labs will suit my needs. Thanks for all the great info. Subscribed.

  • @sneedtube
    @sneedtube11 ай бұрын

    Wew lad, one of the best vids that I watched in months. God-tier quality!

  • @USBEN.
    @USBEN.11 ай бұрын

    BRUH made whole video with this, EPIC!

  • @Shiina_Mashiro
    @Shiina_Mashiro11 ай бұрын

    6:47 nope. It was sovits. They used my weeknd model. Sovits is pretty good at raw studio quality vocals assuming the dataset is good. Which my weeknd model isnt it lol

  • @jurandfantom
    @jurandfantom6 ай бұрын

    At last I managed that! Thank You ByCloud !

  • @absence9443
    @absence944311 ай бұрын

    Beautiful video! Really helpful :)

  • @4.0.4
    @4.0.411 ай бұрын

    Might be good to mention you can run Whisper locally to transcribe audio. The large-v2 model is better than whatever KZread uses, even if slow.

  • @Askejm

    @Askejm

    11 ай бұрын

    Well its included by default in MRQs tortoise ui and i think RVC uses it too

  • @jurandfantom
    @jurandfantom11 ай бұрын

    So if I get it right. 1) record voice 2) use whisper to get transcription (+some fixes of text) 3) use text-voice model that is similar to our voice 4) use voice-voice (that model need to be trained on our own) --- -Training of voice happens once. -we are doing all of that to make our dialog more smooth, but we still make voice over to video for correct speed and length of video (not a case when video is created after voice creation).

  • @l.halawani
    @l.halawani4 ай бұрын

    super interesting, as an AI Product Owner i find your videos invaluable to quickly catching up with all tech at once.

  • @flowerpt
    @flowerpt11 ай бұрын

    Wow, that was dense - awesome!

  • @ceticx
    @ceticx11 ай бұрын

    amazing video, if this isn't a 1/10 confetti video just know it deserves to be

  • @bycloudAI

    @bycloudAI

    11 ай бұрын

    its a 10/10 bottom feeder lol rip

  • @pikaa-si9ie

    @pikaa-si9ie

    11 ай бұрын

    @@bycloudAI I'll give you a like to try to push the algorithm 👍😁😁

  • @zenu903
    @zenu90311 ай бұрын

    I was actually fooled too and didn't realize it wasn't his voice until he pointed it out. Any imperfection you hear could be confused with his accent anyway and his monotone voice also helps so it makes it extra hard to spot

  • @dudedude-su7pt

    @dudedude-su7pt

    11 ай бұрын

    There thousands of channels like this lol. Most people don't know what voice is robotic or real

  • @krishp1104
    @krishp110411 ай бұрын

    wtf this is the first time AI actually fooled me

  • @ojsef39

    @ojsef39

    11 ай бұрын

    i was eating while watching and only notices it because of the muffle and the red line im peripheral vision hahaha

  • @ojsef39

    @ojsef39

    11 ай бұрын

    oh damn, i wasn’t at the part where he revealed it yet. im shocked hahah

  • @handle__

    @handle__

    11 ай бұрын

    ​@@ojsef39same. When I first saw the comments when I haven't yet reached that part I thought people meant the red line parts, but then mind blown🤯😮

  • @wham7125

    @wham7125

    8 ай бұрын

    Definitely not the first time, but you wouldn't know that of course.

  • @LinkRammer
    @LinkRammer11 ай бұрын

    I had absolutely no idea that your voice was completely ai generated... WHAT?!?!?!

  • @quinnherden

    @quinnherden

    4 ай бұрын

    Definitely not. Just that one section :)

  • @ShepoPL
    @ShepoPL11 ай бұрын

    At 0:09 I realized that was AI model of your voice. It's hilarious to listen to AI talking about how great voice deepfake is 😂

  • @Askejm

    @Askejm

    11 ай бұрын

    well thats funny because the first minute is his real voice

  • @ShepoPL

    @ShepoPL

    11 ай бұрын

    @@Askejm You're wrong my guy. Listen carefully when he talks with high pitch and compare it with his other videos where he talks this way. You will hear the slight difference

  • @Askejm

    @Askejm

    11 ай бұрын

    @@ShepoPL no, he did narrate it normally. the artifacts is probably because we added V2V for it to be consistent with the rest of the video. as this was done with RVC v1, it leaded to some artifacting despite a ground truth input

  • @quinnherden

    @quinnherden

    4 ай бұрын

    ​@@AskejmHe mentions at the end that this is AI

  • @gh0stpyram1d
    @gh0stpyram1d11 ай бұрын

    Goated Ai channel

  • @akshatgarg6635
    @akshatgarg66356 ай бұрын

    Can you please tell how did you train TorToise TTS in your voice. I saw the repo but it is not mentioned how to fine-tune it on your voice

  • @beowulf2772
    @beowulf277211 ай бұрын

    Hey! your videos are very professional and well edited! You deserve this like and comment.

  • @marian3248
    @marian324811 ай бұрын

    I was watching this video at 2x speed and got giga fooled by your ai voice, I really couldn't tell this wasn't you.

  • @lucas_zampar
    @lucas_zampar10 ай бұрын

    Great video!

  • @bodyswapai
    @bodyswapai11 ай бұрын

    Love your videos!

  • @jan-Juta
    @jan-Juta11 ай бұрын

    Just waiting for Live V2V to become viable in the open source space. Would be insane for tabletop RPGs and VA for solo projects. Live RVC is kinda working, but not very well.

  • @4.0.4

    @4.0.4

    11 ай бұрын

    VA for solo projects doesn't need to be live, why trade quality for speed in that case?

  • @Kisai_Yuki

    @Kisai_Yuki

    11 ай бұрын

    It already is. You can use the RVC software to create an ONNX and then take the ONNX to MMVCServerSIO. It will work with very little tweaking. The problem is that RVC is more of an auto-tune. It will not change someone's gender, accent or age. It can only create a voice filter. And what is being passed off as "AI singing cover" is really just laundering someone elses singing through this pitch tuning. So taking one singer and using it to sing a different singer, tuned ON that singer, isn't actually a cover, at least not by what the term "cover" means. But it is useful for creating a character voice. So if one were so inclined, a D&D campaign could be made very interesting by using the RVC to train voices (eg a deeper voice for barbarian troll, and a higher pitch voice for a dwarf or halfling) and the GM could create unique NPC's for characters without having to strain their voice.

  • @sujimatsubackupaccount194
    @sujimatsubackupaccount19411 ай бұрын

    RVC retains to core trained voice meanwhile sounding smooth. The SO-VIST-SVC removes most of the trained voice personallity , makes it more based on the voice in the source audio and make the voice sound flat weirdly enough, Even for talking RVC has the better strengths . Tho it suffers from sharp note transitions like c2 to c5 which can cause issues.

  • @stephantual

    @stephantual

    7 ай бұрын

    Exactly. And don't get me started about accents ;) My 'charming' french accent is the bane of these tools.

  • @naeemulhoque1777
    @naeemulhoque177710 ай бұрын

    this video is gold

  • @PriyanshuGupta007
    @PriyanshuGupta00711 ай бұрын

    Bro You Are Amazing.

  • @fnytnqsladcgqlefzcqxlzlcgj9220
    @fnytnqsladcgqlefzcqxlzlcgj922011 ай бұрын

    WOAH i didnt notice it was AI and I work with audio constantly. trippy!

  • @lauraalvarezgonzalez6184
    @lauraalvarezgonzalez618411 ай бұрын

    Thanks!!

  • @Beyondarmonia
    @Beyondarmonia11 ай бұрын

    That "listening to right now" hit me like a freight train. Came to the comments and happy to see everyone else is having a simmilar reaction.

  • @sharptrickster
    @sharptrickster11 ай бұрын

    Do we currently have any TTS pipeline with good enough quality for non-english languages?

  • @Askejm

    @Askejm

    11 ай бұрын

    your best bet is probably 11labs multilingual, which still only supports a handful of languages

  • @YoIomaster
    @YoIomaster11 ай бұрын

    another great video. keep it up brother! QUESTION: I want to wait until fall because AMD is gona enable shader conversion (basically allowing high end consumer cards to use CUDA coded AI tools) until i buy a new gfx card, I really struggle learnign new things with my 6gb 1660 Super but i aslo don't ant to support Nvidias incredible greed and market anipulation. Would your ecommend me to wait and support AMD or what would be the route you would go? I want to go full Audio synth setup and im already using Stable diffusion 1.5

  • @_Everything_is_Fine_
    @_Everything_is_Fine_11 ай бұрын

    are we only limit to voice cloning? any voice generator that generate new voice like changing parameters or combine two voice give one new voice?

  • @yuyiko
    @yuyiko11 ай бұрын

    great video. really love all of this AI content (keywords for youtube ;P )

  • @BHBalast
    @BHBalast11 ай бұрын

    Lol, on my smartphone i cant even tell a difference between your Real voice and fake ones!

  • @krishp1104

    @krishp1104

    11 ай бұрын

    At the end he says ALL audio in this video is AI generated

  • @BHBalast

    @BHBalast

    11 ай бұрын

    @@krishp1104 NOT all, there was a Little fragment. :)

  • @krishp1104

    @krishp1104

    11 ай бұрын

    @@BHBalast no literally all audio in the video is AI generated

  • @BHBalast

    @BHBalast

    11 ай бұрын

    @@krishp1104 I Dont get it, in his comment he says one fragment is not.

  • @_Sepherial
    @_Sepherial6 ай бұрын

    How do I use a cloned voice to read aloud a pdf file?

  • @slime-smp
    @slime-smp11 ай бұрын

    Can you please make a tutorial on how to do this its very confusing

  • @shApYT
    @shApYT11 ай бұрын

    Watching at 2x completely smooths out any bumps that rvc has. The cadence sounds off after pointing out that it is AI.

  • @stephantual
    @stephantual5 ай бұрын

    Non ironically still the best primer on the topic - 5 month on! (which is prehistory in AI) - 🤠

  • @lll-yq4hu
    @lll-yq4hu11 ай бұрын

    Great vid

  • @liam10000888
    @liam1000088811 ай бұрын

    I really like this type of video from you! The ai news was great, but as a layman it was too scattered

  • @mineralbunny8736
    @mineralbunny873610 ай бұрын

    Ah that “crappy” free KZread course Harvard let us have 😂 I actually took the Java CS50 class there and it was very good… I like that they record them so you can watch later!

  • @GavrikCat
    @GavrikCat11 ай бұрын

    What about BARK? But I guess it's not so good. Also, what option would be the best in terms of inference speed?

  • @mastermohit
    @mastermohit11 ай бұрын

    I can't wait for asmin to react to this

  • @OxidoPEZON
    @OxidoPEZON11 ай бұрын

    Can you have this narrate your weekly AI news videos? I loved that series, and I really would watch them all the same with this voice, I didn't notice until you exposed yourself.

  • @steve_jabz
    @steve_jabz11 ай бұрын

    Is RVC still better now that so-vits-svc 5.0 is out?

  • @JohnDoe-nn5pj
    @JohnDoe-nn5pj11 ай бұрын

    the biggest problem with TTS is that you need to make a transcription file for all your audio files. So tacotron needing 1-3 hrs of transcribed audio and that can take a very long time to do. RVC and SVC doesn't need transcripts so it's much easier to make training data.

  • @Askejm

    @Askejm

    11 ай бұрын

    just use whisper

  • @FenrirRobu
    @FenrirRobu8 ай бұрын

    What's up with skipping like a dozen webuis for audio. Not just for this video but many others on the audio AI also just end up showing some barebones default UI and completely miss the projects that are specifically improving the UI and UX.

  • @quinnherden

    @quinnherden

    4 ай бұрын

    Can you suggest some? :)

  • @FenrirRobu

    @FenrirRobu

    4 ай бұрын

    @@quinnherden I have forgotten a few but there's bark infinity, audio webui, tts webui, then for music there's also audiocraft-webui, Audiocraft plus. RVC has some specific additional UIs, there's also the tortoise RVC pipeline but I'm not sure if it's an UI. I watched the video again and I will say that it's well researched but it focuses on teaching about the technology, rather than showing the best ways to use it. If you want to hardcore go on tortoise, mrq might still be the best (although I think already during this video mrq was migrated to mrq's audio tools or something), RVC's original UI has the most buttons and unexplained options. I'm glad he didn't mention coqui because, at least 6 months ago it was just a closed source tortoise clone.

  • @DrW1ne
    @DrW1ne11 ай бұрын

    12:32 My mind blew up.

  • @VaibhavShewale
    @VaibhavShewale11 ай бұрын

    i need this tts cause i need to make videos that are usually long and i have to keep moving so that means background noise earlier i use to record room and then start recording but it used to take me over 2 weeks just to create a 5 min audio and that is too damn long pperiod. i thing need to do research in all this ool cause i dont have that much of money to invest in any of the company is offering for

  • @AngryApple
    @AngryApple11 ай бұрын

    Bark is also very interesting

  • @Crazybark
    @Crazybark10 ай бұрын

    The tacotron one sounded better than the tortoise one

  • @L_tlu
    @L_tlu2 ай бұрын

    9:13 they made it so you can make your own

  • @nils900
    @nils90011 ай бұрын

    How well does the TorToiSe + RVC combo work with other languages?

  • @Toliman.

    @Toliman.

    11 ай бұрын

    It would be reliant on the RVC training of phoneme and language salience of the native recording. Accents are naturally difficult. Ie accents and pronunciation is usually not neutral, so if you use a TTS to generate the non-english version, RVC will interpolate the accent and pronunciation based on the native accent it was generated with. So, if you generate an Austrian voice first, then pass it to a Japanese RVC, it will struggle to find matching properties. But, if you use a Japanese speaker to create English phonemes, and the RVC has examples of these equivalent phonemes, it will substitute. The effect is weird, which is why accents are difficult to emulate.

  • @nunuarthas8680
    @nunuarthas868010 ай бұрын

    we're witnessing bycloud turning himself to an ai then he's gonna upload himself to a cloud and live forever

  • @GoharioFTW
    @GoharioFTW9 ай бұрын

    15:12 Is nobody absolutely terrified of this? We could get to the point that someone could grab a minute of you talking and be able to use it accurately anywhere for anything.

  • @KW-jj9uy
    @KW-jj9uy11 ай бұрын

    I played with allot of these free tools, and 5he most difficult part (as usual) is installing them, lol

  • @memegazer
    @memegazer11 ай бұрын

    Some of those songs that sound good have a lot of work put into them as well. A lot of post processing as well with other audio tools

  • @alibahrami6810
    @alibahrami681011 ай бұрын

    What I get from this video is EEC, VTC, CCT, VTC, and HIGHGAN. 😂

  • @max_s557
    @max_s5574 ай бұрын

    This is the best video ive seen on this topic many thanks brother! I sent you a message on twitter but i couldnt DM because im not verified but i would like you to help me create a pipeline.

  • @samriddhlakhmani284
    @samriddhlakhmani2846 күн бұрын

    Thank god, I skipped sleep, to click on this video. Awesome survey

  • @mlcat
    @mlcat11 ай бұрын

    for just tts VITS is one of the best options

  • @mikk0706
    @mikk070611 ай бұрын

    Gothic-Bot ❤

  • @CassBOTRR
    @CassBOTRR11 ай бұрын

    Weird, I've always done Eleven Labs + RVC, not Tortoise

  • @Askejm

    @Askejm

    11 ай бұрын

    well imo 11labs is already good enough quality, its resemblance that it lacks. tortoise solves that, and RVC makes up for the subpar quality

  • @CassBOTRR

    @CassBOTRR

    11 ай бұрын

    @@Askejm i mean for RVC i just set the index rate up super high and it sounds good enough to be the actual person lol

  • @Askejm

    @Askejm

    11 ай бұрын

    @@CassBOTRR well one should be a little cautious with just jamming the index rate up. the rvc v2 is a lot more intrusive tho in my experience while also sounding better, but i feel like the resemblance you can get is just lackluster since youre limited to only 1 minute

  • @simonstrandgaard5503
    @simonstrandgaard550311 ай бұрын

    Wow

  • @SongStudios
    @SongStudios11 ай бұрын

    Neat

  • @homeyworkey
    @homeyworkey11 ай бұрын

    does the voice-to-voice follow the inflections in the original voice? ie if i a scream, the generated voice would scream too. even if this is a 10/10 video, its still good. ive been wanting to know how to clone the voice of a younger version of me, and now i know exactly what my options are (i tried researching before, to no avail). thank you ! :DD

  • @Askejm

    @Askejm

    11 ай бұрын

    yeah. i made bycloud do the crazy frog with V2V and it worked totally fine. the ding worked but surprisingly all the verbal sound effects were cloned too and it genuinely sounded like him

  • @homeyworkey

    @homeyworkey

    11 ай бұрын

    @@Askejm oh 100%, this video is extremely convincing. i do have a suspicion though that it is easy as most of his videos his voice is pretty flat (not an insult btw, its calming and i like it) if he had more variance in what voices he made, ie whispering, yelling, singing, speaking fast, speaking slow, even when you speak louder or quieter, its not as a simple as 'lowering the volume', the actual voice changes with it. this would require alot of data and to identify the 'tone' of the original voice recording, so you can interpret it for the fake voice generation. ik its complex but im just wondering where we are at with that sort of stuff.

  • @Askejm

    @Askejm

    11 ай бұрын

    @@homeyworkey well i found it to work pretty well. also, rvc uses a pretrained model

  • @alkeryn1700
    @alkeryn170011 ай бұрын

    is it just me that thought that tocotron sounded a lot better than tortoise ?

  • @Askejm

    @Askejm

    11 ай бұрын

    I think tortoise sounds better but by far the most noticeable thing is how tacotron has very poor resemblance

  • @0LexuZ0
    @0LexuZ011 ай бұрын

    Is this just me or this vid had a different thumbnail?

  • @Askejm

    @Askejm

    11 ай бұрын

    he switches it a lot after release, as does him and other youtubers often do

  • @CrashDeluxe
    @CrashDeluxe10 ай бұрын

    I'm stupid, I just heard lots of words jumbled together; RVC, VITS, VCS, JBC, RVC, BC?!?

  • @mrrespected5948
    @mrrespected594811 ай бұрын

    Nice

  • @thedementiapodcast
    @thedementiapodcast7 ай бұрын

    Bar none the best video on the topic. If your mother's tongue is American english, the FLOSS path is the best (use a cloud GPU for speed). But accents are unique to the person (im native french and my english is hit and miss on certain words, which currently no ai can learn, no matter how much data i give it). Even in the best case scenario, it's far from 'perfect' and the affect is overall very flat, as we can hear in this video. But it will get better over time, i'm sure.

  • @knoopx
    @knoopx11 ай бұрын

    us techno bros are not into karaoke xD

  • @monstercameron
    @monstercameron11 ай бұрын

    what about BARK?

  • @madcatlady

    @madcatlady

    11 ай бұрын

    I have the Bark Webui on my PC and it's a crazy lucky dip what you get, some sing and none sound the same as the previous one

  • @zap0p3rr0tr3inta
    @zap0p3rr0tr3inta11 ай бұрын

    11:50

  • @Bazilisk_AU
    @Bazilisk_AU10 ай бұрын

    OKay... I zoned out playing Genshin with this playing on my second monitor and I hear Asmongold and go "Wait wtf !?" and I went back and rewatched the whole thing for context and HOLY CRAP I DID NOT DOUBT THAT IT WAS YOUR VOICE THE WHOLE TIME ! Man... what a time to be alive. A tad too early to pilot mechs in space... just just in time for AI Waifus and have food delivered to your door while you watch anime, explore the stars with hyper-realistic games and argue with strangers on the other side of the world about made up problems.

  • @l.halawani
    @l.halawani4 ай бұрын

    [solved] this is a channel by fireship, but completely run by ai

  • @MnlBnt
    @MnlBnt11 ай бұрын

    my god this is too many tools

  • @rootatnite

    @rootatnite

    11 ай бұрын

    too little*

  • @renanleao5553
    @renanleao55537 ай бұрын

  • @angloland4539
    @angloland453911 ай бұрын

  • @minicup
    @minicup11 ай бұрын

    After about 30 seconds I realised it was AI

  • @benkrararara2185
    @benkrararara218511 ай бұрын

    gj

  • @fueledbylofi7078
    @fueledbylofi707811 ай бұрын

    Bycloud will soon be THE AI news source as this stuff gets more complicated and controversial and eventually will be completely self sufficient and ran by its own AI models trained on bycloud AI news videos 😶

  • @wintdkyo
    @wintdkyo11 ай бұрын

    Golf clap.

  • @kw4093-v3p
    @kw4093-v3p11 ай бұрын

    wtf I was actually fooled. I thought this was your real voice

  • @MidvightMirage
    @MidvightMirage11 ай бұрын

    the sponsor is not your real voice no way it is

Келесі