😕LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks

Ғылым және технология

There are 5 methods for teaching specific concepts, objects of styles to your Stable Diffusion: Textual Inversion, Dreambooth, Hypernetworks, LoRA and Aesthetic Gradients. The question is: which one should you use?
In this video we review 3 key research papers, look at the underlying mathematical mechanics behind each method, analyze data from civitai to arrive at an informed and final conclusion.
Discord: / discord
Live Stream in 8 hours: • 😕LoRA vs Dreambooth vs...
======= Links =======
Spreadsheet: docs.google.com/spreadsheets/...
LoRA paper: arxiv.org/abs/2106.09685
Dreambooth Paper: arxiv.org/abs/2208.12242
Textual Inversion Paper: arxiv.org/abs/2208.01618
Dreaming Tulpa: / dreamingtulpa
Driving a machine insane with Dreambooth: • I drove a Machine Insane
Good Tutorials:
Dreambooth tutorial by OlivioSarikas: • DreamBooth for Automat...
Hypernetworks tutorial by Aitrepreneur: • HYPERNETWORK: Train St...
Textual Inversion tutorial by Aitrepreneur: • ULTIMATE FREE TEXTUAL ...
Textual Inversion Paper Walkthough by me: • Textual Inversion with...
LoRA tutorial by me: • 7GB RAM Dreambooth wit...
LoRA tutorial by Nerdy Rodent: • LORA for Stable Diffus...
Aesthetic Embedings tutorial: • How to use Aesthetic G...
======= Music =======
From KZread Audio Library:
Escapism Yung Logos
Music from freetousemusic.com
‘Late Morning’ by ‘LuKremBo’: • (no copyright music) c...
‘Marshmallow’ by ‘LuKremBo’: • lukrembo - marshmallow...
‘Rose’ by ‘LuKremBo’: • lukrembo - rose (royal...
‘Snow’ by LuKremBo: • lukrembo - snow (royal...
‘Sunset’ by ‘LuKremBo’: • (no copyright music) j...
‘Travel’ by ‘LuKremBo’: • lukrembo - travel (roy...
‘Branch’ by ‘LuKremBo’: • (no copyright music) c...
#stablediffusion #aiart #ai #machinelearning #dreambooth #textual-inversion #hypernetworks #lora #aesthetic-gradients #tutorials #resarch #aesthetic-embeddings

Пікірлер: 345

@infocyde2024 Жыл бұрын
The thing about textual inversions is that they create embeddings that are cross combatable with the base models. A textual inversion trained with SD 1.5 will work with all 1.5 based models, and here is the kicker, you can combine them without having to do any model merging. That is HUGE.
@lewingtonn
Жыл бұрын
yeah, the flexibility of textual inversion is a big factor, also it's really cool conceptually!!
@zyin
Жыл бұрын
The video really should have mentioned this, it's an incredible advantage for embeddings that was just left out.
@neilslater8223
Жыл бұрын
Yes, combining two, three or more Dreambooth models is possible, but it takes time and generates yet another 2GB+ model that you need to save somewhere. Whilst textual inversions can be used flexibly within the prompts in any combination, including weighting them, using as negative prompts, all on the fly with no extra file management However, textual inversion cannot learn to output things that the base model is not able to do at all. So depending on the base model, it may not be possible to train a textual inversion for a specific concept.
@infocyde2024
Жыл бұрын
@@expodemita I do not think they are compatible between 1.4/1.5 and 2.0 2.1. 2.0 and 2.1 should be compatible.
@alexandrmalafeev7182
Жыл бұрын
@@infocyde2024 2.0 and 2.1 are for sure
@simonbronson Жыл бұрын
Much appreciated, having someone clever distil all of this dense information down and explain it succinctly and with so much enthusiasm is so refreshing!
@Animes4ever1 Жыл бұрын
Awesome comparison mate, great addition with the statistics, thanks a lot
@tomm5765 Жыл бұрын
Thanks for your hard work putting this together, very helpful to evolve my understanding of the different approaches. Much appreciated!
@KalebWyman Жыл бұрын
Thanks for explaining these so well, your visual diagrams are great!
@ParanoidAmerican Жыл бұрын
This video is exactly what I needed, and you went about it in the best way possible. Thanks for this
@m3dia_offline Жыл бұрын
I love it, love your promises on what we are going to get from your video at the very starting few seconds of the video itself, keep it going man, love your channel and your energy.
@fun7704 Жыл бұрын
This was a very informative video in fact, thank you! And I like your very dramatic delivery of the content! :)
@TheTruthIsGonnaHurt Жыл бұрын
Liked and Subscribed, Thank you for all the hard work!
@Unstable_Stories Жыл бұрын
I greatly appreciate this video sir! It is really helpful for me to have context of how things actually work behind the scenes to make mental connections and improve how I interact with the external program.
@takeuchi5760 Жыл бұрын
Thanks so much for this. Very underrated channel, literally was thinking something like this would be really helpful.
@fredingham1855 Жыл бұрын
Outstanding job explaining these concepts! Well done!
@AleOnYouTube Жыл бұрын
you deserve more subscribers, only channel I found that actually delivers what you need to know
@anthonyaddo Жыл бұрын
Such an EXCELLENT video. Very very well researched and perfectly presented. Thanks for sharing all your findings and appreciate the time it took.
@ytchen6748 Жыл бұрын
What a great video! Thanks for your academic sharing and empirical results❤
@NukerOfFace Жыл бұрын
Superb video. I don't think I've ever seen a tutorial/explaination for anything that is this good.
@moneyjuice Жыл бұрын
I love your videos, always on point !
@Philip8888888 Жыл бұрын
Wow. Thanks for this video, esp. the first part which gave just enough detail to understand the trade-offs and underlying approaches.
@metamon2704 Жыл бұрын
You explained that amazingly, very easy to understand - also things move fast because it seems like LoRA is now the most popular.
@swannschilling474 Жыл бұрын
Thanks for the input, good research!!
@jondargy Жыл бұрын
Very nice summary- thank you 🙏
@jackzhang8919 ай бұрын
Hey Koiboi. Great video. When you made this video, as you said yourself, LoRA was still very new and the stats are probably not accurate. Now that a good amount of time has passed, I would love to watch an updated analysis video on the effectiveness of LoRA compared to Dreambooth and Textual Inversion. Either way, this is the most informative video I've watched so far comparing these fine-tuning models. Liked and subbed 👍.
@dv8silencermobile Жыл бұрын
You are really good at explaining this stuff. Thanks!
@lionroot_tv Жыл бұрын
This is great. Thank you for sharing your knowledge, and about Excalidraw.
@Apothis1 Жыл бұрын
Really appreciate this, so many videos showing how to do this stuff, but not how it works, and specially not how it works dumbed down to a level I can understand. Very cool, thankyou
@AB-wf8ek Жыл бұрын
Thanks a ton for this breakdown, I've been struggling with this same question for a few weeks now. I had already come to a similar conclusion myself, but this was very validating. Dreambooth is preferred, but the models sizes make it so cumbersome and challenging to test different versions. With textual inversion, the file sizes are insignificant, and you can stack them on top of each other, making them very flexible. I haven't actually evaluated embeddedings (textual inversion) yet for quality because the animation notebook I use doesn't support them, but the developer just made it compatible, so I'm looking forward to testing it out more.
@nolanzor Жыл бұрын
Thank you so much for this video! Amazing work
@yo252yo Жыл бұрын
this is the best video about the topic ive ever seen, thanks so much
@TheAnna1101 Жыл бұрын
Thanks for making such great and informative video. Keep up the good work
@daffertube Жыл бұрын
Great video. Big thanks
Жыл бұрын
Thank You a lot. This has been a really good explanation that I felt missing.
@toastypanda2963 Жыл бұрын
Great explanation! I've learned more about how AI art works from this video alone than all my previous watched videos combined. Everyone tends to say how to configure things without explaining how it works.
@VitaNova83 Жыл бұрын
Absolutely incredible video, thank you!
@LuisPereira-bn8jq Жыл бұрын
That was a really helpful video that definitely saved me a bunch of time trying to understand these differences by myself :P
@lewingtonn
Жыл бұрын
saving people time makes me super happy, thanks!
@rickguzman9463 Жыл бұрын
THANK YOU THANK YOU THANK YOU!! Great video. Great insight.
@ksottam Жыл бұрын
Loved this breakdown. You need more followers!
@takocain Жыл бұрын
That was an insanely good explanation. Thank you!
@kazimozden4010 Жыл бұрын
Thank you for an informative and engaging video!
@danielaston65609 ай бұрын
This video is dope. Super clear and informative. Thank you!!!
@CameronRule Жыл бұрын
One interesting piece of data is Lora has quite a high faves per download rating while only being out for a short period of time
@lewingtonn
Жыл бұрын
yeah, I saw that too.... good sign!
@suryaprasathramalingam24214 ай бұрын
thanks for the short explanation. Loved it!
@martinchen9667 Жыл бұрын
brilliant video, thank you for all the efforts!
@jeronimogauna75085 ай бұрын
Best video I ever seen. Best vibes! Thanks so much
@user-be5rk2hy3c Жыл бұрын
Incredable explanation! Thanks a lot.
@friendofai Жыл бұрын
Really great video, thanks for sharing all your research!
@lewingtonn
Жыл бұрын
glad it helped!
@TurboSkibidiFun Жыл бұрын
This is so well taught man thank you so much
@takif8756 Жыл бұрын
Great tutorial mate, thank you!
@GayanZmith-vy1ql Жыл бұрын
i'm a total beginner to AI, and i suck at math, but you somehow managed to clear a shit ton of confusion. I was hooked on Dreambooth tutorials and trust me, you don't want that. I literally thought i was not going to be able to get started simply because of the massive resources it required. Trust me, you are really good at explaning things :) Really appreaciate the help
@glasco_
Жыл бұрын
I’ve been trying to install dream booth for 3 days now. No success. Ready to walk in front of a bus
@kyosukefukumoto9382 Жыл бұрын
This video is AMAZING! Thank you SO MUCH.
@jasonhemphill6980 Жыл бұрын
That's so much work! Thank you man
@mlcat Жыл бұрын
Very clear explanation, thank you!
@maggiezhuang38428 ай бұрын
This is awesome! thank you!
@kulusic1 Жыл бұрын
Textual inversion is far better on 2.1 than 1.5, and i think that's why they don't get the same love dreambooth receives. You can also speed up textual inversion training if you spend a few minutes getting the initializing text right so the vectors start in relatively close proximity to their final resting place. The best part imo, is you can combine many embeddings together, something which dreamtbooth doesn't really allow.
@leonardom862
Жыл бұрын
How can you get the initializing text right before the training?
@alefratat4018
Жыл бұрын
@@leonardom862 By running image to text I suppose ?
@nathanbollman
Жыл бұрын
Ironically I haven't been able to run dreambooth yet,I switched to linux for AI... something broken with PyTorch2.0 and Cuda11.7 only thing affected is dreambooth training. Turn on gradient checkpoint and it cant train, turn it off and I cant make it to the first epoch without running out of 24GB of vram? I hope this gets fixed soon.
@sub-jec-tiv
Жыл бұрын
Totally agree. Suuper crucial to be able to call multiple embeddings in a prompt!
@thedevo01 Жыл бұрын
Thank you so much for this video! 🙏
@tenghuili3711 Жыл бұрын
Very great job! Thank you!🥰
@mariokotlar303 Жыл бұрын
Awesome explanation, thank you!
Жыл бұрын
thanks for making those complex concepts easy to understand!
@BlancheNuit Жыл бұрын
That is the type of quality content that I'm digging for. I want to understand Stable Diffusion and everything related. But my attention span/knowledge about programming is not enough that I can just read papers about it. So I need videos, with visuals, and easy explainations. And your video was Perfect. Liked + Subscribed :)
@jichenzhang4385 Жыл бұрын
Very nice introduction! Thank you!
@thanksfernuthin Жыл бұрын
Great info! And coincides with what I learned on Computerphile's channel. Slowly but surely my mind is able to wrap around with what we're dealing with.
@errrorproduction Жыл бұрын
really great video! finally understand the differences. just the conclusion is already out of date, since we're moving so incredibly fast. lora, is the most popular format on civitai now. understandable, since training is the quickest, even though ti's end-result is much smaller.
@austinliu9218 Жыл бұрын
clearly explained, much appreciated!
@keiralx7 ай бұрын
Great video, really helped me understand this
@ticosanjr Жыл бұрын
Great Video! Thank you very much!
@darmok072 Жыл бұрын
thank you for the great explanation!
@ronenbecker18739 ай бұрын
You're an absolute legend. Great video
@dreamingtulpa Жыл бұрын
Why am I only now seeing this? Great video and thanks for the feature ❤
@kallamamran Жыл бұрын
Fantastic video!
@404S1mon Жыл бұрын
wow that was great, thank you so much!
@doingtime20 Жыл бұрын
Amazing work thank you. New sub.
@wecharg11 ай бұрын
Great work!
@zweihenderman5285 Жыл бұрын
Great video!!
@cinematic_monkey Жыл бұрын
What I was looking for in that video was the comparison of usability in different scenarios. Which model is good for faces which one for style transfer etc. I'm missing that, other than that quite comprehensive comparison. Good job!
@timalk2097 Жыл бұрын
amazing content, insta subbed !
@KnightLenny Жыл бұрын
Amazing educational video!
@jitgo Жыл бұрын
All different now! LoRA is by far the best all round method now and hugely gaining popularity... Great video by the way, excellent explanations!
@paulofalca0 Жыл бұрын
Great video! Thanks!
@barryjones6479 Жыл бұрын
Great video and explanation! I really want TI to be the future but I agree, the quality of dreambooth training is usually better.
@lewingtonn
Жыл бұрын
thank's for the data point!
@kirollosmalek1365 Жыл бұрын
man you're a hero
@huyked Жыл бұрын
Thank you, sir, for this explanation!
@parasite34 Жыл бұрын
insane work and attention here
@Funzelwicht10 ай бұрын
Awesome explanantion for everyone!
@joaquinramos11819 ай бұрын
Muy buen Video! Gracias
@Slider93 Жыл бұрын
Amazing, thank you
@RemitheDreamfox Жыл бұрын
You explained this so well. My smooth brain couldn't understand these different methods for the longest time \uwu/
@maxschaeffer Жыл бұрын
thanks a lot for your effort, great job
@wendellkwang3724 Жыл бұрын
what a great list of checkpoints you have, a man of culture 🤣
@xhinker Жыл бұрын
Nice video, even though I watched it 6 months later, lots of things happened, your video is still extremely helpful (except the LoRA part 😊)
@Copyshinobi Жыл бұрын
Much appreciated! Having this nodes of wisdom to operate with AI models is a huge contribution to society! Props to you.
@Xiripyu Жыл бұрын
Thanks you for really nice explonation
@mattecrystal6403 Жыл бұрын
I've been messing with Loras and they seem to work really well. You can also do a good amount of mix and matching with loras whereas a full model checkpoint only allows you to use that one model at a time. if I had a fruits lora and a vegetables lora, then I could just turn them both on to get fruits and vegies in my random prompt that doesn't ask for fruits or vegies. If I later just want fruit then I could just remove the vegies lora. I think loras are going to be big going forward, most people just don't know about them yet.
@treyslider6954
Жыл бұрын
I get the feeling that Textual Inversion is the go-to for when you have a new idea you want to teach the model (like a specific character or subject), and Lora is great for when you have a concept you don't want to stop and explain to the model, or may have difficulty doing so. They're very similar things, but not quite the same. For example; loras are great for mimicking a specific art style, because instead of having to describe "I want a painted animation style like this specific style, but with eyes drawn just so", you can train a lora and then just say "" at the end of your prompt, and since it isn't actually part of the prompt, this clears up tokens for describing the actual thing you want depicted in that style.
@ArbJunkAgeG
Жыл бұрын
This is exactly how i feel about lora. It’s disappointing that people don’t seem to gasp the same values of how beneficial loras can be.
@tbuk8350
Жыл бұрын
@@treyslider6954 And also, as described in the Automatic1111 docs, Textual Inversion can't teach COMPLETELY new concepts. The example they gave is that if you trained a model that only knew how to make apples on images of bananas, it wouldn't learn what a banana is, it would just make long yellow apples (in the best-case scenario). Because it's not actually changing model weights, it's better for teaching a style than a new subject, because unless the subject is very similar to something it's seen, it can't learn it. LoRAs can teach a model something it's never seen before, because they are directly inserting weights into the model, meaning it's actually modifying the model and not the input going into it. Basically, Textual Inversion for simple styles, LoRA for anything complicated.
@ThePixelkd Жыл бұрын
Thanks koiboi! I absolutely love these in depth videos. Any plans to give ControlNet this sort of treatment?
@gamebro6337 Жыл бұрын
Sir, thanks for your effort and detailed explanation 🫡🫡🫡learned so much🙇‍♂🙇‍♂🙇‍♂
@bigpapanacho4033 Жыл бұрын
Great video thanks
@DataScienceGuy2 ай бұрын
Sooo usefull video, thx!
@bobsmithy3103 Жыл бұрын
Amazing explanation
@SolracNaujMauriiDS Жыл бұрын
muy buena clase. Aprendi mucho.
@Ben_CY123 Жыл бұрын
Bro, this video really helpful!
@tljstewart Жыл бұрын
ok you had me @00:27 , would be cool to see a video on civitai
@tbuk8350 Жыл бұрын
This video is incredibly helpful. I'm probably going to use either LoRA or Dreambooth, as Textual Inversion can't teach brand new subjects as well as you can by directly inserting or modifying weights in the model.