DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!
❤️ Become The AI Epiphany Patreon ❤️ ► / theaiepiphany
In this video I cover DINO (self DIstillation with NO labels) introduced in the "Emerging Properties in Self-Supervised Vision Transformers" paper by Facebook AI.
The idea is to see whether using supervised learning was preventing transformers from showing the same kind of results in CV as they demonstrated in the NLP world (where we use self-supervised learning objectives such as (masked) language modeling).
It turns out some nice properties emerge such as:
* DINO-ViT learns to predict segmentation masks
* features are especially of high quality for the k-NN classification
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper: arxiv.org/abs/2104.14294
✅ Code: github.com/facebookresearch/dino
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
⌚️ Timetable:
00:00 DINO main ideas, attention maps explained
05:05 DINO explained in depth
10:30 Pseudocode walk-through
13:55 Multi-crop and local-to-global correspondence
15:15 More details on the teacher network
19:00 Results
25:00 Ablations
27:40 Collapse analysis
30:40 Features visualized and outro
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💰 BECOME A PATREON OF THE AI EPIPHANY ❤️
If these videos, GitHub projects, and blogs help you,
consider helping me out by supporting me on Patreon!
The AI Epiphany ► / theaiepiphany
One-time donation:
www.paypal.com/paypalme/theai...
Much love! ❤️
Huge thank you to these AI Epiphany patreons:
Eli Mahler
Petar Veličković
Zvonimir Sabljic
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💡 The AI Epiphany is a channel dedicated to simplifying the field of AI using creative visualizations and in general, a stronger focus on geometrical and visual intuition, rather than the algebraic and numerical "intuition".
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
👋 CONNECT WITH ME ON SOCIAL
LinkedIn ► / aleksagordic
Twitter ► / gordic_aleksa
Instagram ► / aiepiphany
Facebook ► / aiepiphany
👨👩👧👦 JOIN OUR DISCORD COMMUNITY:
Discord ► / discord
📢 SUBSCRIBE TO MY MONTHLY AI NEWSLETTER:
Substack ► aiepiphany.substack.com/
💻 FOLLOW ME ON GITHUB FOR COOL PROJECTS:
GitHub ► github.com/gordicaleksa
📚 FOLLOW ME ON MEDIUM:
Medium ► / gordicaleksa
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#dino #facebook #selfsupervised
Пікірлер: 21
Thanks man, this helped me so much to understand alot of details! Great content, deserves alot more views
@TheAIEpiphany
2 жыл бұрын
Thank you, glad to hear it man!
You explained wvery single thing I needed to udnerstand in this paper! Can't thank you enough!!!!!!
@TheAIEpiphany
2 жыл бұрын
Thank you for letting me know - I appreciate it! 🙏
Excellent work Bro! You video make this work much easier to understand.
You explained every bits and pieces of DINO. Thanks a ton.
@TheAIEpiphany
2 жыл бұрын
🥳
Excellent tutorial
Very Nice Bro, your explanations are crystal clear
@TheAIEpiphany
2 жыл бұрын
Thanks!!
Discovered this channel after the medium post about getting into Deepmind. The quality of your content is sooooo much higher than other channels I saw out there - I like how you get to the nitty-gritty details of the paper :)
@TheAIEpiphany
2 жыл бұрын
Thank you man! Haha glad that blog is making some impact. 😄
Nice work man!
@TheAIEpiphany
2 жыл бұрын
Thanks! 🔥
Thank you for your work!
@TheAIEpiphany
2 жыл бұрын
You're welcome!
Thank you for your explanation. In the early section of the video, you mentioned that transformers no longer require a vast amount of data because of some new technique. What is that technique/trick called?
At 0:42, could you please explain what is meant by "richer training signal" ? Thanks
Please provide link to the video explaining the stretegy that will not require viT to have too much data
Hello i have a paid project on DINO IBOT and DINOV2 will you help?
Explanations are too fast and unclear. You are skipping all the important points as if they are trivial, and don't even write down the formulas.