My name is Grant Sanderson. Videos here cover a variety of topics in math, or adjacent fields like physics and CS, all with an emphasis on visualizing the core ideas. The goal is to use animation to help elucidate and motivate otherwise tricky topics, and for difficult problems to be made simple with changes in perspective.
For more information, other projects, FAQs, and inquiries see the website: www.3blue1brown.com
Пікірлер
Highly recommend for any classical mechanic enthusiasts. Great video.
7130th comment
Grant is the Satoshi of AI, but not...he's present.
Sorry if I'm being ignorant, but what exactly are the "charges?" I have a learning disability so sometimes i miss things even after watching several times
1, 2, 4, 8, 16 is a number sequence typically used by IQ tests. I wonder what is the correct extrapolated next number. It can actually be anything.
Hey 3b1b, what sort of math interactive software do you use to create these amazing animations? They're awesome! You've been one of my favorite YT channels for years now and I've always wondered how it's done because I can't imagine you or someone else is doing them all by hand in the adobe suite... thx.
In base2 π is 11.001 Your 16kg weight is 1000 And there are 1100 bounces Your 64kg weight is 100000 And there are 11001 bounces In base4 π is 3.02 Your 16kg weight is 10 And there are 30 bounces In base8 π is 3.11 Your 64kg weight is 100 And there are 31 bounces
I got nothing done at work today as I spent it all day watching your videos.
Huge long term fan but this series is my favourite.
This is absolutely brilliant. Thank you so much
e and π have a cameo almost everywhere
Thank you for explaining this better than anyone else has been able to. I think I finally get it. I really appreciate your content 🙌🏻
I don't usually comment on videos, but this is one of the best videos I've seen on transformers, extremely detailed but very easy to understand!
This is actually similar to how some IQ tests work... Just trying to see how used you are to creating association patterns out of data they put out... like Finger is to hand, what leaf is to … Twig Tree Forest
That's... just insane
I REMEMBER THIS
If discreet means probability and continuous means probability density, what of we to say about the possiblity of a probability density being gaussian?
I need you to softmax my logits, baby.
ahaa now for sure people found the full video link button
I think it still holds true even when three lines meet at a point with the fact that the area formed by them is 0.
“you, the 3-d lander” makes me feel like hes a 4d entity teaching me his version of toddler math
I still don’t get it 🤷♂️
Love this channel but the flash bang at the end blew my pupils out
I was mentally and physically abused by my father as a child, currently living a financially decrepit and this will probably continue for the forseeable future, and i still cant believe out of wverything that has ever happened to me, this is what i bet my life on and lose
🎯 Key Takeaways for quick navigation: 00:00 *🔍 Understanding the Attention Mechanism in Transformers* - Introduction to the attention mechanism and its significance in large language models. - Overview of the goal of transformer models to predict the next word in a piece of text. - Explanation of breaking text into tokens, associating tokens with vectors, and the use of high-dimensional embeddings to encode semantic meaning. 02:11 *🧠 Contextual meaning refinement in Transformers* - Illustration of how attention mechanisms refine embeddings to encode rich contextual meaning. - Examples showcasing the updating of word embeddings based on context. - Importance of attention blocks in enriching word embeddings with contextual information. 05:37 *⚙️ Matrix operations and weighted sum in Attention* - Explanation of matrix-vector products and tunable weights in matrix operations. - Introduction to the concept of masked attention for preventing later tokens from influencing earlier ones. - Overview of attention patterns, softmax computations, and relevance weighting in attention mechanisms. 21:31 *🧠 Multi-Headed Attention Mechanism in Transformers* - Explanation of how each attention head has distinct value matrices for producing value vectors. - Introduction to the process of summing proposed changes from different heads to refine embeddings in each position. - Importance of running multiple heads in parallel to capture diverse contextual meanings efficiently. 22:34 *🛠️ Technical Details in Implementing Value Matrices* - Description of the implementation difference in the value matrices as a single output matrix. - Clarification regarding technical nuances in how value matrices are structured in practice. - Noting the distinction between value down and value up matrices commonly seen in papers and implementations. 24:03 *💡 Embedding Nuances and Capacity for Higher-Level Encoding* - Discussion on how embeddings become more nuanced as data flows through multiple transformers and layers. - Exploration of the capacity of transformers to encode complex concepts beyond surface-level descriptors. - Overview of the network parameters associated with attention heads and the total parameters devoted to the entire transformer model. Made with HARPA AI
It’s insane. After watching a video with Numberfile, I actually did the exact same thing, proved that it doesn’t work for 3, moved on to 4 and stopped at coloring cause I’m dumb and couldn’t figure out the coloring
Why does it hurt my balls when it do the ,,🦆" sound
I’m amazed that a man in the 1800s understood this and was able to explain this all before computers and quantum mechanics. I also get a kick out of the naysayers like Kelvin. 😊
<*_> This is my seal. I have watched the entire video, understood it, and I can explain it in my own words, thus I have gained knowledge. This is my seal. <_*>
Ever calculated against eternity? You know what a circle is in other means and what a perfect circle defines? And you know what Pi was made to calculate? Now you know what number of collisions you will get the more you increase the mass of the right object to near eternity. 😘 And after you know it, eat more apples 😘
The last point on the perimeter needs to divide the segment not at the halfway point, but offset, say 1/3 and 2/3, creating a small figure at the centre of the circle - voila 32...
The Fouriel transformer is different to normal transformers as instead of inline and adjacent cores and coils it is a 4 dimensional transformer consisting of 4 cores/coils at 390 degs to each other. This is much more economical than standard transformers as there is a lot less waste in heat as the electrical and magnetic waves don't interfere with each other while still inducing into each others cores. They are mainly used in Locomotive traction motors where the less heat produced reduces back emf, this was not a problem with weak fielding equipment and DC motors. With modern high voltage AC motors the heat factor is important so as much power can be driven for maximum speed.
<*_> This is my seal. I have watched the entire video, understood it, and I can explain it in my own words, thus I have gained knowledge. This is my seal. <_*>
Linquistic thermodynamics??
Fine
If the triangle was isosceles D and P would have coincided
I have a fear of looking at fractals being zoomed in and I still study them, but I can't watch them
@57:18 well I am great at using compasses but not great at math. I guess ya win some ya lose some lol.
This is terrifying
what is this supposed to be? edgy or something? are you having a stroke?
i love this series. i did a lot of mallicious promt trial and error; but learning more about the mathematics behind it, i get to understand how some things might work.
9:36 - 3Violet 1Brown !!
I was just stuck with this topic not even my college professors were explaining it nicely then I found you. You owe a salute sir 🙇🙇
The fourth level is explaining it to someone else, as that's always one of the hardest things to do and the best indication you truly know the concept.
You're a GOD
eeeeeeeee..
Puts it on its side 😊
So pi just pops up in the most unlikely of places......sensational
Because of parallel processing and GPUs you can convolve say a 1k image with a 3*3 kernel by offsetting the image one pixel to NW and multiply it by the first element in the kernel, then offset to N and multiply the image by the second element in the kernel an so on in all 8 directions + the center. Then you just add all the 9 images. That is also very fast and it works because of parallel processing.