Machine Learning & Simulation
Machine Learning & Simulation
Explaining topics of 🤖 Machine Learning & 🌊 Simulation with intuition, visualization and code.
------
Hey,
welcome to my channel of explanatory videos for Machine Learning & Simulation. I cover topics from Probabilistic Machine Learning, High-Performance Computing, Continuum Mechanics, Numerical Analysis, Computational Fluid Dynamics, Automatic Differentiation and Adjoint Methods. Many videos include hands-on coding parts in Python, Julia, or C++. The videos also showcase the application of the topics in modern libraries like JAX, TensorFlow Probability, NumPy, SciPy, FEniCS, PETSc and many more.
All material is also available on the GitHub Repo of the channel: github.com/Ceyron/machine-learning-and-simulation
Enjoy :) And please leave feedback.
If you liked the videos, feel free to support the channel on Patreon: www.patreon.com/MLsim
If you want to make a one-time donation, you can do so via PayPal: paypal.me/FelixMKoehler
Пікірлер
Hello, teacher. Your vedios benifit me a lot !And I have a question,It has been bothering me for a long time. How can something be called adjoint?
awesome!
"let's make X", imports library and implements the most basic use case. Seriously, "let's use X to make X" would be just as good and truthful. Reported as spam.
"fastest" "python"
This is great! Working on a package that uses the adjoint method for implicit differentiation and this video explained things really well :D
oh god, thank you that was the real beset
Many thanks for the great video, I was looking for something along these lines. I have two questions. The first is regarding how you just chose that the term multiplying the Jacobian vanishes. Where do you get the freedom to impose that without this hurting the optimization procedure? From my recollection on Lagrange multipliers, we usually solve for them once we consider the variation of the Lagrangian to be zero, but we don't get much freedom - in fact, whatever they end up being may even have physical meaning, e.g. being normal forces on mechanical systems which are constrained to be on a surface. The second question regards why using Lagrange multipliers should work in the first place. I understand that, if we wanted to find saddle points for the loss then indeed this is the path; but how do we justify using a Lagrange multiplier we found during the minimization process to compute the Jacobian of solution w.r.t to parameters in general?
I was looking at the reference code that you mentioned in the jupyter notebook and found that they coded something weird that I can't understand for 2D. out_ft[:, :, : self.mode1, : self.mode2] = self.compl_mul2d(x_ft[:, :, : self.mode1, : self.mode2], self.weights1) out_ft[:, :, -self.mode1 :, : self.mode2] = self.compl_mul2d(x_ft[:, :, -self.mode1 :, : self.mode2], self.weights2) I don't understand why there are two weights (weights1, weights2) and why they take upper mode1 frequencies. Can you explain this? Thanks for your video.
🤓
Thank you so much for this explanation! I had no problem with taking the derivatives of the functional, and getting ln(p) as a function of lambdas. The real novel ideas (at least for me) are writing ln(p) in terms of complete squares, carefully choosing which order you want to substitute it in the three constraints (first in the mean constraint, then the normalisation constraint, then the variance constraint), and also making simple substitutions like y = x - mu. They seem trivial but I was banging my head before watching this video about how to do these substitutions. Your video was really useful in introducing these small and useful tricks in deriving the gaussian from the maximum entropy principle
I guess a typo at 19.52 that original VI target is argmin( KL(q(z) || p(z|D ))) but it was written p(z,D). Actually p(z,D) is the one we end-up using in ELBO. This can be used to summarize the approach here "ELBO: Well we dont have p(z|D) so instead lets use something we have which is p(z,D) but... Lets show that this is reasonable thing to do"
I think requires_grad is true by default in torch. Never knew jax syntax was so simple. Good one!
Really useful. Thank you!
TGV is a 3D problem though... our c++ code runs this problem for a 64x64x64 grid in 2 seconds
So basically write a c extension got it
Well if you think about it Python is just a C wrapper.... The level of abstraction crazy tho all that complex math with just like 5 functions
You really didn't make anything, though. You just used a library...
Can you say why du/d\theta is difficult to compute? I'm happy to look at references or other videos if that's easier! Thanks for the terrific content.
First of all, thanks for the kind comment and nice feedback 😊 I think the easiest argument is that this Jacobian is of shape len(u) by len(theta) for which each column had to be computed separately. This means solving a bunch of additional ODEs. If you have a lot of compute, you could do that in parallel (yet for large theta, let's say greater than 10'000 or 100'000 dimensional parameter spaces, which is reasonable for deep learning, this is infeasible and you have to resort to sequential processing). With the adjoint method, one only solves one additional ODE.
There is a similar video to this which approaches it more from the autodiff perspective rather than the optimal control perspective. I think this can also be helpful: kzread.info/dash/bejne/p2yCrph8p7bVgso.html
JAX for math & stats folks. Works just like how you write math in paper Pytorch for IT folks. the backward function in each variables is very helpful in reducing the number of variables & functions to keep in mind
I guess JAX is superior in calculating high derivative order
Aaaaand how do we know the joint distribution p(X,Z) ? As said X can be an image from our data set and Z can be some feature like "roundness of chin" or "intensity of smiling". It is bit strange to be able to know jointly p(Image, feature) but not being able to know p(Image) because of multi-dimensional integrals
That was a common question I received, check out the follow-up video I created: kzread.info/dash/bejne/mYplsLmGmcyndaw.html Hope that helps 😊
you lost me at “fastest…in python”
😉
Thanks! I'm learning Bayesian Statistics at uni and didn't fully understand why we need precision.
Glad it was helpful! 😊
thanks for your lecture, the pronunciation of inviscid is not [INVISKID] it is simply [INVISID].
*How do do anything in Python:* Import library that does it for you
I mean that's actaully really cool though
Yes, bro, it is like that. Libraries pack all the tools for specific problems. If you want to know raw algorithm and math behind the problem/solution, then feel free to study the source code of the library. For example study specific function that are used in this tutorial to understand how fluid simulation happens
Could you explain how you came up with the equation following the sentence "Then advance the state in time by..."? Is there any prerequisite to understand the derivation?
ETD (Exponential Time Differencing) is a numerical method to solve stiff ODEs, you can have a look at indico.ictp.it/event/a08165/session/7/contribution/4/material/0/0.pdf to understand the idea behind this equation (slides 19 to 22).
Hi, Thanks for the question 😊 Can you give me a time stamp for when I said that in the video? It's been some time since I uploaded it.
Great video! =) Can somebody please explain why we have the joint distribution and why don't have the posterior? I understand that we have some dataset D (images) and maybe we even have their ground truth data Z (like categories, cat, dog, etc..). Does this automatically mean that we have the joint distribution?
Great point! This was common question, so I created a follow-up video. Check it out here: kzread.info/dash/bejne/mYplsLmGmcyndaw.html
Python tutorials: import tutorial tutorial.run()
LMAO, how accurate
😅 kzread.info/dash/bejne/dIWA2LCFl7C6gag.html
Super cool as always. Some feedback to enhance clarity - when writing modules (SpectralConv1d, FNOBlock1d, FNO1d), overlaying the flowchart on the right hand side to show the block to which the code corresponds would be really helpful. I felt a bit lost in these parts.
Thank you so much for this series! It has helped me tremendously to understand how automatic differentiation works under the hood. I was wondering if you plan to continue the series, as there are still operations you haven't covered. In particular, I am interested in how the pullback rules can be derived for the "not so mathematical" operations such as permutations, padding, and shrinking of tensors.
Ich finds geil wie du from scratch das einf mal so dahincodest und dabei super erklärst 😅
Danke für das Kompliment 😊 In die Vorbereitung zum Video ist natürlich auch etwas Zeit reingeflossen :D
Just one observation: the functional unconstrained optimization solution gives you an unnormalized function, you said it is necessary to normalize it afterwards. So what does guarantee that this density, but normalized, will be the optimal solution?? If you normalize it, it will not be a solution to the functional derivative equation anymore. Also, the Euler-Lagrange equation gives you a critical point, how does one know if it is a local minima/maxima or a saddle point?
Excellent work! Thanks a lot for sharing.
Thank you! Cheers! 😊
Thank you. This is super helpful as an intro to fenics for fluid.
Thanks a lot for the kind feedback 😊
Thank you for the video. Very informative and helps a lot. I have a small question: Why there is no density in the equations? In NS equation it is exist. And one bigger question: Is it possible to include the heat transfer into this code with the same scheme as pressure update, or should be there a different way?
whait if the velocity is not constant and changes with the density(q)??
Exam in 20 minutes, thanks haha
Best of luck! 😉
For full accuracy, at 0:15, the distribution of X is actually the distribution of X given Z=z, right?
I love you
I'm flattered 😅 Glad, the video was helpful
Thank you so much for all you great video. What IDE do you use?
You're welcome 🤗 That's visual studio code.
Hello I'm a undergraduate student from South Korea. I really appreciate your videos. It helps me a lot on understanding jax and programming. Can I know if this 1D turbulence flow has a name?
You're very welcome 🤗 Thanks for the kind feedback. This dynamic is associated with the kuramoto-sivashinsky equation. I'm not too certain if we can classify it as turbulent (depends on the definition), it definitely is chaotic (in the sense of high sensitivity with respect to the initial condition)
Thank you for this wonderful video. Please can you also do another one for a vertical flow for two phase liquid and gas
You're very welcome 🤗 Thanks for the suggestion. It's more of a nichy topics. For now, I want to keep the videos rather general to address a larger audience
This dude is always on point ... keep it coming!
Thanks ❤️ More good stuff to come.
how do we know the joint dist?
That refers to us having access to a routine that evaluates the DAG. Check out my follow-up video. This should answer your question: kzread.info/dash/bejne/mYplsLmGmcyndaw.html
Very nice video. Truly showing the potential of Julia for sciml! I’m curious have you compared this Julia algorithm with Jax? It seems like much faster than training in Jax. However, I’m also worried about what if I need to construct mlp rather than one layer net which is most common situation in ml? How about high dimensional data rather than 1d data? Does that also increase the complexity to use Julia?
when u say Z is exponential distribution and X is normal distribution, how do u know this? Is this an assumption?
Yes, that's all a modeling assumption. Here, they are chosen because they allow for a closed-form solution.
Thank you very much!
You're welcome! 🤗
Great stuff
Thanks 🙏
Hi this was the most epic explanation I've ever seen, thank you! My question is that at ~14:25, you swap the numerator and denominator in the first term -- why did you do this swap?
Very cool video! The walkthrough write-up of this alternate program of 1D FNO is super useful for newcomers like myself :)
Great to hear! 😊 Thanks for the kind feedback ❤️
Hi man, awesome videos! a problem, if I may. let's say i have a box or cylinder with fluid(1/2, 3/4). i need is to simulate acceleration on it, input(accelerationXYZ, dt) - return inertia feedback foces on XYZ over time. simple detalisation for 300-400fps. maybe bake center mass beheviour into ML or something. any ideas? Thanks!
Not sure if I'm mistaken or not. In your explanation, the formula is wrong since it's actually suppose to use "ln" rather than "log" For others looking at the math part of the explanation np.log is actually doing ln or Log base e got me confused for a while.
Good catch. 👍 Indeed, it has to be "ln" for the correct box-muller transform. CS people tend to always just write log 😅