Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)

Ғылым және технология

Get your 5$ coupon for Gradient: gradient.1stcollab.com/umarja...
In this video we explore the entire Retrieval Augmented Generation pipeline. I will start by reviewing language models, their training and inference, and then explore the main ingredient of a RAG pipeline: embedding vectors. We will see what are embedding vectors, how they are computed, and how we can compute embedding vectors for sentences. We will also explore what is a vector database, while also exploring the popular HNSW (Hierarchical Navigable Small Worlds) algorithm used by vector databases to find embedding vectors given a query.
Download the PDF slides: github.com/hkproj/retrieval-a...
Sentence BERT paper: arxiv.org/pdf/1908.10084.pdf
Chapters
00:00 - Introduction
02:22 - Language Models
04:33 - Fine-Tuning
06:04 - Prompt Engineering (Few-Shot)
07:24 - Prompt Engineering (QA)
10:15 - RAG pipeline (introduction)
13:38 - Embedding Vectors
19:41 - Sentence Embedding
23:17 - Sentence BERT
28:10 - RAG pipeline (review)
29:50 - RAG with Gradient
31:38 - Vector Database
33:11 - K-NN (Naive)
35:16 - Hierarchical Navigable Small Worlds (Introduction)
35:54 - Six Degrees of Separation
39:35 - Navigable Small Worlds
43:08 - Skip-List
45:23 - Hierarchical Navigable Small Worlds
47:27 - RAG pipeline (review)
48:22 - Closing

Пікірлер: 102

  • @ramsivarmakrishnan1399
    @ramsivarmakrishnan139912 күн бұрын

    You are the best teacher of ML that I have experienced. Thanks for sharing the knowledge.

  • @nawarajbhujel8266
    @nawarajbhujel82662 ай бұрын

    This is what a teacher with a deep knowledge on what is teaching can do. Thank you very much.

  • @DeepakTopwal-sl6bw
    @DeepakTopwal-sl6bw2 ай бұрын

    and learning becomes more interesting and fun when you have an Teacher like Umar who explains each and everything related to the topic so good that everyone feels like they know complete algorithms. A big fan of your teaching methods Umar.. Thanks for making all the informative videos..

  • @sarimhashmi9753
    @sarimhashmi975325 күн бұрын

    Wow, thanks a lot. This Is the best explanation on RAG I found on KZread

  • @tryit-wv8ui
    @tryit-wv8ui5 ай бұрын

    Wow! I finally understood everything. I am a student in ML. I have watched already half of your videos. Thank you so much for sharing. Greetings from Jerusalem

  • @wilsvenleong96
    @wilsvenleong965 ай бұрын

    Man, your content is awesome. Please do not stop making these videos as well as code walkthroughs.

  • @Rockermiriam
    @RockermiriamАй бұрын

    Amazing teacher! 50 minutes flew by :)

  • @yuliantonaserudin7630
    @yuliantonaserudin76303 ай бұрын

    The best explanation of RAG

  • @christopheprotat
    @christopheprotat5 ай бұрын

    Waited for such content for a while. You made my day. I think I got almost everything. So educational. Thank you Umar

  • @suman14san
    @suman14san3 ай бұрын

    What an exceptional explanation of HNSW algo ❤

  • @jeremyregamey495
    @jeremyregamey4956 ай бұрын

    Just love ur videos. Soo much Details but extremly well put together

  • @kiranshenvi2626
    @kiranshenvi26264 ай бұрын

    Awesome context sir, it was the best explanation I found till now!

  • @JRB463
    @JRB4633 ай бұрын

    This was fantastic (as usual). Thanks for putting it together. It has helped my understanding no end.

  • @user-yp2bg2bv2t
    @user-yp2bg2bv2t6 ай бұрын

    One of the best channels to learn and grow

  • @alexsguha
    @alexsguha3 ай бұрын

    Impressively intuitive, something most explanations are not. Great video!

  • @alexandredamiao1365
    @alexandredamiao13653 ай бұрын

    This was fantastic and I have learned a lot from this! Thanks a lot for putting this lesson together!

  • @NeoMekhar
    @NeoMekhar5 ай бұрын

    This video is really good, subscribed! You explained the topic super well. Thanks!

  • @goelnikhils
    @goelnikhils5 ай бұрын

    Amazing content and what clear explanation. Please make more videos. Keep making this channel will grow like anything.

  • @1tahirrauf
    @1tahirrauf6 ай бұрын

    Thanks Umar. I look forward for your videos as you explain the topic in an easy to understand way. I would request you to make "BERT implementation from scratch" video.

  • @mturja
    @mturja2 ай бұрын

    The explanation of HNSW is excellent!

  • @melikanobakhtian6018
    @melikanobakhtian60185 ай бұрын

    Wow! You explained everything great! Please make more videos like this

  • @oliva8282
    @oliva828211 күн бұрын

    Best video ever!

  • @myfolder4561
    @myfolder45612 ай бұрын

    Thank you so much - this is a great video. Great balance of details and explanation. I have learned a ton and have saved it down for future reference

  • @meetvardoriya2550
    @meetvardoriya25505 ай бұрын

    Really amazing content!!, looking forward for more such content Umar :)

  • @sethcoast
    @sethcoastАй бұрын

    This was such a great explanation. Thank you!

  • @trungquang1581
    @trungquang15812 ай бұрын

    Thank you so much for sharing. Looking for more content about NLP and LLMs

  • @bhanujinaidu
    @bhanujinaiduАй бұрын

    Good explanation, thanks

  • @Zayed.R
    @Zayed.R2 ай бұрын

    Very informative, thanks

  • @akramsalim9706
    @akramsalim97066 ай бұрын

    Awesome paper. Please keep posting more videos like this.

  • @mdmusaddique_cse7458
    @mdmusaddique_cse74583 ай бұрын

    Amazing explanation!

  • @ShreyasSreedhar2
    @ShreyasSreedhar23 ай бұрын

    This was super insightful, thank you very much!

  • @Jc-jv3wj
    @Jc-jv3wj24 күн бұрын

    Thank you very much for a detailed explanation on RAG with Vector Database. I have one question: Can you please explain how do we design the skip list with embeddings? Basically how to design which embedding is going to which level?

  • @FailingProject185
    @FailingProject1856 ай бұрын

    Glad I've subscribed to your channel. Please do these more.

  • @maximbobrin7074
    @maximbobrin70746 ай бұрын

    Man, keep it up! Love your content

  • @mustafacaml8833
    @mustafacaml88333 ай бұрын

    Great explanation! Thank you so much

  • @vasoyarutvik2897
    @vasoyarutvik28974 ай бұрын

    Hello sir i just want to say thanks for creating very good content for us. love from India :)

  • @user-cs6vt4ei9v
    @user-cs6vt4ei9v3 ай бұрын

    amazing work very clear explanation ty!

  • @ashishgoyal4958
    @ashishgoyal49586 ай бұрын

    Thanks for making these videos🎉

  • @hientq3824
    @hientq38246 ай бұрын

    awesome as usual! ty

  • @manyams5207
    @manyams52074 ай бұрын

    wow wonderful explanation thanks

  • @_seeker423
    @_seeker4233 ай бұрын

    Excellent content!

  • @nancyyou7548
    @nancyyou75484 ай бұрын

    Thank you for the excellent content!

  • @ahmedoumar3741
    @ahmedoumar37414 ай бұрын

    Nice lecture, Thank you!

  • @fernandofariajunior
    @fernandofariajunior3 ай бұрын

    Thanks for making this video!

  • @LiuCarl
    @LiuCarl3 ай бұрын

    simply impressive

  • @amblessedcoding
    @amblessedcoding6 ай бұрын

    Wooo you are the best I have ever seen

  • @user-wy1xm4gl1c
    @user-wy1xm4gl1c6 ай бұрын

    Thank you, awesome video!

  • @SanthoshKumar-dk8vs
    @SanthoshKumar-dk8vs4 ай бұрын

    Thanks for sharing, really a great content 👏

  • @DanielJimenez-yy8xk
    @DanielJimenez-yy8xkАй бұрын

    awesome content

  • @dantedt3931
    @dantedt39314 ай бұрын

    One of the best videos

  • @satviknaren9681
    @satviknaren968124 күн бұрын

    Please bring some more content !

  • @emptygirl296
    @emptygirl2966 ай бұрын

    Hola, coming back with a great content as usual

  • @umarjamilai

    @umarjamilai

    6 ай бұрын

    Thanks 🤓😺

  • @SureshKumarMaddala
    @SureshKumarMaddala6 ай бұрын

    Excellent video! 👏👏👏

  • @parapadirapa
    @parapadirapa4 ай бұрын

    Amazing presentation! I have a couple of questions though... What size of chunks should be used when using Ada-002? Is that dependent on the Embedding model? Or is it to optimize the granularity of 'queriable' embedded vectors? And another thing: am I correct to assume that, in order to capture the most contexts possible, I should embed a 'tree structure' object (like a complex object in C#, with multiple nested object properties of other types) sectioned from more granular all the way up to the full object (as in, first the children, then the parents, then the grand-parents)?

  • @rajyadav2330
    @rajyadav23305 ай бұрын

    Great content , keep doing it .

  • @oliz1148
    @oliz11483 ай бұрын

    so helpful! thx for sharing

  • @sounishnath513
    @sounishnath5136 ай бұрын

    I am so glad I am subscribed to you!

  • @amazing-graceolutomilayo5041
    @amazing-graceolutomilayo50413 ай бұрын

    This was a wonderful explanation! I understood everything and I didn't have to watch the Transformers or BERT video (I actually know nothing about them but I have dabbled with Vector DBs). I have subbed and I will definitely watch the transformer and BERT video. Thank you!❤❤ Made a little donation too. This is my first ever saying $Thanks$ on KZread haha

  • @Tiger-Tippu
    @Tiger-Tippu6 ай бұрын

    Hi Umar,does RAG also has context window limitation as prompt engineering technique

  • @qicao7769
    @qicao77693 ай бұрын

    Cool video about RAG! You could also upload into Bilibili, as you live in China, you should know that. :D

  • @soyedafaria4672
    @soyedafaria46723 ай бұрын

    Thank you so much. Such a nice explanation. 😀

  • @UncleDavid
    @UncleDavid5 ай бұрын

    Salam Mr Jamil, i was wondering if it was possible to use the BERT model provided by apple in coreml for sentimental analysis when talking to siri then having a small gpt2 model fine tuned in conversational intelligence give a response that siri then reads out

  • @Vignesh-ho2dn
    @Vignesh-ho2dnАй бұрын

    How would you find number 3 at 44:01 ? The algorithm you said will go to 5 and then since 5 is greater than 3, it won't go further. Am I right?

  • @ChashiMahiulIslam-qh6ks
    @ChashiMahiulIslam-qh6ks3 ай бұрын

    You are the BEST!

  • @user-hc3nr9re4j
    @user-hc3nr9re4j6 ай бұрын

    Thank you so much man..

  • @rvons2
    @rvons26 ай бұрын

    Are we storing the sentence embeddings together with the original sentence they were created? If not how do we map them back (from the top-k most similar stored vectors) into the text they were originated for, given that the sentence embedding lost some information when pooling was done.

  • @umarjamilai

    @umarjamilai

    6 ай бұрын

    Yes, the vector database stores the embedding and the original text. Sometimes, they do not store the original text but a reference to it (for example instead of storing the text of a tweet, you may store the ID of the tweet) and then retrieve the original content using the reference.

  • @mohamed_akram1
    @mohamed_akram13 ай бұрын

    Thanks

  • @chhabiacharya307
    @chhabiacharya3074 ай бұрын

    Thank YOU :)

  • @tomargentin5198
    @tomargentin51982 ай бұрын

    Hey, big thanks for this awesome and super informative video! I'm really intrigued by the Siamese architecture and its connection to RAG. Could someone explain that a bit more? Am I right in saying it's used for top-K retrievals ? Meaning, we create the database with the output embeddings, and then use a trained Siamese architecture to find the top-K most relevant chunks computing similarities ? Is it necessary to use this approach in every framework, or can sometimes just computing similarity through the embeddings work effectively?

  • @adatalearner8683
    @adatalearner8683Ай бұрын

    why is the context window size limited? Is it because these models are based on transformers and for a given transformer architecture, long distance semantic relationship detection will be bounded by the number of words/context length ?

  • @songsam1373
    @songsam13732 ай бұрын

    thanks

  • @hassanjaved4730
    @hassanjaved47302 ай бұрын

    Awesome I completely understand the RAG just because of you, Now I am here with some questions let's I am using the Llama2 model to where my main concern is I am giving him the pdf for context then user can ask question question on this, but this approach took time, during inferencing. so after watching your video what i undersatnd using the RAG pipeline is it possible to store the uploaded pdf into vector db then we will used it like that. I am thinking right or not or is it possible or not? Thanks,

  • @user-hd7xp1qg3j
    @user-hd7xp1qg3j6 ай бұрын

    You are legend

  • @h3xl4
    @h3xl423 күн бұрын

    Thanks!

  • @12.851
    @12.8514 ай бұрын

    Great video!! Shouldn't 5 come after 3 in skip list?

  • @adatalearner8683
    @adatalearner8683Ай бұрын

    how to do get target cosine similarity at first place?

  • @user-bt1jl1ou7j
    @user-bt1jl1ou7j6 ай бұрын

    Wow, I saw the Chinese knotting on your wall ~

  • @amblessedcoding
    @amblessedcoding6 ай бұрын

    Thanks bro

  • @user-kg9zs1xh3u
    @user-kg9zs1xh3u5 ай бұрын

    keep it up!

  • @jrgenolsen3290
    @jrgenolsen32903 ай бұрын

    💪👍 good introduktion

  • @ltbd78
    @ltbd784 ай бұрын

    Legend

  • @christopherhornle4513
    @christopherhornle45136 ай бұрын

    Great video, keep up the good work! :) Around 19:25 you're saying that the embedding for "capital" is updated during backprop. Isn't that wrong for the shown example / training run where "capital" is masked? I always thought only the embedding associated with non-masked tokens can be updated.

  • @umarjamilai

    @umarjamilai

    6 ай бұрын

    You're right! First of all, ALL embedding vectors of the 14 tokens are updated (including the embedding associated with the MASK token). What happens actually is that the model updates the embedding of all the surrounding words in such a way that it can rebuild the missing word next time. Plus, the model is forced to use (mostly) the embedding of the context words to predict the masked token, since any word may be masked, so there's not so much useful information in the embedding of the MASK token itself. It's easy to get confused when you make long videos like mine 😬😬 Thanks for pointing out!

  • @christopherhornle4513

    @christopherhornle4513

    6 ай бұрын

    I see, didn't know that the mask token is also updated! Thank you for the quick response. You really are a remarkable person. Keep going!

  • @tempdeltavalue
    @tempdeltavalue4 ай бұрын

    So how llm converts vector to text ?

  • @rkbshiva
    @rkbshiva5 ай бұрын

    Umar, great content! Around 25:00, when you say that we have a target cosine similarity. How is that target's cosine similarity calculated? Because there is no mathematical way to calculate the cosine similarity between two sentences. All we can do is only take a subjective guess. Can you please exlain in detail to me how this works?

  • @umarjamilai

    @umarjamilai

    5 ай бұрын

    When you train the model, you have a dataset that maps two sentences to a score (chosen by a human being based on a scale from 1 to 10 for example). This score can be used as a score for the cosine similarity. If you look papers in this field, you'll see there are many sofisticated methods, but the training data is always labeled by a human being.

  • @rkbshiva

    @rkbshiva

    5 ай бұрын

    @@umarjamilai Understood! Thanks very much for the prompt response. It would be great if we can identify a bias free way to do this as the numbering between 1 - 10, especially when done by multiple people and at scale, could get biased.

  • @koiRitwikHai
    @koiRitwikHai4 ай бұрын

    at 44:00 , the order of linked list is incorrect... isn't it? because it should be 1 3 5 9

  • @moviesnight248

    @moviesnight248

    4 ай бұрын

    Even I have the same doubt. It should have been sorted as per the definition

  • @tempdeltavalue
    @tempdeltavalue4 ай бұрын

    So how LLM converts vector to text ?

  • @faiqkhan7545
    @faiqkhan75456 ай бұрын

    Lets say I want to create a Online semantic search tool , that uses vector DB, and RAG performance. just like bing tool . will it follow the same procedure and what new things I will be adding it to integrate to Internet? Plus nicely put video Umar . can you do a coding session for this one like you do for all others , like make something with real time output with rag ? or anything . will be a pleasure to watch.

  • @anapaunovic8405
    @anapaunovic84055 ай бұрын

    Do you plan to record coding sentence bert from scratch

  • @utkarshjain3814
    @utkarshjain38142 ай бұрын

  • @NicolasPorta31
    @NicolasPorta314 ай бұрын

    Merci !

  • @KumR
    @KumR4 ай бұрын

    Half

  • @DiegoSilva-dv9uf
    @DiegoSilva-dv9uf5 ай бұрын

    Thanks!

  • @amitshukla1495
    @amitshukla14956 ай бұрын

    You are legend

  • @amazing-graceolutomilayo5041
    @amazing-graceolutomilayo50413 ай бұрын

    Thanks

  • @MihailLivitski
    @MihailLivitski6 ай бұрын

    Thanks!

Келесі