Word2Vec - Skipgram and CBOW

#Word2Vec #SkipGram #CBOW #DeepLearning
Word2Vec is a very popular algorithm for generating word embeddings. It preserves word relationships and is used with a lot of Deep Learning applications.
In this video we will learn about the working of word2vec and word embeddings. We will also learn about Skipgram and Continuous bag of words (CBOW ) which help in generating word2vec embeddings.
Word2Vec coupled with RNNs and CNNs are also used in building chatbots. They have lots of other use cases too.
Introduction: (0:00)
Why use word embeddings?: (0:14)
What is Word2vec?: (0:42)
Working of Word2vec?: (1:58)
CBOW and skipgram?: (2:48)
CBOW working ?: (3:36)
skip gram working ?: (5:32)

Пікірлер: 131

  • @rma1563
    @rma15634 ай бұрын

    By far the best explanation of this topic. It's crazy you only took 7 minutes to explain what most people spend a lot more and still can't deliver. Thanks ❤

  • @nax2kim2
    @nax2kim23 жыл бұрын

    indexing for me 2:40 Word2Vec exam 3:06 CBOW 3:20 Skip Gram ----- 5:30 CBOW - working 5:50 Skip Gram - working 6:30 Getting word embeddings thx for this video :)

  • @iindifferent
    @iindifferent4 жыл бұрын

    Thank you. I was having a hard time understanding the concept from my uni and classes. After watching your video I went back and reread, and everything started to make more sense. Went back here watched this a second time and I think I have the hang of it now.

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    Glad it helped!

  • @user-fy5go3rh8p
    @user-fy5go3rh8p3 жыл бұрын

    This is the best explanation I've encountered so far. Thank you!

  • @fabricesimodefo8113
    @fabricesimodefo81134 жыл бұрын

    Exactly what i was searching for ! so clear. Sometime you just need the neural network structure in details in graph or visually. Why don't many people do that ? Its the simplest way to understand what is happening in real in the code after

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    This is what I needed when I was creating it, but did not find it anywhere :)

  • @sheshagirigh
    @sheshagirigh5 жыл бұрын

    Thanks a ton. By far the best i could find after a lot of searching.. even better than few from stanford lectures!

  • @jiexiong8522
    @jiexiong85223 ай бұрын

    Other word2vec videos are still intimidating even after a lot of graph and simplification. Your video is so friendly and helped me understand this key algorithm. Thanks!

  • @tylerlozano152
    @tylerlozano1524 жыл бұрын

    Thank you for the thorough, simple explanation.

  • @GunturBudiHerwanto
    @GunturBudiHerwanto3 жыл бұрын

    Thank you sir! I always come back to this video when I forgot about the concept.

  • @Amf313
    @Amf3132 жыл бұрын

    Best explanation I saw through Internet to illustrate how Word2Vec works. Paper was a little bit hard to read; Andrew Ng's explanation was somewhat incomplete or at least ambigious to me, but your video made it clear. Thank you🙏

  • @subhamprasad6808
    @subhamprasad68083 жыл бұрын

    Finally, I understood the concept of Word2Vec after watching this video. Thank you.

  • @chihiroa1045
    @chihiroa1045 Жыл бұрын

    Thank you so much! This is the most clear and organized tutorial I found on Word2Vec!

  • @maqboolurrahimkhan
    @maqboolurrahimkhan2 жыл бұрын

    Best and easy explanation of word2vec over the internet. Keep up the good work Thanks a ton

  • @jusjosef
    @jusjosef3 жыл бұрын

    Very simple, to the point explanation. Beautiful!

  • @carlrobinson2926
    @carlrobinson29265 жыл бұрын

    very nice explanation, not too long, straight to the point. thanks

  • @rainoorosmansaputratampubo2213
    @rainoorosmansaputratampubo22133 жыл бұрын

    Thank you so much. with this explanation I can understand it easier than read from books

  • @bloodzitup
    @bloodzitup4 жыл бұрын

    Thanks, my lecturer had this video in his references for learning word2vec

  • @skipintro9988
    @skipintro99883 жыл бұрын

    Thanks, bro - this one is the easiest and simplest and quickest explanation on word2vec

  • @johncompassion9054
    @johncompassion90542 ай бұрын

    4:50 "5X3 input matrix is shared by the context words". what do you mean by input matrix? Do you mean the weight matrix between the hidden layer (embedding) and the output layer? 5:18 "You take the weight matrix and it becomes the set of vectors". We have two weight matrices so which one? Also, I guess our vector embedding is the middle layer output values not weights. Correct me if I am wrong. Thank you.

  • @satyarajadasara9000
    @satyarajadasara90003 жыл бұрын

    Very nice video where everything was to the point! Keep posting such wonderful content!

  • @MrStudent1978
    @MrStudent19784 жыл бұрын

    Absolutely beautiful explanation!! Very precise and very much informative....Thanks for your kindness. Sharing one's learning is the best thing that a person can do to contribute to the society. Lots of respects from Punjab India....

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    Glad it was helpful!

  • @pushkarmandot4426
    @pushkarmandot44264 жыл бұрын

    The best video. Explained the whole concept in a very short amount of time

  • @ajinkyajoshi2308
    @ajinkyajoshi2308 Жыл бұрын

    Very well done!! Precise and to the point explanation!!

  • @jamesmina7258
    @jamesmina725813 күн бұрын

    Thank you. I learned a lot from your video.

  • @theunknown2090
    @theunknown20905 жыл бұрын

    Hey in cobw and skip gram Method there are 3 Weight metrics Which metric is selected as d embedding matrix ? And why

  • @ankursri21
    @ankursri214 жыл бұрын

    Thank you.. very well explained in shorter time.

  • @MehdiMirzapour
    @MehdiMirzapour5 жыл бұрын

    Thanks. It is really a brilliant explanation!

  • @absoluteanagha
    @absoluteanagha3 жыл бұрын

    Love this! Such a great explanation!

  • @sunjitrana374
    @sunjitrana3745 жыл бұрын

    Nice explanation, Thanks for that!!! One question: How to decide optimal length of hidden layer? here in example its 3 and in general you said it's around 300.

  • @OorakanaGleb
    @OorakanaGleb4 жыл бұрын

    Awesome explanation. Thanks!

  • @varunjindal1520
    @varunjindal15203 жыл бұрын

    This is indeed very good video. To the point and covers what I needed to know. Thank you.

  • @TheSemiColon

    @TheSemiColon

    3 жыл бұрын

    Glad you found it useful, do share the word 🙂

  • @anujlahoty8022
    @anujlahoty8022 Жыл бұрын

    Simple and eloquent explanation.

  • @juanpablo87t
    @juanpablo87t2 жыл бұрын

    Great Video, thank you! It is very clear how to extract the word embeddings in skip gram by multipliying the W matrix with the one hot vector of the corresponding word, however I can't figure how to extract them from the CBOW model as there are multiple W matrixes, could you give me a hint or a maybe a resource where this is explained?

  • @bryancamilo5139
    @bryancamilo51394 ай бұрын

    Thank you, your explanation is great. Now I have understood the concept 😁

  • @befesa1
    @befesa1Ай бұрын

    Thank you! Really good explanation:)

  • @anindyavedant801
    @anindyavedant8015 жыл бұрын

    I had a doubt, shouldn't the first weight matrix with which the input is multiplied be of dimensions 5x3 as all the connections need to be mapped to the hidden layer matrix and we have 5 inputs and 3 nodes in the hidden layer so the weights would be 5x3 and the second one would be vice versa i.e. 3x5

  • @FTLC
    @FTLC Жыл бұрын

    Thank you so much is was so confused before watching this video ,now its clear to me

  • @hs_harsh
    @hs_harsh5 жыл бұрын

    Sir can you provide the link of slides used. That would be helpful. I'm a student at IIT Delhi and I have to deliver a similar lecture presentation. Thank you!

  • @nithin5238
    @nithin52384 жыл бұрын

    Very clear explanation man.. you deserve slow claps

  • @coolbowties394
    @coolbowties3944 жыл бұрын

    Thanks so much for this thorough explanation!

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    Glad it was helpful!

  • @prathimads2876
    @prathimads28765 жыл бұрын

    Thank you so much Sir...

  • @MARTIN-101
    @MARTIN-1012 жыл бұрын

    this was such an informative lecture, thank you.

  • @ogsconnect1312
    @ogsconnect13124 жыл бұрын

    I cannot say anything but excellent. Thank you

  • @mohajeramir
    @mohajeramir3 жыл бұрын

    this is the best explanation I have found. thank you

  • @TheSemiColon

    @TheSemiColon

    3 жыл бұрын

    Glad you found it useful, do share the word 🙂

  • @ashwinrameshbabu2418
    @ashwinrameshbabu24183 жыл бұрын

    At time 5.28, cbow , hope gives 1x3 and set gives 1x3 dimension output. How are they combined into 1 (1x3) before sending to final layer?

  • @aravindaraman8667
    @aravindaraman86673 жыл бұрын

    Amazing explanation! Thanks a lot

  • @romanm7530
    @romanm75302 жыл бұрын

    Диктор просто огонь!

  • @HY-nt8nk
    @HY-nt8nk3 жыл бұрын

    Good work! Nicely explained.

  • @AdityaPatilR
    @AdityaPatilR3 жыл бұрын

    If hope can set us free hope can set you free as well !! thank you for the explanation and following what you preach ;)

  • @tumul1474
    @tumul14745 жыл бұрын

    awesome !!

  • @alialsaffar6090
    @alialsaffar60905 жыл бұрын

    This was enlightening. Thank you!

  • @gauharahmad2643
    @gauharahmad26434 жыл бұрын

    Sir what do we mean by size of each vector in 4:37 ?

  • @gouripeddivenkataasrithbha5148
    @gouripeddivenkataasrithbha51484 жыл бұрын

    Truly the best resource on word2vec by far. I have only one doubt. What do you mean by size of a vector being three. Other than this, I was able to understand everything.

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    the size of final vector for each word is the size of word vector.

  • @hashinitheldeniya1347
    @hashinitheldeniya13473 жыл бұрын

    can we cluster word phrases into groups using this word2vec technique?

  • @MultiAkshay009
    @MultiAkshay0095 жыл бұрын

    great work! 😍I am really thankful to you. But still I have a doubt with implementation part. 1) How to train the models for new datasets? 2) How to use both approaches differently CBOW and Skip-gram for training of the models? I badly need help with this. :(

  • @TheSemiColon

    @TheSemiColon

    5 жыл бұрын

    Thanks a lot. If you are implanting it from scratch then you have to encode each word of your database as a one hot vector train it using anyone of the algorithm skipgram or cbow and then pull out it's weights. Then multiply the weights with the one hot vector. The tensor flow official blog has a very nice example for it. You may use libraries like gensim to do it for you.

  • @impracticaldev
    @impracticaldev Жыл бұрын

    You earned a subsciption. Good luck!

  • @parthpatel3900
    @parthpatel39005 жыл бұрын

    Wonderful video

  • @md.prantohasan9630
    @md.prantohasan96304 жыл бұрын

    Excellent explanation in a very short time. Take

  • @sadeenmahbubmobin7102

    @sadeenmahbubmobin7102

    4 жыл бұрын

    reading material ta bujhay de amre akhn :3

  • @pranabsarkar
    @pranabsarkar4 жыл бұрын

    Thanks a lot!

  • @Zinghere
    @Zinghere2 жыл бұрын

    Great explanation!

  • @mohitagarwal437
    @mohitagarwal4373 жыл бұрын

    Best bhai aapne pura data science kar rakha hai kya ?

  • @mohajeramir
    @mohajeramir4 жыл бұрын

    this was excellent. Thank you

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    Glad it was helpful!

  • @haorao2464
    @haorao24643 жыл бұрын

    Thanks so much!

  • @hadrianarodriguez6666
    @hadrianarodriguez66664 жыл бұрын

    Thanks for the explanation! If I want to work with terms of two tokens, how can I do it?

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    you may want to append them may be ?

  • @muhammedhassen4354
    @muhammedhassen43545 жыл бұрын

    easy way explanation gr8

  • @keno2055
    @keno20552 жыл бұрын

    Why does the hidden layer at 4:59 have 3 nodes if we only care about the 2 adjacent nodes?

  • @hardikajmani5088
    @hardikajmani50884 жыл бұрын

    Very well explained

  • @ms10596
    @ms105965 жыл бұрын

    So helpful

  • @aliqais4896
    @aliqais48964 жыл бұрын

    thank you very much

  • @Hellow_._
    @Hellow_._10 ай бұрын

    how can we give all input vectors in one go to train the model?

  • @fahdciwan8709
    @fahdciwan87093 жыл бұрын

    what is the purpose of multiplying the 3*5 Weight Matrix with the one-hot vector of the word? How does it improve the embeddings?

  • @SameerKhan-ht4mx

    @SameerKhan-ht4mx

    2 жыл бұрын

    Basically the weight matrix is the word embedding

  • @himanshusrihsk4302
    @himanshusrihsk43024 жыл бұрын

    Really very useful

  • @prajitvaghmaria3669
    @prajitvaghmaria36695 жыл бұрын

    Any idea how to create a deep learning chatbot with keras and tensorflow for WhatsApp platform using python from scratch ?

  • @iliasp4275
    @iliasp42753 жыл бұрын

    thank you , The Semicolon.

  • @Simply-Charm
    @Simply-Charm4 жыл бұрын

    Thank you

  • @renessadesouza5601
    @renessadesouza56013 жыл бұрын

    Thank you so much

  • @shikharkesarwani9051
    @shikharkesarwani90514 жыл бұрын

    The weight matrix should be 5x3 (input to hidden) and 3x5 (hidden to output) @The Semicolon

  • @Agrover112

    @Agrover112

    4 жыл бұрын

    Wx+b hota hai

  • @nazrulhassan6310
    @nazrulhassan63103 жыл бұрын

    fabulous explanation but I need to do some more digging

  • @057ahmadhilmand6
    @057ahmadhilmand68 ай бұрын

    i still dont get it, the word vector for each word is a matriks?

  • @naveenkinnal5413
    @naveenkinnal54134 жыл бұрын

    Just one question. So the final word vector size is the same as sliding window size?

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    No, sliding window can be of any size.

  • @imanbio
    @imanbio3 жыл бұрын

    Plz fix the matrix sizes (3x5 should be 5x3 and vice versa..) - nice presentation

  • @dhruvagarwal4477
    @dhruvagarwal44774 жыл бұрын

    What is the meaning of vector size?

  • @BrunoCPunto
    @BrunoCPunto3 жыл бұрын

    Awesome

  • @arnav3674
    @arnav36742 ай бұрын

    Good !

  • @vionagetricahyo1268
    @vionagetricahyo12685 жыл бұрын

    hey can you share this code ?

  • @TheEducationWorldUS
    @TheEducationWorldUS4 жыл бұрын

    nice explanation

  • @qingyangluo7085
    @qingyangluo70854 жыл бұрын

    how to get the word embedding vector using CBOW? what neighbour words do i plug in?

  • @TheSemiColon

    @TheSemiColon

    4 жыл бұрын

    You have to iterate over a corpus. Popular ones are Wikipedia, google news etc.

  • @qingyangluo7085

    @qingyangluo7085

    4 жыл бұрын

    @@TheSemiColon Say I want to get the embedding vector of the word "love", this vector depends on what context/neighor words I plug in.

  • @josephselwan1652
    @josephselwan16522 жыл бұрын

    it took me 10 times to understand it. but i finally did. lol what we do to get a job haha

  • @qaisgafer3562
    @qaisgafer35624 жыл бұрын

    Great

  • @KARIVENKATARAMPHD
    @KARIVENKATARAMPHD5 жыл бұрын

    nice

  • @tobiascornille
    @tobiascornille3 жыл бұрын

    Which matrix is the embedding matrix in CBOW? W or W' ?

  • @TheSemiColon

    @TheSemiColon

    3 жыл бұрын

    it's W.

  • @randomforrest9251
    @randomforrest92513 жыл бұрын

    nice slides!

  • @theacid1
    @theacid13 жыл бұрын

    Thank you. My prof is unable to explain it.

  • @Mr.AIFella
    @Mr.AIFella5 ай бұрын

    The matrices multiplication not correct. I think it should be 5x1 1x3 to be equal 5x3 to be multiplied by 3x1 to equal 5x1. Right?

  • @AryanKhandal7399
    @AryanKhandal73995 жыл бұрын

    sir aswesome

  • @jatinsharma782
    @jatinsharma7825 жыл бұрын

    Very Helpful 👍

  • @DangNguyen-xx3zi
    @DangNguyen-xx3zi3 жыл бұрын

    Appreciate the work put into this video, thank you!

  • @TheSemiColon

    @TheSemiColon

    3 жыл бұрын

    Glad it was helpful!

  • @_skeptik
    @_skeptik Жыл бұрын

    i didn't fully catch the difference between cbow and skipgram in this explanation

  • @saikiran-mi3jc
    @saikiran-mi3jc3 жыл бұрын

    No much content in the channel to subscribe(i mean to say no playlist on nlp or cv ) .I came hear with lot of hopes. Content in the video is good.

  • @fabricesimodefo8113
    @fabricesimodefo81134 жыл бұрын

    typo 5:25 the input words should change to "set" and "free"