$0 Embeddings (OpenAI vs. free & open source)

Ғылым және технология

What is the cheapest way to generate text embeddings? And how do they compare to OpenAI?
To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/RabbitHoleSyndrome. The first 200 of you will get 20% off Brilliant’s annual premium subscription.
This video was sponsored by Brilliant.
The video is going to explore the world of open source embedding models, how to use them, and how they compare to OpenAI's text-embedding-ada-002.
Source code: github.com/rabbit-hole-syndro...
Sentence Embeddings (SBERT): sbert.net/
MTEB Leaderboard: huggingface.co/spaces/mteb/le...
00:00 Intro
02:02 Project setup
03:29 Embeddings 101
05:07 Server-side Embeddings
06:14 SBERT & Hugging Face
10:22 Sentence Transformers Models
17:18 MTEB
28:35 Inference API
55:37 Transformers.js
1:10:38 Embeddings in the Browser
1:20:19 The Future of Embeddings
1:24:23 Thanks for watching!

Пікірлер: 272

  • @phobosmoon4643
    @phobosmoon464310 ай бұрын

    45 seconds in and you have already asked all of the correct questions. you have my attention, sir.

  • @kop-lg7lo

    @kop-lg7lo

    8 ай бұрын

    And my like

  • @ssathessa

    @ssathessa

    7 ай бұрын

    16:00 in and you have completely lost my attention, along with my hopes for being to do this as a 1st timer😂

  • @TC-Loom
    @TC-Loom9 ай бұрын

    After watching dozens of hours of similar videos this year, this is the best one. Thank you

  • @bakistas20
    @bakistas209 ай бұрын

    I love how detailed your tutorials are! Keep on

  • @justinyoung1762
    @justinyoung17626 ай бұрын

    This is the best ML for JS tutorial I've seen. SUPER helpful that you started with foundational topics. Thanks for making this.

  • @jimg8296
    @jimg82963 ай бұрын

    Wow that was comprehensive and informative. Never realized there was so much involved in selecting models for creating embeddings. Thank you so much!

  • @fille.imgnry
    @fille.imgnry5 ай бұрын

    This is sooo good. The way your thinking, explaining, digging deeper, wanting to understand why things are happening the way they do. Thank you!

  • @vinitjha_
    @vinitjha_7 ай бұрын

    Great explaination. You simplified every bit of detail👏

  • @monterourena
    @monterourena7 ай бұрын

    Amazing job! Please continue doing these videos 🙌🏻

  • @M110ification
    @M110ification4 ай бұрын

    This is the tutorial I've been trying to find. I like how you describe the different components of the system and didn't just start with a walk thru how to setup a dev environment and other mundane pieces.

  • @papa4614
    @papa461410 ай бұрын

    Yes very nice to see your still doing vids ❤❤❤ there is so much your teaching us, thanks for that. I hope you still do the part 3 video series on the electron screensaver project 🤩 stay save

  • @jakubwerner11
    @jakubwerner113 ай бұрын

    So well explained, thank you!

  • @JohnnysaidWhat
    @JohnnysaidWhat7 ай бұрын

    I used clip to embed some fields of study and then feed it user input which would "fuzzy map" to my embeddings. Worked so so well and saved me a ton of heartache. Didn't realize this model could do text2image and vice versa. Will have to try that out as well!

  • @eyemazed
    @eyemazed10 ай бұрын

    damn i love when i dive head into NLP content, get instantly overwhelmed and then a video like this gets randomly recommended and instantly clears away about 120 questions that i had. thanks man

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Awesome!

  • @Richardziani

    @Richardziani

    10 ай бұрын

    Hey, I totally feel you! 😍 It's amazing how stumbling upon the perfect video can make all those overwhelming questions disappear in an instant! 🙌 Thank you so much for sharing your experience, it's truly inspiring! 🌟 Keep diving headfirst into NLP content, my friend! 💪

  • @Mullheimer

    @Mullheimer

    10 ай бұрын

    It's amazing how Google just reads your mind. At this point it is just getting retarded. Every day exactly the right yt vid comes up. I've been querying large pdfs today via chatgpt...

  • @borknagarchile
    @borknagarchile4 ай бұрын

    Your video is amazing, you pretty much see all the possible the use cases for when you are trying to implement something like this. Thanks 🎉

  • @shuaiber
    @shuaiber6 ай бұрын

    Fantastic content. Loved the contextual information in the beginning which is relevant to understand all the different components in the ecosystem.

  • @SantamChakraborty
    @SantamChakraborty5 ай бұрын

    Thank you ! That was amazingly lucid and well put together. Keeping an eye out for your future videos.

  • @lofikiwii
    @lofikiwii10 ай бұрын

    My man!!!! Huge! (it's tony), glad to see a new video! And this was one a monster video! Its jammed pack with so much information man. Keep it going my man!

  • @roguesherlock
    @roguesherlock7 ай бұрын

    man this was so helpful. Thank you! Also, I loved all the btws, fyis, fun facts and side quests haha

  • @stephensinclair276
    @stephensinclair27610 ай бұрын

    That was one of the best videos I have seen on this topic. Going to look for more!

  • @ahmedali-hy9ew
    @ahmedali-hy9ew9 ай бұрын

    Keep it up with the great content, really love the way you explain stuff, we want more ai tutorials please.

  • @SamirLohiya-sm7ze
    @SamirLohiya-sm7ze2 ай бұрын

    This is a great video, and it makes me feel like there is no need to refer to any other video. What a great piece of content!

  • @eliaweiss1
    @eliaweiss18 ай бұрын

    wow!!! amazing stuff, thanks for all this info!

  • @mytechnotalent
    @mytechnotalent10 ай бұрын

    Great job and very detailed I like the JS internals showing the steps.

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Awesome!

  • @eru6ite
    @eru6ite7 ай бұрын

    Dang! I'm not even halfway and I've already learned so much. Amazing content, mate! 🍻

  • @mortocks
    @mortocks10 ай бұрын

    First tutorial that actually makes sense and explains the ecosystem without hand waving. Thanks!!!!!!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    You bet! Thanks for watching 😃

  • @AngelEduardoLopezZambrano
    @AngelEduardoLopezZambrano10 ай бұрын

    This video got me into the rabbit hole of your channel! You just got a new subscriber. Excited to see what you'll come up with next. Particularly, I'd like to see how to fine tune text generation models in modest hardware using javascript. Thanks!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Will keep that in mind - thanks for the sub!

  • @comptvlee
    @comptvlee10 ай бұрын

    Insanely helpful content. You've been one of the content leaders carrying the burden of disseminating all these complex topics into my head. Started with the supabase docs all the way here and I'm grateful!!! This is game changing for my career

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    I appreciate the kind words. Glad they have been helpful!

  • @danielbusquets3282
    @danielbusquets3282Ай бұрын

    Very good video. I just implemented a semantic search engine in my app and it works like magic

  • @rickgeyer9685
    @rickgeyer968511 күн бұрын

    This is absolutely the best explanation of embeddings I have ever seen. Thanks so much for this excellent video!

  • @stephenk8632
    @stephenk863221 күн бұрын

    Amazing video! Thanks. It was uncanny how questions would be popping in my head and you'd answer them in the next sentence lol. Thanks again

  • @yuri.caetano
    @yuri.caetano10 ай бұрын

    Bro your video is hell good edited and content is good, congratulations

  • @erikmejiass
    @erikmejiass9 ай бұрын

    I don’t usually post comments, but in this case would do an exception to thank you for the easy to understand and to the point explanation. It’s rarely found these days. Hats off to you 💪

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Thanks! Glad it was helpful!

  • @sevindis
    @sevindis7 ай бұрын

    Great content, thank you so much!

  • @RishuKumar-zo2hc
    @RishuKumar-zo2hc10 ай бұрын

    Very informative Please release more videos Love your content

  • @TheTaygan
    @TheTaygan23 күн бұрын

    Great video with awesome explanations... well done

  • @scign
    @scign10 ай бұрын

    I love how my head comes up with a question and you go down that exact rabbit hole! You just won a new subscriber here.

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Nice! Thanks for the sub!

  • @moy2010
    @moy201010 ай бұрын

    Probably the best content I have watched on the subject. Kudos!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Happy to hear! Thanks for watching 😃

  • @crometica
    @crometica6 ай бұрын

    It's incredibly valuable, and you've gained a fan ❤. Keep up the fantastic work!

  • @gunerdown
    @gunerdown6 ай бұрын

    You got my sub, great content!

  • @SmallTownDev
    @SmallTownDev8 ай бұрын

    By far the most informative resource 🚀

  • @mohamedamine_47
    @mohamedamine_472 ай бұрын

    While I didn't originally come to watch the TypeScript implementation, I found myself fully immersed in your explanations. It wasn't until I'd watched the first 30 minutes that I realized the video was over an hour long. Thank you for sharing this!

  • @Blocky007
    @Blocky0077 ай бұрын

    amazing and extremely helpful

  • @abagatelle
    @abagatelle8 ай бұрын

    I'm.....overwhelmed! Learn't that it's a very long road. Very pleased I found your excellent channel.

  • @N4LNba777
    @N4LNba777Ай бұрын

    Amazing video, thanks for saving me a ton of time!

  • @richardyim8914
    @richardyim8914Ай бұрын

    This was crazyyy useful. Not a software or web developer, but trying to use embeddings for academic research. Really great walkthrough!

  • @lakinmohapatra
    @lakinmohapatra8 ай бұрын

    Very nice explanation. Thanks a lot

  • @mbrochh82
    @mbrochh823 ай бұрын

    super good tutorial!!

  • @khangvutien2538
    @khangvutien25388 ай бұрын

    With this video, you got a new subscriber. Thanks.

  • @handlez411
    @handlez4114 ай бұрын

    Awesome video! Thank you!

  • @PatrickSteil
    @PatrickSteil4 ай бұрын

    Wonderful video... helping to teach us how all this stuff is working under the covers... new subscriber!

  • @elierh442
    @elierh4423 ай бұрын

    Love your content, keep it up! 👌

  • @adrienazie
    @adrienazie6 ай бұрын

    Thanks for the time and effort you have put into creating this video. Great content and explanation.

  • @amanshrivastava7495
    @amanshrivastava74955 ай бұрын

    Thanks man!!

  • @martinp3839
    @martinp38399 ай бұрын

    Excellent! Especially the way you build up the episode from explaining core concepts to a layman..glad I stumbled upon your channel..

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Glad it was helpful!

  • @martinp3839

    @martinp3839

    9 ай бұрын

    I am new to javascript and typescript. When I cloned the repo and tried executing "npm run dev" in the "HuggingFace/apps/embeddings-huggingface/src", get this error "TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".ts"", Hit a roadblock. Kindly advise how I can fix this.

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Sounds like you may be running the code as JavaScript (TypeScript has a compile step). The easiest way to get off the ground is probably to install ts-node and run your TS files through that.

  • @pranjalagnihotri6072
    @pranjalagnihotri607210 ай бұрын

    Thank you so so much for this banger video few days ago I was working on a side project (inspired by Wesbos talk) to find the similarity between a tweet replies and group them together I first used bag-of-words to generate word vectors and then used OpenAI embeddings only to find out I need to setup a paid account because of that I had to pause my side project and I would say right after this I am going and will complete my projects using huggingface models thank you so much. I explored HF but was not able to gather such details like you explained everything in video❤

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Awesome! Glad it was helpful 😃

  • @archiee1337
    @archiee13378 ай бұрын

    Great video, thank you😊

  • @Muennighoff
    @Muennighoff10 ай бұрын

    Amazing video! Will work on adding a ranking for the individual MTEB task tabs. Also we're working on adding more tasks (e.g. code embeddings) & other models to the leaderboard! 🤗

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    That’s amazing Niklas, thanks for the update!

  • @cklim78

    @cklim78

    9 ай бұрын

    Great! Look forward to the updates especially on the code embeddings task, may I know any targeted date when the updates will be released?

  • @thenextension9160
    @thenextension91607 ай бұрын

    Fantastic thanks

  • @browland601
    @browland6019 ай бұрын

    Love the level of detail here! I'm working on an idea to do text classification using sentence embeddings as input features and then doing logistical regression. If the labels (binary classifier for now) correspond well with the embeddings generated from training examples, then my hope is the logistical regression model will work with inputs not seen before but close semantically. One of the considerations is which embedding model to use, so lots of food for thought here!

  • @theepicosityofpizza

    @theepicosityofpizza

    7 ай бұрын

    This is how they tested classification performance in the benchmark mentioned, so you're on the right track

  • @omijmangukiya8984

    @omijmangukiya8984

    6 ай бұрын

    i suggest using setfit

  • @johngrant7197
    @johngrant71974 ай бұрын

    I love that you rock Typescript in domains usually reserved for Python.

  • @0f897
    @0f8978 ай бұрын

    Outstanding content 😀

  • @rikvermeer1325
    @rikvermeer132510 ай бұрын

    Hmmz... finally a good overview on embeddings that is in depth AND understandable I will watch this multiple times

  • @raphauy
    @raphauy3 ай бұрын

    Thank you very much!

  • @jvandenaardweg
    @jvandenaardweg8 ай бұрын

    Excellent video! ❤

  • @jdkh89
    @jdkh8910 ай бұрын

    Wow I learned so much from this. Thank you so much!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Glad it was helpful!

  • @aksh1618
    @aksh161810 ай бұрын

    Another great one! I'm really hooked onto the high quality content you're producing! On a related note, which one of these open source embedding models would you consider using for something like ClippyGPT, and are actually you planning to switch to one of these?

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Thanks for watching! Great question - we will definitely consider it. Will have to do a deeper dive, but ideally a model that performs well with retrieval. Also need to consider practicalities around deployment and operation, so usually smaller model = better.

  • @grant_vine
    @grant_vine7 ай бұрын

    Best hour and 24 mins spent this week 😊

  • @shinchima
    @shinchima5 ай бұрын

    just the ticket - cheers for uploading

  • @luis96xd
    @luis96xd10 ай бұрын

    Fantastic video, everything was well explained! Thanks Wow, it was an excellent solution to get Array length 1 of embeddings forking the model and adding the label Sentence Transformer, you really research a lot before making a video, great job! 👏💯 I learned a lot, of programming of types, of model parameters and outputs

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Glad it helped!

  • @KelvinWKiger
    @KelvinWKiger7 ай бұрын

    Olala! Thank you, thank you... and thank you very much. Take care 🍀

  • @ninjagaskin
    @ninjagaskin10 ай бұрын

    This beats watching Netflix. Thank you for such a well-put-together tutorial/lecture on embeddings 101. I would really value your advice (and this community too) on a question I have for a personal project: I am attempting to build a pdf analyzer for financial reports - 10ks, Annual Reports, Financial Statements, etc... What is the best metric / category you recommend I use for my embedding model and LLM too? the goal is to allow the upload of any pdf, embed the pdf, store in a vector db, and then query some question answering along with memory capabilities. - new subscriber

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Great use case! I would focus on a retrieval based model which is best suited for question-answer style matching. In terms of which one - I think you will just need to try some of them and test to see if your queries are matching with the content you expect. Good luck (thanks for the sub)!

  • @shtrumj
    @shtrumj2 ай бұрын

    please more content , more depth, loved it, subscribed.

  • @JunYamog
    @JunYamog5 ай бұрын

    Thanks I made it on the end of the rabbit hole. Interesting to learn that you can do embeddings in a browser.

  • @artem-yw8km
    @artem-yw8km7 ай бұрын

    cool realy improved inference results

  • @Swanidhi
    @Swanidhi9 ай бұрын

    Love the rabbit hole content! Subscribed!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Thanks for the sub!

  • @nelzonmusic
    @nelzonmusic3 ай бұрын

    really great video!

  • @aliandiazperez7602
    @aliandiazperez76029 ай бұрын

    Excellent video tutorial and teaching skills!!!!!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Thank you! Cheers!

  • @afterlife320
    @afterlife3206 ай бұрын

    Amazing video 👌🏻🙏🏻

  • @daryladhityahenry
    @daryladhityahenry8 ай бұрын

    Hi! This is crazy good content. I learn so much! Thank you so much.. And have a question though, I'm on minute 39:00 when talking about embedding dimension size. Is the lower dimension size more likely to not have word vectored? I mean, maybe word like "eggyolk" ( maybe ) doesn't registered because the embedding id already full so it can't get the vector value from that word? If yes, that is really a trade off that we need to choose carefully right? Or... If we just use normal vocabulary, it should be okay with small dimension? Thankss!!!!!!!!

  • @vitoralves5934
    @vitoralves59346 ай бұрын

    Man ur just a natural teacher, thank you!

  • @JustCode39
    @JustCode394 ай бұрын

    this video is amazingly good

  • @manishbabbar9969
    @manishbabbar996910 ай бұрын

    Really liked your content. Thanks for sharing such engaging information.

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Thanks, glad it helped!

  • @gabrielmalek7575
    @gabrielmalek75758 ай бұрын

    that was awesome

  • @zeelthumar
    @zeelthumar7 ай бұрын

    I have never watched full tutorial that contains javascript bcz I'm a python guy but for the first time I have gone through the whole tutorial and obviously I subscribed...................U are just awesome...............keep it up

  • @LWC
    @LWC10 ай бұрын

    Really great video. Thank you for it!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    You bet!

  • @danjsy
    @danjsy9 ай бұрын

    Great video thank you an excellent channel. For corporate use cases where they are predominantly Microsoft houses, for retrieval use cases can you see past the new OpenAI implementations? Thanks !

  • @user-vl4sg2vj8b
    @user-vl4sg2vj8b10 ай бұрын

    amazing content man. hats off👏👏

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    🙏 Thanks for watching!

  • @snsa_kscc
    @snsa_kscc10 ай бұрын

    Another banger, bro! You are on 🔥. I love the fact ML ecosystem is morphing with TS/JS web dev. Massive thanks, this was a joy to watch. Have a nice one.

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Appreciate it & thanks for watching! Very excited to see the new possibilities in the TS/JS ecosystem.

  • @cesarsanchez5576
    @cesarsanchez557610 ай бұрын

    I've watched just two of your videos and I feel like I've learnt 6 months of ML Engineering. Many many thanks

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    Glad to help!

  • @sheikhakbar2067
    @sheikhakbar20673 ай бұрын

    It's almost 3:00 in the morning, I am going to sleep and watch this excellent video tomorrow!

  • @JoEl-jx7dm

    @JoEl-jx7dm

    2 ай бұрын

    Same but different day

  • @natural2
    @natural27 ай бұрын

    Wow thanks I learned a lot!!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    7 ай бұрын

    Glad to hear it!

  • @lovol2
    @lovol28 ай бұрын

    Thanks

  • @lucamiamiflorida5065
    @lucamiamiflorida50659 ай бұрын

    best video on the topic, thanks!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    9 ай бұрын

    Glad it was helpful!

  • @RolandAyala
    @RolandAyala4 ай бұрын

    Top their content! Super informative & practical presented in approachable format, Thanks so much! Sub;d.

  • @alvarobyrne
    @alvarobyrne10 ай бұрын

    well done, and thanks for elaborating on various topics, even though the a ha moment has to wait a bit... ooooohm attitude

  • @huydh
    @huydh4 ай бұрын

    I appreciate so much that you made a video about this as I’m learning. But more than 90 mins was really long and I had to skip over some detailed explanations to get to see the demo 😂 So if anything, I’d hope you did it in 2 parts, one video for just the demo of these new technologies (what they can do for me) and another going into details behind everything. That’s my 2 cents… Loved the content 🎉😊

  • @neoblackcyptron
    @neoblackcyptron7 ай бұрын

    I'm glad I found this channel. It's very rare to come across super knowledgeable people like the owner of this youtube channel in the AI field who really know their stuff. I am planning to start an AI consulting business, cutting costs reducing token costs is a big deal for me, this really helped. By the way I heard AI consultants who know their stuff: ML, Genetic Algorithms. Robotics, Computer Vision and LLMs with some RPA thrown in can charge $200/hour is that a fair rate or am I underselling me skills. I don't want to ruin it for the rest of my peer by quoting too low.

  • @1242elena
    @1242elena10 ай бұрын

    amazing thank you!

  • @RabbitHoleSyndrome

    @RabbitHoleSyndrome

    10 ай бұрын

    You bet!

  • @IgnoreMyChan
    @IgnoreMyChan9 ай бұрын

    I love your 2010-like editing with all the micro-cuts making it very exhausting to watch and listen to. 😂

  • @arthurperini
    @arthurperini8 ай бұрын

    Thank you! Great and didactic information here Sir! Instead of use similarly of two phrases, I am going to change this example to retrieve information from a pdf and inject in gpt-3.5-turbo. Is that possible ok? Thank you

Келесі