Cohere

Cohere

Welcome to NLP, now.

We’re kickstarting a new chapter in machine learning by giving developers and businesses access to NLP powered by the latest generation of large language models, now.

Our platform can be used to generate or analyze text to do things like write copy, moderate content, classify data and extract information, all at a massive scale.

Fireside chat: Sebastian Ruder

Fireside chat: Sebastian Ruder

Пікірлер

  • @yacinehmito6954
    @yacinehmito69544 күн бұрын

    Great webinar! I found that a good audience for it are people that already know how to build a RAG-powered tool but want to get some deeper insights: challenges to look out for & what works vs.what doesn't. Definitely useful as this kind of info is sparse out there.

  • @Jo_404
    @Jo_4043 күн бұрын

    100%

  • @mehdipashazadeh9071
    @mehdipashazadeh90714 күн бұрын

    this is not an informative video, do not waste your time watching this video,

  • @yacinehmito6954
    @yacinehmito69544 күн бұрын

    That's odd. I found it very informative.

  • @user-gc9sp7bx5z
    @user-gc9sp7bx5z5 күн бұрын

    this was nice.

  • @thisisjohnny
    @thisisjohnny13 күн бұрын

    I've never heard VAE described with such depth and clarity. Not going to pretend I understood all of the maths, but I nodded along anyhow.

  • @firebolt9701
    @firebolt970113 күн бұрын

    Thank you Dr. Saquib, for the great talk.

  • @wdonno
    @wdonno16 күн бұрын

    Hi, thank you for the presentation. Can this approach be adapted to cluster sentences or paragraphs/ short ‘documents’? The notebook examples were of word embeddings.

  • @saquibsarfraz
    @saquibsarfraz12 күн бұрын

    sure, you can cluster or visualize the sentence embeddings - any vectorized representation/s for that matter.

  • @sumanthgajula4065
    @sumanthgajula406519 күн бұрын

    Great job, Surya Krishna and team! I am happy to see the contributions you are making to the Telugu community in terms of high end LLMs, which are popularly dominated by the English datasets. Proud of you!! 💐🙏👏

  • @nguyenvinh2298
    @nguyenvinh229819 күн бұрын

    I can understand word embedding as the process of passing the digital form of the word through the embedding network to get the word embedding. But is sentence embedding a combination of word embedding?

  • @FoxSmith-sn4jf
    @FoxSmith-sn4jf20 күн бұрын

    How to sign up? There are some mistakes when I sign up like 'Failed to fetch'

  • @steveroy217
    @steveroy21721 күн бұрын

    Promo_SM 🤗

  • @user-eg8mt4im1i
    @user-eg8mt4im1i21 күн бұрын

    Happy to watch a video that recalls that a vector is an element of a vector space, love maths :)

  • @varunpendharkar6305
    @varunpendharkar630524 күн бұрын

    Kite is a type of bird ;)

  • @MalamIbnMalam
    @MalamIbnMalam26 күн бұрын

    How does one sign up for LLM University?

  • @sharanbabu2001
    @sharanbabu2001Ай бұрын

    nice talk, thanks!

  • @user-er4tg9ve1m
    @user-er4tg9ve1mАй бұрын

    This complex topic explained beautifully; I recently read an article on this that intrigued me. This is the link to the article: ask.wiki/2024/01/09/semantic-search-an-intelligent-way-for-browsing/

  • @sean_thw
    @sean_thwАй бұрын

    I feel honored to have been a part of this project. It made me believe in myself more. Thank you :)

  • @AllinOne-sc5cq
    @AllinOne-sc5cqАй бұрын

    MaashaaAllah MaashaaAllah

  • @luka7626
    @luka7626Ай бұрын

    Great video!

  • @rollingstone1784
    @rollingstone1784Ай бұрын

    @cohere, @jpvalois: Excellent video however, there are some inaccuracies: 6:00 instead of "series of transformer blocks", should be "series of transformers" only? (or: attention block and feedforward block?). The description here says "three transformer layer"? The description also says "attention blocks". 09:00: the attention and feedforwarding blocks should be left to right and the arrows also left to right - this reflects the flow of the data better 09:20: should be "feedforwarding layer" instead of only "layer"? 09:40: is the first layer not an attention layer? And could an attention layer and a feedforwarding layer be combined to a transformer layer? (see "transformer blocks" on 06:00)

  • @Andromeda26_
    @Andromeda26_Ай бұрын

    Good Job Marzieh! Keep up the great work!

  • @PolymetricMonogon
    @PolymetricMonogonАй бұрын

    where I can learn all of this BERTopic as mathematical procedure not computational?

  • @rodneypantony3551
    @rodneypantony3551Ай бұрын

    No idea if you're interested in an AI Climate Change Foundation Model but here goes... Action Required for AI Climate Change Foundation Model: Include ALL DATA from the attached Wildfires paper, one of thousands of relevant papers. Need to interrogate an unbiased database of all relevant data going back at least 10,000 years. Need to know where the water has been and where it's going. Need to know if 65% of Russia is permafrost, the mathematics of the melt rate ( IT'S EXPONENTIAL, NOT LINEAR, CUZ IT'S BIOLOGICAL, AND GOING FROM FLAT TO VERTICAL and we're next), and where the Russian permafrost meltwater is going. Separate email on Canada emergency lack of preparedness and palpable incompetence . Nota bene: The size of the hardware indicated for AI Climate Change Foundation Model is global, everything from underseas fiberoptic to remote weather stations to satellites. Exemplia gratia: Permafrost microbiome is biological and multiples like rabbits...2 billion, 4, 8, 16, 32 kind of graph. Gives rise to sudden, catastrophic (biblical) events. 😩 Pairs well with books on bison, papers on ash layers, near extinction events in North America. Sadly our experts lack requisite expertise and skills. AI community indicated. Energy cost of foundation model and AI hardware attached. So, to keep costs down, restrict to 10,000 years of relevant Climate Change data. Reuters CEO has billions for AI project so that's another source of funding. Nota bene: Of hundreds of trillions of investments, much is cash equivalent and can and does shift in milliseconds from endangered investments. Kindly verify or rebut everything in this

  • @anirudhnarayan7104
    @anirudhnarayan7104Ай бұрын

    Good to see you here Swabha! Long time :)

  • @raziehfadaei4801
    @raziehfadaei4801Ай бұрын

    Does BERTopic need preprocesing like lemmatization, tokenization and removing stopwords?

  • @shafrazbuhary
    @shafrazbuharyАй бұрын

    Let's say one two movies have value for Action and Comedy 0, 1 the dot product will be 1. lets say another two movies have value of 0, 2 for both movies. And it will get dot product of 4. And will conclude second set is more simmilor than first set. But it is not the case in real. Could you please explain this.

  • @nafez
    @nafez2 ай бұрын

    Great stuff as always from Jay

  • @rajivparikh4643
    @rajivparikh46432 ай бұрын

    Multilingual AI LLMs trained responsibly enable greater creativity, learning, and collaboration. Scientists can research across borders and cultures, enabling new discoveries. Isolated communiities can gain empathy for previously distant cultures. Kudos to the Cohere AI team and all the worldwide collaborator on this human interest project.

  • @sean_thw
    @sean_thw2 ай бұрын

    this is the best thing that came out of 2023!

  • @DarwinSantos
    @DarwinSantos2 ай бұрын

    First, congratulations on the launch! Second, how well does it perform in translation tasks between Spanish and English, for example?

  • @jpvalois
    @jpvalois2 ай бұрын

    Thanks Luis! Your explanation hits just the right notes for me: no fluff, not too complex, well structured, logical, good rhythm. Excellent overall. I'll be checking out your other material. Merci beaucoup!

  • @sivalokesh3997
    @sivalokesh39972 ай бұрын

    Thank you very much.

  • @ChrisSMurphy1
    @ChrisSMurphy13 ай бұрын

    Thanks for sending me here Jay

  • @muhammadal-qurishi252
    @muhammadal-qurishi2523 ай бұрын

    Very wonderful and informative interview Thanks Jay and Omar

  • @mobime6682
    @mobime66823 ай бұрын

    you're such a great educator

  • @irshviralvideo
    @irshviralvideo3 ай бұрын

    Scale AI is way better than this

  • @BizzInnovate
    @BizzInnovate3 ай бұрын

    Excellent Video

  • @BR-hi6yt
    @BR-hi6yt3 ай бұрын

    Thanks - great talk. The problem with MI is its complexity. When each token has a thousand vectors and linear transforms on all of them makes it too complex for our normal human brains. I would like to use a cut down language, say just subject, verb, object, in present tense, no adverbs, conjunctions or even periods. Then limit the embeddings to, say, 5 vectors per token instead of hundreds or thousands. I bet human brain only uses 4 or 5 for each word. Then train the toy model on very very simple statements to "understand" and as few layers as possible. Then collect neuron data whilst changing activation function and the soft max function and more .... Its better when the model can only "say" ten or twenty things where it understands the syntactic logic rather than just memorising stuff. THEN observe what the neurons are doing - smaller and less complex. These LLMs are ridiculously complex to observe neurons. I am trying to make a very very small language syntax but need some tips. There is probably a simple way of doing it but I haven't found it yet. Maybe just simple additions or what else is possible?

  • @cat-asd
    @cat-asd3 ай бұрын

    Thank you!

  • @vrddhicapital2176
    @vrddhicapital21763 ай бұрын

    over acting ki had hothi arjun , or bhai nahi degi chutiyeh kitna haat hila leh

  • @sandeepdey89
    @sandeepdey893 ай бұрын

    Silly question here, Maarten. Can we use BERTopic in R? any go-around or emulation will be most welcome. TIA

  • @germank7924
    @germank79244 ай бұрын

    Host needs to fix the mic at least, if not the whole "vlog" setup.

  • @jasonswift7468
    @jasonswift74684 ай бұрын

    Thanks to Rosanne Liu. Your stroy inspires me a lot.

  • @syednabi1778
    @syednabi17784 ай бұрын

    Proud of you our Daughter, Allmighty bless u in every steps.

  • @rashmitrathod6873
    @rashmitrathod68734 ай бұрын

    very precise and nicely demonstrated with easy examples.. appreciate for such a wonderful explanation!

  • @cyrilgorrieri
    @cyrilgorrieri4 ай бұрын

    Hi Jay, thanks for the video. I have been doing something similar but I have faced a few issues. The first one is that KMeans doesn't seem to work so well for this use case as it requires a defined set of clusters and I was struggling to find the most optimal set of clusters. So I used HDBSCAN that can have a dynamic number of clusters, but it doesn't do well in high dimension vectors. I ended up first doing the dimension reduction via UMAP and then run the HDBSAN. It gives somewhat good results but I also have to play with the hyper parameters to find the most optimal result. Since that video, do you have some learnings around using other clustering algorithms for embeddings?

  • @cyrilgorrieri
    @cyrilgorrieri4 ай бұрын

    I wrote before listening to the Q&A part of the video😅. The third option you mentioned works quite well using UMAP to reduce to 15 to 20 dimensions, run HDBSCAN for the clustering, and finally re run UMAP again to plot. That really unblocked me towards better results

  • @NahomGetachew-wp9bj
    @NahomGetachew-wp9bj4 ай бұрын

    I admire Vered Shwartz's work, and she is an inspiration for my pursuit and exploration of this field. Thank you!

  • @concaption
    @concaption4 ай бұрын

    Hi Arjun and Sonam.

  • @xvaruunx
    @xvaruunx4 ай бұрын

    ✨✨✨✨

  • @andrew-does-marketing
    @andrew-does-marketing4 ай бұрын

    Does cohere have developers that business owners can hire or work with to implement an embedding database?