Superfast RAG with Llama 3 and Groq

Ғылым және технология

Groq API provides access to Language Processing Units (LPUs) that enable incredibly fast LLM inference. The service offers several LLMs including Meta's Llama 3. In this video, we'll implement a RAG pipeline using Llama 3 70B via Groq, an open source e5 encoder, and the Pinecone vector database.
📌 Code:
github.com/pinecone-io/exampl...
🌲 Subscribe for Latest Articles and Videos:
www.pinecone.io/newsletter-si...
👋🏼 AI Consulting:
aurelio.ai
👾 Discord:
/ discord
Twitter: / jamescalam
LinkedIn: / jamescalam
#artificialintelligence #llama3 #groq
00:00 Groq and Llama 3 for RAG
00:37 Llama 3 in Python
04:25 Initializing e5 for Embeddings
05:56 Using Pinecone for RAG
07:24 Why We Concatenate Title and Content
10:15 Testing RAG Retrieval Performance
11:28 Initialize connection to Groq API
12:24 Generating RAG Answers with Llama 3 70B
14:37 Final Points on Why Groq Matters

Пікірлер: 17

  • @awakenwithoutcoffee
    @awakenwithoutcoffee9 күн бұрын

    hi James, Microsoft just open-sourced their graphRAG technology stack, might be cool to take a look and see how we can leverage/combine them both.

  • @alexjensen990
    @alexjensen9903 күн бұрын

    Nice walk through and I agree that Groq is amazing... Just wish they had other models.

  • @PanduPandu-fh5tk
    @PanduPandu-fh5tk2 күн бұрын

    Good work, Helped me a lot!

  • @tiagoc9754
    @tiagoc975412 күн бұрын

    Nice thing is that you can use groq with langchain as well

  • @jamesbriggs

    @jamesbriggs

    12 күн бұрын

    Yes very true

  • @gilbertb99
    @gilbertb9912 күн бұрын

    What are your thoughts on adding a short summary description of the document or paper in each chunk including the title?

  • @jamesbriggs

    @jamesbriggs

    11 күн бұрын

    it's a good idea - I haven't tried it before but seems sensible, you would need to find a balance between too much summary which might "overpower" the meaning of the chunk itself and getting enough summary in there to be useful - but if you get something good there it feels like a great idea imo

  • @content_ai_
    @content_ai_12 күн бұрын

    You in Bali nice! I am looking for an online job mate. I'm pretty desperate at this point

  • @jamesbriggs

    @jamesbriggs

    12 күн бұрын

    You can tell? But yes, here for a while - just work on AI stuff, get yourself out there a bit, it does take time though

  • @tiagoc9754
    @tiagoc975412 күн бұрын

    Groq is insanely fast

  • @jamesbriggs

    @jamesbriggs

    12 күн бұрын

    Yeah it’s wild

  • @tiagoc9754
    @tiagoc975412 күн бұрын

    Is there any oss embedding model you'd recommend over e5 for real/prod use cases? I've just used openai so far

  • @juanpablomesalopez

    @juanpablomesalopez

    12 күн бұрын

    gte-base or bge-base are good in benchmarks, but gotta really test them on your use case. You should also fine-tune the embeddings with your use case data.

  • @jamesbriggs

    @jamesbriggs

    12 күн бұрын

    E5 have been good, I like Jina’s embedding model, and I’ve heard some good things about BAAI bge-m3 too for hybrid search

  • @byczong

    @byczong

    11 күн бұрын

    @@jamesbriggs maybe in some future video you could cover bge-m3 :)) this model sound pretty cool (especially dense/multi-vector/sparse retrieval)

  • @Davorge
    @Davorge12 күн бұрын

    is this re-usable in such way that we can switch calling groq to call open ai gpt-4o or other models?

  • @jamesbriggs

    @jamesbriggs

    12 күн бұрын

    Yeah it’s pretty simple to swap them out, they use a similar (maybe even same) API

Келесі