RAGAS - Evaluate your LangChain RAG Pipelines

Creating good RAG Systems is hard. RAGAS can help you to change some parts of your System and perform automated performance evaluation to see if your RAG performance improved or not.
Code: github.com/Coding-Crashkurse/...
Timestamps:
0:00 Introduction
0:30 RAGAS
9:46 RAGAS with LangFuse

Пікірлер: 26

  • @seallyolme
    @seallyolme3 ай бұрын

    This is awesome! Great and clear video :)

  • @mosheragomaa5544
    @mosheragomaa5544Ай бұрын

    So simple, helpful and clear! Very interesting. Thanks for the video

  • @M10n8
    @M10n84 ай бұрын

    Excellent timing ;-) Thanks for video

  • @Challseus
    @Challseus4 ай бұрын

    Another banger! :)

  • @andreypetrunin5702
    @andreypetrunin57024 ай бұрын

    Огромное спасибо за видео!!

  • @user-we3qo9kj4q
    @user-we3qo9kj4q4 ай бұрын

    Bro is on fire this month!

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    You guys give me so many requests on topics 😀

  • @user-we3qo9kj4q

    @user-we3qo9kj4q

    4 ай бұрын

    @@codingcrashcourses8533 i was, i am and i will support you till the end. Ur videos helped my sooooooooooo much.

  • @mohammed333suliman
    @mohammed333sulimanАй бұрын

    Great video , thank you

  • @codingcrashcourses8533

    @codingcrashcourses8533

    Ай бұрын

    thank you for your comment :)

  • @maxlgemeinderat9202
    @maxlgemeinderat92024 ай бұрын

    Nice one! Also a big fan of RAGAS, however there are still many bugs that come with RAGAS, especially when trying to evaluate with local llms

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    yes, it´s still far away from perfect, but good that frameworks like these are developed

  • @nguyenquynghia9755
    @nguyenquynghia9755Ай бұрын

    I switched to using RecursiveCharacterTextSplitter, but my context relevance is still low. Do you know why?

  • @GenerativeAI-Guru
    @GenerativeAI-Guru4 ай бұрын

    I was waiting for this thank you so much, is it possible to add how to evaluate accuracy using F1 scoring or other methods

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    Not out of the box, F1 scores can be easily caculated with pandas (to_pandas) like this: F1 = 2*precision*recall/(precision+recall)

  • @GenerativeAI-Guru

    @GenerativeAI-Guru

    4 ай бұрын

    @@codingcrashcourses8533 thanks

  • @maxlgemeinderat9202

    @maxlgemeinderat9202

    4 ай бұрын

    you could also calculate the RAGAS score which is the mean across all metrics

  • @robertputneydrake
    @robertputneydrake4 ай бұрын

    Nice, Meister! Machste irgendwann das Thema Code RAG ggf. mit Knowledge-Graphen?

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    Currently no plans on working with knowledge graphs, since I don´t have experience with these. But maybe in the future :)

  • @alexandershevchenko4167
    @alexandershevchenko41674 ай бұрын

    Thank you for the video! Yeah, It will be really intereseting to know how to perform RAGAS in CI/CD pipline. Can you record video for this one please? Will be really helpful

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    Maybe in a few weeks

  • @fire17102
    @fire171024 ай бұрын

    It's there an ai pipeline to auto optimize the rag quality? Seems like the obvious next step... Great video 🙏👍

  • @codingcrashcourses8533

    @codingcrashcourses8533

    4 ай бұрын

    You probably would have to build something like that on your own, since there are so many ways how a pipeline could look like. You could also work on your prompt and so on.

  • @fire17102

    @fire17102

    4 ай бұрын

    @@codingcrashcourses8533 I'd always want to manually make changes I think are best, but I'd still like to see a full matrix of hyperperameters to remove alot of the guess work. Chunk size for example. More over I'd like to benchmark everything and add scoring functions. For example a score for fact checking - see Lucidate's last video. And also IndyDevDan last video battle royal of models, I suggested to combine it with something like you do with rag params and what I suggest for full pipeline benchmark with ai suggested optimization