Talk to your CSV & Excel with LangChain

Ғылым және технология

Colab: drp.li/nfMZY
In this video, we look at how to use LangChain Agents to query CSV and Excel files. This allows you to have all the searching power of a tool like Pandas but done through natural language using an LLM to help.
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t...
github.com/samwit/llm-tutorials
#LangChain #BuildingAppswithLLMs

Пікірлер: 177

  • @bseddonmusic1
    @bseddonmusic1 Жыл бұрын

    You are producing great content that's showing me how to exploit GPT. Thanks.

  • @mahenderp2017
    @mahenderp201711 ай бұрын

    Good article with a workable example. Great work.

  • @kennethleung4487
    @kennethleung4487 Жыл бұрын

    Great stuff Sam. Looks like those legacy Excel spreadsheets with macros and multiple indexes still require plenty of cleaning and preprocessing before we can use any agent on them

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Yes treating the doc as a spreadsheet/table and not a csv file is actually quite different. The spreadsheet way is being baked into Google Sheets and Excel so I wonder how much of a market there is for an open source system. Would love to hear your opinion.

  • @rickeras
    @rickeras Жыл бұрын

    Might be a good idea for a new video is Lang Flow. A GUI based tool for Lang Chain

  • @ambresh009
    @ambresh00920 күн бұрын

    The videos are great. Very helpful. I've a question. After loading the csv file using CSVLoader, which custom chain/agent I can use? Can you share some insights on that? Share any reference/notebook if possible.

  • @TienPham-rx6gk
    @TienPham-rx6gk9 ай бұрын

    Hi Sam, thank you for this great tutorial. If possible, can you also show us how to use HuggingFace models for the csv agent? Also, do you have any recommendation which LLMs from Huggingface is great for this kind of task? Look forward to hearing from you soon.

  • @ibrahim-sf9od

    @ibrahim-sf9od

    6 ай бұрын

    Hey hi @TienPham-rx6gk did you find any solution? I am looking for an open source pre-trained model too which can do this task? did you find any on hugging face?

  • @joelwilson9079
    @joelwilson90799 ай бұрын

    Great stuff Sam. Quick question - How do we improve the model if it answers a question incorrectly? Is there a "training" mechanism or reward function to let them know it was incorrect?

  • @samwitteveenai

    @samwitteveenai

    2 ай бұрын

    (just seeing this now) not really you can fine tune the LLM for this task but that isn't a guarantee.

  • @RS-vu5um
    @RS-vu5um Жыл бұрын

    Great Video. Your sessions are super

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Thanks, I appreciate that!

  • @kakaraparthiphani9983
    @kakaraparthiphani99833 ай бұрын

    Good Video.. I have a doubt you have taken a dataset with all columns of integers. if the columns having strings or characters..?

  • @Aidev7876
    @Aidev78765 ай бұрын

    This is exactly what I needed but can I use something more secured than langchain. For example Voiceflow on top of chatgpt? My customer is very sensitive about data protection. Thanks a lot for answering.

  • @adriangabriel3219
    @adriangabriel3219 Жыл бұрын

    could you make a video on how to correctly use a csv_agent in langchain with alpaca? I have tried the approach you showed with Alpaca and it doesn't seem to produce good results at all, so I would be curious to see how you go about it

  • @jacksheen2574
    @jacksheen2574 Жыл бұрын

    Great video Sam … I had one question - Could you please tell me how to change the agent.agent.llm_chain.prompt.template ? I will be very grateful to you if you can help me out as I am just starting to learn LangChain

  • @stlo0309
    @stlo030910 ай бұрын

    Hi Sam. brilliant tutorial for doing exactly what the video title says. I do have a question, what actual LLM does the agent call when we simply say OpenAI(temperature=0) without specifying any model parameter?

  • @sankalpyadav373

    @sankalpyadav373

    10 ай бұрын

    Is chatgpt api become paid, it is showing that limit has been reached. Do you face same problem

  • @samwitteveenai

    @samwitteveenai

    9 ай бұрын

    when I recorded that video (a few months ago) I think it was text-davinci-003, it is probably the same with ChatOpenAI being used for the other OpenAI models.

  • @kevinehsani3358
    @kevinehsani3358 Жыл бұрын

    Thanks for the great video. I think you already have done pandasAI video. Would you recommend using that in place of an agent from langchain?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Good question. I think the Pandas AI is more if you are using it for personal use but LangChain if making an app for others etc. Can check the prompts from both and see what works best for you and use those as well.

  • @violasong6592
    @violasong6592 Жыл бұрын

    Very nice tutorial! Thanks! I have a question tho, how do we ask questions to multiple csv files? or even multiple csv files + some txt/pdf documents?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    you can have multiple indexes and query each of them.

  • @guilhermeveiga9345
    @guilhermeveiga9345 Жыл бұрын

    Thanksss man, great vid

  • @abbuu_
    @abbuu_9 ай бұрын

    hey Sam, great video and content in general, just a quick question, how would you go about adding short term memory to a chain with Dataframe/CSV? The dataframe or csv agents have no parameter for MemoryBuffer. There are ways to read the csv or dataframe using a separate loader, but how do you incorporate it into a chain with an llm, prompt and most importantly, a memory buffer? I am trying to make it remember the questions I asked before (memory in the same chat instance, not historically - e.g. when you correct a question the llm does not understand, "I meant X") Thanks much

  • @generic-youtube-user

    @generic-youtube-user

    9 ай бұрын

    Hey, i am also looking for similar functionality. Did you find anything for it? Apparantly we can use the Conversational Memory Buffer but it seems it doesn't integrate well with this csv_agent.

  • @jayrn4596

    @jayrn4596

    6 ай бұрын

    Hello guys. I am also working on a similar use case. Any solution you guys found?

  • @alperenyuksel7184

    @alperenyuksel7184

    2 ай бұрын

    Hello guys. I am also working on a similar use case. Any solution you guys found?

  • @ibrahim-sf9od
    @ibrahim-sf9od5 ай бұрын

    Hey hi sam, I have one main question. Is there any open-source model where I can do the same thing ? or is there any open-source even close to doing what you have done here ? maybe I can fine tune and use that.

  • @madhu1987ful
    @madhu1987ful3 ай бұрын

    Great video. BTW, I could not extract the Prompt from the agent using the code specified in this video. It was throwing error

  • @RedCloudServices
    @RedCloudServices Жыл бұрын

    Sam can you make a video showing how to get a reply as a Plotly chart? or a PyVis with networkx graph?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    one of my previous vids should getting replies as triples which you can use in NetworkX. Might look at making something more advanced like that

  • @harryfinn8460
    @harryfinn8460 Жыл бұрын

    Excellent video Sam, I too have a question, lets say i wanted to add to the csv_agent promt - ie tell it how it should handle date periods like "last week", ie specify it to use today as the end of period and ignore all future dates. Is there anyway to extend the csv_agaent? or do you have to write a custom agent?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    You could probably do this just by overwriting the Prompt to add it in there. See how I get the prompt to show what it is and then just assign it to that variable.

  • @harryfinn8460

    @harryfinn8460

    Жыл бұрын

    @@samwitteveenai thanks Sam, that exactly what I did! Appreciate you commenting back mate.

  • @benebento9572

    @benebento9572

    Жыл бұрын

    Me too

  • @maanyajain6105
    @maanyajain610511 ай бұрын

    Hi Sam, is there any open source LLM that we could use for the same??

  • @joseluisbeltramone599
    @joseluisbeltramone5999 ай бұрын

    Thanks for the great video, Sam. I was doing analytics on a pandas DF using the LangChain agent and came across the model’s tokens limit. Is there any way to overcome it?

  • @samwitteveenai

    @samwitteveenai

    9 ай бұрын

    you can use the 16k context model for 3.5-turbo which is 4x longer than the normal 3.5 model

  • @joseluisbeltramone599

    @joseluisbeltramone599

    9 ай бұрын

    @@samwitteveenai I'll try. Thank you again Sam!

  • @rafaelprudencioleite7291
    @rafaelprudencioleite7291 Жыл бұрын

    Great video! There's some notebook that show how use Alpaca Llama to talk to CSV or any other date file like Json?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    I made one and it didn't work well out of the box, so I need to finetune an Alpaca to do it. Will try to do that this weekend.

  • @rafaelprudencioleite7291

    @rafaelprudencioleite7291

    Жыл бұрын

    @@samwitteveenai thanks so much!

  • @oscarsotelo898
    @oscarsotelo898 Жыл бұрын

    Great work. I had a question, What could be the problem that it only counts 5 records when I have 200?

  • @samwitteveenai

    @samwitteveenai

    11 ай бұрын

    It might be limited to only sending that many back to the LLM, not sure about this as I did it quite a while ago.

  • @The31COD31
    @The31COD312 ай бұрын

    In a database of cars would LangChain be able to compare cars with everything about them (brand, series, model, HP, option list, etc) to another to give me a good comparison car for example a Mercedes A-Class to an Audi A3 or something like that. Series and model would an input from myself for which car could compare to what and some it should solve itself by comparing body types etc, but option list is not normalised for different car producers. Would vector embedding be needed for that? Or is a different model a better solution? For example BERT? Would be grateful about a response, thank you.

  • @AkhRamy
    @AkhRamy8 ай бұрын

    How scalable is this to large data sets, or to databases with multiple tables?

  • @matthew_berman
    @matthew_berman Жыл бұрын

    Fantastic video, Sam. I’m going to try this but use a pdf instead.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    I have some chat your docs vids coming, but they keep getting delayed by LLMs getting released every day :D

  • @matthew_berman

    @matthew_berman

    Жыл бұрын

    @@samwitteveenai are you just using pure langchain for it?

  • @damianogarofoli165
    @damianogarofoli165 Жыл бұрын

    Nice video! I have a question though, is it possible replicate the code or the idea using a different LLM like Bloom, OPT or GPTNeoX?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Yes but it wont work with the standard version of those models because they don't do well with these tasks. I did one No OpenAI vid and I plan another later this week, looking at what models can do what etc.

  • @vikkasgoel2465
    @vikkasgoel2465 Жыл бұрын

    Hi Sam, Great and very helpful video, thanks. I have a question. My CSV have many columns and then there is another csv that contains the definition of each column. How to handle such case and stillbe able to ask questions on the csv. Vikkas

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    you try feeding that info in via the prompt. Just try to keep it concise.

  • @surajkhan5834
    @surajkhan5834 Жыл бұрын

    can please tell me how can we use pinecone into this to store memory

  • @BerwinSingh
    @BerwinSingh4 ай бұрын

    Hey sam, Great video! Can i achieve the same using Mistral or Llama 2?

  • @samwitteveenai

    @samwitteveenai

    4 ай бұрын

    with some of the finetunes of Mistral you should be able to get some ok results.

  • @BerwinSingh

    @BerwinSingh

    4 ай бұрын

    @@samwitteveenai thanks. Will try it out

  • @FFF0007
    @FFF0007 Жыл бұрын

    Awesome content! Simple and effective. Congrats :) ((small question: is it possible to use an alternative to OpenAI for this task? Some LLM providers such as SelfHostedPipeline or SelfHostedHuggingFaceLLM?! Thanks in advance.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Yes you can, but often models like Alpaca etc. weren't trained on instructions that allow this to work, so it would need finetuning.

  • @FFF0007

    @FFF0007

    Жыл бұрын

    @@samwitteveenai great to know, thanks. I am going to watch your finetuning video first :)

  • @taminem6509

    @taminem6509

    Жыл бұрын

    Hi @@samwitteveenai, do you have any tips / links on how to build instructions dataset from csv tables to finetune LLMS like Alpacas ?

  • @1MinuteFlipDoc
    @1MinuteFlipDoc Жыл бұрын

    i was not aware of this -- cool! Welcome to LangChain LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an API, but will also:

  • @Player-oz2nk
    @Player-oz2nk4 ай бұрын

    🎯 Key Takeaways for quick navigation: 00:00 🗂️ *Introduction to LangChain for querying CSV and Excel files* - Overview of using LangChain with OpenAI models to extract data from CSV and Excel files. 01:25 🔒 *Security considerations for CSV agent* - The CSV agent runs a Python agent under the hood, caution advised for prompt injection attacks. 02:22 🛠️ *Setting up the CSV agent with OpenAI language model* - How to create a CSV agent and configure it to minimize hallucination by setting the temperature to zero. 03:48 📊 *Understanding the CSV agent's prompt and scratch pad* - Explanation of the CSV agent's prompt structure and the use of a scratch pad for iterative language model calls. 05:14 🤔 *Asking the CSV agent simple and complex questions* - Demonstrating the CSV agent's ability to answer simple queries like row counts and more complex ones involving data filtering. 07:32 🔄 *Using LangChain with Excel files and custom agents* - Converting Excel files to CSV for use with LangChain and the possibility of creating custom agents for specific tasks. 09:22 🎓 *Conclusion and practical applications of LangChain* - Summarizing the capabilities of LangChain for non-technical users to query data and the invitation for feedback and subscription. Made with HARPA AI

  • @benebento9572
    @benebento9572 Жыл бұрын

    Hello Sam, when will you make a video about reading csv, pdf or txt data using free LLMs? It would be interesting to learn using alternatives to chatgpt/openai.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    They need to be fine tuned or find prompts that can get them to stay consistent. most will not work for tools etc.

  • @Freedomwithfinance-cha
    @Freedomwithfinance-cha9 ай бұрын

    Hi @Sam - One more question: Can i refine the prompt of the agent?

  • @samwitteveenai

    @samwitteveenai

    9 ай бұрын

    Yes all the prompts you can change and should tune depending on the model you are using.

  • @angelo3108
    @angelo3108 Жыл бұрын

    This is wonderful. How long would creating this app take? you made it look easy!

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    writing the backend is not that complicated if you look at the Colab code I provided.

  • @angelo3108

    @angelo3108

    Жыл бұрын

    @@samwitteveenai thank you so much.. Will check out out and get back to you. Again thanks for sharing your knowledge

  • @angelo3108

    @angelo3108

    Жыл бұрын

    Hi Sam.. Are you available for consultation?

  • @shuntianli9651
    @shuntianli96519 ай бұрын

    what is the strategy for handling large amount of csv file, for example: over 800K

  • @rickmoni4598
    @rickmoni459810 ай бұрын

    Possible to use Matplotlib or Seaborn to display Data Visualization as the additional output after we query the data? So you think this would work?

  • @samwitteveenai

    @samwitteveenai

    10 ай бұрын

    Yeah possibly better to try doing it as a custom tool with an OpenAI Function

  • @user-ie3by1dv6n
    @user-ie3by1dv6n Жыл бұрын

    Thank you for your informative video. I have a question for you. I followed your method to conduct queries and responses for the product information in my online store's csv file. However, it consumed too many tokens for just a few questions, as shown below: text-davinci, 17 requests - 42,525 prompt + 2,142 completion = 44,667 tokens. I'm wondering if converting the csv file into embedded vector values could reduce the number of tokens used in queries. I'd like to know your opinion on what can be done when the tokens used for queries and responses are excessively high.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    Interesting what types of queries were you doing? if it was things like list all the products etc and that was more that 4k tokens yes you will have an issue, if it was just getting Pandas queries it should have that kind of issue. You are right you could use a vector store and do it that way. I have a few videos showing things like that coming out soon

  • @adolforangel1045

    @adolforangel1045

    Жыл бұрын

    Hey 구본천, great questions. What have you found to be best for an optimal token consumption? I started using embeddings for questions but then got to know agents and started using them. Using this agent method and asking 5 questions on a 15,000 rows table, the consumption was $0.14 USD; not that optimal. Appreciate your feedback! And thanks Sam Witteveen for such great content!

  • @sarveswarnaidu717

    @sarveswarnaidu717

    Жыл бұрын

    Looking for this solution @samwitteveenai any documentation to achieve this?

  • @PaulBenthamcom

    @PaulBenthamcom

    Жыл бұрын

    @@samwitteveenai Sam, could you point me in the direction of your videos using a vector store with the pandas agent? Or indicate when you might have some videos out on it? I'm currently comfortable with the Pandas agent and adjusting the prompt but it gets expensive!

  • @bourbe
    @bourbe7 ай бұрын

    Hello, I am wonderng About something, when WE se a csv agent, WE don't need to use embeding, Vector data base or a memory ? I am currenly confuse

  • @micbab-vg2mu
    @micbab-vg2mu Жыл бұрын

    Thank you :)

  • @MeanGeneHacks
    @MeanGeneHacks Жыл бұрын

    What would be cool would be if we could visualize the data using matplotlib

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    this is an interesting direction a few people have mentioned and since I suck at writing Matplotlib code I probably will look into it :D

  • @Arocksum
    @Arocksum Жыл бұрын

    What is the name of the OpenAI model you used inside this video ?

  • @user-sg9zq9mm3d
    @user-sg9zq9mm3d Жыл бұрын

    Sam any idea how to have this on multiple csv files

  • @KatharinaBuhrke
    @KatharinaBuhrkeАй бұрын

    Is there a code that can put the text back into the same excel file? I mean so that the excel loader from langchain does not forget about the formatting but can put it into the same excel format with even additional, AI generated content Thanks in advance for your tips!!!

  • @kunalmundada8754
    @kunalmundada8754 Жыл бұрын

    I approached this slightly differently by converting CSV/Excel files into SQL tables(named by name of csv). Then using the SQL agent instead of CSV agent, as GPT is well-trained for SQL queries. There is one downside that the SQL table do not have the correct schema for the columns. Do you see any other issues arising out of it?

  • @clray123

    @clray123

    Жыл бұрын

    What do you mean it does not have the correct schema? All SQL columns have names and data types.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    I think the key thing with all of these is to experiment and see what works best for you own situation. I may make a video of the SQL Agent as well, it is also very cool.

  • @JesseDahirKanehl

    @JesseDahirKanehl

    Жыл бұрын

    I would love to do this as well since I'm well versed in SQL and all our data is in SQL server. It would be nice to use Wolfram alpha or JavaScript libraries to generate charts or nice looking tables if the user of our chat bot wants it

  • @njokedestay7704

    @njokedestay7704

    Жыл бұрын

    @@samwitteveenai I'll be waiting for that 👍👍👍

  • @angelo3108

    @angelo3108

    Жыл бұрын

    This is wonderful idea. How long would creating this take?

  • @clray123
    @clray123 Жыл бұрын

    Would it be capable of doing (complex) joins between SQL tables to answer arbitrary predicate logic questions using a database?

  • @kyoungd

    @kyoungd

    Жыл бұрын

    Probably not, but give it a few years. Scary.

  • @thebirdhasbeencharged

    @thebirdhasbeencharged

    Жыл бұрын

    To some extent if you make it aware of the tables, I've had more luck with text2sql

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    you can use the SQL Agent for that so you get SQL queries and not pandas etc. I might make a vid of that soon.

  • @boopalanm5206
    @boopalanm52067 ай бұрын

    Hi sam had one doubt how can we chat with .xlsx file or .xls file

  • @supriyodey2461
    @supriyodey24616 ай бұрын

    How do we aadd past conversations as memory to agent?

  • @NileshKumarPandey-vr7pw
    @NileshKumarPandey-vr7pw22 күн бұрын

    how to persist that csv in vector db and get similar kind of response ? please help.

  • @sanakmukherjee3929
    @sanakmukherjee3929 Жыл бұрын

    Nice explanation. Can you help me add this to a custom csv dataset.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    custom csv should work just fine.

  • @sanakmukherjee3929

    @sanakmukherjee3929

    Жыл бұрын

    @@samwitteveenai yes I found that but how do access conversationbuffermemory with it

  • @kartikeychouhan1738
    @kartikeychouhan173811 ай бұрын

    Can we use other language model like LLAMA or Alpace for reading csv like this?

  • @samwitteveenai

    @samwitteveenai

    11 ай бұрын

    most don't have enough reasoning for doing that.

  • @rohiniayyalraj7532
    @rohiniayyalraj7532Ай бұрын

    Wat if the excel is having multiple sheets. Will it work?

  • @user-yg6fr6jy3d
    @user-yg6fr6jy3d11 ай бұрын

    Hey I ran into an issue which I found quite weird. create_csv_agent worked for me as in the video, but then suddenly I started getting an error while running the same code as before on the same file. The error was a token limit error. its only a 157 row csv file and again, it worked before on the same file, but suddenly even upon restarting kernel and reloading everything, it will not query because of this error. Anyone ran into this weird issue?

  • @westonbeck9436

    @westonbeck9436

    10 ай бұрын

    I have this issue as well but I have not been able to resolve it. Did you ever find a solution?

  • @souviksen7286
    @souviksen72862 ай бұрын

    Sam, really great demonstration on langchain CSV agents but I am getting the error OutputParserException while running the code in notebook in vs code to chat with my csv file not containing huge data only 1 sheet of 22 rows using langchain create_csv_agent, AzureOpenAI. How can I solve this error, Sam could you or anyone out there please give me the solution for this issue with detailed explanation? Please revert to me for more details on this. Thanks.

  • @samwitteveenai

    @samwitteveenai

    2 ай бұрын

    they have updated LangChain so the code on this is about 1 year old unfortunately. I will try to make a new version of the video soon.

  • @Passe1811
    @Passe1811 Жыл бұрын

    Could it be that the CSV agent always summarize the text? I have this "Comment" field on my CSV and when i asked for the value of that field in one of the rows, it returns me a summarize of that comment, not the comment itself 🤔. The original comment: The products arrived in good condition, but the delivery was delayed more than expected and the customer service did not provide me with a clear solution regarding the matter. The value returned by the agent: The products arrived in good condition, but the delivery was very slow.

  • @p.v.chaitanya655
    @p.v.chaitanya65510 ай бұрын

    which LLM are you using? gpt3.5 or gpt4

  • @samwitteveenai

    @samwitteveenai

    10 ай бұрын

    from memory that was davinci 3 or 3.5. the code should show it.

  • @techieinside1277
    @techieinside1277 Жыл бұрын

    could you make a video on using langchain and llama to connect llama to the internet? maybe using alpaca13b or alpaca7b?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    I am looking into this, the challenge is to do it well LLaMa needs to be trained on a unique dataset. Still working on it

  • @techieinside1277

    @techieinside1277

    Жыл бұрын

    @@samwitteveenai i see. what about something like vicuna

  • @kaustubhshingana
    @kaustubhshingana10 ай бұрын

    How can we load multiple files ?

  • @I_Lemaire
    @I_Lemaire Жыл бұрын

    This can affect the jobs of many data workers and analysts. How can they best protect themselves?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    I think like many areas the need to people with a surface amount of knowledge may decline, but there will still be a need for people with deep knowledge.

  • @dasman9187

    @dasman9187

    Жыл бұрын

    @@samwitteveenai How deep though? Didn't GPT 4 just pass a medical licensing exam with flying colors? I think you could potentially pivot into areas that have to do with AI, because undoubtedly many new jobs will be created from this. Many people will be left behind though.

  • @ranausman143
    @ranausman143 Жыл бұрын

    Does it sends / uploads your csv data somewhere? I explicitly wanted to know about data privacy.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    not you full file but if you use OpenAI like this then some of the data will be included in the prompt.

  • @fiellin
    @fiellin Жыл бұрын

    any idea to process multiple csv/excel data on it?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    you could run it multiple times and then merge the outputs to a summary chain. This would require making a custom agent etc.

  • @xjp
    @xjp Жыл бұрын

    Possible for the agent to query data from 2 csv files instead?

  • @samwitteveenai

    @samwitteveenai

    11 ай бұрын

    yes but will need to change some of the internal code

  • @glansingColt
    @glansingColt Жыл бұрын

    how can i only print out the final answer?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    set verbose = False

  • @theh1ve
    @theh1ve Жыл бұрын

    Could you do this but not using chatGPT? I would need to use a local LLM is that at all possible?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    yes but you would probably need to finetune the local model for this task.

  • @MrPsycic007
    @MrPsycic007 Жыл бұрын

    Can we try to do something similar with Opensource LLMs alpacalora , gpt4all ?

  • @knoopx

    @knoopx

    Жыл бұрын

    been playing with this, no success so far but surely coming very soon.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    good question I did try this on Alpaca and was hoping to show that as a follow up video but it wasn't good enough out of the box. That said it should be doable by finetuning the model first. I will have another go at it when I get some time.

  • @pwned1111

    @pwned1111

    Жыл бұрын

    @@samwitteveenai fine tune it on various pandas queries ?

  • @madakuse

    @madakuse

    Жыл бұрын

    Waiting for this. Will be fantastic.

  • @knoopx

    @knoopx

    Жыл бұрын

    Got tools and data QA working but the context size (2048) limits significantly the amount of text you can feed. And it's slow, even on 4bit. We need a non Llama based one for this to be useful.

  • @JiandiDong
    @JiandiDong Жыл бұрын

    is there a limit for the size of the csv file?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    possibly but they way it works as long as the CSV can be loaded into memory, then pandas queries can be run on it.

  • @MaxKamrani
    @MaxKamrani9 ай бұрын

    what about large csv ?

  • @ashishkr.229
    @ashishkr.2293 ай бұрын

    Can i give you my csv assignment?... I've to submit by tomorrow and I don't know how to do😢

  • @leslietientcheu4025
    @leslietientcheu40257 күн бұрын

    What about csv file without using csv agent please help

  • @GMCvancouver
    @GMCvancouver11 ай бұрын

    I have private documents (Excel &CSV)I can't share it with openai , is there anyway to do it as private GPT ?

  • @samwitteveenai

    @samwitteveenai

    11 ай бұрын

    Yes you can try some of the open source models. I am going to revisit this in some more vids soon.

  • @GMCvancouver

    @GMCvancouver

    11 ай бұрын

    @@samwitteveenai Many thanks Sam, that would change my life I have plenty of CVS & excel files and existing LLM like groovy and snoozy from gpt4all are unable to read CSV & Excel correctly. That would be great to have tutorial video ☺️

  • @rohitchan007
    @rohitchan007 Жыл бұрын

    🔥🔥🔥🔥

  • @chinmaybhat9636
    @chinmaybhat963611 ай бұрын

    HI @Sam Witteveen I am getting Rate Limit Error Can you guide me how to do that ?

  • @samwitteveenai

    @samwitteveenai

    11 ай бұрын

    That sounds like an OpenAI issue, leave it and try a bit later sometimes their API has issues

  • @AshishKumarRajak-xg7il
    @AshishKumarRajak-xg7il3 ай бұрын

    do i have to add open ai api key myself?

  • @samwitteveenai

    @samwitteveenai

    3 ай бұрын

    yes you will need to

  • @chrisweeks8789
    @chrisweeks8789 Жыл бұрын

    Is it possible with alpaca models?

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    not with the straight Alpaca model. I have tried it and didn't get good results. But I am working on a finetuned version of Alpaca to do it.

  • @chrisweeks8789

    @chrisweeks8789

    Жыл бұрын

    @@samwitteveenai i shall sub and eagerly wait for its arrival

  • @MrSaixxx
    @MrSaixxx6 ай бұрын

    i need json responce

  • @PokemonParadise2010
    @PokemonParadise2010Ай бұрын

    Can this do graphing too?

  • @samwitteveenai

    @samwitteveenai

    Ай бұрын

    graphin in what sense? plots? you could make an LLM write the code for a plot. if you are talking about Knowledge graphs then yes but in a different way

  • @PokemonParadise2010

    @PokemonParadise2010

    Ай бұрын

    @@samwitteveenai So like if i ask " Make me a line plot showing the trend of xyz from 2005 - 2010 using the Plotly library" (assuming I have that data ofc!), I would want it to make me a line graph using Plotly

  • @satishkumar-ir9wy
    @satishkumar-ir9wy Жыл бұрын

    Nice content, I have few queries: 1. if i use OpenAI API key, is it like my organization's data will get exposed? 2. Can you make video on how to develop a model to extract question answers from my Organization's data (available in CSV in Excel format only). In my case i want to create the similar question answering bot or web app with my organization's data. Anyone has any idea about that.

  • @samwitteveenai

    @samwitteveenai

    Жыл бұрын

    anything you pass in the prompt will be data OpenAI has access too. So be careful.

  • @satishkumar-ir9wy

    @satishkumar-ir9wy

    Жыл бұрын

    @@samwitteveenai thanks for quick response Can you guide me to create a chat gpt like chat bot to answer queries based on my Excel data

  • @tusharbokade8378

    @tusharbokade8378

    8 ай бұрын

    @@satishkumar-ir9wy Hey, I am also looking to create a similar chatbot. Were you able to create one?

  • @MrShivrajansingh
    @MrShivrajansingh2 ай бұрын

    I tried this but the result is not satisfactory

  • @samwitteveenai

    @samwitteveenai

    2 ай бұрын

    I need to revisit tabular data with these again soon, there are lots of new ways to approach it. I think this vid is close to a year old now

  • @MrShivrajansingh

    @MrShivrajansingh

    2 ай бұрын

    @@samwitteveenai Thank you so much for your reply, I really appreciate it, The issue seems to stem from LangChain's processing, where it embeds document data and searches for the closest matching data before reconverting it into text. This can lead to errors, particularly with logical answers. For instance, calculating the average expenses for specific categories like food is problematic. This is because the process requires access to the entire CSV dataset, and LangChain struggles to retrieve specific data if the corresponding keyword is missing from the CSV.

  • @samwitteveenai

    @samwitteveenai

    2 ай бұрын

    @@MrShivrajansingh often for this kind of thing it is better to just get treat the csv as a SQL db and use the LLM ti just write SQL quieries

  • @priyashn5715
    @priyashn57159 ай бұрын

    How can I add custom template/prompts?

  • @patrickmihalcea6480
    @patrickmihalcea64808 ай бұрын

    Lol now try doing it in typescript

  • @rajatkumarsinha2159
    @rajatkumarsinha21598 ай бұрын

    Awesome video!! Can you/anyone guide me how to load CSV file for question answering using Dolly2.0 with langchain??

  • @samwitteveenai

    @samwitteveenai

    8 ай бұрын

    I wouldn't use Dolly as it is very out of date now and the LLaMA 2 models are much better.

  • @hummingbird8125
    @hummingbird8125 Жыл бұрын

    Please change the davince model to chatgpt model (gpt3 turbo) for this tutorial as it is better and 10x cheap

  • @user-ie3by1dv6n

    @user-ie3by1dv6n

    Жыл бұрын

    How to change davince model to gpt-3-turbo? when i input model_name='gpt-3.5-turbo' parameter to create_csv_agent function i got error. Could you teach me?

  • @ranati2000
    @ranati2000Ай бұрын

    agent.agent.llm_chain.prompt.template AttributeError: 'RunnableAgent' object has no attribute 'llm_chain'

  • @samwitteveenai

    @samwitteveenai

    Ай бұрын

    this is over a year old they have updated since then. I will make an update at some point.

Келесі