Talk to Your Documents, Powered by Llama-Index

Ғылым және технология

In this video, we will build a Chat with your document system using Llama-Index. I will explain concepts related to llama index with a focus on understanding Vector Store Index.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.com/@engineerprom...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/engineerprompt/c...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
Google Colab: tinyurl.com/hms9c4jv
llama-Index Github: github.com/jerryjliu/llama_index
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Timestamps:
[00:00] What is Llama-index
[01:10] System Architecture
[02:54] Llama-index Setup
[04:54] Loading Documents
[05:42] Creating the Vector Store Index
[06:16] Creating Query Engine
[07:06] Q&A Over Documents
[09:00] How to Persist the Index
[10:20] What is inside the Index?
[11:38] How to change the default LLM
[13:25] Change the Chunk Size
[14:26] Use Open Source LLM with Llama Index

Пікірлер: 107

@engineerprompt10 ай бұрын
Want to connect? 💼Consulting: calendly.com/engineerprompt/consulting-call 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Join Patreon: Patreon.com/PromptEngineering ▶ Subscribe: www.youtube.com/@engineerprompt?sub_confirmation=1
@mlg403510 ай бұрын
The SINGLE MOST valuable YT video I have come across on this topic. BRAVO!! And thank you!
@engineerprompt
10 ай бұрын
Glad it was helpful!
@rehberim36010 ай бұрын
Again a great video. While I was trying to figure out how to learn this technology and where I could find reliable sources, it was lucky for me to find such up-to-date information.
@ChronozOdP10 ай бұрын
Another excellent video. Easy to follow and up to date. Thank you and keep it up!
@s.moneebahnoman7 ай бұрын
Amazing! I haven't seen enough videos talking about persisting the index especially in beginner level tutorials. I think its such a crucial concept that I found out much later. Love the flow for this and its perfectly explained! Liked and subbed!
@rajmankad294910 ай бұрын
The explanation is so clear! Thank you.
@MikewasG10 ай бұрын
Thanks for your sharing! It's very helpful.
@MarshallMelnychuk10 ай бұрын
Great explanation and comparison very useful thank you
@gregorykarsten735010 ай бұрын
Excellent. I was wondering what the difference between langchain and lama index was. I also thought lama index is very powerful with its indexing functionality. This can bridge the gap between semantic and index search
@aseemasthana41218 ай бұрын
Perfect pace and level of knowledge. Loved the video.
@engineerprompt
8 ай бұрын
Glad you liked it!
@dario2728 күн бұрын
Finally a good tutorial on the subject! Thanks so much!
@TeamDman10 ай бұрын
Thanks for sharing!
@ghostwhowalks232410 ай бұрын
thanks for simplifying this
@user-dp9lj1ew6k10 ай бұрын
This is awesome. I'm going to try
@kayasazАй бұрын
Great explanation
@adamduncan657910 ай бұрын
Excellent videos! Really helping out with my work. Curious what tool you are using to draw the system architecture? I really like the way it renders the architectures.
@xt37089 ай бұрын
learn so much from you!
@engineerprompt
9 ай бұрын
Thank you
@KinoInsight9 ай бұрын
I liked your explanation. You are a good story teller. You explained the details in a simple way and yet easy to implement manner. Thank you. I look forward to your next video. But how do we ensure the latest data is fed to the LLM in real time? In this case, we need to provide the data to the Llama. And the response is limited to the data provided.
@henkhbit57489 ай бұрын
Nice intro about llma-index👍. I think for small amount of documents the default llma-index embedding in json is sufficient. I suppose u can also use chromadb or weaviate or other vectorstores. Would be nice to see with the non default vector store...
@xt3708
9 ай бұрын
yeah, maybe a video comparing diff stores, when to use which, strenghts/weaknesses etc
@hernandocastroarana620610 ай бұрын
Excellent video. Do you know what is the best option to start the code in an interface? I passed the code to Vs Code and then started it in Streamlit but it gives me some problems. I appreciate your help
@DixitNitish10 ай бұрын
Great video, how can I have the ability to compare 100s of document using llamaindex and will it know which chunk belongs to which document when answering the question? Also, how do you make sure all the pieces that should be in 1 chunk stays together, for instance if there is a table that goes across 2 pages then that should still be in 1 chunk?
@nishkarve7 ай бұрын
Excellent. Is there a video you are planning to make on a multi modal RAG? I have a PDF which is an instruction manual. It has text and images. When a user asks a question, for example, "How to connect the TV to external speakers?", it should show the steps and the images associated with those steps. Everywhere I see are examples of image "generation". I don't want to generate images. I just want to show what's in the PDF based on the question.
@Gingeey2310 ай бұрын
Great video - thanks for sharing. Do you reckon they will implement the ability to use local LLMs with these embeddings, and if so, is there any plan to update LocalGPT to include the option of using Llama Index over LangChain Vector Stores? Cheers!
@AlignmentLabAI
10 ай бұрын
They should be compatible with local models out of the gate, in fact I believe llama index is local models first, hence 'llama'
@y2knoproblem10 ай бұрын
This is very helpful. Where can we find the system architecture diagram?
@jamesvictor21826 ай бұрын
Great video. The notebook fails at the first hurdle for me: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions
@CaesarEduBiz-lz2cg9 ай бұрын
Is it better to use Llama Index or RAG (Retrieval Augmented Generation) ?
@HarmeetSingh-ry6fm27 күн бұрын
Hi prompt as you mentioned in this video that this is a system prompt for StableLM, I want to know is there a way I can find prompt format for different LLM for example mixtral 8x7b/22b or llama 3
@Rahul-zq8ep8 ай бұрын
is it a Production ready code ? What important points we should keep in mind to make a similar app for Production environment ?
@abdullahiahmad32447 ай бұрын
to be honest this is the best tutorial i see in 2023
@engineerprompt
7 ай бұрын
Thank you 😊
@wangbei910 ай бұрын
llama-index previously known as gpt index has been there since March. It is built on top of langchain. Would like to compare with your local gpt to check the performance. Also, for documents related application, often OCR is needed to grab the text at first.
@engineerprompt
10 ай бұрын
That's a good idea. Will do a quick implementation of localgpt with llama-index. I agree with OCR and have been experimenting with unstructured package for dealing with pdf files.
@mmdls602
10 ай бұрын
Localgpt with llama index? What do you mean? Bit confusing with all the models and use of same words in the eco system. Would be nice if you can explain the ecosystem and where these things for in. Thank you, great videos
@dimitripetrenko4386 ай бұрын
Hi bro cool video! May I ask if there is a way to store quantized model with LlamaIndex? It's very painful to quantize it every single time I try to run it
@angloland453910 ай бұрын
❤
@vladimirgorea87149 ай бұрын
what's the difference in using Llamaindex and just using openai embeddings?
@Elyes-sj8kb
5 ай бұрын
Llamaindex offers a tool set to implement rag and uses an embedder, an embedder converts a word into a vector
@med-lek10 ай бұрын
I have a question: When I ask same question - I got different answer?
@ismaelnoble8 ай бұрын
how does llama index compare with the localgpt method?
@anuvratshukla70619 ай бұрын
If I'm uusing weaviate, how to load then?
@udithweerasinghe64025 ай бұрын
Do you have codes to do the same without openAI. Using some model in huggingface?
@jersainpasaran193110 ай бұрын
Information always magic!!! would it be possible to integrate chatgpt 3.5 turbo where user is requested, does it allow to search for documents in different languages or is it necessary to load a specific library for each language? Thanks always
@engineerprompt
10 ай бұрын
Thank you 🙏 if you are using got-3.5 then the content can in different languages. You will have to instruct gpt to auto detect the language and process it.
@nazihfattal9749 ай бұрын
I tried this in my colab pro account and the session crashed when I ran the vectorstore. Out of GPU memory. colab allocated 16GB of VRAM. Would you please add option for using huggingface hosted LLMs through their free inference API (applies to select models)? Thanks for a great video.
@bertobertoberto310 ай бұрын
Nice tutorial! How would you associate document chunks with the actual document the chunks came from? Let’s say I have a 500 page pdf. Now I split it into 500 documents, one per page, then I apply this llamaindex to chunk each page. How do I know that chunk 46kjkjh belongs to page 5?
@engineerprompt
10 ай бұрын
You can add meta data to each chunk which will add that information.
@bertobertoberto3
10 ай бұрын
@@engineerprompt oh thank you. Is this in the documentation somewhere? An example would be invaluable
@bongimusprime7981
10 ай бұрын
@@bertobertoberto3 each source node in the response will have the text used,as well as a reference to the source doc id + metadata, eg: response.source_nodes[0].node.text response.source_nodes[0].node.ref_doc_id response.source_nodes[0].node.metadata
@bertobertoberto3
10 ай бұрын
@@bongimusprime7981 awesome
@nickwoolley73310 ай бұрын
Will you be testing the new Mistral-7B-v0.1 and Mistral-7B-Instruct-v0.1 LLMs? They claim to outperform Llama 2.😊
@engineerprompt
10 ай бұрын
Yes
@Shogun-C10 ай бұрын
When attempting to run the index line of code, I get an AuthenticationError saying my API key is incorrect even though I've copied it straight across from my OpenAI account as a newly generated key. Any idea as to where I'm going wrong?
@draganamilosheska3702
7 ай бұрын
did you fix that? i have the same problem
@gitinit341610 ай бұрын
Awesome ❤ a little off topic question......would be so kind as to share the app you are using for making diagrams. it's sick....I've been looking for something that that since quite a while now, but with no luck.... 🥺
@engineerprompt
10 ай бұрын
Here is the one I use: excalidraw.com/
@sidhaarthsredharan33184 ай бұрын
im doing the same, but indexing every node created, there are around 5000 nodes, and its taking a long time. is there some progress bar (like tqdm) code i can add to see how long the indexing process would take?
@engineerprompt
4 ай бұрын
I don't think there is a progress bar by default. You might be able to add a callback though.
@trobinsun985110 ай бұрын
what about security of our data ? what if it's confidential document ? thanks for your excellent videos
@engineerprompt
10 ай бұрын
Look at the localgpt project, you can run everything locally. Nothing leaves your system.
@rizwanat74965 ай бұрын
I am in the vectorstoreindex.from documents cell. it been running for like 24 hrs now. How do I know when it will end. I am running it locally in my laptop. output shows completion of batches. almost 2400+ batches. but it doesn;t showing how many are left. can somebody help?. my data consist of 850+ json. over all 70MB data.
@42svb5810 ай бұрын
3:18 "replace the openai llm with open-source..." 😁😁😁😁😁
@user-jp5uy5ok7g4 ай бұрын
Can I build a multilingual chatbot using llama index?
@AI_Expert_0073 ай бұрын
Thanks for the clear explanation. Could you please share the name of the tool you used to create the workflow diagram?
@engineerprompt
3 ай бұрын
It's called excalidraw
@AI_Expert_007
2 ай бұрын
@@engineerprompt Thanks a lot !
@sachinkalsi951610 ай бұрын
Thanks for the video. I’m facing lots of latency issues (15+ min) while reading the stores the index. How I can improve ? There are 100k+ vectors . Going ahead with numpy array takes few minutes only !
@engineerprompt
10 ай бұрын
You probably want to explore another vector store
@niranandkhedkar3681
9 ай бұрын
Hey @engineerprompt , Can you please make a video on this. Facing the same challenge to reduce the response time of llama index. @sachinkalsi9516 Found any solution for this? Any help is appreciated from everyone thanks.
@test123826 ай бұрын
Can this help llm parsing html?
@matten_zero10 ай бұрын
Much more intuitive than LangChain
@user-gq6ol1di3t10 ай бұрын
how to chat with pdf document that contain mathematical equations and some derivations
@engineerprompt
10 ай бұрын
You will have to use something like Nougat from meta to convert those into markdowns and then you can use llama-index as shown here.
@vostfrguys10 ай бұрын
If let's say I put the transcript of a tv show for exemple all stargate tv shows would it be able to generate an episode with x,y,z theme ?
@hamtsammich9 ай бұрын
could you do a tutorial about how to do this locally? I'm very interested in llama index, but I'm wary of using things that aren't on my local hardware
@engineerprompt
9 ай бұрын
Sure, will do
@hamtsammich
9 ай бұрын
@@engineerprompt SWEET Yeah, I've been trying to figure out how to try this locally and use my pdfs. I've got a 3090, and I've been excited about llms, but haven't managed yet
@huyvo91057 ай бұрын
Hi, can i get your pipeline draw link please
@am1rsafavi-naini3569 ай бұрын
I don't have any credit card, but I will buy an coffee for you some day (maybe in person, who knows :)
@engineerprompt
9 ай бұрын
Thanks 🙏
@Macventure10 ай бұрын
How can this be modified to run locally on GPU, without OpenAI?
@AlignmentLabAI
10 ай бұрын
As a baseline you can use vllm with the openai API spec on the server and drop the API base URL and API key to the openai variables in the scrips in your environment variables
@dineshbhatotia87836 ай бұрын
ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format. Getting this error
@vitalis10 ай бұрын
how does it compare to quivr as an AI second brain?
@Udayanverma7 ай бұрын
Do u have any example of a model on personal desktop/server. I dont wish to publish my content to chatgpt or any internet service.
@engineerprompt
7 ай бұрын
Checkout my localgpt project
@anilpgonade850310 ай бұрын
If I upload a doc of 50,000 words how much will it cost
@hassentangier389110 ай бұрын
is there a way to run it without requiring openai key
@jaysonp9426
10 ай бұрын
You would still need an API call to a hosted llm and have and embeddings model to do the embeddings.
@zearcher46337 ай бұрын
can you make a video on how to create a website chatbot out of all of this? say, we used this video and made a chatbot to talk with our data, how do we use it in our website?
@regonzalezayala16 күн бұрын
🎯 Key points for quick navigation: 00:00 *💡 Introduction to Llama-Index* - Introduction to the task: building a document Q&A system using Llama-Index, - Comparison with LangChain and brief overview of functionalities, - Emphasis on fine-tuning embedding models for better performance. 01:19 *📑 Document Processing and Embedding* - Explanation of converting documents into chunks and computing embeddings, - Process of creating a semantic index and storing embeddings, - Introduction to querying the document by computing embeddings for user questions. 02:57 *🛠️ Initial Setup and Code Implementation* - Installing necessary packages: Llama-Index, OpenAI, Transformers, Accelerate, - Setting up the environment and loading the document using Simple Directory Reader, - Overview of creating vector stores and relevant indexing. 05:17 *🧩 Implementing Query Engine and Basic Queries* - Description of building a query engine, - Implementation of querying the documents with sample questions, - Obtaining and displaying responses from the model. 08:42 *🛠️ Customizing Configuration and Parameters* - Explanation of customizing chunk sizes, LLM models, and other parameters, - Process of persisting vector stores for future use, - Detailed look at embedding and document storage components. 11:43 *🔧 Advanced Customization and LLM Usage* - Methods for changing the LLM model, including GPT-3.5 Turbo and Google Palm, - Instructions on setting chunk sizes and overlaps, - Using open-source LLMs from Hugging Face and configuring corresponding parameters. 16:37 *🚀 Conclusion and Future Prospects* - Summary of using Llama-Index for document Q&A systems, - Mention of advanced features and future tutorial plans, - Encouragement to check out additional resources and support via Patreon. Made with HARPA AI
@devikasimlai4767Ай бұрын
14:25
@toannguyenngoc820910 ай бұрын
OpenAI API limit and you must pay money if you use it. Can you give a example build chat without API
@engineerprompt
10 ай бұрын
Yes, coming soon
@toannguyenngoc8209
10 ай бұрын
@@engineerprompt It's great, I'll look forward to it.
@Nihilvs10 ай бұрын
hmmmm...
@elizonfrankcarcaustomamani49994 ай бұрын
Hello, help with that please. When I execute the line 'index = VectorStoreIndex.from_documents(documents)' after 1 min I get an error 429 (insufficient_quota). Check if the OPENAI_API_KEY variable was registered with '!export -p', and if it is. Thanks
@googleyoutubechannel85546 ай бұрын
This looked very promising, but your colab (like 99% of all notebooks) is broken right off the bat, it doesn't even install dependencies. FYI - 'best practice' aka absolutely required, on any notebook is to specify the _specific versions of all dependencies _ , or you just have junk that won't run in days to weeks ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions
@googleyoutubechannel8554
6 ай бұрын
Also, the second code block has a critical error that means the code has never even run _once_ ? Will throw an error: os["OPENAI_API_KEY"] = Colab format is: os.environ['OPENAI_API_KEY'] =