Talk to Your Documents, Powered by Llama-Index
Ғылым және технология
In this video, we will build a Chat with your document system using Llama-Index. I will explain concepts related to llama index with a focus on understanding Vector Store Index.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.com/@engineerprom...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/engineerprompt/c...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
Google Colab: tinyurl.com/hms9c4jv
llama-Index Github: github.com/jerryjliu/llama_index
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Timestamps:
[00:00] What is Llama-index
[01:10] System Architecture
[02:54] Llama-index Setup
[04:54] Loading Documents
[05:42] Creating the Vector Store Index
[06:16] Creating Query Engine
[07:06] Q&A Over Documents
[09:00] How to Persist the Index
[10:20] What is inside the Index?
[11:38] How to change the default LLM
[13:25] Change the Chunk Size
[14:26] Use Open Source LLM with Llama Index
Пікірлер: 107
Want to connect? 💼Consulting: calendly.com/engineerprompt/consulting-call 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Join Patreon: Patreon.com/PromptEngineering ▶ Subscribe: www.youtube.com/@engineerprompt?sub_confirmation=1
The SINGLE MOST valuable YT video I have come across on this topic. BRAVO!! And thank you!
@engineerprompt
10 ай бұрын
Glad it was helpful!
Again a great video. While I was trying to figure out how to learn this technology and where I could find reliable sources, it was lucky for me to find such up-to-date information.
Another excellent video. Easy to follow and up to date. Thank you and keep it up!
Amazing! I haven't seen enough videos talking about persisting the index especially in beginner level tutorials. I think its such a crucial concept that I found out much later. Love the flow for this and its perfectly explained! Liked and subbed!
The explanation is so clear! Thank you.
Thanks for your sharing! It's very helpful.
Great explanation and comparison very useful thank you
Excellent. I was wondering what the difference between langchain and lama index was. I also thought lama index is very powerful with its indexing functionality. This can bridge the gap between semantic and index search
Perfect pace and level of knowledge. Loved the video.
@engineerprompt
8 ай бұрын
Glad you liked it!
Finally a good tutorial on the subject! Thanks so much!
Thanks for sharing!
thanks for simplifying this
This is awesome. I'm going to try
Great explanation
Excellent videos! Really helping out with my work. Curious what tool you are using to draw the system architecture? I really like the way it renders the architectures.
learn so much from you!
@engineerprompt
9 ай бұрын
Thank you
I liked your explanation. You are a good story teller. You explained the details in a simple way and yet easy to implement manner. Thank you. I look forward to your next video. But how do we ensure the latest data is fed to the LLM in real time? In this case, we need to provide the data to the Llama. And the response is limited to the data provided.
Nice intro about llma-index👍. I think for small amount of documents the default llma-index embedding in json is sufficient. I suppose u can also use chromadb or weaviate or other vectorstores. Would be nice to see with the non default vector store...
@xt3708
9 ай бұрын
yeah, maybe a video comparing diff stores, when to use which, strenghts/weaknesses etc
Excellent video. Do you know what is the best option to start the code in an interface? I passed the code to Vs Code and then started it in Streamlit but it gives me some problems. I appreciate your help
Great video, how can I have the ability to compare 100s of document using llamaindex and will it know which chunk belongs to which document when answering the question? Also, how do you make sure all the pieces that should be in 1 chunk stays together, for instance if there is a table that goes across 2 pages then that should still be in 1 chunk?
Excellent. Is there a video you are planning to make on a multi modal RAG? I have a PDF which is an instruction manual. It has text and images. When a user asks a question, for example, "How to connect the TV to external speakers?", it should show the steps and the images associated with those steps. Everywhere I see are examples of image "generation". I don't want to generate images. I just want to show what's in the PDF based on the question.
Great video - thanks for sharing. Do you reckon they will implement the ability to use local LLMs with these embeddings, and if so, is there any plan to update LocalGPT to include the option of using Llama Index over LangChain Vector Stores? Cheers!
@AlignmentLabAI
10 ай бұрын
They should be compatible with local models out of the gate, in fact I believe llama index is local models first, hence 'llama'
This is very helpful. Where can we find the system architecture diagram?
Great video. The notebook fails at the first hurdle for me: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions
Is it better to use Llama Index or RAG (Retrieval Augmented Generation) ?
Hi prompt as you mentioned in this video that this is a system prompt for StableLM, I want to know is there a way I can find prompt format for different LLM for example mixtral 8x7b/22b or llama 3
is it a Production ready code ? What important points we should keep in mind to make a similar app for Production environment ?
to be honest this is the best tutorial i see in 2023
@engineerprompt
7 ай бұрын
Thank you 😊
llama-index previously known as gpt index has been there since March. It is built on top of langchain. Would like to compare with your local gpt to check the performance. Also, for documents related application, often OCR is needed to grab the text at first.
@engineerprompt
10 ай бұрын
That's a good idea. Will do a quick implementation of localgpt with llama-index. I agree with OCR and have been experimenting with unstructured package for dealing with pdf files.
@mmdls602
10 ай бұрын
Localgpt with llama index? What do you mean? Bit confusing with all the models and use of same words in the eco system. Would be nice if you can explain the ecosystem and where these things for in. Thank you, great videos
Hi bro cool video! May I ask if there is a way to store quantized model with LlamaIndex? It's very painful to quantize it every single time I try to run it
❤
what's the difference in using Llamaindex and just using openai embeddings?
@Elyes-sj8kb
5 ай бұрын
Llamaindex offers a tool set to implement rag and uses an embedder, an embedder converts a word into a vector
I have a question: When I ask same question - I got different answer?
how does llama index compare with the localgpt method?
If I'm uusing weaviate, how to load then?
Do you have codes to do the same without openAI. Using some model in huggingface?
Information always magic!!! would it be possible to integrate chatgpt 3.5 turbo where user is requested, does it allow to search for documents in different languages or is it necessary to load a specific library for each language? Thanks always
@engineerprompt
10 ай бұрын
Thank you 🙏 if you are using got-3.5 then the content can in different languages. You will have to instruct gpt to auto detect the language and process it.
I tried this in my colab pro account and the session crashed when I ran the vectorstore. Out of GPU memory. colab allocated 16GB of VRAM. Would you please add option for using huggingface hosted LLMs through their free inference API (applies to select models)? Thanks for a great video.
Nice tutorial! How would you associate document chunks with the actual document the chunks came from? Let’s say I have a 500 page pdf. Now I split it into 500 documents, one per page, then I apply this llamaindex to chunk each page. How do I know that chunk 46kjkjh belongs to page 5?
@engineerprompt
10 ай бұрын
You can add meta data to each chunk which will add that information.
@bertobertoberto3
10 ай бұрын
@@engineerprompt oh thank you. Is this in the documentation somewhere? An example would be invaluable
@bongimusprime7981
10 ай бұрын
@@bertobertoberto3 each source node in the response will have the text used,as well as a reference to the source doc id + metadata, eg: response.source_nodes[0].node.text response.source_nodes[0].node.ref_doc_id response.source_nodes[0].node.metadata
@bertobertoberto3
10 ай бұрын
@@bongimusprime7981 awesome
Will you be testing the new Mistral-7B-v0.1 and Mistral-7B-Instruct-v0.1 LLMs? They claim to outperform Llama 2.😊
@engineerprompt
10 ай бұрын
Yes
When attempting to run the index line of code, I get an AuthenticationError saying my API key is incorrect even though I've copied it straight across from my OpenAI account as a newly generated key. Any idea as to where I'm going wrong?
@draganamilosheska3702
7 ай бұрын
did you fix that? i have the same problem
Awesome ❤ a little off topic question......would be so kind as to share the app you are using for making diagrams. it's sick....I've been looking for something that that since quite a while now, but with no luck.... 🥺
@engineerprompt
10 ай бұрын
Here is the one I use: excalidraw.com/
im doing the same, but indexing every node created, there are around 5000 nodes, and its taking a long time. is there some progress bar (like tqdm) code i can add to see how long the indexing process would take?
@engineerprompt
4 ай бұрын
I don't think there is a progress bar by default. You might be able to add a callback though.
what about security of our data ? what if it's confidential document ? thanks for your excellent videos
@engineerprompt
10 ай бұрын
Look at the localgpt project, you can run everything locally. Nothing leaves your system.
I am in the vectorstoreindex.from documents cell. it been running for like 24 hrs now. How do I know when it will end. I am running it locally in my laptop. output shows completion of batches. almost 2400+ batches. but it doesn;t showing how many are left. can somebody help?. my data consist of 850+ json. over all 70MB data.
3:18 "replace the openai llm with open-source..." 😁😁😁😁😁
Can I build a multilingual chatbot using llama index?
Thanks for the clear explanation. Could you please share the name of the tool you used to create the workflow diagram?
@engineerprompt
3 ай бұрын
It's called excalidraw
@AI_Expert_007
2 ай бұрын
@@engineerprompt Thanks a lot !
Thanks for the video. I’m facing lots of latency issues (15+ min) while reading the stores the index. How I can improve ? There are 100k+ vectors . Going ahead with numpy array takes few minutes only !
@engineerprompt
10 ай бұрын
You probably want to explore another vector store
@niranandkhedkar3681
9 ай бұрын
Hey @engineerprompt , Can you please make a video on this. Facing the same challenge to reduce the response time of llama index. @sachinkalsi9516 Found any solution for this? Any help is appreciated from everyone thanks.
Can this help llm parsing html?
Much more intuitive than LangChain
how to chat with pdf document that contain mathematical equations and some derivations
@engineerprompt
10 ай бұрын
You will have to use something like Nougat from meta to convert those into markdowns and then you can use llama-index as shown here.
If let's say I put the transcript of a tv show for exemple all stargate tv shows would it be able to generate an episode with x,y,z theme ?
could you do a tutorial about how to do this locally? I'm very interested in llama index, but I'm wary of using things that aren't on my local hardware
@engineerprompt
9 ай бұрын
Sure, will do
@hamtsammich
9 ай бұрын
@@engineerprompt SWEET Yeah, I've been trying to figure out how to try this locally and use my pdfs. I've got a 3090, and I've been excited about llms, but haven't managed yet
Hi, can i get your pipeline draw link please
I don't have any credit card, but I will buy an coffee for you some day (maybe in person, who knows :)
@engineerprompt
9 ай бұрын
Thanks 🙏
How can this be modified to run locally on GPU, without OpenAI?
@AlignmentLabAI
10 ай бұрын
As a baseline you can use vllm with the openai API spec on the server and drop the API base URL and API key to the openai variables in the scrips in your environment variables
ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format. Getting this error
how does it compare to quivr as an AI second brain?
Do u have any example of a model on personal desktop/server. I dont wish to publish my content to chatgpt or any internet service.
@engineerprompt
7 ай бұрын
Checkout my localgpt project
If I upload a doc of 50,000 words how much will it cost
is there a way to run it without requiring openai key
@jaysonp9426
10 ай бұрын
You would still need an API call to a hosted llm and have and embeddings model to do the embeddings.
can you make a video on how to create a website chatbot out of all of this? say, we used this video and made a chatbot to talk with our data, how do we use it in our website?
🎯 Key points for quick navigation: 00:00 *💡 Introduction to Llama-Index* - Introduction to the task: building a document Q&A system using Llama-Index, - Comparison with LangChain and brief overview of functionalities, - Emphasis on fine-tuning embedding models for better performance. 01:19 *📑 Document Processing and Embedding* - Explanation of converting documents into chunks and computing embeddings, - Process of creating a semantic index and storing embeddings, - Introduction to querying the document by computing embeddings for user questions. 02:57 *🛠️ Initial Setup and Code Implementation* - Installing necessary packages: Llama-Index, OpenAI, Transformers, Accelerate, - Setting up the environment and loading the document using Simple Directory Reader, - Overview of creating vector stores and relevant indexing. 05:17 *🧩 Implementing Query Engine and Basic Queries* - Description of building a query engine, - Implementation of querying the documents with sample questions, - Obtaining and displaying responses from the model. 08:42 *🛠️ Customizing Configuration and Parameters* - Explanation of customizing chunk sizes, LLM models, and other parameters, - Process of persisting vector stores for future use, - Detailed look at embedding and document storage components. 11:43 *🔧 Advanced Customization and LLM Usage* - Methods for changing the LLM model, including GPT-3.5 Turbo and Google Palm, - Instructions on setting chunk sizes and overlaps, - Using open-source LLMs from Hugging Face and configuring corresponding parameters. 16:37 *🚀 Conclusion and Future Prospects* - Summary of using Llama-Index for document Q&A systems, - Mention of advanced features and future tutorial plans, - Encouragement to check out additional resources and support via Patreon. Made with HARPA AI
14:25
OpenAI API limit and you must pay money if you use it. Can you give a example build chat without API
@engineerprompt
10 ай бұрын
Yes, coming soon
@toannguyenngoc8209
10 ай бұрын
@@engineerprompt It's great, I'll look forward to it.
hmmmm...
Hello, help with that please. When I execute the line 'index = VectorStoreIndex.from_documents(documents)' after 1 min I get an error 429 (insufficient_quota). Check if the OPENAI_API_KEY variable was registered with '!export -p', and if it is. Thanks
This looked very promising, but your colab (like 99% of all notebooks) is broken right off the bat, it doesn't even install dependencies. FYI - 'best practice' aka absolutely required, on any notebook is to specify the _specific versions of all dependencies _ , or you just have junk that won't run in days to weeks ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. llmx 0.0.15a0 requires cohere, which is not installed. tensorflow-probability 0.22.0 requires typing-extensions
@googleyoutubechannel8554
6 ай бұрын
Also, the second code block has a critical error that means the code has never even run _once_ ? Will throw an error: os["OPENAI_API_KEY"] = Colab format is: os.environ['OPENAI_API_KEY'] =