Is Falcon LLM the OpenAI Alternative? An Experimental Setup with LangChain

Ғылым және технология

👉🏻 Kick-start your freelance career in data: www.datalumina.io/data-freelancer
The Technology Innovation Institute in Abu Dhabi has launched Falcon, a new, advanced line of language models, available under the Apache 2.0 license. The standout model, Falcon-40B, is the first open-source model to compete with existing closed-source models. This launch is great news for language model enthusiasts, industry experts and businesses, as it presents many opportunities for new use cases. In this video, we are going to compare the new Falcon-7B model against OpenAI's text-davinci-003 model to see if open-source can take on the battle with paid models.
🔗 Links
huggingface.co/blog/falcon
github.com/daveebbelaar/langchain-experiments
huggingface.co/tiiuae/falcon-7b-instruct
Introduction to LangChain
kzread.info/dash/bejne/gI2HudBqmdPIl8o.html
Copy my VS Code Setup
kzread.info/dash/bejne/rKmgqa-Sl5PcZrg.html
👋🏻 About Me
Hey there, my name is @daveebbelaar and I work as a freelance data scientist and run a company called Datalumina. You've stumbled upon my KZread channel, where I give away all my secrets when it comes to working with data. I'm not here to sell you any data course - everything you need is right here on KZread. Making videos is my passion, and I've been doing it for 18 years.
While I don't sell any data courses, I do offer a coaching program for data professionals looking to start their own freelance business. If that sounds like you, head over to www.datalumina.io/ to learn more about working with me and kick-starting your freelance career.

Пікірлер: 44

@daveebbelaar Жыл бұрын
👋🏻I'm launching a free community for those serious about learning Data & AI soon, and you can be the first to get updates on this by subscribing here: www.datalumina.io/newsletter
@user-wr4yl7tx3w Жыл бұрын
Can we try fine tuning the Falcon for future video
@ingluissantana Жыл бұрын
As always GREAT video!!!! Thanks!!!!
@PeterDrewSEO Жыл бұрын
Great video mate, thank you!
@user-jt1zw1ux8z10 ай бұрын
Thanks man! Finally some code that was working for me 👍
@oshodikolapo2159 Жыл бұрын
Just what i was searching for. Thanks for this. bravo!
@daveebbelaar
Жыл бұрын
Thanks!
@sree_9699 Жыл бұрын
Interesting! I was exploring the same thing just an hour ago on HF and ran into this video as I opened the KZread. Good content.
@daveebbelaar
Жыл бұрын
Thanks! 🙏🏻
@VaibhavPatil-rx7pc11 ай бұрын
Excellent detailed information
@marcova10 Жыл бұрын
Thanks, Dave with some trials it seems that this version of falcon works for short questions, I am finding that in some cases the LLM spits several repeated sentences, may need some tweaking in the output to clean it up Great alternative for certain uses
@aditunoe Жыл бұрын
In the special_tokens_map.json file of the HF repo there are some special tokens defined that differ from what OpenAI or others use a little bit. Integrating those into a prompt template of the chains seemed to improve the results for me (Also wrote on example in the HF comments). Three interesting ones in particular: >>QUESTIONSUMMARY>ANSWER
@datanash8200 Жыл бұрын
Perfect timing, need to implement some LLM for a work project 🙌
@daveebbelaar
Жыл бұрын
👌🏻
@user-lf2tj6bz3m Жыл бұрын
Thanks for Your video... do you have how implemented that with node js?
@Jake_McAllister11 ай бұрын
Hey Dave, love the video! How did you create your website, looks amazing bro 👌
@Esehe11 ай бұрын
@17:30 interesting how my OpenAI output/summary is different, than yours: " This article explains how to use Flowwise AI, an open source visual UI Builder, to quickly build large language models apps and conversational AI. It covers setting up Flowwise, connecting it to data, and building a conversational AI, as well as how to embed the agent in a Python file and run queries. It also shows how to use the agent to ask questions and get accurate results."
@xXWillyxWonkaXx Жыл бұрын
Hey man, love your videos. Two questions: Q1. 11:50 are you talking about embedding? Q2. From your experience/deduction /observation of the LLM on huggingface, can you train a model like MosaicML MPT-7B, through in QLorRA in the mix and train it to be like GPT4 or even slightly better in terms of understanding/alignment - could using tree of thought mitigate or solve a small percentage of that?
@daveebbelaar
Жыл бұрын
A1 - No I don't use embeddings in this example. Just plain text sent to the APIs A2 - Not sure about that
@esakkisundar8 ай бұрын
How to run the FalconModel locally. Does providing a key run the model in HuggingFace server?
@user-wr4yl7tx3w Жыл бұрын
Do you have a video on pre training an LLM?
@shakeebanwar44038 ай бұрын
Can i run this 7b model without gpu my system ram is 32 gb
@KatarzynaHewelt Жыл бұрын
Thanks Dave for another great video! Do you know if I can perhaps download falcon locally and then use it privatelly - without HF API?
@daveebbelaar
Жыл бұрын
Thanks Katarzyna! I am not sure about that.
@bentouss3445 Жыл бұрын
Really great video on this hot topic of open llm vs closed ones... It will be really interesting to see how to self host a open llm to not go through any external inference API.
@daveebbelaar
Жыл бұрын
Thanks! Yes that is very interesting indeed!
@luis96xd Жыл бұрын
Great video, I have a doubt, what are the requirements to run locally Falcon-7B instruct? Can I use a CPU?
@fullcrum2089
Жыл бұрын
15GB GPU Memory
@luis96xd
Жыл бұрын
@@fullcrum2089 Thank you so much! That's a Lot 😱
@felixbrand797111 ай бұрын
I’m sure this will be basic question, but where is the inference running here? Is it local, or is it on huggingface’s resources?
@vuktodorovic4768
10 ай бұрын
That is what I wanted to ask. I mean I loaded this model into the google collab free tier and it took 15gb of ram and 14gb of GPU memory, I cant imagine what hardware you should have to run something like this locally. Also, I can't imagine that hugging face would give you their resources just like that. His setup seems very strange.
@GyroO7 Жыл бұрын
I feel like using chunk size of 1000 with 200 overlaps will improve the results
@ko-Daegu Жыл бұрын
How are you runing a .py file as a jypeter Notebook on the side like that how are you taking each line inside it's one block to the side interactive this setup looks neat
@daveebbelaar
Жыл бұрын
Check out this video: kzread.info/dash/bejne/rKmgqa-Sl5PcZrg.html
@ko-Daegu
Жыл бұрын
@@daveebbelaar Merci
@Eloii_Xia Жыл бұрын
Imagine combine it with Obsidian, Notion or other similar software
@fdarko1 Жыл бұрын
I am new to Data science and want to know more about it to become a Pro. Please mentor me.
@daveebbelaar
Жыл бұрын
Subscribe and check out the other videos on my channel ;)
@mayorc Жыл бұрын
But don't you get a free amount of tokens for free that recharge every month using OpenAI, or not? So unless you go over the amount you shouldn't get charged.
@pragyanbhatt6200 Жыл бұрын
Nice tutorial Dave, but isn't it unfair to compare two models with different parameters count? falcon -7b has 7billion where as text-davinci-003 has almost 175 billion parameters?
@daveebbelaar
Жыл бұрын
It's definitely unfair, but that's why it's interesting to see the performance of a much smaller, free to use model.
@noelwos107110 ай бұрын
UNFAIR ADVANTAGE .What do you think, as a European citizen, would you have to sue Europe, which hinders the development of progress offered by artificial intelligence and thus causes enormous damage in Europe's lagging behind the whole world. isn't the EU a responsible institution
@deliciouspops Жыл бұрын
i like how degraded our society is
@VaibhavPatil-rx7pc11 ай бұрын
Excellent detailed information