How to Fine-tune T5 and Flan-T5 LLM models: The Difference is?
Ғылым және технология
Introduction how to fine-tune T5 and FLAN-T5 models (LLM - Large Language Models). Then some detailed videos how to code, step-by-step, fine tuning in real time on T5 and Flan T5 models. Fine tune Flan T5. Theory how to fine-tune T5 LLMs.
Next video is on coding examples (JupyterLab, Colab).
Literature (all rights & credits are with those authors):
huggingface.co/docs/transform...
huggingface.co/docs/transform...
huggingface.co/docs/transform...
Пікірлер: 25
I like how you explain and give us the time to understand what you are talking about (especially for non-English native speakers ) so I'm so happy to discover your channel thank you so much.
Awesome tutorial video, keep up the good work.
Nice video !! Looking forward for the upcoming ones ! Great job
@code4AI
Жыл бұрын
More to come!
You explained this so well and made it easy to understand. Going to go watch the next videos!
@code4AI
Жыл бұрын
Awesome, thank you!
You are a legend!
@code4AI
Жыл бұрын
Thank you!
This is a great video compiling all the necessary information about Google's family of Open Source LLMs. Not sure what fine-tuning technique you'd use in your upcoming videos but I'd like to see how to fine tune using PEFT from Huggingface.
@code4AI
Жыл бұрын
Thanks for your feedback. Next step is classical fine-tuning of T5 and Flan-T5. And then I'll dive in Parameter-Efficient Fine-Tuning, since you'll read everywhere, that recent State-of-the-Art PEFT techniques achieve performance comparable to that of full fine-tuning, so why not let's have a quantitative and qualitative comparison between those two techniques of fine-tuning?
@theunknown2090
Жыл бұрын
@@code4AI 100%
Can you please make a video on finetuning the Flan T5 with PEFT/Lora ? And possibly show some results ?
Hello, thank you for your valuable content I currently use gpt4 for my tasks I have a system message that is about 3k tokens( a scoring criteria). A user input that is also about 3k (contains info to be scored) and a assistant output that is 1k I know that flan-T5 has maximum token length about 4096 tokens, Can i do instruced finetuning using the user input and assistant output only, without the system message?
Do we need to do some text cleaning when useing t5
How do I structure my inputs and outputs in my datasets when using huggingface trainer???
I have a very important question. For the dataset containing only question and answer features. How should I approach? For eg. If the user input question, model must generate answer.
Is t5 better than Roberta for classification
A question please how can we fine tune with custom data without formatting it in question-answer?
@code4AI
Жыл бұрын
If all of your data have a coherent format, all have the same structure, it doesn't matter how you call them, the neural network will learn the pattern. You just have to be consistent all over your computation. Hint: check the pre-training data set for your transformer to find similar structural patterns and adapt accordingly.
Sir, what is different train model between english vs non english? thank you
@code4AI
Жыл бұрын
You need a LLM model, that has been pre-trained on different languages, not just English. Because your tokenizer has to adapt to your (multiple) language(s). But if you have a multi-lingual LLM like BLOOM, that has been pre-trained on 50+ languages, you can fine-tune it like any other language model.
@vinsmokearifka
Жыл бұрын
@@code4AI ok Sir. Thank you..
HELP
@code4AI
Жыл бұрын
There is a whole community ...