T5: Exploring Limits of Transfer Learning with Text-to-Text Transformer (Research Paper Walkthrough)

#trasferlearning #t5 #google
This paper from Google introduces T5 model (Text-to-Text Transfer Transformer) and releases large scale C4 corpus (~750GB) . T5 is an large neural network model that is trained in a pre-train, fine-tune learning paradigm. In this framework all NLP tasks are reframed into a unified text-to-text-format where the input and output are always text strings. Fine-tuning of the model was done on various tasks like SQAD question answering, WMT translation, CNN/Daily mail Abstractive Summarisation, Sentiment analysis, etc
⏩ Abstract: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new "Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.
Please feel free to share out the content and subscribe to my channel :)
⏩ Subscribe - / @techvizthedatascienceguy
⏩ OUTLINE:
0:00 - Background and Overview
1:22 - Abstract
2:13 - Input and Output formatting for T5 model
4:20 - The Colossal Clean Crawled Corpus (C4 Corpus)
6:23 - Downstream tasks
7:07 - Diagram view Text-to-Text Framework
9:56 - Attention Mask Patterns
11:06 - Flow chart of exploration of unsupervised objectives
⏩ Paper Title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
⏩ Paper: arxiv.org/abs/1910.10683
⏩ Code: github.com/google-research/te...
⏩ Author: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu
⏩ Organisation: Google
⏩ IMPORTANT LINKS
My T5 Blog - prakhartechviz.blogspot.com/2...
Google T5 Blog - ai.googleblog.com/2020/02/exp...
LongT5 Paper Summary: • LongT5: Efficient Text...
SpanBERT - • SpanBERT: Improving Pr...
PEGASUS - • PEGASUS: Pre-training ...
Research Paper Walkthroughs - • Simple Unsupervised Ke...
Data Augmentation in NLP - • Data Augmentation usin...
*********************************************
If you want to support me financially which totally optional and voluntary :) ❤️
You can consider buying me chai ( because i don't drink coffee :) ) at www.buymeacoffee.com/TechvizC...
*********************************************
⏩ KZread - / techvizthedatascienceguy
⏩ Blog - prakhartechviz.blogspot.com
⏩ LinkedIn - / prakhar21
⏩ Medium - / prakhar.mishra
⏩ GitHub - github.com/prakhar21
⏩ Twitter - / rattller
*********************************************
Please feel free to share out the content and subscribe to my channel :)
⏩ Subscribe - / @techvizthedatascienceguy
Tools I use for making videos :)
⏩ iPad - tinyurl.com/y39p6pwc
⏩ Apple Pencil - tinyurl.com/y5rk8txn
⏩ GoodNotes - tinyurl.com/y627cfsa
#techviz #datascienceguy #nlproc #researchpaper #naturallanguageprocessing #transformer #deeplearning

Пікірлер: 40

  • @pranavdange5695
    @pranavdange56952 жыл бұрын

    Absolute beauty. I really liked your crisp and clear approach. Nice job

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thank you Pranav.

  • @SatrioWPutra
    @SatrioWPutra3 жыл бұрын

    I have no difficulty following your explanation, great video!

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    3 жыл бұрын

    Awesome, thank you!

  • @shivamkaushik6637
    @shivamkaushik6637 Жыл бұрын

    Thanks for the explanation. Beautifully summarised.

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    Жыл бұрын

    Thank you :)

  • @aunkitchaki9943
    @aunkitchaki99432 жыл бұрын

    Amazing explanation. Please keep up your good work! ❤

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thank you so much Aunkit! Will keep doing so :)

  • @ashita1130
    @ashita1130 Жыл бұрын

    Very useful to watch esp during strenuous cardio on treadmill, I gain knowledge and stay distracted 🙏💫

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    Жыл бұрын

    😄 thanks!

  • @ashishchoudhary7732
    @ashishchoudhary77325 ай бұрын

    Crisp and clear explanation Thanks!

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    5 ай бұрын

    Thank you 😊

  • @gloriaabuka5644
    @gloriaabuka56442 жыл бұрын

    Thank you! Excellent explanation.

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thank you very much :)

  • @nischalgandigesuresha2367
    @nischalgandigesuresha23677 ай бұрын

    Thanks for the video. I wish you had explained a little more about the model architecture.

  • @shivprasadsagare6693
    @shivprasadsagare66932 жыл бұрын

    Amazing content. Keep it up.

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thanks you very much Shivprasad.

  • @akshayak435
    @akshayak4352 жыл бұрын

    Amazing!

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thank you!

  • @pulkitmehta1795
    @pulkitmehta1795 Жыл бұрын

    very well explained .

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    Жыл бұрын

    Thank you Pulkit 🙏

  • @LiloXiaoJieJie
    @LiloXiaoJieJie2 жыл бұрын

    great video!

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thanks!

  • @aqibfayyaz1619
    @aqibfayyaz16192 жыл бұрын

    Good one

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Thank you :)

  • @samyunyap7738
    @samyunyap77382 жыл бұрын

    Hi, I found that there are many different types of T5 models. I was wondering what the differences are? Mainly, what is the difference between the T5Model and the T5ForConditionalGeneration Model?

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    2 жыл бұрын

    Hey, The latter is nothing but the language modeling head on top of the decoder, where as the former entails no specific head and just return the hidden states.

  • @samyunyap8077

    @samyunyap8077

    2 жыл бұрын

    @@TechVizTheDataScienceGuy I see, thanks! Great video, btw!

  • @ratikagarg1494
    @ratikagarg14943 жыл бұрын

    1.1k subscribers🔥

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    3 жыл бұрын

    🎉

  • @user-kz2es8sg3f
    @user-kz2es8sg3f7 ай бұрын

    But isn’t gpt also input is text , output is also text?

  • @YashwanthReddy-zr9nk
    @YashwanthReddy-zr9nk3 жыл бұрын

    One Small doubt, Can you please let me know the Embedding Vector Size of T5 Transformer?

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    3 жыл бұрын

    I think in original paper they have written 768. But the hugging face implementation has 512 as default. Kindly correlate. Thanks.

  • @YashwanthReddy-zr9nk

    @YashwanthReddy-zr9nk

    3 жыл бұрын

    @@TechVizTheDataScienceGuy Thank you. You are doing a great job. Good Luck to you.

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    3 жыл бұрын

    You’re welcome!

  • @rohanalytics
    @rohanalytics3 жыл бұрын

    Amazing!

  • @TechVizTheDataScienceGuy

    @TechVizTheDataScienceGuy

    3 жыл бұрын

    Thanks rohan!