LLMs for Everything and Everyone! - Sebastian Raschka - Lightning AI

Ғылым және технология

In this conference talk, titled "LLMs for Everything and Everyone!" presented by Sebastian Raschka, who is the Lead AI Educator at Lightning AI, you will get an approachable view into the world of Large Language Models (LLMs). LLMs, like GPT-3.5, have revolutionized various fields of artificial intelligence, and Sebastian highlights their versatility and widespread applicability, in addition to addressing the small vs. big model debate.
You will gain insights into the practical uses of LLMs in diverse scenarios, discovering how these models can be harnessed to tackle complex problems and streamline workflows. Raschka's expertise as an AI educator at Lightning AI shines through as he breaks down complex concepts into understandable insights, making this talk accessible to both novices and experts.
✅ Connect with Sebastian: / sebastianraschka
✅ Connect with Southern Data Science Conference on LinkedIn - / southerndatasciencecon...
✅ Connect with Southern Data Science Conference on Twitter:
/ southerndsc

Пікірлер: 6

  • @wisconsuper
    @wisconsuper10 ай бұрын

    🎯 Key Takeaways for quick navigation: 00:29 🤖 Sebastian Raschka is the lead AI educator for Lightning AI and a former professor of Statistics. 00:43 🤔 LLMs (Large Language Models) promise to make us more productive by speeding up tasks like writing emails and coding. 01:52 🦾 Sebastian discusses the motivation behind caring about LLMs, including staying updated with news and the potential for increased productivity. 04:09 🎛️ LLMs like ChatGPT go through three main stages: pre-training, supervised fine-tuning, and alignment. 07:37 💡 The pre-training and fine-tuning stages involve next word prediction tasks, but the format of the data sets and instructions differ. 10:09 🤖 LLMs (Large Language Models) go through a three-stage process: pre-training, fine-tuning, and alignment. 11:31 🦾 Reinforcement Learning from Human Feedback (RLHF) involves sampling prompts, having humans rank the responses, training a reward model, and refining the model using proximal policy optimization. 13:21 💡 When it comes to using LLMs, there are five main ways: everyday tasks, pre-training and prompting for specific domains, fine-tuning on specific data, fine-tuning and prompting for code-related tasks, and training custom models. 16:36 👥 Bloomberg pre-trained their own LLM on general data and finance-specific data to generate a proprietary database format and finance news headlines. 18:13 🐍 Meta AI (formerly Facebook) trained the Code LLM by pre-training a model, further training it on general code data, fine-tuning it on python code, and iterating through various fine-tuning stages to create models like Code Llama. 19:07 👥 Open source LLMs provide access to log probabilities, allowing for analyzing the confidence of the model and ranking responses. 21:24 🤖 Open source LLMs provide privacy and control as the data never has to leave the computer and can be modified as desired. 23:19 👍 Open source LLMs offer full customizability and allow for experimentation and modifications to the model. 24:30 💡 Open source LLMs do not change unless desired, which can be an advantage or disadvantage depending on the need for updates. 25:40 🔌 Running open source LLMs requires access to hardware and some additional work in terms of cloning repositories and downloading weights. 30:02 👥 Open source models like Free Willy Falcon, Bikuna, PhysioStable, Code Llama, and Lit GPT offer freely available weights and a hackable, customizable codebase. 31:27 🏢 Fine-tuning smaller models can achieve better performance on specific tasks than using larger general models like GPT-3 or GPT-4. 32:49 💡 Fine-tuning models are task-specific and useful for solving specific business problems, while prompting models are more general-purpose. 34:40 ⏱️ Parameter-efficient fine-tuning techniques like adapters and low-rank adaptation (Laura) can save a significant amount of time compared to full fine-tuning. 37:12 🌟 Future trends for LLMs include the mixture of experts approach, multi-modal models, and LLMs for specific domains like protein-related tasks. 39:33 🎯 Non-transformer large language models like RWKV, hyena hierarchy, and retentive Network offer alternative approaches to language modeling that are worth keeping an eye on. 39:49 🔄 There are alternatives to reinforcement learning-based fine-tuning, such as relabeling data in hindsight and direct policy optimization, which show promising results and may simplify the process. 41:22 🌐 The performance of fine-tuned models can vary based on the specificity of the domain, with the potential to outperform larger pre-trained models in smaller, more specific domains. 41:09 📚 Sebastian Raschka stays updated on new research papers and frequently discusses them on his blog. He is also involved in developing the open-source repository Electricity for loading and customizing LLMs. 41:40 🎙️ There is no size limit for the domain when it comes to outperforming pre-trained models with fine-tuned models. The performance can vary based on the specific task and the domain covered. Made with HARPA AI

  • @zakiasalod891
    @zakiasalod8919 ай бұрын

    Absolutely amazing! Thank you so much for this.

  • @Saitama-ur3lq
    @Saitama-ur3lq8 ай бұрын

    what exactly is a parameter in llm? what do you mean when you say this model has 1 billion parameters?

  • @kuterv

    @kuterv

    8 ай бұрын

    Parameters are numerical values that basically translate into model performance/behaviour, learned during the training process. Those are needed to better capture the patterns and relationships in the data. More = higher complexity = ability to perform more complex tasks. As a rough estimate, for every 1 billion data points, you can expect approximately 10 billion parameters. Where data point is a unit of information that is used to train a model. Let's say we want to train a model to understand positive and negative reviews, so data point will be: one review (tasty pizza and great service) + sentiment (positive). Example: weights. LLM is a neural network - a lot of connections. How to figure out the influence of one neuron on another? - weights. He mentioned another parameter called "temperature" - controls the randomness of the generative process. Temperature is usually used for the entire model but for some cases, multiple temperature values could be used. You can adjust "biases" to eliminate (probably partially if data set is like that) offensive or harmful model responses.

  • @Saitama-ur3lq

    @Saitama-ur3lq

    8 ай бұрын

    @@kutervthank you for the insanely detailed explanation, lets say you wanted to train a model on reviews since you mentioned it. How many parameters do you need or does this somehow get set by the system while training

  • @rodi4850
    @rodi48509 ай бұрын

    nothing new, just recycled existing information on this topic, very disappointing ...

Келесі