Sam Witteveen
Күн бұрын
9,195
1

Nemotron-4 340B - Need to Make a LLM Dataset?

Ғылым және технология

In this video, I talk about the new Nemotron model from Nvidia and how it goes beyond just one video to be a whole family of models that allows you to make endless amounts of free synthetic data to train your own language models
Blog: blogs.nvidia.com/blog/nemotro...
Tech Report: research.nvidia.com/publicati...
Testing the model: chat.lmsys.org/
🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: drp.li/dIMes
👨‍💻Github:
github.com/samwit/langchain-t... (updated)
github.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:45 Benchmarks
01:11 NVIDIA Blog
01:26 NVIDIA Hugging Face
01:36 Nemotron-4 340B- Instruct
03:12 Nemotron-4 340B Technical Paper
04:11 HelpSteer2
06:25 RewardBench
07:37 Nemotron-4 340B Demo on LMSYS Chatbot Arena

Пікірлер: 36

@WillJohnston-wg9ew9 күн бұрын
so glad you simplified the presentation format. Much better without the stock video rolls.
@micbab-vg2mu10 күн бұрын
thank you for the update:)
@anonymousguy874810 күн бұрын
Informative video
@user-en4ek6xt6w10 күн бұрын
Interesting model
@unclecode9 күн бұрын
Great overview, as usual 😎. I agree, the most interesting part is releasing the Reward model. Although there is one thing engaged my mind. Did you notice its HumanEval score is low? It's around 73.2 compared to models like Llama 3 at 81 or Qwen 2 at 86. I asked on X (Twitter), and they said the Qwen score is for 5 shots while for them is 0. I checked the Qwen tech report, and it's also 0 shots. I followed up with them about this but got no reply. This is crucial because HumanEval involves coding datasets, and if the instruct model has issues, the reward model might too.
@figs328410 күн бұрын
Did they also release software for the pipeline of synthetic dataset generation? I saw that this will be apart of their NIM, and explanation in their tech report. But wasnt sure if they were going to release anything besides that.
@novantha110 күн бұрын
Welp. I was debating whether or not to get a server or a consumer platform given the upcoming wave of hardware but this tips me pretty heavily to a server platform, lol. I've always found it easier to work with models locally.
@PestOnYT10 күн бұрын
Sam, your videos are great! To the point, easy to listen to and no nonsense. I've noticed that most LLMs start their reply with a repetition of the question, or words like "sure, i can answer that...". Is there a way to make them suppress this output? I use llama3 (via Ollama) in my home automation, and I generate some text for TTS and it bothers me that it repeats instructions or puts those phrases before the actual text. Any help (also from the chat) is appreciated. :)
@samwitteveenai
10 күн бұрын
Generally this can be achieved via the alignment etc. With the bigger models you can do it via prompting and in context examples. The challenge is many orgs are actually doing instruction tuning to get the models to do exacly what you don't want. What model are you using?
@PestOnYT
10 күн бұрын
@@samwitteveenai I use llama3:8b-instruct-q8_0. My current prompt is "You are a helpful assistant. Please, be brief and concise, the user doesn't like chatty LLMs. Try to be as precise as possible. Always use a step-by-step approach and ponder about the result before replying. No repetition or references to these instructions or apologies either, just the plain reply, please. If you do not have an answer, say so, don't make up wrong answers. Don't use any formatting, you may use UTF-8 smilies when appropriate."
@PestOnYT
10 күн бұрын
I need to add that some times it works as expected, other times it doesn't.
@ringpolitiet
10 күн бұрын
@@PestOnYT My current prompt for Llama 3 is: "You are an AI assistant. You fulfill the user's requests in a neutral and informative manner. Do not be a sycophant to the user. Do not compliment the user. Do not thank the user." I find it gets rid of most of the things I personally do not care for.
@PestOnYT
10 күн бұрын
@@ringpolitiet Nice. It worked fine for the couple of test I just did. Thank you!
@quebono10010 күн бұрын
Are we back to the tron naming
@rakeshkumarrout262910 күн бұрын
Hey can you make a tutorial on how to integrate vectorstore into salesgpt for e-commerce puropose
@hqcart110 күн бұрын
I have been always woundering, why would a company go through all the hassle of creating a new model just for the sake of creating it ending up losing money,time,and the model created turns to be that it suck and way behind similar sized ones??? then it stirked me like a lightning !! this is NVIDIA, the don't give a damn, they want you to buy/rent gpus to try their models!
@jondo7680
10 күн бұрын
This is Nvidia. They gave you tools to make new and better models. People who want to run these models which you make will also need graphic cards (than there are people like me who run 8b models on their smartphones lol).
@samwitteveenai
10 күн бұрын
Nvidia can certainly make money from people needing lots of GPUs for this. I think they also made it as a way to sell DGX machines which it is made to conveniently fit on. People who are buying a DGX usually want a model that they can run and FT locally for privacy reasons etc.
@mshonle
10 күн бұрын
Releasing open models also makes perfect business sense when it comes to attracting and retaining the top research talent. As Meta knows, top researchers don’t just want a lot of money (which they get) they also want to see their results be used! (That results aren’t always strong is just the nature of research.)
@toadlguy
10 күн бұрын
@@samwitteveenai Another reason Nvidia is releasing this, I suspect, is that getting new accurate training data either for fine tuning or creating a new model is becoming more difficult. The large AI companies are creating their own synthetic data but Open Source users are are not licensed to use most models to create synthetic data. To keep Nvidia's gravy train running there needs to be more data. An H100 costs as much as a new car, and a DGX with 8xH100, well that's as much as a small train 🤣
@toadlguy
10 күн бұрын
@@jondo7680 The Llama3 70b model, which is comparative to this model, will actually run on a well speced MacBook Pro (You can't use it to make synthetic data, though). Maybe Nvidia is a little worried.
@user-hk9jy9qh3r10 күн бұрын
Sam, I like your content. How can I contact you about a project related to Vision?
@samwitteveenai
10 күн бұрын
hey thanks. Just ping me on Linkedin. Easier to chat there.
@jondo768010 күн бұрын
Why do they have Sonnet and not Opus in the comparison? Does Opus beat it?
@samwitteveenai
10 күн бұрын
probably. Though I did notice after recording the video Nvidia were doing some zero shot comparisons to few shot from other companies. I do think its a decent model and great to have the ability to generate datasets without any any legal issues.
@phil-jc8hp
10 күн бұрын
In my opinion, Opus beats it, especially for math and logic questions. However, it performs really well for summarization and RAG, which makes sense since Nvidia is focusing a lot on company internal local RAG deployments and has even previously published some Llama finetunes for that.
@DaeOh10 күн бұрын
Asinine. To make a good instruction-following dataset you'd need so much more, you want combined instructions, chain-of-thought, step-back questions, alphabetized lists and god knows what other tricks, breaking conversations into more turns, how to handle subject changes, and on and on
@user-ru1qz1bo2q
7 күн бұрын
Looks like it's aimed more at creating pre-training data or domain knowledge fine tuning datasets.
@3a14610 күн бұрын
3 A100 for int8 inference
@vio1ator22310 күн бұрын
Free LLM to generate synthetic data? It's like a picks and shovels seller giving out free maps so that prospectors will do more mining and buy more picks and shovels.
@samwitteveenai
9 күн бұрын
😀 Well said.
@nufh10 күн бұрын
The AI is advancing so fast. Is it open for everyone? What is the hardware requirement for this?
@clray123
9 күн бұрын
2 nodes with 8xH100 or 8xA100 80 GB. Or 1 node with 8x H200.
@Siqum6 күн бұрын
promt it this. this is so bad. ``` $0.0075 / 1k characters how much is it per 1million characters ```