Sam Witteveen

HI my name is Sam Witteveen, I have worked with Deep Learning for 9 years and with Transformers and LLM for 5+ years. I was appointed a Google Developer Expert for Machine Learning in 2017 and I currently work on LLMs and and since earlier in 2023 on Autonomous Agents.

7 сағат бұрын

Nemotron-4 340B - Need to Make a LLM Dataset?

14 сағат бұрын

ChatTTS - Conversational TTS Step by Step

21 сағат бұрын

Qwen 2 - For Reasoning or Creativity?

Күн бұрын

Testing Microsoft's New VLM - Phi-3 Vision

14 күн бұрын

5 Problems Getting LLM Agents into Production

21 күн бұрын

Google's RAG Experiment - NotebookLM

21 күн бұрын

Mastering Google's VLM PaliGemma: Tips And Tricks For Success and Fine Tuning

21 күн бұрын

Mistral's new 7B Model with Native Function Calling

21 күн бұрын

Google I/O for Devs - TPUs, Gemma & GenKit

Ай бұрын

Google is Finally Doing Agents

Ай бұрын

How Google is Expanding the Gemini Era

Ай бұрын

GPT-4o: What They Didn't Say!

Ай бұрын

Advanced Colab - How to go Beyond the Basics

Ай бұрын

New Summarization via In Context Learning with a New Class of Models

Ай бұрын

Function Calling with Local Models & LangChain - Ollama, Llama3 & Phi-3

Ай бұрын

Adding RAG to LangGraph Agents

Ай бұрын

Creating an AI Agent with LangGraph Llama 3 & Groq

Ай бұрын

Llama3 + CrewAI + Groq = Email AI Agent

2 ай бұрын

Llama 3 - 8B & 70B Deep Dive

2 ай бұрын

Unlock The Gemini 1.5 Pro API (+ File API )

2 ай бұрын

Colab 101: Your Ultimate Beginner's Guide!

2 ай бұрын

Discover What's New In Gemma 1.1 Update: New 2B & 7B Instruction Tuned models

2 ай бұрын

CrewAI + Claude 3 Haiku

2 ай бұрын

CrewAI - Building a Custom Crew

2 ай бұрын

Master Claude 3 Haiku - The Crash Course!

2 ай бұрын

Master CrewAI: Your Ultimate Beginner's Guide!

3 ай бұрын

Anthropic's Meta Prompt: A Must-try!

3 ай бұрын

Cohere's Command-R a Strong New Model for RAG

3 ай бұрын

Claude 3 Vs Gemini Vs GPT-4: Who Can Make Amazing Powerpoints?

Пікірлер

@couthyapperСағат бұрын

can you make a video about langchain v0.2?

@picklenickil7 сағат бұрын

Does this work well with threaded bg process?

@__________________________691010 сағат бұрын

WE can't use this outside of google colab like public links from gradio

@RuturajZadbuke8 сағат бұрын

You can self host using the gh repo

@__________________________69107 сағат бұрын

@@RuturajZadbuke That is the problem, I want to use it from google colab

@ramankumar266611 сағат бұрын

I totally new to Langchain and I am getting error 429 - {'message': 'You exceeded your current quota, please check your plan and billing details, 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'} .. Please help me to resolve the error

@henkhbit574813 сағат бұрын

Nice video, Streamlit and Gradio has already a lot of component.s. Curious about the decision to use Flask and not Fastapi?

@samwitteveenai5 сағат бұрын

I agree this is something I thought too. Fast API would give API end points and swagger for free etc.

@aa-xn5hc16 сағат бұрын

Streamlit is also a free server.... Would be good to point out

@darkmatter958323 сағат бұрын

can i do this on colab?

@user-iv8fq5zl9oКүн бұрын

Why are ppl sleeping on chain it

@IanScrivener22 сағат бұрын

ChainLit is great, but my sense is that these are different tools for different use cases.

@jmirodg7094Күн бұрын

Compared to Gradio is it simpler? from your video it looks similar...

@AI-WireКүн бұрын

Is there any way to host it for free and make a small app available to the public at a small scale?

@WeightsByDev18 сағат бұрын

Gradio has that. Basically the app runs locally but gradio provides ssh tunnelling by providing a publicly accessible url. need to use share=True parameter while launching the app

@WillJohnston-wg9ewКүн бұрын

Any thoughts on how to use prompting to generate my UI on the fly for my user? I want to have a dynamic UI that is driven by the prompts.

@samwitteveenaiКүн бұрын

Can you explain a bit more. What do you want to change or update etc?

@williamwong8424Күн бұрын

hi Sam, can you do also file upload or it has a stored memory of uploaded files in the backend where FE can just query? thanks

@samwitteveenaiКүн бұрын

I am not sure if they have an upload feature will look into it

@terryleach8933Күн бұрын

Awesome!

@WillMesopКүн бұрын

Creator of Mesop here. Thanks for creating this video! Big fan of your KZread channel so it was awesome to see this 😊

@WillMesopКүн бұрын

I know a lot of people have wondered why we made another python UI framework. One of the reasons I didn't mention in the blog post is that it's very difficult to use most open source projects, especially FE ones, due to requirements around web security and build integration within Google

@devmikКүн бұрын

@@WillMesop 🔥🛠💎

@12345idiotsluggageКүн бұрын

I don't know if you or Sam are humans or bots. But you two are a treasure. Please keep going. Filthy causal here just trying to level up and/or avoid obsolescence.

@samwitteveenaiКүн бұрын

@WillMesop Awesome!! As soon as I saw it I knew I wanted to make a video to help it get some more attention. Very cool to see you chime in here. Thanks for the great work.

@Archiiee117 сағат бұрын

Nice bro

@theh1veКүн бұрын

Thanks for this good to see but there was nothing here really to take me away from streamlit. Despite it's frustrations sometimes.

@micbab-vg2muКүн бұрын

Thank you :)

@Nick_With_A_StickКүн бұрын

“Often when you are making it for your self, when you start to use it, perhaps some of the assumptions were totally wrong” I laughed out loud so hard 😂. That hit home.

@unhandledexception1948Күн бұрын

so grateful for these carefully crafted walk thru's, the accompanying notebook, the detailed but concise narratives, simply fantastic Sam !

@enceladus96Күн бұрын

2:26 just got trippy real quick

@unclecodeКүн бұрын

Great overview, as usual 😎. I agree, the most interesting part is releasing the Reward model. Although there is one thing engaged my mind. Did you notice its HumanEval score is low? It's around 73.2 compared to models like Llama 3 at 81 or Qwen 2 at 86. I asked on X (Twitter), and they said the Qwen score is for 5 shots while for them is 0. I checked the Qwen tech report, and it's also 0 shots. I followed up with them about this but got no reply. This is crucial because HumanEval involves coding datasets, and if the instruct model has issues, the reward model might too.

@WillJohnston-wg9ew2 күн бұрын

so glad you simplified the presentation format. Much better without the stock video rolls.

@siddz.172 күн бұрын

im getting an error stating "TypeError: Chat.load_models() got an unexpected keyword argument 'compile'" Help

@vio1ator2232 күн бұрын

Free LLM to generate synthetic data? It's like a picks and shovels seller giving out free maps so that prospectors will do more mining and buy more picks and shovels.

@samwitteveenai2 күн бұрын

😀 Well said.

@DaeOh2 күн бұрын

Asinine. To make a good instruction-following dataset you'd need so much more, you want combined instructions, chain-of-thought, step-back questions, alphabetized lists and god knows what other tricks, breaking conversations into more turns, how to handle subject changes, and on and on

@novantha12 күн бұрын

Welp. I was debating whether or not to get a server or a consumer platform given the upcoming wave of hardware but this tips me pretty heavily to a server platform, lol. I've always found it easier to work with models locally.

@wankeecho2 күн бұрын

can these models be adopted to a specific dataset for sort of a financial domain? is there a possibility that they train the domain data for their models?

@rakeshkumarrout26292 күн бұрын

Hey can you make a tutorial on how to integrate vectorstore into salesgpt for e-commerce puropose

@3a1462 күн бұрын

3 A100 for int8 inference

@nufh3 күн бұрын

The AI is advancing so fast. Is it open for everyone? What is the hardware requirement for this?

@clray1232 күн бұрын

2 nodes with 8xH100 or 8xA100 80 GB. Or 1 node with 8x H200.

@figs32843 күн бұрын

Did they also release software for the pipeline of synthetic dataset generation? I saw that this will be apart of their NIM, and explanation in their tech report. But wasnt sure if they were going to release anything besides that.

@quebono1003 күн бұрын

Are we back to the tron naming

@rookandpawn3 күн бұрын

I'm coming from text-generation-webui, how can i use that model folder for ollama?

@PestOnYT3 күн бұрын

Sam, your videos are great! To the point, easy to listen to and no nonsense. I've noticed that most LLMs start their reply with a repetition of the question, or words like "sure, i can answer that...". Is there a way to make them suppress this output? I use llama3 (via Ollama) in my home automation, and I generate some text for TTS and it bothers me that it repeats instructions or puts those phrases before the actual text. Any help (also from the chat) is appreciated. :)

@samwitteveenai3 күн бұрын

Generally this can be achieved via the alignment etc. With the bigger models you can do it via prompting and in context examples. The challenge is many orgs are actually doing instruction tuning to get the models to do exacly what you don't want. What model are you using?

@PestOnYT3 күн бұрын

@@samwitteveenai I use llama3:8b-instruct-q8_0. My current prompt is "You are a helpful assistant. Please, be brief and concise, the user doesn't like chatty LLMs. Try to be as precise as possible. Always use a step-by-step approach and ponder about the result before replying. No repetition or references to these instructions or apologies either, just the plain reply, please. If you do not have an answer, say so, don't make up wrong answers. Don't use any formatting, you may use UTF-8 smilies when appropriate."

@PestOnYT3 күн бұрын

I need to add that some times it works as expected, other times it doesn't.

@ringpolitiet3 күн бұрын

@@PestOnYT My current prompt for Llama 3 is: "You are an AI assistant. You fulfill the user's requests in a neutral and informative manner. Do not be a sycophant to the user. Do not compliment the user. Do not thank the user." I find it gets rid of most of the things I personally do not care for.

@PestOnYT3 күн бұрын

@@ringpolitiet Nice. It worked fine for the couple of test I just did. Thank you!

@jondo76803 күн бұрын

Why do they have Sonnet and not Opus in the comparison? Does Opus beat it?

@samwitteveenai3 күн бұрын

probably. Though I did notice after recording the video Nvidia were doing some zero shot comparisons to few shot from other companies. I do think its a decent model and great to have the ability to generate datasets without any any legal issues.

@phil-jc8hp3 күн бұрын

In my opinion, Opus beats it, especially for math and logic questions. However, it performs really well for summarization and RAG, which makes sense since Nvidia is focusing a lot on company internal local RAG deployments and has even previously published some Llama finetunes for that.

@hqcart13 күн бұрын

I have been always woundering, why would a company go through all the hassle of creating a new model just for the sake of creating it ending up losing money,time,and the model created turns to be that it suck and way behind similar sized ones??? then it stirked me like a lightning !! this is NVIDIA, the don't give a damn, they want you to buy/rent gpus to try their models!

@jondo76803 күн бұрын

This is Nvidia. They gave you tools to make new and better models. People who want to run these models which you make will also need graphic cards (than there are people like me who run 8b models on their smartphones lol).

@samwitteveenai3 күн бұрын

Nvidia can certainly make money from people needing lots of GPUs for this. I think they also made it as a way to sell DGX machines which it is made to conveniently fit on. People who are buying a DGX usually want a model that they can run and FT locally for privacy reasons etc.

@mshonle3 күн бұрын

Releasing open models also makes perfect business sense when it comes to attracting and retaining the top research talent. As Meta knows, top researchers don’t just want a lot of money (which they get) they also want to see their results be used! (That results aren’t always strong is just the nature of research.)

@toadlguy2 күн бұрын

@@samwitteveenai Another reason Nvidia is releasing this, I suspect, is that getting new accurate training data either for fine tuning or creating a new model is becoming more difficult. The large AI companies are creating their own synthetic data but Open Source users are are not licensed to use most models to create synthetic data. To keep Nvidia's gravy train running there needs to be more data. An H100 costs as much as a new car, and a DGX with 8xH100, well that's as much as a small train 🤣

@toadlguy2 күн бұрын

@@jondo7680 The Llama3 70b model, which is comparative to this model, will actually run on a well speced MacBook Pro (You can't use it to make synthetic data, though). Maybe Nvidia is a little worried.

@user-hk9jy9qh3r3 күн бұрын

Sam, I like your content. How can I contact you about a project related to Vision?

@samwitteveenai3 күн бұрын

hey thanks. Just ping me on Linkedin. Easier to chat there.

@micbab-vg2mu3 күн бұрын

thank you for the update:)

@user-en4ek6xt6w3 күн бұрын

Interesting model

@anonymousguy87483 күн бұрын

Informative video

@eightrice3 күн бұрын

so what's actually better about this compared to whisper?

@samwitteveenai3 күн бұрын

different kind of model, Whisper is for Speech to Text and this is Text to Speech (TTS)

@eightrice3 күн бұрын

@@samwitteveenai oh, I often confuse the abreviations. So it makes voice sounds.

@samwitteveenai3 күн бұрын

Yes exactly.

@vikrantkhedkar64515 күн бұрын

This is a really important stuff

@cyberpunkdarren5 күн бұрын

It sounds terrible. And clearly they are lying about "10 million hours". Just two chinese guys trying to rip you off.

@NoidoDev3 күн бұрын

It doesn't. Compared to what? For what price?

@silvias48085 күн бұрын

OMG your point is right on the spot! That's exactly the problem I had to deal with in my project

@BreezyVenoM-di1hr5 күн бұрын

what version of chroma db you were using back then??

@samwitteveenai3 күн бұрын

Not sure I think 1 or 2, that was about a year ago.

@puremajik5 күн бұрын

Thank you this was very instructive. Can you recommend the best libaries for : 1) sectioning a document based on topic changes, 2) summarizing each section while maintaining contextual continuity and coherence, and 3) combining the summaries into a cohesive final summary? I'm thinking something like transformers (Hugging Face), spaCy, Gensim, pandas?

@tvaddict64915 күн бұрын

I have a request. Can you please explain the Customer Support Bot that is an example in Langgraph documentation? Or if you could simplify some of the stuff from that tutorial so langchain beginners, who know agents and tools can follow the tutorial? I find that official langgraph tutorial video on YT extremely lacking.

@samwitteveenai3 күн бұрын

let me take a look into it.

@tvaddict64913 күн бұрын

@@samwitteveenai Thank you so much!

@unclecode5 күн бұрын

The fact that it allows you to get a random speaker sample and then hold it for later use is very intriguing. It's something I wished for the first time I encountered the ElevensLab and similar platforms like Suno. Additionally, training the model with those extra tokens is another interesting feature that's challenging to achieve in ElevensLab. Now, I'm curious about the format of the data when you get that random speaker. Is it a tensor? If so, can you perform arithmetic on it? For example, if you have a sample representing a happy speaker, could you add or subtract it to/from other sounds? This could lead to some fascinating applications, similar to how you can manipulate word embeddings (e.g., "king" - "male" + "female" = "queen"). I'll definitely take a closer look at this. Thanks for sharing!

@PestOnYT6 күн бұрын

I'm using mbrola for my TTS in my home automation. Though it is outdated, the quality is still the best of the tools I've seen so far. This ChatTTS looks very promising.Which version did you use? Currently it is at 0.0.5 and that doesn't work the way you described it. Not even with the code sample shown on HF. The keyword "compile" is not in chat.load_models. The chat.sample_random_speaker doesn't exist either. I've used it with python 3.11. BTW: Would be nice if it could understand Speech Synthesis Markup Language (SSML). If anybody knows a similar TTS which does, drop me a hit please. :)

@mushinart6 күн бұрын

Any recommendations for arabic embeddings?

@samwitteveenai3 күн бұрын

Check out the Cohere multi-lingual ones or BDE-M3 which is an open source multilingual embedding model

@aa-xn5hc6 күн бұрын

Thank you! 😃

@couthyapperСағат бұрын
can you make a video about langchain v0.2?
@picklenickil7 сағат бұрын
Does this work well with threaded bg process?
@__________________________691010 сағат бұрын
WE can't use this outside of google colab like public links from gradio
@RuturajZadbuke8 сағат бұрын
You can self host using the gh repo
@__________________________69107 сағат бұрын
@@RuturajZadbuke That is the problem, I want to use it from google colab
@ramankumar266611 сағат бұрын
I totally new to Langchain and I am getting error 429 - {'message': 'You exceeded your current quota, please check your plan and billing details, 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'} .. Please help me to resolve the error
@henkhbit574813 сағат бұрын
Nice video, Streamlit and Gradio has already a lot of component.s. Curious about the decision to use Flask and not Fastapi?
@samwitteveenai5 сағат бұрын
I agree this is something I thought too. Fast API would give API end points and swagger for free etc.
@aa-xn5hc16 сағат бұрын
Streamlit is also a free server.... Would be good to point out
@darkmatter958323 сағат бұрын
can i do this on colab?
@user-iv8fq5zl9oКүн бұрын
Why are ppl sleeping on chain it
@IanScrivener22 сағат бұрын
ChainLit is great, but my sense is that these are different tools for different use cases.
@jmirodg7094Күн бұрын
Compared to Gradio is it simpler? from your video it looks similar...
@AI-WireКүн бұрын
Is there any way to host it for free and make a small app available to the public at a small scale?
@WeightsByDev18 сағат бұрын
Gradio has that. Basically the app runs locally but gradio provides ssh tunnelling by providing a publicly accessible url. need to use share=True parameter while launching the app
@WillJohnston-wg9ewКүн бұрын
Any thoughts on how to use prompting to generate my UI on the fly for my user? I want to have a dynamic UI that is driven by the prompts.
@samwitteveenaiКүн бұрын
Can you explain a bit more. What do you want to change or update etc?
@williamwong8424Күн бұрын
hi Sam, can you do also file upload or it has a stored memory of uploaded files in the backend where FE can just query? thanks
@samwitteveenaiКүн бұрын
I am not sure if they have an upload feature will look into it
@terryleach8933Күн бұрын
Awesome!
@WillMesopКүн бұрын
Creator of Mesop here. Thanks for creating this video! Big fan of your KZread channel so it was awesome to see this 😊
@WillMesopКүн бұрын
I know a lot of people have wondered why we made another python UI framework. One of the reasons I didn't mention in the blog post is that it's very difficult to use most open source projects, especially FE ones, due to requirements around web security and build integration within Google
@devmikКүн бұрын
@@WillMesop 🔥🛠💎
@12345idiotsluggageКүн бұрын
I don't know if you or Sam are humans or bots. But you two are a treasure. Please keep going. Filthy causal here just trying to level up and/or avoid obsolescence.
@samwitteveenaiКүн бұрын
@WillMesop Awesome!! As soon as I saw it I knew I wanted to make a video to help it get some more attention. Very cool to see you chime in here. Thanks for the great work.
@Archiiee117 сағат бұрын
Nice bro
@theh1veКүн бұрын
Thanks for this good to see but there was nothing here really to take me away from streamlit. Despite it's frustrations sometimes.
@micbab-vg2muКүн бұрын
Thank you :)
@Nick_With_A_StickКүн бұрын
“Often when you are making it for your self, when you start to use it, perhaps some of the assumptions were totally wrong” I laughed out loud so hard 😂. That hit home.
@unhandledexception1948Күн бұрын
so grateful for these carefully crafted walk thru's, the accompanying notebook, the detailed but concise narratives, simply fantastic Sam !
@enceladus96Күн бұрын
2:26 just got trippy real quick
@unclecodeКүн бұрын
Great overview, as usual 😎. I agree, the most interesting part is releasing the Reward model. Although there is one thing engaged my mind. Did you notice its HumanEval score is low? It's around 73.2 compared to models like Llama 3 at 81 or Qwen 2 at 86. I asked on X (Twitter), and they said the Qwen score is for 5 shots while for them is 0. I checked the Qwen tech report, and it's also 0 shots. I followed up with them about this but got no reply. This is crucial because HumanEval involves coding datasets, and if the instruct model has issues, the reward model might too.
@WillJohnston-wg9ew2 күн бұрын
so glad you simplified the presentation format. Much better without the stock video rolls.
@siddz.172 күн бұрын
im getting an error stating "TypeError: Chat.load_models() got an unexpected keyword argument 'compile'" Help
@vio1ator2232 күн бұрын
Free LLM to generate synthetic data? It's like a picks and shovels seller giving out free maps so that prospectors will do more mining and buy more picks and shovels.
@samwitteveenai2 күн бұрын
😀 Well said.
@DaeOh2 күн бұрын
Asinine. To make a good instruction-following dataset you'd need so much more, you want combined instructions, chain-of-thought, step-back questions, alphabetized lists and god knows what other tricks, breaking conversations into more turns, how to handle subject changes, and on and on
@novantha12 күн бұрын
Welp. I was debating whether or not to get a server or a consumer platform given the upcoming wave of hardware but this tips me pretty heavily to a server platform, lol. I've always found it easier to work with models locally.
@wankeecho2 күн бұрын
can these models be adopted to a specific dataset for sort of a financial domain? is there a possibility that they train the domain data for their models?
@rakeshkumarrout26292 күн бұрын
Hey can you make a tutorial on how to integrate vectorstore into salesgpt for e-commerce puropose
@3a1462 күн бұрын
3 A100 for int8 inference
@nufh3 күн бұрын
The AI is advancing so fast. Is it open for everyone? What is the hardware requirement for this?
@clray1232 күн бұрын
2 nodes with 8xH100 or 8xA100 80 GB. Or 1 node with 8x H200.
@figs32843 күн бұрын
Did they also release software for the pipeline of synthetic dataset generation? I saw that this will be apart of their NIM, and explanation in their tech report. But wasnt sure if they were going to release anything besides that.
@quebono1003 күн бұрын
Are we back to the tron naming
@rookandpawn3 күн бұрын
I'm coming from text-generation-webui, how can i use that model folder for ollama?
@PestOnYT3 күн бұрын
Sam, your videos are great! To the point, easy to listen to and no nonsense. I've noticed that most LLMs start their reply with a repetition of the question, or words like "sure, i can answer that...". Is there a way to make them suppress this output? I use llama3 (via Ollama) in my home automation, and I generate some text for TTS and it bothers me that it repeats instructions or puts those phrases before the actual text. Any help (also from the chat) is appreciated. :)
@samwitteveenai3 күн бұрын
Generally this can be achieved via the alignment etc. With the bigger models you can do it via prompting and in context examples. The challenge is many orgs are actually doing instruction tuning to get the models to do exacly what you don't want. What model are you using?
@PestOnYT3 күн бұрын
@@samwitteveenai I use llama3:8b-instruct-q8_0. My current prompt is "You are a helpful assistant. Please, be brief and concise, the user doesn't like chatty LLMs. Try to be as precise as possible. Always use a step-by-step approach and ponder about the result before replying. No repetition or references to these instructions or apologies either, just the plain reply, please. If you do not have an answer, say so, don't make up wrong answers. Don't use any formatting, you may use UTF-8 smilies when appropriate."
@PestOnYT3 күн бұрын
I need to add that some times it works as expected, other times it doesn't.
@ringpolitiet3 күн бұрын
@@PestOnYT My current prompt for Llama 3 is: "You are an AI assistant. You fulfill the user's requests in a neutral and informative manner. Do not be a sycophant to the user. Do not compliment the user. Do not thank the user." I find it gets rid of most of the things I personally do not care for.
@PestOnYT3 күн бұрын
@@ringpolitiet Nice. It worked fine for the couple of test I just did. Thank you!
@jondo76803 күн бұрын
Why do they have Sonnet and not Opus in the comparison? Does Opus beat it?
@samwitteveenai3 күн бұрын
probably. Though I did notice after recording the video Nvidia were doing some zero shot comparisons to few shot from other companies. I do think its a decent model and great to have the ability to generate datasets without any any legal issues.
@phil-jc8hp3 күн бұрын
In my opinion, Opus beats it, especially for math and logic questions. However, it performs really well for summarization and RAG, which makes sense since Nvidia is focusing a lot on company internal local RAG deployments and has even previously published some Llama finetunes for that.
@hqcart13 күн бұрын
I have been always woundering, why would a company go through all the hassle of creating a new model just for the sake of creating it ending up losing money,time,and the model created turns to be that it suck and way behind similar sized ones??? then it stirked me like a lightning !! this is NVIDIA, the don't give a damn, they want you to buy/rent gpus to try their models!
@jondo76803 күн бұрын
This is Nvidia. They gave you tools to make new and better models. People who want to run these models which you make will also need graphic cards (than there are people like me who run 8b models on their smartphones lol).
@samwitteveenai3 күн бұрын
Nvidia can certainly make money from people needing lots of GPUs for this. I think they also made it as a way to sell DGX machines which it is made to conveniently fit on. People who are buying a DGX usually want a model that they can run and FT locally for privacy reasons etc.
@mshonle3 күн бұрын
Releasing open models also makes perfect business sense when it comes to attracting and retaining the top research talent. As Meta knows, top researchers don’t just want a lot of money (which they get) they also want to see their results be used! (That results aren’t always strong is just the nature of research.)
@toadlguy2 күн бұрын
@@samwitteveenai Another reason Nvidia is releasing this, I suspect, is that getting new accurate training data either for fine tuning or creating a new model is becoming more difficult. The large AI companies are creating their own synthetic data but Open Source users are are not licensed to use most models to create synthetic data. To keep Nvidia's gravy train running there needs to be more data. An H100 costs as much as a new car, and a DGX with 8xH100, well that's as much as a small train 🤣
@toadlguy2 күн бұрын
@@jondo7680 The Llama3 70b model, which is comparative to this model, will actually run on a well speced MacBook Pro (You can't use it to make synthetic data, though). Maybe Nvidia is a little worried.
@user-hk9jy9qh3r3 күн бұрын
Sam, I like your content. How can I contact you about a project related to Vision?
@samwitteveenai3 күн бұрын
hey thanks. Just ping me on Linkedin. Easier to chat there.
@micbab-vg2mu3 күн бұрын
thank you for the update:)
@user-en4ek6xt6w3 күн бұрын
Interesting model
@anonymousguy87483 күн бұрын
Informative video
@eightrice3 күн бұрын
so what's actually better about this compared to whisper?
@samwitteveenai3 күн бұрын
different kind of model, Whisper is for Speech to Text and this is Text to Speech (TTS)
@eightrice3 күн бұрын
@@samwitteveenai oh, I often confuse the abreviations. So it makes voice sounds.
@samwitteveenai3 күн бұрын
Yes exactly.
@vikrantkhedkar64515 күн бұрын
This is a really important stuff
@cyberpunkdarren5 күн бұрын
It sounds terrible. And clearly they are lying about "10 million hours". Just two chinese guys trying to rip you off.
@NoidoDev3 күн бұрын
It doesn't. Compared to what? For what price?
@silvias48085 күн бұрын
OMG your point is right on the spot! That's exactly the problem I had to deal with in my project
@BreezyVenoM-di1hr5 күн бұрын
what version of chroma db you were using back then??
@samwitteveenai3 күн бұрын
Not sure I think 1 or 2, that was about a year ago.
@puremajik5 күн бұрын
Thank you this was very instructive. Can you recommend the best libaries for : 1) sectioning a document based on topic changes, 2) summarizing each section while maintaining contextual continuity and coherence, and 3) combining the summaries into a cohesive final summary? I'm thinking something like transformers (Hugging Face), spaCy, Gensim, pandas?
@tvaddict64915 күн бұрын
I have a request. Can you please explain the Customer Support Bot that is an example in Langgraph documentation? Or if you could simplify some of the stuff from that tutorial so langchain beginners, who know agents and tools can follow the tutorial? I find that official langgraph tutorial video on YT extremely lacking.
@samwitteveenai3 күн бұрын
let me take a look into it.
@tvaddict64913 күн бұрын
@@samwitteveenai Thank you so much!
@unclecode5 күн бұрын
The fact that it allows you to get a random speaker sample and then hold it for later use is very intriguing. It's something I wished for the first time I encountered the ElevensLab and similar platforms like Suno. Additionally, training the model with those extra tokens is another interesting feature that's challenging to achieve in ElevensLab. Now, I'm curious about the format of the data when you get that random speaker. Is it a tensor? If so, can you perform arithmetic on it? For example, if you have a sample representing a happy speaker, could you add or subtract it to/from other sounds? This could lead to some fascinating applications, similar to how you can manipulate word embeddings (e.g., "king" - "male" + "female" = "queen"). I'll definitely take a closer look at this. Thanks for sharing!
@PestOnYT6 күн бұрын
I'm using mbrola for my TTS in my home automation. Though it is outdated, the quality is still the best of the tools I've seen so far. This ChatTTS looks very promising.Which version did you use? Currently it is at 0.0.5 and that doesn't work the way you described it. Not even with the code sample shown on HF. The keyword "compile" is not in chat.load_models. The chat.sample_random_speaker doesn't exist either. I've used it with python 3.11. BTW: Would be nice if it could understand Speech Synthesis Markup Language (SSML). If anybody knows a similar TTS which does, drop me a hit please. :)
@mushinart6 күн бұрын
Any recommendations for arabic embeddings?
@samwitteveenai3 күн бұрын
Check out the Cohere multi-lingual ones or BDE-M3 which is an open source multilingual embedding model
@aa-xn5hc6 күн бұрын
Thank you! 😃