Arxiv Papers

7 сағат бұрын

[QA] Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Пікірлер

@existenceisillusion65282 күн бұрын

This will be bad, and it's no surprise that it's Google leading the charge . . . . off a cliff.

@SYETC200_SuyashSingh6 күн бұрын

How was this video made? I mean using which tool.

@ArxivPapers6 күн бұрын

github.com/imelnyk/ArxivPapers

@kingki195310 күн бұрын

Benchmark for reasoning in LLM

@ya_krutyi133712 күн бұрын

Cool format! Only would be really cool to have subs

@mulderbm12 күн бұрын

I think this can help humans to sudy better too, but thats just me

@kingki195312 күн бұрын

It could potentially to invent ML builder and optimizer agent using MLLM.

@kingki195313 күн бұрын

does the benchmark for Knowledge reasoning in LLM is exist?

@kingki195314 күн бұрын

Why it didn't use Rouge for comparison to real data? What the metric to evaluate if the model is hallucinate?

@kingki195316 күн бұрын

I don't understand, why lot people like than most of other videos?

@cancelebi893918 күн бұрын

I dont like lack of transferablity. N shot is not robust and has these issues with order, balance n selection. 0 shot instruction learning techniques feel more robust

@kingki195319 күн бұрын

I wonder how the western scientist can create such a Compact and complex but usefull Paper like this. I hope my lecturers did this too.

@TheTruthOfAI19 күн бұрын

Teeeeeeeerrrifffiiiccccc!!! any torch code around?

@LoFiLatentSpace21 күн бұрын

Crazy how fast we came from scaling laws v1, to theoretical performance indicators allowing for early stoppage, to continual training methods without specified training steps or learning rates.

@kingki195321 күн бұрын

Does it related to scrapegraphai?

@kingki195322 күн бұрын

Does it related to last competion in Kaggle about enchaching LLM about Math Olympic?

@kingki195322 күн бұрын

I have question, why no new paper on Mamba architecture?

@robsim22 күн бұрын

Mamba sucks at ICL so it's basically unusable for LLMs. There's a few papers on this channel about it.

@kingki195322 күн бұрын

@@robsim what is ICL did you mean?

@knextkoder16 күн бұрын

@@kingki1953 LMAO!!! I was just about to say cause Mamba sucks... BTW, ICL means In-Context-learning

@gemini_53722 күн бұрын

This is a paper, created by AI, read by AI, analyzed by AI, for AI...

@kingki195323 күн бұрын

I think we can use this as text summarization purpose

@kingki195323 күн бұрын

Can it improve RAG?

@user-wr4yl7tx3w23 күн бұрын

Great format also

@user-wr4yl7tx3w23 күн бұрын

Great format

@mulderbm23 күн бұрын

No expert but this is leaning towards neuro science. Maybe good to combine methods at some point

@alexanderbrown-dg3sy23 күн бұрын

I agree but for a different type of analysis. In terms of reasoning and how it should be interpreted…should be viewed more as multi-hop graph transversal..leaning more towards graph theory when it comes the the mechanistic traits of reasoning in a model’s latent space. Just my two cents.

@jaex961727 күн бұрын

The "Florida" in the ai voiceover really makes this video great.

@kingki195328 күн бұрын

Imagine if there's also philosophical reasoning to AI. It would propose that AI could also do a abstraction thinking. I hope someone create this paper.

@santhosh-rq2nf29 күн бұрын

Hey, I like the content, very much. I have a question though, how do you make these? Summarization or gpt + text to speech models?, it's way better than I would expect such a combo

@ArxivPapers29 күн бұрын

yes, you can take a look at the process here: github.com/imelnyk/ArxivPapers

@LoFiLatentSpaceАй бұрын

❤Scaling Laws 2.0 as a means of optimizing pre-training, categorizing the relative potential performance distributions of architectures, etc I.e. The paper introduces an energy function as an analytical tool to model and understand the memorization and retrieval process of Transformer-based language models, providing insights into their performance and convergence dynamics during training.

@TheTruthOfAIАй бұрын

there is one thing i dont understand.. "we present..." but no code, provide so much details but no code.. why? What does this means itself? a trickery to the competition? why after releasing such large details u do not release any code or model at all??????.. "chameleon outperforms..." << where ? u say so? that detokenization of the image representation sounds a lot like bullshit XD

@alexanderbrown-dg3syАй бұрын

Considering in the last year. Some very compelling tokenizers have been released like morphpiece for instance…this is pretty huge. Phenomenal research.

@TheTruthOfAIАй бұрын

good stuff... this one is more "online trianing" than the other trashpaper released recently by GPT4.

@TheTruthOfAIАй бұрын

u shouldn't let GPT to choose your paper title.. "Online Training" ... woder wether u really know what does that means.. u cant name "online training" .. which basically means in-line continous training.. just because ur set is from the internet??... Online Training quality ... is the duality between inference and training.. not about the `py.requests.get(url)` source as it...

@TheTruthOfAIАй бұрын

LOL... what a shame XD obviously a GPT made paper.. that exhibit no-knowledge over JEPA... so he presents joint embedding as it was the novelty behind this... x'D get a pythina70M to perform as pythina410M with your "technique" .. so we can see wether "architecture" is the key of performance and not "common f. sense over what the size of a neural complexity can do"... "platonic representation"... and what is joint embedding? flower-power representation? XD

@crosstarsdrawer5813Ай бұрын

Thank you. Would you tell me how is this video made? Is it also made by AI tool(like paper summarization tool)?

@veeveeleoАй бұрын

Instructions unclear, reversed a curse now very lucky

@ndhtyuАй бұрын

The curse of diversity - bahahahahahahaha!

@valfredolimaАй бұрын

bom

@valfredolimaАй бұрын

ObrIGADO pela leitura, desde Brasil.

@TymexComputingАй бұрын

3:34 it's a b.s. paper :)

@TymexComputingАй бұрын

He He? Ha Ha?

@rauldurandАй бұрын

Excellent!

@felipevaldes7679Ай бұрын

Here's a summary of the key points: Chatbots like GPT-4 and ChatGPT are now widely used by millions, but there is a lack of public datasets showing how these tools are actually used by a diverse population of users in practice. To address this gap, researchers offered free access to ChatGPT in exchange for users' consent to anonymously collect their chat transcripts and request headers. This resulted in WildChat, a corpus of 1 million user-ChatGPT conversations with over 2.5 million interaction turns. WildChat contains more diverse user prompts, languages, and potentially toxic use cases compared to other chatbot interaction datasets. In addition to chat transcripts, WildChat includes demographic data like location and hashed IP addresses to enable analysis across regions and times. The diverse use cases captured make WildChat potentially useful for fine-tuning instruction-following models. WildChat is released publicly under AI2 ImpACT Licenses at the provided URL.

@BlayneOliverАй бұрын

This video should be receiving millions of views

@leoperez2566Ай бұрын

sounds sus

@erinkhooАй бұрын

Summary by @erinkhoo 1. Apple has released OpenELM, a family of open-source large language models (LLMs) designed to run efficiently on Apple devices. There are 4 pre-trained models of different sizes - 270M, 450M, 1.1B and 3B parameters. 2. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each transformer layer. This leads to enhanced accuracy compared to other open LLMs like OLMo, while requiring 2x fewer pre-training tokens[1][6]. 3. Unlike typical releases that only provide model weights and inference code, Apple has open-sourced the complete training and evaluation framework. This includes training logs, multiple checkpoints, pre-training configurations on public datasets, and code to convert models for on-device inference using Apple's MLX library[1][6]. 4. In benchmarks, the 1.1B parameter OpenELM model outperforms the 1.2B OLMo model by 2.36% accuracy on the OpenLLM leaderboard tasks, while using half the pre-training data[1][4]. However, OpenELM is slower than OLMo in inference due to a naive implementation of the RMSNorm layer[2]. 5. Apple has released the models on the Hugging Face hub, making them easily accessible for developers to experiment with and integrate into applications[10][12]. Implications for Apple: 1. Enabling on-device AI: OpenELM models are compact and optimized to run on iPhones, iPads and Macs rather than relying on the cloud. This enables fast, local AI processing for features like composing emails, while preserving privacy[4][10]. 2. Catching up in the AI race: By open-sourcing capable LLMs, Apple is making strides to match competitors like Google and Microsoft in the generative AI space. OpenELM could power upcoming AI features rumored for iOS 16[10][12]. 3. Empowering developers and researchers: Releasing the full training stack allows the community to investigate the models for potential biases and risks. Developers can easily leverage the models as-is or fine-tune them for specific applications[1][6]. 4. Paving the way for more open research: OpenELM represents a shift in Apple's typically closed approach. It may signal more open collaboration with the research community to accelerate AI progress in a responsible manner[1][8]. 5. Balancing cloud and on-device AI: While OpenELM focuses on on-device inference, Apple is also exploring integrating large cloud models from Google and OpenAI into its products. The strategy seems to be a mix of efficient on-device models for fast, private processing and cloud models for more advanced capabilities[4][10]. In summary, OpenELM marks a significant step for Apple in the AI domain. By open-sourcing efficient models and empowering the research community, Apple aims to responsibly democratize and accelerate AI development, while still preserving its core values around privacy and on-device processing. However, more work is needed to optimize model performance to match the efficiency of other open LLMs. Citations: [1] OpenELM- An Efficient Language Model Family with Open-source Training and Inference Framework.pdf ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/12393781/3f2bba30-d8b4-4856-84ac-a33b3af97b59/OpenELM- An Efficient Language Model Family with Open-source Training and Inference Framework.pdf [2] Apple releases OpenELM, a slightly more accurate LLM - Theregister www.theregister.com/2024/04/24/apple_openelm_ai/ [3] Apple Releases Open Source AI Models That Run On-Device - Reddit www.reddit.com/r/apple/comments/1ccca6l/apple_releases_open_source_ai_models_that_run/ [4] Apple's New AI Model: OpenELM and What It Means for iPhone Capabilities www.stork.ai/blog/apples-new-ai-model-openelm-and-what-it-means-for-iphone-capabilities [5] Apple releases OpenELM: small, open source AI for devices | VentureBeat venturebeat.com/ai/apple-releases-openelm-small-open-source-ai-models-designed-to-run-on-device/ [6] [2404.14619] OpenELM: An Efficient Language Model Family with Open ... arxiv.org/abs/2404.14619 [7] OpenELM: An Efficient Language Model Family with Open-source ... huggingface.co/papers/2404.14619 [8] OpenELM: An Efficient Language ... - Apple Machine Learning Research machinelearning.apple.com/research/openelm [9] [PDF] The OpenELM Library: Leveraging Progress in Language Models for ... openreview.net/pdf?id=C0SGtHjr4wK [10] Apple releases OpenELM family of AI models for small on-device tasks tech.hindustantimes.com/tech/news/apple-releases-openelm-family-of-ai-models-for-small-on-device-tasks-all-you-need-to-know-71714021035566.html [11] CarperAI/OpenELM: Evolution Through Large Models - GitHub github.com/CarperAI/OpenELM [12] Apple's Releases 4 New Open-Source AI Models that Run ... beebom.com/apples-releases-open-source-ai-models-run-on-device/ [13] OpenELM: An Efficient Language Model Family with Open ... - KZread kzread.info/dash/bejne/k4Nlutiwd7eel9I.html [14] Apple Machine Learning Research machinelearning.apple.com/research/?year=2024 [15] OpenELM: Apple's play for democratizing AI development www.absolutegeeks.com/article/tech-news/openelm-apples-play-for-democratizing-ai-development/ [16] What is the implication of it being a reference design? They had two ... www.threads.net/%40daniel_rubino/post/C6J36UwOnV6

@felipevaldes7679Ай бұрын

The audio thing reads 3.5 as 3…..5 maybe substituting . With DOT on certain cases using regex could improve the audio generation

@matthewgaseltine66672 ай бұрын

why not use ElevenLabs or OpenAI's whisper TTS??

@MilesBellas2 ай бұрын

AI needs to learn Blender to replicate 2D images as synthetic data. Using Blender's etc....algorithms = key to understanding the real world.

@MrMehrd2 ай бұрын

Good to hear this method exists.

@ThruSilverRL2 ай бұрын

This one doesn't have audio.

@BUY_YOUTUB_VIEWS_4812 ай бұрын

Do not stop doing what you love, you are very good at ti!

@ramanshariati57382 ай бұрын

does anybody realize how HUGE this is???

Arxiv Papers

[QA] What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

Adversarial Attacks on Multimodal Agents

Can Go AIs be adversarially robust?

[QA] Adversarial Attacks on Multimodal Agents

[QA] Can Go AIs be adversarially robust?

Autoregressive Image Generation without Vector Quantization

Measuring memorization in RLHF for code completion

[QA] Autoregressive Image Generation without Vector Quantization

[QA] Measuring memorization in RLHF for code completion

Bootstrapping Language Models with DPO Implicit Rewards

[QA] Bootstrapping Language Models with DPO Implicit Rewards

Ad Auctions for LLMs via Retrieval Augmented Generation

[QA] Ad Auctions for LLMs via Retrieval Augmented Generation

An Empirical Study of Mamba-based Language Models

[QA] An Empirical Study of Mamba-based Language Models

[QA] Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

What If We Recaption Billions of Web Images with LLaMA-3?

[QA] What If We Recaption Billions of Web Images with LLaMA-3?

SAMBA: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

[QA] SAMBA: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Why Warmup the Learning Rate? Underlying Mechanisms and Improvements

An Image is Worth More Than 1616 Patches: Exploring Transformers on Individual Pixels

[QA] Why Warmup the Learning Rate? Underlying Mechanisms and Improvements

[QA] An Image is Worth More Than 1616 Patches: Exploring Transformers on Individual Pixels

Large Language Models Must Be Taught to Know What They Don't Know

[QA] Large Language Models Must Be Taught to Know What They Don't Know

[QA] State Soup: In-Context Skill Learning, Retrieval and Mixing

State Soup: In-Context Skill Learning, Retrieval and Mixing

Пікірлер