Running out of time to catch up with new arXiv papers? We take the most impactful papers and present them as convenient podcasts. If you're a visual learner, we offer these papers in an engaging video format. Our service fills the gap between overly brief paper summaries and time-consuming full paper reads. You gain academic insights in a time-efficient, digestible format.
Пікірлер
This will be bad, and it's no surprise that it's Google leading the charge . . . . off a cliff.
How was this video made? I mean using which tool.
github.com/imelnyk/ArxivPapers
Benchmark for reasoning in LLM
Cool format! Only would be really cool to have subs
I think this can help humans to sudy better too, but thats just me
It could potentially to invent ML builder and optimizer agent using MLLM.
does the benchmark for Knowledge reasoning in LLM is exist?
Why it didn't use Rouge for comparison to real data? What the metric to evaluate if the model is hallucinate?
I don't understand, why lot people like than most of other videos?
I dont like lack of transferablity. N shot is not robust and has these issues with order, balance n selection. 0 shot instruction learning techniques feel more robust
I wonder how the western scientist can create such a Compact and complex but usefull Paper like this. I hope my lecturers did this too.
Teeeeeeeerrrifffiiiccccc!!! any torch code around?
Crazy how fast we came from scaling laws v1, to theoretical performance indicators allowing for early stoppage, to continual training methods without specified training steps or learning rates.
Does it related to scrapegraphai?
Does it related to last competion in Kaggle about enchaching LLM about Math Olympic?
I have question, why no new paper on Mamba architecture?
Mamba sucks at ICL so it's basically unusable for LLMs. There's a few papers on this channel about it.
@@robsim what is ICL did you mean?
@@kingki1953 LMAO!!! I was just about to say cause Mamba sucks... BTW, ICL means In-Context-learning
This is a paper, created by AI, read by AI, analyzed by AI, for AI...
I think we can use this as text summarization purpose
Can it improve RAG?
Great format also
Great format
No expert but this is leaning towards neuro science. Maybe good to combine methods at some point
I agree but for a different type of analysis. In terms of reasoning and how it should be interpreted…should be viewed more as multi-hop graph transversal..leaning more towards graph theory when it comes the the mechanistic traits of reasoning in a model’s latent space. Just my two cents.
The "Florida" in the ai voiceover really makes this video great.
Imagine if there's also philosophical reasoning to AI. It would propose that AI could also do a abstraction thinking. I hope someone create this paper.
Hey, I like the content, very much. I have a question though, how do you make these? Summarization or gpt + text to speech models?, it's way better than I would expect such a combo
yes, you can take a look at the process here: github.com/imelnyk/ArxivPapers
❤Scaling Laws 2.0 as a means of optimizing pre-training, categorizing the relative potential performance distributions of architectures, etc I.e. The paper introduces an energy function as an analytical tool to model and understand the memorization and retrieval process of Transformer-based language models, providing insights into their performance and convergence dynamics during training.
there is one thing i dont understand.. "we present..." but no code, provide so much details but no code.. why? What does this means itself? a trickery to the competition? why after releasing such large details u do not release any code or model at all??????.. "chameleon outperforms..." << where ? u say so? that detokenization of the image representation sounds a lot like bullshit XD
Considering in the last year. Some very compelling tokenizers have been released like morphpiece for instance…this is pretty huge. Phenomenal research.
good stuff... this one is more "online trianing" than the other trashpaper released recently by GPT4.
u shouldn't let GPT to choose your paper title.. "Online Training" ... woder wether u really know what does that means.. u cant name "online training" .. which basically means in-line continous training.. just because ur set is from the internet??... Online Training quality ... is the duality between inference and training.. not about the `py.requests.get(url)` source as it...
LOL... what a shame XD obviously a GPT made paper.. that exhibit no-knowledge over JEPA... so he presents joint embedding as it was the novelty behind this... x'D get a pythina70M to perform as pythina410M with your "technique" .. so we can see wether "architecture" is the key of performance and not "common f. sense over what the size of a neural complexity can do"... "platonic representation"... and what is joint embedding? flower-power representation? XD
Thank you. Would you tell me how is this video made? Is it also made by AI tool(like paper summarization tool)?
Instructions unclear, reversed a curse now very lucky
The curse of diversity - bahahahahahahaha!
bom
ObrIGADO pela leitura, desde Brasil.
3:34 it's a b.s. paper :)
He He? Ha Ha?
Excellent!
Here's a summary of the key points: Chatbots like GPT-4 and ChatGPT are now widely used by millions, but there is a lack of public datasets showing how these tools are actually used by a diverse population of users in practice. To address this gap, researchers offered free access to ChatGPT in exchange for users' consent to anonymously collect their chat transcripts and request headers. This resulted in WildChat, a corpus of 1 million user-ChatGPT conversations with over 2.5 million interaction turns. WildChat contains more diverse user prompts, languages, and potentially toxic use cases compared to other chatbot interaction datasets. In addition to chat transcripts, WildChat includes demographic data like location and hashed IP addresses to enable analysis across regions and times. The diverse use cases captured make WildChat potentially useful for fine-tuning instruction-following models. WildChat is released publicly under AI2 ImpACT Licenses at the provided URL.
This video should be receiving millions of views
sounds sus
Summary by @erinkhoo 1. Apple has released OpenELM, a family of open-source large language models (LLMs) designed to run efficiently on Apple devices. There are 4 pre-trained models of different sizes - 270M, 450M, 1.1B and 3B parameters. 2. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each transformer layer. This leads to enhanced accuracy compared to other open LLMs like OLMo, while requiring 2x fewer pre-training tokens[1][6]. 3. Unlike typical releases that only provide model weights and inference code, Apple has open-sourced the complete training and evaluation framework. This includes training logs, multiple checkpoints, pre-training configurations on public datasets, and code to convert models for on-device inference using Apple's MLX library[1][6]. 4. In benchmarks, the 1.1B parameter OpenELM model outperforms the 1.2B OLMo model by 2.36% accuracy on the OpenLLM leaderboard tasks, while using half the pre-training data[1][4]. However, OpenELM is slower than OLMo in inference due to a naive implementation of the RMSNorm layer[2]. 5. Apple has released the models on the Hugging Face hub, making them easily accessible for developers to experiment with and integrate into applications[10][12]. Implications for Apple: 1. Enabling on-device AI: OpenELM models are compact and optimized to run on iPhones, iPads and Macs rather than relying on the cloud. This enables fast, local AI processing for features like composing emails, while preserving privacy[4][10]. 2. Catching up in the AI race: By open-sourcing capable LLMs, Apple is making strides to match competitors like Google and Microsoft in the generative AI space. OpenELM could power upcoming AI features rumored for iOS 16[10][12]. 3. Empowering developers and researchers: Releasing the full training stack allows the community to investigate the models for potential biases and risks. Developers can easily leverage the models as-is or fine-tune them for specific applications[1][6]. 4. Paving the way for more open research: OpenELM represents a shift in Apple's typically closed approach. It may signal more open collaboration with the research community to accelerate AI progress in a responsible manner[1][8]. 5. Balancing cloud and on-device AI: While OpenELM focuses on on-device inference, Apple is also exploring integrating large cloud models from Google and OpenAI into its products. The strategy seems to be a mix of efficient on-device models for fast, private processing and cloud models for more advanced capabilities[4][10]. In summary, OpenELM marks a significant step for Apple in the AI domain. By open-sourcing efficient models and empowering the research community, Apple aims to responsibly democratize and accelerate AI development, while still preserving its core values around privacy and on-device processing. However, more work is needed to optimize model performance to match the efficiency of other open LLMs. Citations: [1] OpenELM- An Efficient Language Model Family with Open-source Training and Inference Framework.pdf ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/12393781/3f2bba30-d8b4-4856-84ac-a33b3af97b59/OpenELM- An Efficient Language Model Family with Open-source Training and Inference Framework.pdf [2] Apple releases OpenELM, a slightly more accurate LLM - Theregister www.theregister.com/2024/04/24/apple_openelm_ai/ [3] Apple Releases Open Source AI Models That Run On-Device - Reddit www.reddit.com/r/apple/comments/1ccca6l/apple_releases_open_source_ai_models_that_run/ [4] Apple's New AI Model: OpenELM and What It Means for iPhone Capabilities www.stork.ai/blog/apples-new-ai-model-openelm-and-what-it-means-for-iphone-capabilities [5] Apple releases OpenELM: small, open source AI for devices | VentureBeat venturebeat.com/ai/apple-releases-openelm-small-open-source-ai-models-designed-to-run-on-device/ [6] [2404.14619] OpenELM: An Efficient Language Model Family with Open ... arxiv.org/abs/2404.14619 [7] OpenELM: An Efficient Language Model Family with Open-source ... huggingface.co/papers/2404.14619 [8] OpenELM: An Efficient Language ... - Apple Machine Learning Research machinelearning.apple.com/research/openelm [9] [PDF] The OpenELM Library: Leveraging Progress in Language Models for ... openreview.net/pdf?id=C0SGtHjr4wK [10] Apple releases OpenELM family of AI models for small on-device tasks tech.hindustantimes.com/tech/news/apple-releases-openelm-family-of-ai-models-for-small-on-device-tasks-all-you-need-to-know-71714021035566.html [11] CarperAI/OpenELM: Evolution Through Large Models - GitHub github.com/CarperAI/OpenELM [12] Apple's Releases 4 New Open-Source AI Models that Run ... beebom.com/apples-releases-open-source-ai-models-run-on-device/ [13] OpenELM: An Efficient Language Model Family with Open ... - KZread kzread.info/dash/bejne/k4Nlutiwd7eel9I.html [14] Apple Machine Learning Research machinelearning.apple.com/research/?year=2024 [15] OpenELM: Apple's play for democratizing AI development www.absolutegeeks.com/article/tech-news/openelm-apples-play-for-democratizing-ai-development/ [16] What is the implication of it being a reference design? They had two ... www.threads.net/%40daniel_rubino/post/C6J36UwOnV6
The audio thing reads 3.5 as 3…..5 maybe substituting . With DOT on certain cases using regex could improve the audio generation
why not use ElevenLabs or OpenAI's whisper TTS??
AI needs to learn Blender to replicate 2D images as synthetic data. Using Blender's etc....algorithms = key to understanding the real world.
Good to hear this method exists.
This one doesn't have audio.
Do not stop doing what you love, you are very good at ti!
does anybody realize how HUGE this is???