Lunar Lake Running Local Small Language Model with RAG | Talking Tech | Intel Technology

Ғылым және технология

Talking Tech is making the rounds at the Intel Tech Tour in Taiwan, and today we’re showcasing Microsoft’s Phi-3 small language model, or SLM, running locally on a Lunar Lake-powered system, without the need for an internet connection or access to the cloud. In addition to showcasing Phi-3's efficiency and speed on Lunar Lake, the demo shows how retrieval-augmented generation (RAG) allows users to supplement the model’s knowledge with their own data to enable hyper-specific answers from trusted sources.
For more details on Lunar Lake disclosures at ITT, visit the Computex 2024 page on intel.com/performanceindex.
Learn more about Lunar Lake in our Talking Tech overview: • Lunar Lake Overview: I...
See AI upscaling in action with F1 24 running on a Lunar Lake system with XeSS: • Lunar Lake Gaming Demo...
About Intel Technology: Intel has always been at the forefront of developing exciting new technology for business and consumers including emerging technologies, data center servers, business transformation, memory and storage, security, and graphics. The Intel Technology KZread channel is a place to learn tips and tricks, get the latest news, and watch product demos from both Intel and our many partners across multiple fields.
Connect with Intel Technology:
Visit Intel Technologies WEBSITE: intel.ly/IntelTechnologies
Follow Intel Technology on TWITTER: intel.ly/3MmhqET

Пікірлер: 10

@jdelkins2Ай бұрын
great job guys!
@jeanchindeko5477Ай бұрын
3:42 just here a human hallucination! RAG (which btw stand for Retrieval Augmented Generation, so no need to call it RAG Retrieval) is not a feature of the SLM or LLM or L3M, it’s an architecture which requires different components beside the model itself!
@Elegant-Capybara28 күн бұрын
Locally, sure. But what software were they using for the front end and back end? That doesn't look like any text generation UI I'm familiar with.
@rylo1111
27 күн бұрын
It's a Gradio UI
@XYang2023Ай бұрын
I tried to run the same Phi model on the NPU of my Meteor Lake laptop. It ran but I think running it on the GPU is a better fit as you pointed out. I ran it via Jupyter notebook. What quantization did you use? Could you open source this tech demo (the front end, i.e. the web UI and the back end). Thanks.
@mrhassellАй бұрын
Besides RAG, there are other ways to enhance GenAI applications, Constructing the Vector DB: Creating an efficient database for retrieval, Data Selection: Choosing relevant data for retrieval. Embedding Model: Optimizing how data is represented, Index Type: Determining the indexing method for retrieval. In summary, RAG is only 1 strategy for boosting LLM performance, however its really about combining generation with retrieval, that ensures reliable and accurate responses from AI models.
@minefacexАй бұрын
I wonder how much is the battery life when using stuff like this regularly
@jeanchindeko5477
Ай бұрын
No mention of that for now! We might need to wait device to get in the hand of people to know if battery life is great or not
@VeptisАй бұрын
So 3B model is now a "small language model"? Do these demos just run int4 just again? Phi1,1.5 and 2 are the worst models I ever tested for my coding benchmark...