Gary Explains
24 күн бұрын
19,847
1

Copilot+ PCs - Do you need an NPU? Microsoft Says "Yes", I Say "No"

Ғылым және технология

Microsoft's new vision for an AI infused Windows computer is known as Copilot+ PC and it relies on the PC having a Neural Processing Unit or NPU for short. What is an NPU? Why does a Copilot+ PC need an NPU? Is it optional? Let's find out.
---
Your GeForce PC can run Copilot Plus features, but here's why it doesn't: www.androidauthority.com/nvid...
Twitter: / garyexplains
Instagram: / garyexplains
#garyexplains

Пікірлер: 308

@POVwithRC22 күн бұрын
Microsoft is in bed with hardware manufacturers who want to sell more hardware. What do you think the arbitrary leave-behinds that windows 11 were meant to do? Sell more Intel and AMD product.
@neovirtuality
16 күн бұрын
Yep. First the TPU and now the NPU
@jordancave698722 күн бұрын
Msft gatekeeping AI features to encourage the masses to purchase new PC's, thereby helping out vendors and Win11 adoption. It's hidden in plain sight.
@POVwithRC
22 күн бұрын
Bingo
@csteelecrs
22 күн бұрын
Business 101
@runed0s86
22 күн бұрын
These new PCs will also be constantly screen-shotting your screen and uploading data to microsoft... To sell to the highest bidder!
@ernestuz22 күн бұрын
The core of current AI algorithms is the 'multiply and accumulate' instructions, that even microcontrollers as cheap as STM32F4s have. So many things can run AI nowadays, the major limitation is the amount of data that has to go through those 'multiply and accumulate'. Originally desktop CPU used Double precision for floating point calculations (lets call it FP80 and FP64, the number is their size in bits), that was too much for graphics, so graphic cards use FP32 and FP16, but neural networks are very resilient to precision loss, so people started to 'quantize' their models to 8 bits and lower, at 4 bits the models I have tried are still very strong (quantized from an original FP16 model).. The state of the art today is models that use a single bit per weight, that don't need the multiply step. Just the algorithmic refinements are driving computing needs down. The fact is that you can run quantized models in a CPU at acceptable speeds, just don't choose a 80billion parameters one, a 3B parameters LLM should run acceptably fast, producing tokens much faster than a person can type. Somebody wants the AI always turned on in your computer to learn everything from you, otherwise I can't see the reason.
@skyak4493
22 күн бұрын
You nailed it! Microsoft's real motivation is to make people train their AI replacements!
@philippeferreiradesousa4524
21 күн бұрын
Language models on device are memory bandwidth limited. The AI moniker is really earned by moving to non-upgradeable on-package RAM that’s 136GB/s bandwidth instead of 50GB/s.
@LordHog22 күн бұрын
The question is do we need Microsoft’s Co-Pilot PC, no, absolutely not
@gaiustacitus4242
22 күн бұрын
Software developers need one just to build software for the people who purchase them. Otherwise, your answer is correct.
@gaiustacitus4242
22 күн бұрын
@@MadafakinRio The majority of the people who purchase a Copilot+ PC will never realize any advantage over just using Copilot on a Windows 11 PC based on Intel or AMD CPUs. The small LLMs that Copilot+ PCs can utilize are too low quality to generate high quality output. This means the majority of the workload is offloaded to a hosted AI (i.e., regular Copilot). Copilot+ requires a NPU to perform local AI processing. This is mainly for better support of the new Recall feature in Windows 11, and Recall will prove to be more hindrance than help.
@mrangles3402
22 күн бұрын
Exactly
@azeemuddinkhan923
22 күн бұрын
@@gaiustacitus4242really? I find it amusing that you have assumed that people won't find massive battery improvements useful
@sinom
22 күн бұрын
@azeemuddinkhan923 there haven't been any independent tests yet to find out if there are any battery improvements. Additionally copilot+ PC ≠ ARM PC. Most of the supposed battery life improvement is ARM vs x64 not "no copilot+ vs copilot+" adding a simple TPU simply cannot magically improve battery life by a significant factor. Additionally both AMD (with their CPUs in general) and Intel (with their e-cores) are making significant improvements in battery life under normal use, so how long this (still not actually confirmed) advantage of ARM chips actually lasts is debatable. Especially since for most apps ARM CPUs will require the use of Pris, a translation layer, which may (still nobody independently tested it) or may not be less efficient than running these apps natively instead.
@faisalrahman923622 күн бұрын
Do you guys want the professor to resume the Speedtest-G benchmark? Would you donate him to continue so he can arrange more devices for tests? If you agree, then like and say "yes".
@anb4351
22 күн бұрын
I want professor to start doing SOC showdown he used to do long ago on Android Authority YT Channel
@owlmostdead949222 күн бұрын
It's a good day for Linux
@od1sseas663
22 күн бұрын
lol
@shanehebert396
22 күн бұрын
Let me guess... "Year of Linux on the Desktop" yet again, like every one of the past 30? years?
@owlmostdead9492
22 күн бұрын
@@shanehebert396 Last time I checked "Recall" wasn't a thing 30, 20, 10 or 5 years back.
@od1sseas663
22 күн бұрын
@@shanehebert396 2024 is SURELY the year of Linux THIS time!!!!! 😂😂😂😂😂
@toby9999
22 күн бұрын
Why?
@SuperFredAZ22 күн бұрын
Gary, you always explain everything so well, thanks!
@nyonkavincenttafeli700222 күн бұрын
I'm waiting for that video that will really show me(us, as I'm sure I'm not alone) the day to day usefulness of that copilot thing
@KeyYUV22 күн бұрын
I wouldn't use AI baked in the OS anyways. It's good to know MS won't secretly enable AI via windows update for my desktop. If I want to use AI I'll get 3rd party apps.
@johnsimon845722 күн бұрын
As a desktop user We have GPU for just about everything from drawing UI elements to rendering fonts. CPU is used way less now than 20 years ago for web browser tasks. I see NPUs as another kind of GPU - specialized hardware. But while GPU is used constantly I only see occasional use for NPU hardware today - facial detection, text prediction, etc. An office worker isn’t going to be making constant use of an NPU. Maybe I don’t have any imagination but it feels like a lot of marginal uses but few major ones. The Apple photo app can detect flowers in a field and separate background from subject. It’s a cool trick and insane to see on a phone but it’s not like I use that functionality on a daily or even weekly basis.
@martineyles
22 күн бұрын
I think the Minecraft demo is a situation where you might not want your GPU to do the AI work, as that might affect gameplay performance. Offloading to the NPU Is perhaps helpful here, unless of course this demo does the AI work in the cloud. The recall feature was after all the only thing they said was definitely local.
@deth3021
22 күн бұрын
TTS could be a big use case. Speech to text as well,
@univera1111
22 күн бұрын
Hold up. To me NPU is like a GPU in GPU. Or like an Os in a GPU using kvm. So I believe it will be faster and more efficient.
@deth3021
22 күн бұрын
@univera1111 more like a mini TPU.
@gabrielgon3408
22 күн бұрын
I make calls for work with that "follow me" front facing camera feature on my tab s8 ultra. I consumes more battery noticeably. Maybe with a higher performing NPU it could drain less battery.
@aleksandardjurovic920321 күн бұрын
I totally agree. Thank you for the video!
@carloslegrutier22 күн бұрын
Correct me if I’m wrong, but up to low-level DX12 stuff, you could run most games on a CPU. Well, perhaps not ‘run’, more like a slideshow.
@andyH_England21 күн бұрын
I look forward to local LLMs on devices focused on specific tasks, like an encyclopedia with different volumes, such as a history or maths LLM. This way, we can load them individually into local RAM without needing the cloud. This could be monetised, where you buy/rent an LLM based on your current requirements. They can be updated. I assume focused LLMs would be inherently more accurate, and the package will be more suitable for non-cloud requirements.
@alpha007org
17 күн бұрын
I'm running models locally. I had some success with LLama8B_q8, (I'm trying qwen2 now) for RAG. Speed is insane, but it still hallucinates too much. If I can get ~4-10GB always in memory, and NPU with very low power usage,... well I hope we'll get this in the future.
@nathanaelsmith355322 күн бұрын
I've been using the OpenAI API at work and am a bit underwhelmed. If I ask it the same question 10 times I get the correct answer about 8 times and sometimes completely daft answers for the remainder. This is currently the best I can get it after optimizing my prompts. Not very reliable. I personally don't consider LLMs to be AI because they don't actually understand anything. They just regurgitate correlated data.
@paulbarnett227
22 күн бұрын
"They just regurgitate correlated data." - I didn't think of it like that but yes, you're right.
@andybarnard4575
22 күн бұрын
But isn't that what my brain does?
@martineyles22 күн бұрын
Which things is the NPU used for. Is it deciphering your request and responding, or is it for analysing the screenshots for searchable content (demos show OCR functionality, but also being able to search for things like a brown leather bag in the image). Perhaps some of these tasks require more computational power than others. If the recognition of objects in photographs were unbundled from the other features, perhaps it wouldn't need the NPU.
@alpha007org
17 күн бұрын
Imagine grammarly or deepl translate, but running locally. This is a good use case for NPUs. But we need a huge algorithmic development to reduce hallucinations to 0.01%.
@HydrasHead22 күн бұрын
They should at least give us the option, to run those features on whatever hardware we want.
@paulbarnett22722 күн бұрын
Thanks Gary. I've been wondering about this, can I run CoPilot+ on my 4080? Apparently not. It's a real shame and is obviously gatekeeping by Microsoft to push the new NPU equipped laptops. It would seem that they already have a CUDA build but will not release it to the masses.
@thedevincrutcher22 күн бұрын
The usefulness of the NPU for the average Windows user is highly unknown. We don't know how people will feel about these new features. But we know automatic screenshots creep people out in ways that log files didn't. Yet considerations remain: 1. Most laptops only have 8 GB of VRAM and the models are larger. It makes more sense to have the NPU share main memory, especially for non-gaming laptops. 2. x86 PCs have mostly not kept up with the Apple Silicon in terms of battery life. 3. The NPU allows for use Ai cases requiring low latency, like background removal or noise isolation. These mostly useful for video streaming, but difficult to justify for asynchronous processing requirements. 4. Latency is different from speed and most users probably won't grasp that nuance. Potentially difficult sell. 5. Most laptop users don't care where the Ai processing is happens as long as they have good battery life. The privacy aspects are likely lost on most users. Remember, the people designing and marketing these systems are deeply technical. They often cannot grasp the end user's perspective or adapt to it quickly enough.
@TimothyChapman20 күн бұрын
This is definitely cause for suspicion. There is no reason to lock the user out of these NPUs. The good news is that the NPU itself probably just does all of the neural processing. The bad news is what the OS running on the CPU probably does with the NPU's output...
@DrB93422 күн бұрын
I would rather build my own LLM and put a big-arse gpu or two in the system.
@chengong38822 күн бұрын
The reason is pretty simple really, because the desktop gaming PC market is tiny and inconsequential, Microsoft doesn't care, they know the gamers don't care either. 90% of the normal people will be running a laptop, or something that doesn't have a 4090 in it, so nobody cares if your 4090 can technically run copilot+ but can't. They're trying to make this work for the 90%, where if this thing did just run on a laptop 4060 or something, it would just kill the battery life for 90% of the users.
@GaryExplains
21 күн бұрын
The desktop gaming PC market is tiny and inconsequential? 🤦‍♂️
@RAVANAZAR22 күн бұрын
Nope. Don't want Cortana or AI anywhere near my PC.
@lekejoshua4402
22 күн бұрын
Donkey 😂😂
@michaelharings99138 күн бұрын
Suse, a European open source company, think Suse Linux, is making a new vendor and LLM agnostic generative AI platform called Suse AI solutions. So that should help break out the actual comparisons between CPU, GPU and NPU performance on different tasks.
@GaryExplains
8 күн бұрын
All LLMs are platform agnostic.
@fuseteam22 күн бұрын
I'd like to interject what you refer to as "pc" is actually copilot+ pc or as i've recently taken to calling it "copilot" you see everything makes use of copilot, the pc is just the vehicle with which copilot is delivered to you. Copilot is delivered to you through many means the edge browser, your cloud storage, you office tools, the bing website, github and more. The pc is just one mean out of many. It's in the it's all just copilot, just c-co-copilot, just copilot
@Winnetou17
21 күн бұрын
I don't want to imagine the seizure that Richard Stallman will have when finding out about these requirements. Hope he won't. But if I was in his place, I certainly could.
@fuseteam
20 күн бұрын
@@Winnetou17 ikr lmao
@dansanger534022 күн бұрын
The way to look at it is that NPUs are efficiency cores for the GPU. Since Copilot+ is so far laptop-only, it makes sense to have efficient NPUs. The other thing that makes sense is to push an industry standard for consumer AI hardware, if we ever want to be able to do AI on something other than an Nvidia GPU. Right now, if you want to do AI on Windows or Linux you have to jump through hoops if you want to run on anything other than Nvidia. I'm glad Microsoft is doing something to address the de facto Nvidia monopoly on AI hardware.
@Winnetou17
21 күн бұрын
But it's insane that they're pushing it at the OS level. Which is the same OS for desktops too. Since when does "PC" mean "laptop only" ? Oh, wait, we're speaking about Microsoft here. I rest my case.
@isiahfriedlander555922 күн бұрын
Lunar Lake, state of the art X86, the architecture of real work.
@electrodacus22 күн бұрын
In the new Lunar Lake SOC the CPU can do 5 TOPS (int8) while the NPU can do 48 TOPS with likely lower power than the CPU and GPU. The GPU can also do 67 TOPS same (int8) This are theoretical numbers but it shows the NPU has an order of magnitude better performance and at lower power. The new AMD strix point APU claims an NPU with 50TOPS that seems to be able to to Block FP16 at same performance as int8 so it seems even more impressive. Their iGPU can do around 23TOPS of FP16 not sure if it can do int8 any faster and it will sure use more power while doing this than the NPU.
@andybarnard4575
22 күн бұрын
Lets use the NPU as a CPU then. 2025 will be the year of the 8-bit processor. Bring it on.
@electrodacus
22 күн бұрын
@@andybarnard4575 It may be that just 4bit or even 1bit is needed for effective AI. Due to simplicity and extreme parallelism they can just do much more computation than traditional CPU's. So maybe 2025 is not the year CPU becomes irrelevant but it may not be that many years from now.
@madorsey07722 күн бұрын
that was my overwhelming question when seeing the MS announcements, Why?
@soragranda22 күн бұрын
11:28 That is a shitty move to do to a partner... though, people will find a way to put something like copilot but using your gpu (I mean, maybe it doesn't make sense on laptops but on desktops, it does).
@shamim647 күн бұрын
If it need to run the AI model at full load and 24/7, then it definitely need the NPU. If you want to try running LLM on your computer, try Ollama. It can use CPU and GPU.
@jaybestemployee22 күн бұрын
Let's say any development of specialized computing hardware (NPU in this case) needs a large sum of initial investment. The cause for NPU is power efficiency for "AI" applications. The benefits you gain with an NPU varies with how dependent you are on those applications, which can be not very much for a lot of people. However, MS is pushing this new piece of hardware as a new standard so the hardware can be sold with scale to possibly trigger a healthy upward cycle of NPU hardware investments and consumption. Whether the hardware is gatekept I don't think it would matter a lot for long coz people are smart enough usually to hack their way to exploit what they got. Then the investment decision on whether to buy an NPU is whether you want to help with the development of a more power efficient piece of hardware than the GPU and make each type of hardware (CPU,GPU,NPU) do its best in parallel. With the scale of NPU sales goes up, I guess the development of NPU may also include local model training and such for private AI applications, again depending on demand. Diverting such workload from GPU would be a main goal for these NPU initiative coz you know NVidia has been controlling the GPU market for AI for some time and other bigs are not very happy I can tell.
@B.Ch3rry22 күн бұрын
All I want is for Windows (Microsoft) to be natively supported on Apple M-Series. BootCamp spoiled me with the ability to dual boot/choose based on my needs.
@John-cn8jv21 күн бұрын
I'll just keep my 4 year old laptop going. Microsoft can kiss my ***. Microsoft thinks what's good for their bottom line is what the users need, even if they need to jam it down our throat. I think a raspberry pi is my next desktop. Linux doesn't make the demands of M.S. I prefer to choose my own hardware.
@cmd_f519 күн бұрын
I remember thinking Windows 10 was too annoying to get into because of the changes. This latest in the Win 11 saga is just making me wish ReactOS was actually ready in any way for end users. Guess it's time to mess with Linux for the tenth time and waste more energy
@andikunar71838 күн бұрын
While I totally agree with you in general (CPU/GPU/NPU), you forgot to mention, that during LLM/SLM inference token-generation (non-batched), memory-bandwidth becomes the main limiting factor. This means pumping all the billions of parameter from RAM to the SoC caches for every token-generation calculation really matters. This is why the M2/M3 Max and M2 Ultra with their large+wide high-bandwidth RAM does not do to badly vs. the much, much more performant 4090 during pure token generation (see the llama.cpp performance comparisons). But this would have been to complicated to communicate for Microsoft… And Snapdragon X is supposed to do around 130 GB/s - between M3 and M3 Pro.
@GaryExplains
8 күн бұрын
True, but there is more to local AI stuff than LLMs.
@abbe964122 күн бұрын
As a Linux user iam so happy to not have to care about NPU's.
@dallasgrful22 күн бұрын
10:30 explains the requirements for NEW PC’s only. Lots of people won’t need these computers.
@MikeKasprzak22 күн бұрын
"Year of the Linux Desktop" jokes aside, I would genuinely love some spare highly power efficient matrix math silicon in my laptop. In practice it may only support 4-bit or 8-bit data (at most 16-bit floating point data), but thats still useful, just not as general purpose as a CPU/GPU
@skyak4493
22 күн бұрын
What else uses such low precision matrix math?
@Bareego21 күн бұрын
If you run a large language model all the time this means that your pc will waste 8GB of ram, all the time. Ram manufacturers are going to love this. The only hope for low level access might be through RISC-V chips coming up, although they still have a lot to catch up generally speed wise. The whole implementation with copilot just seems like they had a solution that said buzzword AI and looked for problems to throw it at.
@vengrinv22 күн бұрын
Ok, lets talk about this more. Of course you can run ML anywhere, its just a program. I think that a good comparison would be video decode/encode. Can it be bruteforced by the GPU/CPU? Yes. Will you have a good time? Propably no. The point is that specified hardware is always gonna be the better option for power per Watt than CPU/CPU. With all that said there is no way in hell I would want or use copilot, and there are WAY more usecases for the NPU that i would rather use
@GaryExplains
22 күн бұрын
True the conventional wisdom, like video encode/decode, is that dedicated hardware is better. Yes. But the point is that GPUs are already bits of dedicated hardware that are good at ML stuff. Plus CPUs with the right extensions like matric multiply stuff are also just as good. So between the CPU and the GPU, why is Microsoft mandating the use of the NPU when potentially its benefits are small.
@vengrinv
22 күн бұрын
@@GaryExplains Maybe free up cycles on other processing units? or perhaps its actually about the frequency of processing units.
@HydrasHead
22 күн бұрын
They could at least leave the choice up to the user.
@vengrinv
22 күн бұрын
@@HydrasHead it's Microsoft we are talking about, there is no choice
@paulbarnett227
22 күн бұрын
@@GaryExplains "why is Microsoft mandating the use of the NPU when potentially its benefits are small" - one word - Money. Sell more new PCs even though, as you point out, there's plenty of hardware out there already that could run this stuff.
@PragmaticTornadoКүн бұрын
It's funny that my desktop with an RTX 4090, would absolutely smoke all of these bespoke NPU's. But because of greed and marketing reasons, that probably won't happen. Though I'm the kind of guy who uses GPedit / RegEdit to disable as much telemetry as possible, so a feature that takes regular snapshots of your actual screen - probably isn't for me. It's a new AI-powered world, of telemetry and privacy concerns.
@victorc77719 күн бұрын
Before finishing the video, I’m going to go ahead and say that I’m not interested in Windows at all if they keep trying to shove ads into the OS and if they don’t start taking our privacy more seriously. I’m a Mac user, but I have to use it for support of most of the applications I need for my work/business. When I need Windows, I spin up a VM on my Unraid server. I used to be Windows only just 2 years ago, but Win11 has turned into an ads streaming OS. A bit hyperbolic, but my opinion remains.
@stevemilchuck924121 күн бұрын
Speed and efficiency is relative to what you are trying to accomplish. When it comes to whoring out my data I'm certain Microsoft is looking for more speed and efficiency of getting that information to the cloud so they can digest it in a larger model.
@mikldude937622 күн бұрын
Good video Gary , my guess is the one common denominator as always is money , my bet is when microsoft is saying you need an NPU to do it microsofts way , they probably mean you need a microsoft NPU to do it , they have structured their AI in such a way that it is like a walled garden and others cannot play ..... unless they pay $$$$$$$ . Computers and technology may change , but human greed is always the same :).
@rustyclark235621 күн бұрын
Always running ai? Sounds great for idle power efficiency. npu or not.
@dumnthum22 күн бұрын
To test it, maybe someone can run the Local model in LM Studio with an eGPU. Phi mini is available there.
@martineyles22 күн бұрын
The NPU could be like a Floating Point coprocessor - Something that is often separate from the CPU, but eventually it dies out because all processors do enough of that type of processing internally. However, there are also things like SIMD, which is as far as I understand the bread and butter of GPUs and got CPU implementations (eg. mmx), yet these did not replace the GPUs.
@skyak4493
22 күн бұрын
But NPUs are 4 bit maybe 8 bit. That is why they take less power than GPUs. All the matrix math I would want to do is high precision simulation of physics.
@jerzyczajaszwajcer20 күн бұрын
but for desktop it is ok to run llms or smls on grpahics also u should be able to run ai on grpahics in laptops when they are at wall outlet
@GaryExplains
20 күн бұрын
Exactly.
@vasudevmenon249622 күн бұрын
Jensen announced they will add new API that will add the copilot runtime layer to nvidia GPU that will be visible as npu in task manager and he says it's much faster than current SoC even with older RTX GPU via driver update.
@TamasKiss-yk4st
22 күн бұрын
But still use 200-450W for that, the NPU do the same with 1-3W power consumption (just for example the iPhone has NPU with 35 TOPS, but not drain the battery in 5 mins, it's remain usable for hours, but a GPU can't work with a tiny battery)
@vasudevmenon2496
22 күн бұрын
@@TamasKiss-yk4st i never talked about performance on a battery. Even nvidia doesn't. it's great to see SoC NPUs are on par with PCs especially their perf per watt with a small compute tile.
@paulbarnett227
22 күн бұрын
@@TamasKiss-yk4st On a desktop system battery is irrelevant. I hope NVidia do what Jensen appears to be saying.
@LA-MJ22 күн бұрын
Amd has had NPU-enabled Laptop-SKUs for at least a year already
@Big_Yin22 күн бұрын
I'd rather use NPU to improve graphics or frames performance, i literally have the same opinion on Ai like crypto and nft.
@robertlawrence9000
22 күн бұрын
So in other words, we don't need an NPU and it would be better to have that space used for more graphics processing. It makes sense to me.
@Big_Yin
22 күн бұрын
@robertlawrence9000 no we're currently using neural processing units for tacky gimmicky applications instead of using them to enhance CPU and gpu in what I would hope eventually fix the problems with crossfire/sli and use npus to bridge that gap.
@abbe9641
22 күн бұрын
NPU's are in the CPU, there will be massive latency penalties from doing it this way with the GPU. Best place to utilize AI hardware acceleration is on the GPU die itself.
@GaryExplains
22 күн бұрын
I think you are confusing the terms CPU/GPU etc. The NPU isn't in the CPU, it is in the same chip or processor, but not IN the CPU. They are separate. There are no latency penalties using the GPU, if there were then we won't have high FPS games in 4K!!!
@martineyles
22 күн бұрын
I gather the NPU does similar calculations to the Tensor Cores. Therefore it might be possible to have them so something like DLSS. Microsoft did announce an upscaling technology, so I wonder whether this is run by the GPU part of a mobile SOC, or is farmed out to the NPU.
@davidbayliss378922 күн бұрын
I think Microsoft will extend to GPU's/CPU's eventually. I think they want to start somewhere, get people to buy new stuff for Windows 11 lol ... I think over-all they want to reduce the stuff they'll need to support. But after collecting a ton of data on NPU use with copilot+ then I can't see why they won't start adding switches so you can enable it to run on GPU's / CPU's too. I'm not too clued-up on NVIDIA'S NIMS (might have got that name wrong) container thing yet ... I don't know if it'll apply ... but I thought of that as being the new potential abstraction layer at least in terms of NVIDIA GPU AI and maybe that could interface with Copilot+ and de-risk GPU use for Microsoft a bit.
@commentarytalk144622 күн бұрын
Presumably Apple's NPU will be more about OS-AI stuff (there's a list online) eg modify this photo, set my reminder and reschedule my calendar etc all baked in locally? Whereas as said the productivity app AIs still need the cloud for live service of information manipulation eg Dame Sally Markham Compositions: "How many pages Ms. ChatGPT? Wake me up when you've finished."
@gr-os4gd
22 күн бұрын
The M-series packages have NPUs already.
@ThePowerLover
21 күн бұрын
@@gr-os4gd Thet have NPUs since the A11 Bionic.
@stevemilchuck924121 күн бұрын
The question is do we even need co-pilot other than less than a handful of times of playing with it I absolutely no use for it.
@abhiramshibu21 күн бұрын
But what if you added an external NPU, there are lot of NPU available in the market..
@unvergebeneid22 күн бұрын
Microsoft's artificial barriers aside, I'm super unclear how things are supposed to work in PCs with a powerful NPU and a powerful GPU, both being great for inference. Since they're using completely separate memory, I don't see how they can complement each other, so probably the NPU will just go to waste, right?
@C.M.C.B21 күн бұрын
All we need is a virtual npu stack to use our Gpu or Cpu... Someone is going to do it !
@wawaweewa915922 күн бұрын
Can i ask co pilot to gather a bunch of data and put them in paragrpahs on a certain topic from variousc sources?
@gaiustacitus4242
22 күн бұрын
You can certainly do so, but you likely will not care for the generated output.
@wawaweewa9159
22 күн бұрын
@@gaiustacitus4242 🤣🤣😢
@raygordon182221 күн бұрын
You can see where this is going, you will become part of the MS AI Network using your machine for their use!! (Oh we are only using resources you are not using at the moment!)
@dave24-7322 күн бұрын
Million dollar question, will Microsoft lock down the latest versions of windows saying you need a NPU or you can’t install it, looking at Windows 11 this is exactly what they did with TPM, and later CPU now being minimum specs.
@gaiustacitus4242
22 күн бұрын
You can be certain that is precisely what Microsoft will do.
@gaiustacitus4242
22 күн бұрын
@@MadafakinRio It didn't take Microsoft 10 years to make obsolete more than 50% of PCs in use by its customer base when requiring a TPM to run Windows 11. Microsoft doesn't care one whit about its customers who aren't generating new revenue for the company. It must make older hardware obsolete in order to drive industry support for the company's product line. It's just the nature of business.
@dave24-73
22 күн бұрын
@@MadafakinRio I’m not so sure, ARM is coming, and AI is happening much faster, 10 years the ship will have already sailed.
@El.Duder-ino14 күн бұрын
Thx Gary for summarizing whole show👍so I don't have to watch it in full which would be definitely more painful as M$ shows r probably least interesting and entertaining out of all BIG players. Seeing M$ limiting their Windows OS now the NPU is not surprising at all, it's actually expected as "M$ standard" besides making Windows more bloated "spyware" data collecting and even more notches slower and less responsive. All this AI buzz around some kind of assistant helping with our daily tasks will succeed only if its going to be well implemented and optimized and I have feeling Windows will not have the best solution in town even it's quite ahead of the AI game also thx to close collaboration with the OpenAI and early adoption into its Azure cloud infrastructure. Still M$ can learn hell of a lot from the Copilot+ however who implements key features most logical, intuitive way and convinces its users to use them actively over traditional approach will be the real winner of the AI game.
@BriefNerdOriginal22 күн бұрын
Oh no, now I'm so sad in discovering that I cannot run Recall fantastic feature on my current laptop ...
@CTimmerman22 күн бұрын
Sounds like still adding more features to CISC. RISC already should be fine for matrix multiplications using its many cores.
@GaryExplains
22 күн бұрын
What is the connection between RISC, matrix multiplication, and cores? I am confused?
@CTimmerman
22 күн бұрын
RISC cores are simpler/smaller than CISC cores, so use less power per cycle, in exchange for larger executables which are not a problem with advances in storage. CUDA cores are probably even simpler/smaller, but afaik limited to expensive VRAM.
@GaryExplains
22 күн бұрын
I think your understanding of RISC is stuck in the 1980s.... Things have moved on since then.
@CTimmerman
22 күн бұрын
Turns out RISC processors like ARM's are taking over laptops like Apple's now, after mobile. And CUDA is RISC as well but with extra features. Even CISC CPUs use microcode to feed RISC cores now.
@JohnWilliams-gy5yc22 күн бұрын
If inference is so light that Apple chooses the ARM extension accelerator path, I guess the "real" reason of copilot+ npu requirement is AMD don't want to license Intel AMX.
@churblefurbles22 күн бұрын
They really don't like to mention how the npus compare to modern gpus
@xperiafan537022 күн бұрын
2:39 That's what it's all about. Efficiency.
@GaryExplains
22 күн бұрын
14:34 But has that been proved?
@xperiafan5370
22 күн бұрын
@@GaryExplainsIt hasn't been disproven ether. Has it? All of these big CPU companies are pushing for them, which means you can't cancel out the fact that the NPUs gave got efficiency benefits over CPUs and GPUs in ML inferencing. And we will be getting answers to our questions in about 2 weeks time. So there's no need to declare NPUs unnecessary before even getting to use them.
@GaryExplains
22 күн бұрын
If the efficiency is proved to be an actual thing, are the gains sufficiently highly that mandating their use (which means millions of PCs become obsolete and we all need to spend loads of money buying new PCs) warranted? Also, as a side note, all of these big CPU companies are only pushing for them because Microsoft is insisting on it and they have to do it so they don't get left behind in the race to add the word "AI" to every product.
@robertlawrence9000
22 күн бұрын
I don't want a spying, logging everything on my PC. Efficiency or not so we really don't need an NPU.
@GaryExplains
22 күн бұрын
@robertlawrence9000 But the point of my video (I thought) was to discuss that all these AI features can run without the use of an NPU. So I guess you mean you don't want a Copilot+ PC, not specifically an NPU.
@berrywin22 күн бұрын
I don't get the AI hype! A tv station asked a AI: How many legs has an elephant? And got the answer two (2) legs? I asked ChatGPT how long time does it take to go to the nearest sun, beyond our star if you go with 70000 km/hour. It managed to get the star right Proxima Centauri, but the answer was a factor of 1000 wrong!
@GaryExplains
22 күн бұрын
I just asked Gemini about the elephant and it said, "An elephant has four legs. There might be some confusion due to optical illusions depicting elephants with more legs, but real elephants definitively have four legs." 🤷‍♂️
@longboardfella5306
22 күн бұрын
It’s not an answering engine. You’re using it wrong. Try dialoguing with it and using to test hypothesis or to summarise a complex document. You will THEN find what an LLM can do
@robertlawrence9000
22 күн бұрын
They get things wrong all the time. I never trust it. It only takes 1 minor detail to mess up the credibility of the results.
@gaiustacitus4242
22 күн бұрын
@@longboardfella5306 Many of the LLMs have been trained to have a specific political viewpoint. When you ask it questions to which the correct answer is contrary to the political agenda, it responds with insults and recommends that you educate yourself. Only by continuing the "dialog" and backing the AI into a corner by using logic and irrefutable facts will the AI eventually concede that you are correct and provide a proper response.
@gaiustacitus4242
22 күн бұрын
@@robertlawrence9000 Yes, AI has a tendency to make up "facts" which are completely false or court cases which never occurred to support its generated output. This behavior by AI is referred to as hallucination.
@simonabunker18 күн бұрын
Haven't most major releases of Windows enforced minimum hardware requirements? Mostly to drive sales of new hardware. If you are being very generous, you could say it is ti future proof your computer.
@andrew007s21 күн бұрын
Excellent video. Microsoft is in the business of selling new installs of windows. It's a win win to sell folks while whole new laptops. Because... Why not. Share Holders stay happy and jobs increase. Haha
@iscariotproject22 күн бұрын
no i dont need a clippy on steroids annoying me...microsoft we know you had a psychotic trauma event with microsoft bob failing and you have tried to bring it back over and over..LET IT GO
@brulsmurf
22 күн бұрын
Looking at your screen makes me worried. Are you sure you're on the right track? Let me show you a better way to do things
@RafaCoringaProducoes
22 күн бұрын
microsoft bob, i can see you are a person of culture as well
@skyak4493
22 күн бұрын
Clippy II -Clippy’s Revenge! Humanity is forced to train it’s AI replacement!
@hallkbrdz22 күн бұрын
Microsoft is going to finally make Desktop Linux take off next year as W10 expires. While my workstation has a NVIDIA 4000 (Solidworks) meets the specs, the CPU doesn't have TPM 2.0 so I won't be upgrading. The next workstation will run Linux, I don't like being told what I have to have for their software.
@NoName-zf6nr21 күн бұрын
To test ML on Nvidia GPUs, install CUDA, CuDNN and(!) TensorRT libraries. Choosing a good combination of such libraries can already be responsible for a speed factor ~6, as I have experienced. --robert jasiek
@GaryExplains
21 күн бұрын
Testing ML on Nvidia GPU's isn't the problem, as you say technology like CUDA is well know and very mature. The problem is recreating those tests on the NPUs in a Copilot+ PC.
@youcantata22 күн бұрын
Someone will come up with a "NPU emulation layer" DLL for window, that will enable Copilot+ PC capability on PC with no NPU, but with nVidia GPU with much less efficiency (consumes more electric power drain but faster speed) or or multi-core CPU (more power drain and slower)
@JynxedKoma7 күн бұрын
NVIDIA and Microsoft are already working together to allow AI acceleration using the RTX 40 series, thus negating the "need" for an NPU powered laptop. So no, you do not NEED an NPU. Just a40 series RTX GPU. The 4080 alone has 1,000+ TOPS.
@GaryExplains
7 күн бұрын
If you saw the official quote from Nvidia, that I included in the video you need an NPU. Is there an official announcement about using the GPU.
@JynxedKoma
7 күн бұрын
@@GaryExplains Believe so, as someone covering the AMD computex livestream informed me there was such an announcement made between the two.
@GaryExplains
6 күн бұрын
Yeah I think there is some confusion about that announcement. The word Copilot and Copilot+ PC seem to get misheard etc. yes there will be Copilot+ PCs with Nvidia GPUs, but that doesn't mean without an NPU. If someone knows the exact time index of that statement in the keynote we could take a closer look.
@u263a322 күн бұрын
Yes you do
@redacted62919 күн бұрын
All Information... Asinine Intrusion... Annoying Interactions... for a problem nobody identified comes a solution no one needs.
@ps330122 күн бұрын
They want u to waste money to buy a new laptop!! If u r an idiot, u will waste your money earlier
@Pushing_Pixels19 күн бұрын
As though Win11 needed any more spyware. Apart from basic chatbot stuff, I'm not interested in any AI programs I can't run locally.
@SuperFinGuy19 күн бұрын
Phi-3 is open source, you can run even on your phone.
@GaryExplains
19 күн бұрын
Of course. There are plenty of open source models. But the point I was trying to make was not about the models but about the hardware to run the models.
@DK-ox7ze22 күн бұрын
Microsoft doesn't gain anything in restricting the Copilot+ PCs to only CPUs with NPUs, as they would want as much penetration as possible for the new OS. So maybe NPUs are actually more efficient. Though one reason why they would want to restrict high penetration is availability of data centers (available GPUs) that can run LLMs for hundreds of millions of users. While LLMs run in the cloud and have nothing to do with client hardware, by restricting the number of users they can efficiently serve all users.
@gaiustacitus4242
22 күн бұрын
Microsoft isn't introducing a new OS. The Copilot+ PCs run Windows 11. The local AI features will only run a Copilot+ PC. You can already run Copilot on a Windows 11 PC where all AI processing is done in the cloud. It is the ability to run part of the AI locally that grants the qualification of Copilot+, and this requires a NPU that is only supported on PCs based on Qualcomm's Oryon (i.e., Snapdragon X and Snapdragon X Elite).
@paulbarnett227
22 күн бұрын
They gain sales of the latest line of Surface products. It's about money.
@paulwoodward826521 күн бұрын
If the new silicon is highly optimised to do this workload, this is kind of reasonable. We don’t want these models run inefficiently by millions of devices, that’s bad for the planet. Arguably LLMs are bad for the planet full stop. I’m sure someone will figure out a way to run representative workloads on these NPUs , on M4, and on GPUs, and then we’ll know.But looking at how efficient hardware video encoders and decoders are, compared to CPU and GPU, it’s certainly plausible these NPUs are way more efficient than general purpose silicon.
@GaryExplains
21 күн бұрын
Isn't having to buy new PCs with the NPUs built-in, also bad for the planet?
@whothefoxcares14 күн бұрын
Companies collaborate. Criminals conspire. Cui bono: consumers or cartels?
@morecarstuffКүн бұрын
I absolutely DON'T want any copilot. i didnt even want cortana BS.
@user-qw1kv2jb8y18 күн бұрын
OK, do you really think there's anything to this whole "machine learning" business? I know nothing about it myself, but it can, what ChatGPT themselves call "hallucinate" (I think they probably thought of that word themselves, and "shipped it" out to journalists), ie. IT CAN GET IT *WRONG* SOMETIMES! What's the point if it can get it wrong sometimes, and there's no inkling that it has done so??
@nfineon22 күн бұрын
CoPilot + PC (Stupid Naming) is a HARD NO, already switched to Ubuntu Linux and it works perfect for 99% of everything I need. The next generation CPU's are going in too fast into the AI gimmicks, in fact only a small fraction of us will use these features, but on the latest Intel Lunar lake for example, the DIE space taken by the Neural Engine is larger than the entire E-Core complex! FFS, would rather take more compute or cache than dedicating 8 cores of space just for AI neural processing which is beyond Niche at the moment.
@RUHappyATM22 күн бұрын
All I want to know is whether it can help me writ my thesis on Before the Big Bang, the 3B's! Oh, and correct my grammar.
@skyak4493
21 күн бұрын
Sure! Hallucinating theory that sounds good is it’s specialty! As long as there is no possible way to check for truth it;s golden!
@Vincent_Koech22 күн бұрын
Just like the arbitrary Windows 11 upgrade requirements. It must have worked.
@retroheadstuff855422 күн бұрын
Microsoft please stop making e-waste! 💻🖥💻🖥💻🖥💻
@JoeEbitDa21 күн бұрын
Nvidia have ChatRTX is this their response to Microsoft's gaslighting?
@mrnakomoto724122 күн бұрын
I miss windows 7 and xp 😭😭
@mikldude9376
22 күн бұрын
I miss 95 :) .
@KAZVorpal22 күн бұрын
You can run a pretrained large language model just about as good as ChatGPT on your local computer today, without NPU. The reason they want to force you to run it in the cloud is so they can control you. They want control of what you were allowed to ask and what information you are allowed to have. When you run an LLM on your own computer, you can choose one that is uncensored, that will give you accurate information.
@GaryExplains
22 күн бұрын
Yes you can run LLMs on your local computer, I have several videos doing exactly that, but I don't think you can say "just about as good as ChatGPT".
@KAZVorpal
22 күн бұрын
@@GaryExplains Llama 3 is just about as good as ChatGPT. Not quite, but just about.
@gaiustacitus4242
22 күн бұрын
The real reason they want to run the AI in the cloud is to gain access to your data. They've ran into a wall with training AI based on information available on the Internet. Further advances need access to proprietary intellectual property, and they expect end users will be stupid enough to submit it - and users will do so. I'm 100% with you when it comes running AI. It should be performed locally.
@gaiustacitus4242
22 күн бұрын
@@GaryExplains There are 70+ billion parameter LLMs you can run locally which generate output on par with that generated by ChatGPT. Of course, this assumes that you have the high-end hardware to run them...which most people do not.
@GaryExplains
22 күн бұрын
Llama 3 in its full size is getting closer to ChatGPT, yes, but on a PC you run a quantized version which isn't as good by a long way.
@tonytins22 күн бұрын
It's nothing but one big scam.
@andybarnard457522 күн бұрын
Dont forget the TPU.
@paulbarnett227
22 күн бұрын
It's just Google's name for an NPU.
@andybarnard4575
22 күн бұрын
Ah, I must have been misled by the AI answer to my question which told me that TPUs are better for training but NPUs and TPUs are equally efficient at inference. But in my physics degree (in the last century) a tensor was a 4D vector, so presumably a CPU with good matrix operations can imitate either.... which is related to the premise of this video..
@DarkPa1adin22 күн бұрын
What Windows needs isn't AI but optimized apps for ARM SoC
@Psychlist197222 күн бұрын
Hi Gary. Technically, a CPU (without an embedded GPU) is capable of doing your graphics work as well, but it will be more efficient and faster on a GPU. Faster means you can do more, like higher resolution (think larger models, or run more often). NPUs are really no different. (Also, FWIW, Apple includes a "Neural Engine". It's an NPU. Not sure why the confusion there.) ChatGPT gets all the press, but AI is also about things like noise removal, camera tracking, OCR, high-fidelity background elimination, graphics upscaling, etc. Having AI in-line and being evaluated in real-time opens up a lot of capabilities. DirectML will be the low-level interface for NPUs, GPUs, CPUs, etc. like Direct3D is to graphics. The discussion around "What does it all mean" makes me think you missed those announcements at Build last month. ONNX Runtime, PyTorch etc. can run on top of that as the higher-level access. (Disclosure: I work in Windows at Microsoft, but not specifically on AI tech)
@GaryExplains
21 күн бұрын
Yes of course Apple's processors have NPUs, that wasn't the point I was making.
@Psychlist1972
21 күн бұрын
@@GaryExplains Maybe I misunderstood then. I thought you were comparing and contrasting and saying Apple just integrated the acceleration into the CPU so an NPU wasn't needed. But that's all part of their AI Accelerator/NPU/Neural Engine from what I understand.
@GaryExplains
21 күн бұрын
@@Psychlist1972 No, I was saying that with the M4, Apple added a second ML accelerator, a CPU based one, that uses the Armv9 Arm Scalable Matrix Extension, showing that you can make a useful ML accelerator in the CPU. In Apple's case it compliments the Neural Engine, I am suggesting that it could replace it. Apple wouldn't have added it if it was useless or undesirable etc.
@Psychlist1972
21 күн бұрын
@@GaryExplains Ahh, got it. Thanks. That's another matrix/vector instruction set like Arm NEON or Intel AVX/AMX (the Xeon version, not to be confused with Apple's AMX), and part of the updated Armv9 set. Usable for much more than AI, but usable for AI as well, of course, for apps compiled to use it. It would be nice if the X elite Oryon cores implemented SME, but there's always a bit of leap-frogging going on between competitors, and Apple does some great work. There's still benefit to having a dedicated NPU, vs using CPU instruction cycles, but whether or not that is "essential" would depend on how much CPU you're willing to part with when running those models. I suspect much of the AI work Apple does will still run on their Neural Engine where it's more efficient. Unfortunately, any depth I have on CPU instruction sets pretty much bottoms out here.
@donaldduck573122 күн бұрын
Seems an awful lot of money to find five things to do in Paris and make a fake photo, is there anything else AI can do? I'm quite happy for now with my Win10 HP ZBook, my Wacom pen tablet, using Matlab, Python, SolidWorks, Sketchbook and my "real" intelligence to create the design things.
@Aizemiyo22 күн бұрын
This make me remember folding@HOME and malicious botnets, except it only run on NPU so it doesnt bottlenecking the system. Copilot does sounds like a perfect legal "botnet" from Msft to run whatever calculations or researchs from user's computer for them and submit the datas periodically all around the world. Unless it can be run totally offline with only one-time resources download, then I will not be convinced my processing power are not being used for their benefit. Now I wonder what if malicious botnet could utilises these NPUs? That wouldnt be good for sure.
@skyak4493
22 күн бұрын
Microsoft is creating a botnet to spy on everyone. The NPU is needed so that everyone doesn’t kill the AI spying process to save battery. The end game is making everyone train their AI replacement.
@PatrickHSB22 күн бұрын
Gary says what
@GaryExplains
22 күн бұрын
What
@xspydazx22 күн бұрын
can models be run on just cpu and ram ? or are we forced to by tools... optimized to use gpu over cpu: the graphics cards tessla 1000 have 1000gpu threads inside ! hence with 10 graphics cards you have a 10,000 gou model : this is nvidia cuda : the graphics card did not take off? but in such instances the gpu power over powers the cpu/ram combo... now this proposed chip also adds more capabilitys to regular pcs as people are being forced into the cloud due to the paying market and the tools being create for gpu use, so we are forced to invest in gpu power and yet the Vram seems to be the marker and not the gpu....so we are already going the wrong direction again... now with these new chips we wil also be forced back to the cloud as they will utilize these chips of we do not... the machine locks up when the gpu is used intensly as well as heat issue .. hence the need for such a chip ... in truth we do not want to keep being forced into upgrade as well as to the toolmakers pandering to these chip makers ... my cpu ram combo is higher than my gpu / ram combo ... so we should be able to choose how we alocate resources ... i think that is the true future we should be sereking... ie the ability to transform all power to sheilds and life resources ....they have full control over which device is used for calculation and power consumption. this is what we actually need .... resource control : not forced (as triton does not exist on windows a great downfall for home fine tuners like me....)
@gaiustacitus4242
22 күн бұрын
AI models require far more processing power than can be provided by a system that does not have a GPU and/or a NPU. Theoretically, it could be done but the performance would be so poor that no one would waste their time using it.
@xspydazx
22 күн бұрын
@@gaiustacitus4242 hmm ... im not sure as right now it is possible to quantize the hell out of a model , and load it on a gpuless machine ...or on cpu and not cuda or anything like that ... today these model can even run on tablets and model android systems which do not persay have gpu bare minimal... i think the need of tools which are more universal is all that we need ... after all its only a nerual network.. its not special.. there are some heavy crunching of numbers ... this shows the decline in investments in the cpu architecture hence the rise in gp costruction and its suppasing cpu technology... the lucritive natiure of gpus keep the technology seperate , where even nvida do not encroad on intel or amd domain which are basically monopolizing the market hence are under vaious government constraints with sharing technology of whic ma be higher that te technology that the government has currently upgraded to yet: as you know they need to be at least 100 times in front of the comercial areana , hence the danger of open source model and chat gpt , in which l=kaprthy revealed was actullay over 100 time more powerfull than the modelt hey are actully using publically ...the level they reached already must be released in stages.... as in truth we have reached the level of only computing power holding people back and not coding power....hence despite hiding the transformers and the actual fullimplementation the models have been released in the higgingface transformers! so all models are already open source.... so to make a 500b model is only a matter of 5 100b models as moe! so we aree only limted by the cloud that we can afford or the machines we can make: hence these upgrades ... they alrady know they have implemented these tech already on a mass level they are only bringing the public up to date ! (they already upgraded)... at one point our computer supassed mainframes... nassa was stiill using tape drives .... by todays expenditure and operational calculations required to plot trajectorys etc nasa should have been the biggest cloud providers in the world ... instead they ar a consumer! upgrades are moving too fast we are in james bond.... opps startrek.... (nearly)
@gaiustacitus4242
22 күн бұрын
@@xspydazx Small LLMs can run on mobile devices or computers without a NPU or powerful GPU, but these models have accuracy rates which I find unacceptable. As users come to realize the limits, they too will find the output unacceptable.
@xspydazx
22 күн бұрын
@@gaiustacitus4242 i think you will find that to make thes emodels effective they need to be trained at a higher quantize level , so when saving the model at full precision , the reduction process odes not loose so much ... i have quantized down to q2 and they have been lossless: someobody qunatized one of my models to idfex but i could not even load it to check if it was good or not... .... so its how you decide to create the model : as well ass for mobile phones it needs to be specifically trained to be oin this devce !! .. so it will be highly effective in the end ... so some form of new lora / qlora will be required in training before it will be a higher performer ... i also lowered raised my tempreture during training also ... so when training if the accuracy does not reach its desried rate ... i can adust the temptreeture down to gain a bit more accuracy .. for the latent samples.. its about techniques ... hence opensource public display and sharing and private models with special skils ...wink wink!
@benyomovod690422 күн бұрын
They can take the copilot and put it in their.......