Northwood vs Prescott: A Clock For Clock Pentium 4 Battle

Ғылым және технология

In this video, we compare a Northwood Pentium 4 to a Prescott Pentium 4 running in an identical environment to see if Prescott is as bad as people say and to find out where Intel went wrong with Netburst.
Test System:
Pentium 4 @ 3.2 GHz (Northwood | SL6WG)
Pentium 4 @ 3.2 GHz (Prescott | SL7PN)
Abit AI7 (Intel i865 chipset)
1GB Dual Channel DDR @ 400 MHz CL3
Nvidia Geforce 6800 512MB
Adata 128GB SSD
Windows XP Professional SP2
Music List:
Level 2 - Descent (AWE 32 recording by Goomer • Descent OST - Level 2 ... )
Level 5 - Descent (AWE 32 recording by Goomer • Descent OST - Level 5 ... )
Stardust Speedway (Present) - Sonic CD (US)
Menu Theme - Half-Life PS2 Port (Extended & Enhanced by BlackLambda25 • Half-Life OST - Main M... )
Prunus in Guahua (Menu) - Gran Turismo 6
Lotus in dam (Menu) - Gran Turismo 6
New PSN Account - PlayStation Vita
Kei - Maken X
Chapters:
0:00 Introduction & Background
6:14 The Test Setup
7:38 Benchmarks
16:36 Thoughts & Conclusion
25:35 Outro

Пікірлер: 297

  • @NTGTechnology
    @NTGTechnology25 күн бұрын

    Yes, there's a fly Edit: I forgot to mention that x86-64 was also introduced with Prescott. Although, it was mostly non-existent on Socket 478 (there's only a couple SKUs that have it). It was better supported on LGA 775. Edit 2: I've seen people pointing out that Gallatin actually had L3 cache instead of a larger L2 cache. I do remember hearing this along time ago now that I remember. However, while doing research, Intel Ark states the P4 EE 3.2 as having only L2 cache. I figured Ark would be enough since it's coming from the primary source, but I guess Ark can't be fully trusted.

  • @CompatibilityMadness

    @CompatibilityMadness

    24 күн бұрын

    A bit of a warning about ASUS N6800/TD/512M/A - it's using GDDR2 memory with 128-bit bus. Because of this, both Radeon X1650 XT (Pro) or 6600 GT should be faster than this 6800 series card. Out of curiosity : How many ROPs/Pixel Shaders your card has in GPU-z/AIDA64 ? PS. There is Far Cry bench tool from HOC you could use instead of FRAPS.

  • @geofrancis2001

    @geofrancis2001

    23 күн бұрын

    @@CompatibilityMadness they also done a 128mb ddr1 card.

  • @professionalinsultant3206

    @professionalinsultant3206

    23 күн бұрын

    the gallatin extreme edition has L3 cache

  • @NTGTechnology

    @NTGTechnology

    23 күн бұрын

    @@professionalinsultant3206 I pointed that out in the edit already...

  • @vardekpetrovic9716

    @vardekpetrovic9716

    23 күн бұрын

    great video. prescott had 31 pipeline stages though.

  • @goo1.x
    @goo1.x25 күн бұрын

    Well I'm sure that Intel learned their lesson when it comes to making processors that only bring marginal performance uplift whilst being less efficient.

  • @BReal-10EC

    @BReal-10EC

    24 күн бұрын

    Yeah... AMD unfortunately played the same game with Bulldozer, Piledriver, Steamroller, & Excavator CPUs. MORE FREQUENCY!

  • @aaaalex1994

    @aaaalex1994

    23 күн бұрын

    @@BReal-10EC _cries in FX9590_

  • @BReal-10EC

    @BReal-10EC

    23 күн бұрын

    @@aaaalex1994 Ha. I also have one of those still somewhere- but an FX-8320 with two HD 6950s in Crossfire X. One of the few times Fire being in a computer product name made sense. Damn thing will heat a barn when gaming (back when games supported two GPUs). Those AMD CPUs did not make much sense on the high end due to much higher thermals and PSU requirements... but the FX 6000 series was a very good budget gaming CPU back then.

  • @mitch075fr

    @mitch075fr

    23 күн бұрын

    @@BReal-10EC Recent benchmarks of these chips show that, on modern software, they don't actually suck that much - and they did equip the Xbox One and PS4. So, Bulldozer & Co needed optimized software to shine, at a time when everybody optimized for Intel Core - in practice, they sucked on the PC, but in theory they were an actual improvement over AMD's previous K10 µarch.

  • @stephandolby

    @stephandolby

    23 күн бұрын

    @@mitch075fr That was Jaguar, though.

  • @medallish
    @medallish23 күн бұрын

    I worked at a computer store when Prescott came out, and if you've ever wondered why in the early 2000's why there were cases with a fan duct at around CPU height, well it was often advertised as the case being "Prescott compatible"

  • @karolwojtyla3047

    @karolwojtyla3047

    4 күн бұрын

    This starts with P4 and CPU airducts stay with us for today in strong performance PCs. ;)

  • @yukinagato1573
    @yukinagato157323 күн бұрын

    Okay, there's a lot of stuff I wanted to talk about: 1) Prescott-2M P4 was actually built because of x86-64. As AMD was the first to bring it to market, Intel had to buy a license to use it. However, as Prescott's main registers and cache were designed to hold 32-bit words, they had to double pretty much everything to 64-bits in order to maintain the same level of performance as seen on 32-bit systems. As such, the increase from 1 MB to 2 MB of L2 cache was only done so that their CPUs wouldn't lag even further while executing 64-bit code. Of course some 32-bit programs could benefit from the cache increase, but at this size, the L2 cache latencies were so high (due to the amount of memory) that the speedup wouldn't be actually that big, while being extra expensive to make. Actually, many 32-bit programs and games perform about the same on an original Prescott and on a Prescott-2M. 1b) The better solution for this cache problem was in fact what Gallatin brought to the table: 512 KB of L2 cache combined with 2 MB of L3 cache. The latter was much slower than the former, but was gigantic, while the L2 cache at this size was much faster than L3. Intel actually maintained 512 KB as the standard size for L2-per core cache for quite a while (except when it's shared between cores). It gives you fairly low levels of latency, and increasing it further (*without any architectural changes) won't give you any performance boost. The best approach is building a slower, but much bigger level of shared cache, so you don't compromise the performance of tasks that aren't that much cache-dependent. Excluding the Pentium D and Pentium Extreme Edition, Gallatin is considered to be the fastest implementation ever of Netburst on a single core. 2) The Conroe project, which is actually a dual core 64-bit design based on the old P6 architecture (from the likes of Pentium III) started as soon as 2001 (one year after Willamette launched), probably indicating that Intel knew they screwed up with Netburst as soon as Willamette got released. They probably knew its pipeline design would only get more and more complicated as time went on, and decided to hold Conroe as a backup plan. Remember: a 90nm Prescott chip measures about the same as an 180 nm Willamette chip in area. I don't know at which extent Intel was aware of the future thermal issues Prescott would have (given that Northwood actually had pretty tame power consumption levels compared to the Athlon XP and even 64), but they were definitely cooking up an entirely new architecture based on dual core processor to replace Netburst in the case it failed. And it did. Really badly. 3) Prescott was literally built to scale up to insane clock speed levels. Actually, Netburst in its core was, with Intel expecting to reach 10 GHz in 2005 while consuming less than 1V. This never happened. Originally, they intended to replace Prescott with a chip called Tejas, which would increase the pipeline stage count to 50 stages in the same 90nm process. It was even expected to be called "Pentium V" at that time. Infamously, Intel cancelled Tejas, as they would discover that a 2.80 GHz Tejas would output 150W of TDP, higher than the highest TDP-emitting Prescotts (115W), while running at much lower frequencies. Intel soon realized Netburst was a dead end, and invested a ton on the Enhanced P6 architecture (Pentium M and Core Duo/Solo chips) to lead their notebook and low-power market, as well as Conroe (Core 2 family) as their next generation architecture for notebooks and desktops alike. We all know how the rest of the story goes. 4) Pentium D, as we all know, is really a stopgap product between Netburst and Conroe. In 2005, AMD was threatening Intel with their dual core processors, such as Athlon 64 X2. At this point, Conroe wasn't ready yet (it was released in 2006), so Intel had to quickly come up with any kind of product to compete with AMD's offerings, regardless of performance or power consumption. Since Netburst wasn't built to have two cores (there are some consequential design complications regarding multi core approach), they made a work around and literally put two Pentium 4s on the same processor. Its design was DREADFUL, since AMD's solution was much more elegant and efficient. Pentium D's design literally involved the two processors communicating, exchanging cache data and more through FSB, which is HELLISH SLOW compared to in-chip communications. Some people even called Pentium D "not a dual core, but a dual processor solution". I mean, it worked, and the two cores could provide more parallel processing power, but it was slower than AMD's solution, and was completely smoked when Conroe (an architecture designed from the start to be a dual core) was released. If anything, the Pentium D actually set up another record for power consumption: while Prescott consumed at the most 115W, Smithfield consumed a roaring 130W. Eventually, it was up to Presler to bring TDPs back to 115W, and eventually to 65W in later steppings, which helped a lot. But at this point, Conroe was just about to be released, so... Ehhh. 4b) If you ever wondered why the Core 2 Quad processors had two separate chips in a pretty similar arrangement to what the Presler Pentium D offered, it's because Conroe was built to be a dual core processor, not a quad core one. Today, whenever Intel wants to make a processor with 4 , 6 or 8 cores, etc. (including architectures with efficiency cores and performance cores as well), they just make the CPU with the most amount of cores and disable some whenever needed or when they have defects. Anyways. Excellent video!

  • @tourmaline07

    @tourmaline07

    21 күн бұрын

    From 2024 , the idea that Intel could release a 10Ghz chip at less than 1000mV is even more absurd now then it was back then : We are having Intel overclocking their chips to 6ghz and having them crash at vids in the region of 1400mV out of the box , while consuming 300W of power. Still , considering they had Tejas which was a *single* core chip chewing up 150W back then I suppose they've not learnt much 😂

  • @masejoer

    @masejoer

    18 күн бұрын

    Good response, and correct data from what I remember. It seemed to be the Pentium 4 that taught us what limits we'd see when clocking CPUs, up, and the limits are why we were stuck in the 3-4GHz range for so long, and why the focus shifted more toward IPC within the clock limits. Intel's theory in the Pentium 4 architecture was for a future that wasn't possible. I'm just glad that Willamette and Tualatin weren't mentioned throughout the video. Living near Intel in Oregon, the bad pronunciation is difficult to listen to ;)

  • @BGTech1

    @BGTech1

    5 күн бұрын

    50 pipeline stages is absolutely insane! It would have been interesting if these chips did make it to market

  • @GGigabiteM
    @GGigabiteM23 күн бұрын

    Cedar Mill actually really did matter to end users. Dell had some horrific case designs at the time that had virtually zero airflow that packed 86-115W Pentium 4s. It caused the machine to run hotter than the blazes of hell. Cedar Mill, specifically the D0 stepping with sSpecs starting with SL9K_ were 65W parts and made the difference of the machine shutting down on thermal trip. Like the SFF Optiplex GX280 power supply would get so hot that it would literally burn with any Pentium 4 other than the 65W Cedar Mill. It could be made even worse if a discrete video card was installed, the PSU would trip on thermal overload. Having the Cedar Mill just barely made it work.

  • @pankoza2

    @pankoza2

    9 күн бұрын

    Also Cedar Mill introduced CompareExchange128 and some other instuctions required for x64 version of Windows 8.1 and later

  • @RetroPcCupboard
    @RetroPcCupboard24 күн бұрын

    I completely skipped the Pentium 4 originally. Athlon XP was much better value and ran cooler. I switched back to Intel when I got a gaming laptop with Pentium M 2.0ghz. My opinion on the P4 has changed now though. I think it is a great processor for Windows 98. Much better value than trying to get a Tualatin 1.4ghz.

  • @runningbird501

    @runningbird501

    23 күн бұрын

    Win98SE on a Northwood P4 is my daily retro driver.

  • @thepcenthusiastchannel2300

    @thepcenthusiastchannel2300

    22 күн бұрын

    Pentium 4 HT 3.2GHz Northwood running Windows XP for me. Pentium !!!-S 1.4GHz running Windows 98 SE as well.

  • @RetroPcCupboard

    @RetroPcCupboard

    22 күн бұрын

    @@thepcenthusiastchannel2300 What GPUs did you use for XP and 98?

  • @thepcenthusiastchannel2300

    @thepcenthusiastchannel2300

    22 күн бұрын

    @@RetroPcCupboard ATi Radeon 9800 Pro 256MB + Quantum Obsidian X24 24MB for WinXP and ATi Rage Fury MAXX + 3Dfx Voodoo2 1000 12MB for Windows 98 SE.

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    Main reason I agree with you is because P4 chips are really really cheap nowadays. Seems like many people had one, as well as pretty much every single office. You can find loads of these machines and processors today. The last popular P3 was the 1.0 GHz Coppermine. This one you can find for a reasonable price when they show up. Now, as Tualatin was pretty much oriented to low-end and low-comsumption servers, literally no one bought it. It also only runs on specific motherboards that can offer it the right levels of voltage, as Intel made it incompatible with older PGA370 boards. These Tualatin boards generally are rare, expensive, have less features than older boards, support less memory and, in many cases (I've seen quite a bit of those boards like so), they have bulging capacitors or just don't work. Finding a good board is sometimes more difficult than finding the CPU itself. Tualatin is a very bad value. If people want a P3-class machine, they should go with Coppermine. If they want more performance, Northwood or Prescott are the way to go.

  • @thebayandurpoghosyanshow
    @thebayandurpoghosyanshow23 күн бұрын

    As for the pipelines: here's a simplified version of it. All modern CPUs I know of are pipelined; that means they have a queue of instructions they run, sometimes one per clock cycle, sometimes more. And there's a mechanism that is called "branch prediction", which predicts which instructions are going to be required next, essentially filling the pipeline in advance. If the branch prediction mechanism makes a mistake and loads an instruction into the pipeline that is not required, every prediction after that is false, and the PC needs to empty the pipeline and start with the required instruction. This causes a delay depending on the length of the pipeline, also known as pipeline stall. The length of the pipeline is not the reason for low instructions per clock. Here's the issue with CISC CPUs: you can complete more complex instructions per clock, but those instructions require complex CPU mechanisms to do their thing at the same time. It's never exactly the same time, but it must be close enough to be within one clock cycle. The higher the clock speed, the more precise your mechanisms must be. But there's another way; break your complex instructions into simpler instructions, and they will be easier to perform in the scope of one clock cycle. But you may need more than one clock cycle to perform one complex instruction. This approach started, I think, with Pentium Pro CPU. So Intel decided to have complex instructions translated into simpler instructions, then compensate the low IPC with higher clock speed. And it seemed it would work; with Prescott they introduced Hyper-Threading, which was a built on their earlier super-scalar technology. An instruction uses only part of the CPU; HT allowed two instructions to be run at the same time on two separate sets of data, as long as they used different parts of the CPU, so the CPU was used more effectively. Not twice as effective, of course. But when you use more of your CPU per clock cycle, the CPU inadvertently uses more energy and releases more heat. Add higher clocks to this - I think you get the picture. But this is justified as long as the CPU does stuff faster, so the idle time compensates for this. In practice, this didn't work as expected - the long pipeline, the branch prediction mechanism that was inadequate for such a long pipeline, the stalls meant often the CPU was doing useless stuff and being forced to redo it, which by itself reduced the CPUs efficiency. And Intel found out they had reached a ceiling for TDP and power draw. There was only so much power an ATX PSU could provide a CPU with; and there was only so much heat a heatsink could remove. They did overcome most of these issues with the Cedar Mill die shrink, but by then it was too late to save Pentium 4, Intel had already started going back to the Pentium M's core, which was essentially a roided up P3 Tualatin, added the memory interface and branch prediction and other good stuff from P4, then, unlike the Pentium D, which was just two CPUs in one package, made it a multi-core CPU. It's like HT, but your CPU has two cores and each thread is performed independently on one core, so if one core stalls, the other one isn't affected. Then they added HT back with Nehalem, and each core now had two threads it could run. Now funny thing is, Pentium III had about 10 pipeline stages, and early Pentium 4s had about 20. Penny scaled it down to 14, and Nehalem again increased it to 20-24, but by then multiple cores meant that one core stalling did not affect the performance too much, shrinking the size of basic elements on the die allowed to have more performance per watt, and the branch prediction technology got good enough to reliable build pipelines that deep. Sorry for the longread.

  • @phirenz

    @phirenz

    23 күн бұрын

    "break your complex instructions into simpler instructions, and they will be easier to perform in the scope of one clock cycle..... this approach started, I think, with Pentium Pro CPU." This approach started with the 486 (or maybe even the 386). 486 class CPUs are fully pipelined, and there is a comprehensive subset of x86 instructions that execute at the same one instruction per cycle as RISC cpus. The more complex instructions (ALU instructions that operate on memory destinations and string instructions) get broken down into smaller instructions, even back on the 486. "Intel decided to have complex instructions translated into simpler instructions, then compensate the low IPC with higher clock speed." The slightly lower IPC isn't really an issue. Those instructions that break down into 4 uops on x86 are the equivalent of 3 or 4 full-sized instruction on any competing RISC implementation, so they need higher IPC to compensate. It's not exactly true, but it can be useful to think of x86's complex instructions as a form of code compression that pack more work into a single instruction. And it's not the higher clock speed that intel are relying on. The key feature which as added with the Pentium Pro was out-of-order execution, which allows it to execute multiple uops (and multiple full instructions) in a single cycle, out-of-order, whenever their dependencies are ready. The Pentium Pro can sustain nearly 3 IPC on well optimised code that hits L1 cache, and more modern cores can (and do) exceed 4 IPC on well optimised code. This approach works so well, that modern high-performance ARM and RISC-V cores take the same approach, breaking down instructions into simpler uops (well, it's not needed so much for RISC-V) and then executing them with an extremely wide out-of-order scheduler.

  • @thebayandurpoghosyanshow

    @thebayandurpoghosyanshow

    23 күн бұрын

    @@phirenz thanks for your corrections, friend!

  • @laurelsporter4569

    @laurelsporter4569

    14 күн бұрын

    "Here's the issue with CISC CPUs: you can complete more complex instructions per clock, but those instructions require complex CPU mechanisms to do their thing at the same time. It's never exactly the same time, but it must be close enough to be within one clock cycle." They really don't require complex CPU mechanisms, just decoder overhead. They generally didn't worry about keeping instructions down to a single cycle, either, in the pre-RISC days. One instruction might take 5 cycles, during which very little else got done. But, they could often encode several operations in just a byte or two. As well, if the computer was busy, the main bus would be busy, too, leaving the CPU a few cycles here and there to process instructions, with no ability to read or write to or from them (or being stuck reading or writing, and not executing anything useful). "But there's another way; break your complex instructions into simpler instructions, and they will be easier to perform in the scope of one clock cycle. But you may need more than one clock cycle to perform one complex instruction. This approach started, I think, with Pentium Pro CPU." www.righto.com/2023/01/inside-8086-processors-instruction.html www.righto.com/2022/11/how-8086-processors-microcode-engine.html Breaking instructions down to simpler ones inside the CPU, to run as multiple simpler instructions, has been part of x86 from the beginning, long before pipelining. It was also common practice on other CISC ISAs, going back those built with discrete components. Back in the day, data buses, and memory capacity, were often the biggest restrictions, and compilers were generally pretty basic binary converters. So, every CPU ISA tried to pack as much semantic info as they reasonably could into their individual instructions (you might think there'd be a caveat about complexity, here, but IBM even went as far as implementing database-related CPU instructions, so...). While early RISC CPUs did many boneheaded things, pipelining everything, and ditching stack operations (easy to write directly into machine code, from most programming languages) in favor of GPRs and better compilers, were huge wins. Ironically, the RISC-driven compiler improvements allowed special registers not currently being used to be treated as improvised GPRs, without hand-coding assembly, on x86, removing much of the early RISC performance advantages. "And it seemed it would work; with Prescott they introduced Hyper-Threading, which was a built on their earlier super-scalar technology" Intel introduced HT with a variant of Willamette (Foster) for servers, and then Northwood for desktop PCs.

  • @aaaalex1994
    @aaaalex199424 күн бұрын

    If me and my father would had know this back in January 2005, we would had bought an Athlon 64 instead of a 3.2 GHz Prescott P4... (we still have that PC btw)

  • @unruler

    @unruler

    23 күн бұрын

    I too still have my old P4 3Ghz Prescott and at that time I thought it was great CPU, played HL2, Doom3 w/o problem. But then again back then I wouldn't understand why would you need two cores at lower clock, no software supported it.

  • @IanRomanick
    @IanRomanick18 күн бұрын

    You are saying Willamette wrong. There's a local winery that sells tshirts that say, "It's Willamette, damn it." Because they rhyme.

  • @masejoer

    @masejoer

    18 күн бұрын

    and Tualatin

  • @Adesterr
    @Adesterr23 күн бұрын

    I think the desire for a longer pipeline / clock speed came from what the customers were looking for. Bigger number = better -> more pipeline steps = more clock speed -> more clock speed = bigger number = more sales

  • @pc-sound-legacy

    @pc-sound-legacy

    23 күн бұрын

    Totally agree.. Remember this was the time when everything was about numbers. MB/GB on MP3 players, capacity DVD vs. CD, Megapixels on Digital Cameras etc..

  • @yancgc5098

    @yancgc5098

    23 күн бұрын

    Also the fact that most things were single-threaded at the time and a higher clock speed increases the performance of any type of application, while a higher IPC doesn’t have the same benefit in all types of applications

  • @dom3827

    @dom3827

    23 күн бұрын

    higher clock make sense. Less latencys. They are more realtime for realtime applications. That is why intel back than was insanely superior in real time audio recording and such.

  • @2BuckFridays
    @2BuckFridays22 күн бұрын

    I was literally about to do this comparison to satisfy my curiosity, thanks for saving me the trouble! great video :)

  • @giserson2
    @giserson224 күн бұрын

    Umark doesn't work because it looks for a registry key containing a version number to determine which Unreal Tournament exe to run (2003, 2003 demo, 2004, 2004 demo). This is broken for the GOG version of UT 2004 and possibly other versions as well. To fix it change the "version" key at HKEY_LOCAL_MACHINE\SOFTWARE\Unreal Technology\Installed Apps\UT2004 to a number equal to or greater than 3187. Note that on 64 bit versions of Windows it would fall under HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432NODE\Unreal Technology\Installed Apps\UT2004 This is according to some notes I wrote down after digging through the Umark code and getting it working almost 2 years ago.

  • @NTGTechnology

    @NTGTechnology

    23 күн бұрын

    Thanks. I had first tried the GOG version and it didn't work. Then I copied over the retail version, but it had already been installed so it wouldn't have brought the registry keys. This makes sense then.

  • @Dashzer0

    @Dashzer0

    23 күн бұрын

    I don't need this but I really appreciate comments like this. They brighten my day.

  • @ningyuanwang600
    @ningyuanwang60024 күн бұрын

    I remember I read a review that Prescott's performance scales better with frequency, especially beyond 3.0Ghz. That's probably one of the design goals Intel made. If 90nm was good enough to push Prescott beyond 4.0 Ghz comfortably, that will make a decent difference compared to Northwood.

  • @NTGTechnology

    @NTGTechnology

    24 күн бұрын

    That would make sense. Intel had a 4.0 GHz model planned for release, but ended up canceling it due to power limitations.

  • @docnele

    @docnele

    23 күн бұрын

    ​@@NTGTechnology I remember trying to overclock neighbour's Prescott LGA775, just a bit, to see what happens. It started thermal throttling like crazy and cooling fan sounded like vacuum cleaner. It looked like it just sucks power and turns any additional Mhz into heat.

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    ​​​@@docnele Prescott couldn't scale as well as Intel hoped to also because of thermal leakage. There are many ways of reducing a chip's power consumption. The main one Intel was trying to do was to lower the voltage. When you do that, you reduce the power dissipated by a CMOS logic gate when it transitions between being open or closed (this is also called the dynamic power of a CMOS logic gate). However, there's a limit on how much you can decrease the voltage: at some point, you are not giving enough voltage for the gate to be "completely" open or closed. This means that it starts to dissipate more static power (which is the power a logic gate dissipates while sitting still, at the same state - CMOS gates were designed to have extremely low levels of static power and mostly dissipate dynamic power, and this is why they are so much more power efficient than other techniques from the past like nMOS gates). I think the problem should be clear by now. Prescott dissipated a lot of heat due to power leakage, and it would only get worse at higher clock speeds. It necessitated lower voltages in order to operate at more comfortable temperatures. But with the added static dissipation, every time you increased the clock speeds, the power leakage would only be bigger and bigger...

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    @@NTGTechnology Even with Cedar Mill, they ended up not releasing a 3.8 GHz variant. And even then, the later stepping for the 3.6 GHz model only emitted 65W of TDP. It's also pretty overclockable, and many people managed to put them to work at over 4, 4.2 GHz. Intel could have kept releasing faster Pentium 4s and Pentium Ds beyond 3.6 GHz if they wanted to, but with the Core 2 coming up a few months later, who cared at that point.

  • @shrekoc5570

    @shrekoc5570

    14 күн бұрын

    I remember reading at the time that Intel wanted to call Prescott Pentium 5. The reworked architecture, longer pipeline and new process node were supposed to lead to much higher clock speeds. Once it was clear that clocks would not be scaling as they had hoped, Intel figured Prescott would be better received as the latest Pentium 4 than a next-generation product.

  • @Ivan-pr7ku
    @Ivan-pr7ku23 күн бұрын

    The main reason Intel went forward with Prescott was the implementation of the x86-64 extensions. At the same time, the clock-rate agenda was still in full swing so the CPU's pipeline had to be stretched even more, together with totally different logic layout and thus they ended with a brand new architecture anyway. Otherwise, Northwood was very well balanced NetBurst implementation that simply topped out on its capabilities - clock scaling and feature support. The increased cache latency in Prescott definitely is responsible for some performance loss against Northwood, despite doubling of the total size.

  • @NTGTechnology

    @NTGTechnology

    23 күн бұрын

    Not sure how I forgot to mention the addition of x86-64. Though it's worth noting that only some models have 64 bit support.

  • @neongenesis2979
    @neongenesis297924 күн бұрын

    There were a few Socket 478 boards with PCI Express. I have two Biostar boards, P4M900-M4 and 945GC-M4, with both seeming to resource memory terribly, having 4gb of DDR2 533 installed and only being able to use less 2gb in Windows XP.

  • @hygri
    @hygri23 күн бұрын

    Loved it, great video! Waves of nostalgia... moar frequency better frequency.

  • @RedStar-dz5tc
    @RedStar-dz5tc24 күн бұрын

    Nice video! I have the same Abit Ai7 motherboard with the exact same Zalman cpu cooler and it came with the Northwood Pentium 4 @ 3.2 GHz (SL6WG) cpu.

  • @pc-sound-legacy
    @pc-sound-legacy23 күн бұрын

    Great comparison with many different benchmarks, power consumption comparison etc..) Thanks! Your conclusion sounds reasonable to me. I also think that the major problem might have been the new 90nm process that did not lead to the expected results and therefore terminated the possibility to raise clock speeds that favour the architecture. Such a change in lithography can be quite challenging - narrow structures may interfere if not handled properly- just a thought- but the fact that the new process even increases power consumption made it obvious that something didn't go according to plan🤔

  • @kenh6096
    @kenh609619 күн бұрын

    Great comparison thank you.

  • @matm9243
    @matm924324 күн бұрын

    Around 13 years ago I had both CPU's and a Dell Optiplex GX270 with 4gb of 400mhz ram running Windows 7 Pro 32bit In my testing, in modern games at that time, the Prescott beat it. I played Skyrim, Black Ops, and Battlefield 3. 800x600 lowest settings. With a program I believe is called SETFSB I was able to overclock the dell and it had a Radeon HD 3850

  • @aaron96244
    @aaron962445 күн бұрын

    Nice video. Happy it was recommended.

  • @powerpower-rg7bk
    @powerpower-rg7bk23 күн бұрын

    The pipeline stage count in the Pentium 4 don't necessarily include those used by the instruction decoder as there was a micro-op cache. For a micro-op cache hit, there was no need to use the decoder and thus save a few early stages. Thus the pipeline stages often quoted for the Pentium 4 are for the optimal scenario where there is a mico-op cache hit. Geek Bench and other results where Prescott pulled ahead could be explained by usage of SSE3 where possible. In particular code that could use the LDDQU instruction saw improvements to memory performance due to how it could handle unaligned accesses better. The proliferation of SSE3 code wasn't there at Prescott's launch which dampened its initial impressions. There were models of both Northwood/Gallatin and Prescott that had a 1066 MT FSB. These were often the Extreme Edition versions as the mainstream variants generally topped out at 800 MT FSB. 20:00 The Pentium 4 Extreme edition didn't increase L2 cache but rather added a large L3 cache on top of the L2. Latencies were different there as well as how the cache itself behaved. Since that core was designed for multisocket systems, the L3 cache would often snope for cache data that'd be in-use in another socket to cut down on time to resolve coherency matters. It still worked in the single socket Extreme Edition desktops as another cache layer but the benefits were never fully realized for what it was designed to do. Prescott on paper was supposed to hit 5 Ghz so even with the few areas where IPC regressed, there was supposed to be an overall performance increase. Take a look at old Intel roadmaps and yes, Intel was hoping to reach that area based upon the design. Transistor leakage is what made thermals run away as at the time Intel had no means of addressing that issue. Of course since then, Intel has figured out ways around it at the transistor level. Intel did shrink down the Prescott design down to 65 nm with the Cedar Mill core. These weren't that popular as they only came out a few months before the Core 2 Duo and clearly a stop gap solution to appease OEMs. However, for overclockers hoping to reach the absolute highest clock speeds, Cedar Mill was the popular choice until the Athlon FX (Piledriver) came along. There was one more chip in the Netburst family but only made it to the prototype stage: Tejas. Intel had prototypes for testing but the performance of lengthening the pipeline further and run away thermals was a clear indicator of the need to pivot to the Core 2 Duo. Tejas was to take the NetBurst design past 7 Ghz. The channel Fully Buffered managed to get their hand on a prototype sample for further reference.

  • @NTGTechnology

    @NTGTechnology

    23 күн бұрын

    In regards to the cache on Gallatin, after hearing it mentioned by a few people I do now remember that it had L3 cache instead of just L2. I was going off of Intel Ark for my research and I figured that was enough since it's a primary source. I've attached an addendum to my pinned comment.

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    @@NTGTechnology There are two different versions of P4 EE. There is the Gallatin based, which in fact featured 2 MB of L3 cache as well as 512 KB of L2 cache, and the Prescott-2M based, which has 2 MB of L2 cache. This latter variant is not that special; only difference from other Prescott-2M chips is that it has 1066 MT/s of FSB. But overall they were pretty bad value back in the day. Gallatin is still the fastest P4.

  • @NTGTechnology

    @NTGTechnology

    22 күн бұрын

    @@yukinagato1573 I was specifically referring to the Gallatin based 3.2 GHz model for Socket 478 (SL7AA). According to Intel Ark, it only has L2 cache hence why I said that in the video.

  • @TheGrunt76
    @TheGrunt7623 күн бұрын

    I still remember the era just fine and back in the day the problem with Prescott wasn't considered that it couldn't match Northwood in general but that in practice it couldn't actually move the Netburst architecture forward at all. It practically exposed Intel's design flaw and showed that the architecture won't ever be capable of 6,7 or even 10Ghz that Intel envisioned at some point. And considering the competition, Prescott just couldn't offer enough. AMDs old K7 architecture gave all earlier P4s a good run for the money and practically offered unbeatable price/performance ratio against Willamette and Northwood, but compared to K8, Prescott suddenly looked outright pathetic. And Intel had actually hit the wall with the architecture, so Prescott was more or less the best that it could offer. It is quite ironic that Intel seems to struggle with similar issues with their current Core platform. They try to squeeze as high clock speed as possible to compete with high end AMD chips while they are at the same time hitting power draw and thermal limits of the silcon to the point that their CPUs have stability and degradation issues. By the way, outliers especially with synthetic benchmarks are to be expected. Different benchmarks are optimized for different workloads and it is no wonder that NW can beat Prescott in some of those tests also depending on when the benchmark was released and what platforms and features it was designed to benchmark. I'd say that at the time the Prescott was in its prime and when compared to then old NW, Prescott managed to beat NW in 99% times of the real application performance. Often this was just negligible improvement like some of your benchmarks shows and that was actually also the major problem of Prescott.

  • @Russell970
    @Russell97024 күн бұрын

    I love my Pentium M 745 it's actually somehow faster than desktop pentium 4s because of it's 2MB cache!

  • @amdintelxsniperx

    @amdintelxsniperx

    23 күн бұрын

    p m uses a more efficient pipeline its pretty much a modded p3

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    @@amdintelxsniperx Crazy to think Intel eventually ditched what they once considered to be "the architecture of the future", and went back to the older one. Lol. (Of course, they heavily improved it, but still. Damn.)

  • @sparki_
    @sparki_23 күн бұрын

    thanks a lot for this video!

  • @megra25
    @megra2522 күн бұрын

    Great review! you should do GPUs as well :)

  • @IronicTonic8
    @IronicTonic823 күн бұрын

    I always felt Northwood was a very successful generation for the P4, it was able to beat the Athlon XP series by quite a margin. I had an Athlon XP 3200+ back in the day and I was a bit jealous of my roommates P4 3.2GHz (Northwood). Northwood didn't run that hot, especially when compared to the Athlon XP of the time, and it scaled well during it's run. I definitely agree that Prescott sealed the legacy of the P4 and soured it in many people's minds.

  • @alphadog6970
    @alphadog697024 күн бұрын

    Great video 👍 As I watch this I will write down some opinions. Presscots run really hot even with a good cooler. In the bios stock setting is to thermal throttle them if they hit 70c to 50% and that happens in the 3D benchmarks regularly as it is extended load. You can see what happens if you run RTSS in the benchmarks and monitor cpu utilisation total and each core individual. If total is hovering around 50-54% its getting hot. You can monitor gpu utilisation with HWinfo as it has RTSS module that makes aveilable gpu core utilisation on the overlay so you are confident that everything is properly tested.

  • @magnum333
    @magnum33323 күн бұрын

    Thank you for posting your benchmark results. Mobile P4 Prescott really shines. Power consumption goes down by A LOT. You can undervolt it by messing with the MSR registers. If you are bound to Netburst, that's my pick, it's a clear winner.

  • @racinggameschannel
    @racinggameschannel24 күн бұрын

    Great content! Been considering a 3.4Ghz northwood for a Dell Inspiron 9100 (Dell XPS Gen 1), curious to see how either of them overclocked, if northwood will be able to compete even better to Prescott

  • @phirenz
    @phirenz23 күн бұрын

    "However branch prediction isn't some magic bullet" Modern branch predictors basically are magic. It's just that the branch predictor in Prescott was nowhere near good enough to compensate for it's longer pipeline. Branch predictors don't just work on repetitive workloads, they are absolutely essential for performance of out-of-order pipelines on all workloads. Because, out-of-order pipelines can't even see branches until they are decoded 5-10 stages down the pipeline. Not just conditional branches, they can't see unconditional jumps either. Or call instructions, or even return instructions. Modern branch predictors are so magic that they can predict the correct destination for indirect calls, and they track the last dozen branches taken allowing them to (for example) change their prediction of a branch inside a function based on which outer function called it.

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    Yes. But I mean, Prescott's branch predictor WAS GOOD. EXTREMELY GOOD. One of the most advanced branch predicting circuits we've seen at that time. However, not even it was able to save Prescott from its 31-staged pipeline. Damn.

  • @phirenz

    @phirenz

    22 күн бұрын

    ​@@yukinagato1573 Yeah, it was extremely good for it's time. Netburst (and especially Prescott) come at an interesting point in CPU design history. CPU designers had learned the "pipeline must be as short as possible, to minimise branch delay" design philosophy that "classic RISC pipelines" optimised for was wrong, and that a good branch predictor would compensate for longer pipelines that allowed for both higher clock speeds, and (more importantly) out-of-order execution. But cpu designers hadn't yet learned what the optimal pipeline length was. The Pentium Pro/2/3 had a conservatively short pipeline (and a really simple branch predictor, by today's standards), and Netburst (especially Prescott) appears to have been an experiment with lengthening the pipeline. And I guess Prescott pushed things too far, modern x86 cores seem to have settled on a pipeline length of 19 stages, which more or less lines up with Northwood. Though they make better use of their extra pipeline stages, putting them in the frontend, rather than Netburst which throws away too many pipeline stages trying to run the backend at higher clock speeds. Intel weren't the only people who made this mistake at this point in time. IBM also made the powerpc core for the PS3 and Xbox 360 with a very long pipeline, It didn't even have out-of-order execution, they went with an extremely long pipeline simply because they were targeting clock speeds of over 6ghz (physics bit them in the ass, and they could only clock it at 3.2ghz, and only with massive cooling solutions)

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    @@phirenz Yeah, very fair points. Just some random thoughts: it isn't exactly optimal to build a super short pipeline for complex architectures, even if you want to reduce branch delay. Earlier x86 CPUs had an "in-concept" pipeline with 2 stages implemented (it wasn't called a pipeline per say, but the concept is the same). You had a fetch circuit (with memory control, registers and an instruction decoder) and an execution circuit (with an ALU and branching logic). I call this a pipeline because it can have one instruction at the fetch and another at the execution block at the same time. But cramming all the logic on just two stages really limits your maximum clock speed. The 486 was the first Intel CPU to implement a 5-staged pipeline (prefetch, decode 1, decode 2, execution and writeback), and it allowed Intel to quadruplicate its clock speed, going from 25 MHz to 100 MHz in a span of almost 4 years. With such small pipelines, branch delay was barely a problem. Moreover, longer pipelines also become more important when designing superscalar CPUs, like the first Pentium. It could issue two instructions at once, which means the logic for identifying hazards (dependence conflicts between instructions that use the same registers in the pipeline), as well as branch prediction, got much more complicated. In fact, it gets exponentially complex as you increase how many instructions the CPU issues. Pentium Pro/II/III used a triple issue system with out-of-order execution, which I don't think makes sense for a 5 staged-pipeline, but does for one with 10 stages. The more stages you put in it, the more complex logic you can put and break them down so they don't limit your maximum clock speed as much. From a design perspective, Netburst was actually brilliant, and very ahead of its time. It had speculative execution, a complete instruction rescheduling circuit that captured instructions mistakenly sent to the pipeline and reissued them later, a micro ops-based L1 instruction cache that made cached instructions skip some decoding stages in the pipeline, and a heck a ton more. It's an INSANELY complex architecture. The problem is, 90% of the things it introduced were all done to minimize the 1000 problems you have with such massive pipelines. If in a way Netburst felt like a, may I say, "natural" path for x86 to follow, it also strained itself because it was at the limit of what could have possibly been done on a single core architecture. You have this gigantic, complex core doing whatever it can for its pipeline not to stall. And Prescott did even more so in that regard (as it had an even more gargantuan pipeline), but as you said, it just went too far. In the end, it's all about tradeoffs. It makes sense to implement a longer pipeline if you have a bunch of stuff to put in a chip. But at the end of the day, even the most brilliant, complex and powerful circuits can't save you from a 31-staged pipeline. As for IBM, YES, but I think Apple felt it harder than the consoles. The PowerPC G5 (970 series) got insanely hot and power hungry, even with a dual core architecture. In some Power Mac G5 towers, they even put water coolers inside. They would leak and damage all the boards over time. Also, there has never been a G5 Mac laptop.

  • @SianaGearz
    @SianaGearz23 күн бұрын

    Love the effort on benching UT2k4!

  • @Vile-Flesh
    @Vile-Flesh22 күн бұрын

    This was very interesting. I didn't use Prescott much but my daily computer for a number of years was a Northwood Pentium 4 2.26ghz that someone at work gave me in 2006 and after getting it working again with a power supply swap it was a nice upgrade from the Pentium 3. I remember in 2010 after replacing the GeForce 440MX with an FX5200 I could finally watch 480p videos in full screen on youtube.

  • @tkoutnchannel
    @tkoutnchannel24 күн бұрын

    Great video. Northwood being faster than Prescott is one of those myths that gets repeated over and over. It's an impressive feat that they are essentially equal clock for clock given the extensive changes between their pipelines. If only the clockspeed gains materialized. If they did we would probably look back at prescott and netburst overall, differently.

  • @yukinagato1573

    @yukinagato1573

    23 күн бұрын

    Northwood is faster than Prescott on most non-optimized applications. Prescott introduced a lot to the table to compensate for its longer pipeline and resulting branch misprediction penalties. Besides doubling L1 and L2 cache and shrinking to a 90nm node, it also introduced SSE3, an improved implementation of Hyper Threading and more ways of "in-software branch prediction hints" (basically, programmers could tell "hints" to the P4 to indicate whether a branch should be taken or not taken). Many of these features require software optimization, better compilers or support for new instructions at the software level to improve performance. This is also what happened to Willamette (it introduced SSE2, among other things, and was fairly dependent on software optimization to compensate for its longer pipeline). When Prescott was first reviewed in 2004, everyone was baffled Intel would put a processor that seemed to be slower per clock than its predecessor, even with the promise of reaching unprecedented clock speeds. Today, what we see is that, in most cases, Prescott is marginally better than Northwood on most software. This is mainly because they were either patched or actually designed to benefit from Prescott's features. Of course there are some outliers, but this is the main reason why Prescott is faster than Northwood, at least nowadays.

  • @tkoutnchannel

    @tkoutnchannel

    23 күн бұрын

    @@yukinagato1573 Maybe :P I re-skimmed tom's hardware review from 2004 it has a number of test with software of the day. It is certainly a mixed bag (I skipped most of the synthetics because, iirc, Intel was fudging the numbers there, though it shouldn't really matter if we wanna compare intel v intel. Just not a fan of them in general.) It was certainly a disappointment but I just wouldn't call it slower than Northwood. You might be right and if we ran a suite with tests from (pre-netburst era) say 1998-2000 maybe we would see Prescott's architecture hurting the scores more. But those would certainly still run more than well enough for any practical purpose.

  • @Xaltar_
    @Xaltar_23 күн бұрын

    The larger cache made all the difference in gaming and other applications that benefit from it. I had a Northwood overclocked from 2.4ghz to 3.6ghz and my Prescott at the time couldn't go over 3.2ghz, the Northwood was significantly faster in every benchmark with it's 400mhz advantage. My Northwood was a seriously good overclocker however. I had the highest overclocked Northwood on air cooling on the tech forums I was on for years. Not as good as my Celeron 300A with it's 733mhz max and 600mhz daily use clocks but still a great chip. IMO, the Northwood was the last decent overclocker before we got into more modern, pushed to the limits (more rigorously binned) CPUs, on air and AIO cooling at least.

  • @jonjohnson2844
    @jonjohnson284424 күн бұрын

    Thanks I was debating whether to save 20 cents on a Prescott CPU in my next build :D

  • @mattelder1971
    @mattelder197122 күн бұрын

    I remember a lot of press at the time saying that Intel basically hit a brick wall at 4Ghz, and this pretty much explains why. They were so focused on that magic number that they refused to believe that they weren't going to be able to achieve it with the P4 designs.

  • @K31TH3R
    @K31TH3R18 күн бұрын

    I had a Northwood Pentium 4 2.6B (533MHz FSB) that was an absurd silicon lottery winner. It got me into the 4GHz club on air cooling when 4GHz often required sub ambient, and the chip was 24/7 stable at 3.93GHz @ 1.62V. I very rarely saw Prescott chips post FPS numbers as good as I was posting, and I was also proud that my much cheaper chip was capable of beating the $999 Gallatin based 3.73GHz P4EE in games.

  • @spg3331
    @spg333122 күн бұрын

    Great video

  • @Sazabizc
    @Sazabizc14 күн бұрын

    oh man... this brings back memory's. i started to really get into pc gaming with a gateway gaming pc back in 03, 700xl I think was its name. It had a 2.8 ghz northwood chip. I later upgraded the motherboard and ram so I could over clock and got the northwood to 3.8 ghz .unfortunately... I found out the hard way that northwood doesn't like to be overclocked ... it can suddenly die or just degrade to the point where it can barley hold stock clocks.. this happened to me 2 or 3 times . Had enough and switched to a 3.0 ghz prescott and man what a chip.. i got it up to 4.5 ghz and yes people complained it ran hot. I didn't care I had a koolance case with a integrated water cooling set up, so heat wasn't a issue for me. I used that system until something on the motherboard died and I had to upgrade. The early 2000s was a very nostalgic time for me... tech was really changing fast.. the geforce 6800 came out i think in 04. I got it for Christmas to replace my fx 5900 ultra . then Pcie came out... then sli... then new 775 socket.. then dual cores ect. unfortunately I was a young teen then, I couldn't keep up with all the changes . So I had to buy used or new old stock back in the day.

  • @mickwolf1077
    @mickwolf107724 күн бұрын

    I had & still have a 3.4 GHz Prescott but had high input lag playing L4D. Put in a 3.2 Northwood and no issues. The watercooled presscott was great keeping your legs warm.

  • @robsyoutube
    @robsyoutube23 күн бұрын

    Something worth noting in testing I did years ago on Linux. If something was compiled with SEE3 instruction support it was always slightly faster than the northwood. With the northwood winning in SSE2. This may have just been GCC specific but its worth noting anyways.

  • @matt5721
    @matt57214 күн бұрын

    I just have to say this might be the highest quality comment section in the history of KZread

  • @weelebaseknowles4410
    @weelebaseknowles441023 күн бұрын

    They said they would reach 10ghz but they didn’t hit 5

  • @lilkuz2005
    @lilkuz20058 күн бұрын

    I would be interested in seeing some thermal comparisons. One of my old P4 builds is running a 3ghz prescott and I would like to see if switching to a northwood would be worth the trouble.

  • @Mannard74
    @Mannard747 күн бұрын

    The best AGP video card was the 7950GT. Units that died under warranty were offered a replacement (XFX and EVGA at least) of a PCI-x 8800GT. The 7950GTs had a tendency to burn themselves up. After that, the best AGP got was the 7800GS afaik.

  • @ccanaves
    @ccanaves20 күн бұрын

    Would be interesting to test the difference in performance at different clock speeds. I remember at the time it was said that at lower frequencies the Northwood was faster than Prescott, but as frequency goes up, the difference gets smaller until a crossover point where Prescott starts to be faster. The reviews of the time (Tom's and Anand's) also claim a similar thing.

  • @leetattitude6808
    @leetattitude680821 күн бұрын

    At the time I had both a 2.8Ghz Northwood and a 3Ghz Prescott, and a Prometeia mach2.The Prescott overclocked 70-100 Mhz faster but had a lower 3DMark 01 score than the Northwood when they were both running around 4.5 Ghz. My Northwood + 9800pro got 4th place overall in the futuremark the hall of fame.

  • @BReal-10EC
    @BReal-10EC23 күн бұрын

    Weird how 20-21 years ago this new SMT or "Hyper-Threading" was a really big deal. Today Intel is saying SMT is dead.. I guess because of their E and super E cores? And you hypothesizing on what happened does make sense. Considering the amount of money tied up i the designs, and how long it takes to even get an engineering sample to test... I bet Industrial Espionage is a real concern for these companies. AMD, Intel, Nvidia, up-and-coming tech brands from China... they all want to know what the others are working on. I'm sure GPUs take just as much time and money... if not longer.

  • @krazownik3139

    @krazownik3139

    23 күн бұрын

    I mean, you could even hypothesize yourself by trying to interpret how the competition behave. When it comes to HT I believe that Intel is preparing something huge. Maybe not in the next gen, but in gen after that. I doubt that they ditch HT completely. In gaming rigs and average PCs it's no longer as important as during times with lesser number of cores, but when it comes to workstations SMT is a must to keep up with AMD offering, and those share the same architecture. The biggest issue with HT in modern multi-core type Intel CPU design is core affinity and the role of OS in it, which Intel doesn't have control on.

  • @yukinagato1573

    @yukinagato1573

    22 күн бұрын

    HT was done because they wanted to keep the P4 pipeline as busy as possible. By making it process another stream of data, either from a different program or from an application optimized to take advantage of it, you can keep the pipeline being fed even when it stalls, as in the case of a branch misprediction, or instructions dependent on the same registers as other already scheduled intructions. This is a major concern when making a CPU with such a long pipeline. However, HT was never meant to be power efficient (like basically every other major aspect of Netburst). SMT processing is slower than multi core processing and consumes more power, because it makes the pipeline more complicated. Some companies, like ARM, really resisted over the idea of implementing SMT in their processors, as they are literally meant to be power efficient. I believe that Intel thinks the amount of LP, E and P cores they are currently putting in their CPUs is big enough to do everything HT does in a more efficient way without complicating too much the architecture and potentially consuming more power.

  • @NUCLEARARMAMENT

    @NUCLEARARMAMENT

    17 күн бұрын

    Intel is ditching SMT in favor of rentable units. I'm not worried about it.

  • @BReal-10EC

    @BReal-10EC

    17 күн бұрын

    @@NUCLEARARMAMENT Rentable Units sounds like just another marketing term. That being said, I can totally see some new CPU architecture designed to work better without HT. It's a design choice. Back when CPUs has just one or two or four cores and limited frequency, HT was a real concern as Windows and programs started using more than one core, so having more cores even if virtual helped performance. That's not really a concern now as even Intel will sell you a reasonable 4 core CPU that can boost to almost 5 GHz.. you have to really go low in the skew stack to get very limited cores (number and performance per core) now.... Also wonder if maybe SMT/HT is another security risk. I am sure Intel doesn't want another Spectre level problem.

  • @NUCLEARARMAMENT

    @NUCLEARARMAMENT

    17 күн бұрын

    @@BReal-10EC It's not like you can entirely avoid SMT or branch prediction style mitigation. It's always going to be exploitable.

  • @mad1316
    @mad131623 күн бұрын

    Would love to see how the Gallatin core compares clock-for-clock.

  • @kungfujesus06
    @kungfujesus0623 күн бұрын

    Video encoding is likely getting some wins from sse3 instructions but I'd have to look at the ffmpeg code to be sure

  • @shihanafridhi9517
    @shihanafridhi95172 күн бұрын

    The Prescott is the first Pentium 4 CPU that can run Windows 10.

  • @jrherita
    @jrherita23 күн бұрын

    Great info! The results make sense - Prescott has double the L1 and L2 caches of Northwood, which explains why the games are faster. OTOH, the caches are higher latency (23 vs 16 cycles for L2), and of course the pipeline stall for a branch miss is much worse on Prescott. Prescott does have a stronger branch predictor though to offset some of this. The party trick of course for Prescott is 64-bit apps like 7-Zip should prefer it (assuming there's enough memory bandwidth) over Northwood.. but then there's the Northwood Extreme Edition.. :)

  • @gast128
    @gast12823 күн бұрын

    Old times. The deep pipelines may hurt branches in code which can only be mitigated a bit with a branch predictor. The Netburst architecture should have reached 10GHz clock speeds in 2010. Since that wasn't reachable they abandoned this road and went basically back to the Pentium 3 architecture with the Core chips which had a shorter pipeline and later an integrated memory controller.

  • @justingoretoy1628
    @justingoretoy16285 күн бұрын

    My roommate does deinterlaced standard definition videos at 30fps on a 55" LCD with the black bars. At least it isn't stretched. Also, there's another reason they went with the high clocks with a longer pipeline, the world was shocked by the multimedia capabilities of the Sony PS2, they needed a way to accelerate video decoding type tasks vs increasing frames and complexity in games. Even twice the CPU performance in branchey tasks won't give you the magnitude of exponential increase in DVD playback that a smart SIMD or vector extension approach would. We can see this dilemma continue to plague video game development well into the seventh generation of video game console hardware. The Power PC based approaches in the PS3, XBox360, AND Nintendo Wii were a consequence of the hangover we experienced from that era and that processing power based on particular workload dilema. Nowadays our x86/64 CPU's have vector extensions so performant they could be a graphics card.

  • @andrebachmann1475
    @andrebachmann147524 күн бұрын

    I think you missed the fact that Prescott supported 64 bit, while Northwood did not. So Intel had to go with Prescott because the marketing side wold have been difficult if AMD's Athlon 64 supports 64 bit, while Intel's Pentium did not.

  • @aaaalex1994

    @aaaalex1994

    24 күн бұрын

    Only the Socket 775 ones do. No released Socket 478 CPUs support Intel 64.

  • @tkoutnchannel

    @tkoutnchannel

    24 күн бұрын

    @@aaaalex1994 Correct for all intents and purposes but IBM had Intel make two 64bit enabled SKUs for socket 478 (SL7QB, SL7Q8). So they do technically exist.

  • @aaaalex1994

    @aaaalex1994

    23 күн бұрын

    @@tkoutnchannel I know. That's why I said "no released" processors. Also, even if you can get one of those rare CPU's, I'm not sure if the motherboard would support it...

  • @tkoutnchannel

    @tkoutnchannel

    23 күн бұрын

    @@aaaalex1994 Oh okay, I see. Maybe a clearer way to word it would be "released to retail". There are retail boards that support them fully, which is also interesting. Hard to find this info as it's not a configuration anyone would have run. I actually found one of these cpus in a scrap IBM 1U rack server.

  • @AgentLazarus

    @AgentLazarus

    23 күн бұрын

    ​@@aaaalex1994the motherboard would support it but with possible crashes or strange anomalies. It would work perfectly fine until it just doesn't randomly

  • @chazbotic
    @chazbotic23 күн бұрын

    from when i worked in industrial computer engineering, Prescott's introduction of SSE3 allowed much better performance for design and visualization software for modelling or CAM packages.

  • @bergePanzer581
    @bergePanzer58123 күн бұрын

    I built my first gaming rig with a 2.4A GHz chip using an ASUS P4S8X with its SiS chipset. I replaced that in the future with an Abit AI7 and a 3.0C GHz Northwood. It had Hyper Threading, so two threads at the time. I ran it overlocked at 3.6 GHz using a divider since the ram couldn’t keep up. God did that thing fly with its GeForce 4 Ti4200, and later ATi Radeon 9800 Pro.

  • @Shane-Phillips
    @Shane-Phillips18 күн бұрын

    I DIY'd my first PC in the P4/Athlon era, and my recollection of the tech discussion of the day is that outside of the most technical reviewers, the role of IPC in the performance of a CPU core was a lot less talked about and not particularly well understood, in that day it was all about winning the GHZ wars, and that I suspect was what Intel wanted, quite similar really to how the core wars became a thing for a while after the release of the first Ryzen parts. Looking at the problems Intel have had with their current i7/i9 parts, I don't suspect their corporate ethos has changed a whole lot. Regarding Northwood vs Prescott, the testing Hardware Unboxed performed does suggest cache is really important to gaming performance, so that is likely why gaming was better on prescott. Considering the importance a bump of 512k to 1M, it's surprising they didn't win gaming even harder.

  • @r3n846
    @r3n84617 күн бұрын

    I wonder what the apparent speed between the two would be. Which is faster at opening apps or general tasks.

  • @sniglom
    @sniglom12 күн бұрын

    I'm curious about whether CL2 DDR would improve the situation for the Northwood. Less cache means it should be more sensitive to memory timings.

  • @evolucion888
    @evolucion88815 күн бұрын

    Pentium 4 EE were bases on Scavenged Xeon CPUs based on the Northwood philosophy with 2MB L3 cache. 1) Improved branch predictor help offset the performance deficit when going deeper with 31 stage pipelines. That is why the performance between both weren't much different. 2) As frequency scales up, performance gains were more significant on Prescott than Northwood. 3) The last horrah for Netburst single core CPUs was the Cedar Mill CPU.

  • @m0l13xXx
    @m0l13xXx24 күн бұрын

    What about Cedar Mill? Is it refreshed shrinked die version of presscot? I had mine still running today at 3.2 GHz paired with radeon X1900XT for old windows xp games era

  • @NTGTechnology

    @NTGTechnology

    23 күн бұрын

    It's pretty much Prescott just on Intel's 65nm node.

  • @m0l13xXx

    @m0l13xXx

    23 күн бұрын

    @@NTGTechnology and so does the Gallatin for EE?

  • @allesbelegt

    @allesbelegt

    23 күн бұрын

    @@m0l13xXx Gallatin is 130nm and is the Xeon Gallatin adapted for Socket 478 and 775. Later EE are just higher clocked, with higher fsb, Variants of Pentium 4 and Pentium D (and for the EE for Pentium D, also with activated Hyperthreading).

  • @raphi154farel5
    @raphi154farel520 күн бұрын

    Got a prototype P4 1.5 GHz with massive Rambus memory in the first days of the P4. This PC was a beast performance wise and stable (yes, we made Rambus stable) but never was shipped to the market. Instead poor users had to use SDRam with hot P4 that could not make full use out of netburst architecture.

  • @mirotuh
    @mirotuh22 күн бұрын

    I had pentium 4 630 back then in my desktop and don't recall any overheating at all, although fans were quite loud sometimes. But I didn't mind so much and thought it's normal. However, I also had Acer notebook with P4 3GHz (cannot find more detailed specs anymore) and CPU fried itself after 2 years with all stock settings.

  • @shieldtablet942
    @shieldtablet94223 күн бұрын

    About RAM perf, it may not just be cache. Memory paralelism, data prefetch and branch prediction will affect memory tests. I imagine Prescott has these much updated, given the 50% increase in the pipepeline depth. Some chipsets do data prefecthing, like the nForce2, but here it will not be the case. AMD's 65nm also had some issues. It was requiring more voltage and clocking less. Power consumption wasn't bad though. They did never manage to get the same clocks out of it that they could from 90nm chips.

  • @jeffrydemeyer5433
    @jeffrydemeyer543323 күн бұрын

    it's hard to blame them for wanting higher clocks, multi threading was in its infancy there was talk about materials research that would be pushing clocks up to 10-20Ghz "soon". I'm really curious about the time line where multicore didn't happen but 20Ghz did

  • @laur6405
    @laur640512 күн бұрын

    A few days ago i found and restored a trashed old IBM computer that has a 3.0 ghz 800mhz fsb Prescott with HT from the factory. Jesus christ that thing runs so hot and the fans ramp up so often it's crazy. Had to recap the board because the caps were blown,probably because of the heat coming from that cpu

  • @CaptainSpicard
    @CaptainSpicard23 күн бұрын

    I had a Prescott 3.2GHz in a laptop (HP ZD8000) You can imagine that battery life was nil, and heat output was insane.

  • @BReal-10EC
    @BReal-10EC24 күн бұрын

    I still have a Dell Dimension 2400 (in storage now) that I upgraded to the Northwood 3.06 GHz with HT (from 2.2). It was a good CPU back when those were still fairly new. Heck, the 3.06 P4 was able to remain usable for a long time iirc. It was just the lack of an AGP slot on that Dell that made me upgrade- for gaming. Intel graphics back then was abysmal, while some ATI "on board" graphics were actually usable for light gaming.

  • @Igbf
    @Igbf22 күн бұрын

    I feel that your conclusion was totally on point. The same thing happened arguably with AMD Bulldozer on socket AM3+: I feel that a die shrink to 32nm of the existing architecture (Thuban) using the power savings to increase the core count from 6 to 8, would have been much better in every aspect than what ended being called FX-8150 …but AMD could not afford to waste so much money in R&D, specially at that moment when they were so far behind both technologically and economically. Luckily for consumers they went back on track at the end. But it took years. Years of virtually no competition and allowing intel to release mediocre quad cores and still stay ahead.

  • @KoleckOLP
    @KoleckOLP22 күн бұрын

    Our family pc used to have Prescott Pentium 4 650 with HT, and it wasn't a bad chip, but we had a huge cooler on it compared to the i5-650 we got later, that thing even overclocked on the stock cooler :D

  • @Mr_Meowingtons
    @Mr_Meowingtons23 күн бұрын

    Back in the day, I was running a Socket 478 3.4ghz HT Prescott 3GB if ram 7800GT playing World of Warcraft. What a time to be alive 😂

  • @kingdbag
    @kingdbag9 күн бұрын

    Northwood was no joke... it was the reason I flipped from AMD Athlon back over to Intel for a couple years. I overclocked my 3.4Ghz CPU to 4.5Ghz all with stock cooling.

  • @SteveChisnall
    @SteveChisnall18 күн бұрын

    Need to make a 3-way comparison with a Gallatin at the same clock-speed as the other two

  • @samohraje2433
    @samohraje243319 күн бұрын

    I had back in 2007 a motherboard that was labelled as " Prescott Ready " . PGA478 of course. It had 2 extra setups of coils and switching mosfets to be 100% able to power this heat element. Few months later i found on a scrap yard a PC with Pentium 4 extreme edition. The motherboard was dead but the cpu survived. I swapped it, tested it, and immidiately threw it away because it was so hot... Catastrophic Pentium 4 lineup. Those on 775 LGA socket were actually pretty good but at that time, the Dual core cpus took a great of a leed and procceed to kill everything behind even some AMD cpus... P4 on 478socket was tragedy

  • @wettuga2762
    @wettuga276222 күн бұрын

    I have a bunch of Pentium 4 computers stored away and I avoid them like the plague, but I might take a second look at those on a Socket 775 board. Some CPU models between 2.8/3.2Ghz were rated at "just" 84w. maybe I got some of those, plus the newer platform gives them some advantages over the Socket 478 ones.

  • @sharoyveduchi
    @sharoyveduchi12 күн бұрын

    This is really cool. I was reading up a few days ago on how Russia is making their own fabs and right now they've come up with machines that can do 350nm. They plan to get to 130nm in like 2 years. If they can get to 130nm, then they can make something like the Northwood Pentium 4 for themselves. They have plans to go beyond 130nm of course but I really hope one of them gets the idea to start cranking out low end desktop CPUs for the general market once they reach 130nm. Maybe we can smuggle some of them to the states.

  • @vojtechadame5860
    @vojtechadame586022 күн бұрын

    The Prescott misery actually comes to it's own with the Celeron D. These things were hot as regular P4s and slow as older Willamette P4s.

  • @laurelsporter4569

    @laurelsporter4569

    14 күн бұрын

    Not only that, but having to go out to the chipset to talk to one another made them OK at running multiple separate things, but especially poor at a multithreaded application (same as the Pentium D, but worse), right at the time that was starting to become a thing. With decent sized caches, the Pentium Ds at least weren't a terrible choice, if you couldn't get an Athlon64 X2. Whether the Celeron D, or original Covington Celerons, were Intel's worst processors, is difficult to determine...

  • @aliensounddigital8729
    @aliensounddigital872923 күн бұрын

    Prescott 2.8 GHz (E) (owned) = The only time I needed watercooling too keep the chip below 140 degrees. No heatsinks tried at the time worked. Before knowing what overclocking and underclocking was. The lower tiered chips were better though. Talking about the prescott C variant of chips which were lower clocked at 2.4GHz C (owned) - 2.8GHz C or so. Anything else over 2.8GHZ for this platform was super expensive.

  • @TheJuggtron
    @TheJuggtron24 күн бұрын

    i have a p4EE and a PEE965 and theyre fun to play around with - the 965 with ddr3 is as stable as.

  • @madwolf-us4sc
    @madwolf-us4sc23 күн бұрын

    A longer pipeline usually makes IPC higher if you could keep the pipeline filled, the problem is if there is a pipeline missed the pipeline have to be flushed which lower IPC. During that time brunch prediction is still poor.

  • @Paxmax
    @Paxmax21 күн бұрын

    Oh yeah, these where the times! I laughed my arse off at Willimette, never bought it, dodged RAMBUS, Stayed on P3 and even got some 1.2G Tualatins, experimented with Slot-ket(Slocket?) adapters. Jumped on the Northwood train at 1.8G until 3G, then I went with AMD until the nice Core2Duo's started coming strong, what a relief it felt like. Today I run a mixed setup, some Intel rigs some AMD rigs. Trying to pick the morsels out of the chaos.

  • @nathanahubbard1975
    @nathanahubbard197523 күн бұрын

    Did you mean to say that some people prefer "interlaced" video at 30fps? Because you were talking about de-interlacing it yourself.

  • @laurdy
    @laurdy23 күн бұрын

    the next generation from Prescott was supposed to have 40-50 pipeline stages and reach 5GHz+

  • @taznz1
    @taznz123 күн бұрын

    The P4 was FSB bandwidth bound, due to the RAM and system I/O sharing the same bus, this became more and more evident as Intel increased CPU Multiplier and thus clock speeds, Prescott's large cache was intel's Band-Aid fix to the problem. If you want a Northwood to outperform Prescot simply get a P4C Northwood, and DDR2-1000 RAM and overclock the FSB to 250Mhz (1000 MT/s). I had a Northwood P4 HT 3.0ghz, and technically downgraded it to the better overclocking P4 HT 2.8C, then with Corsair DDR2-1000 ram, overclocked the FSB to 250MHz (1000 MT/s) which with the stock 14x multiplier increased the CPU to 3.5Ghz, my flat mate at the time had a Prescott P4 3.6Ghz and it lost to my system in every benchmark we tested, despite having a 100MHz advantage and double the cache memory. The Pentium 4 simply needed a faster FSB and RAM to feed it at higher clock speeds.

  • @complexacious
    @complexacious23 күн бұрын

    I used to have a Northwood P4 3GHz but I had RDRAM and not DDR. I remember reading constantly that DDR was "better" and I'm sure the Athlon was perfectly happy with DDR, but the P4 was not. The P4 was either built for or was rescued by the benefits of RDRAM and I feel like any comparison of a P4 running DDR isn't really giving the architecture it's best shot. I could easily be completely wrong, but I think the trick to RDRAM was that when the P4 hit a pipeline flush or cache miss, RDRAM could respond faster and get the CPU un-stalled quicker; DDR was better when the CPU was more in control because it had the throughput but not the random access performance. But that's me just surmising based on the fact that DDR is low clock - high throughput where RDRAM ran significantly higher clocks and so you'd think that would mean it could set up more transfers per second even if those transfers were smaller?

  • @laurelsporter4569

    @laurelsporter4569

    14 күн бұрын

    The later dual-channel DDR chipsets were good, with fast DDR (like 400MHz). Intel's deal with Rambus prevented them from offering a DDR chipset for a bit (I think it ended up being close to two years), and then a lot of budget DDR boards were single-channel, after that. RDRAM could handle more outstanding memory IOs than SDRAM, at the time, so could get more efficient use out of its bus.

  • @Fahrenheit38
    @Fahrenheit3823 күн бұрын

    I use socket 478 for most of my Win9x computing. Wonder how they would compare in Win98SE

  • @myne00
    @myne0023 күн бұрын

    The reason Prescott was hotter was that they had problems with that node leaking too much. This is an old memory, but from what I recall they had issues with the insulator used at the time.

  • @myne00

    @myne00

    23 күн бұрын

    Your speculation on the clock speed goals is broadly correct. The era is nicknamed the clockspeed wars. Amd beat Intel to 1ghz and that kinda lit a fire under their ass. They were determined to smash all the next barriers and they did. But while they were zigging, AMD zagged, first dropping x64, and shortly following with dualcore. They abandoned netburst, rejigged the Pentium M, to "Core" and retook the lead with the Core 2.

  • @sniglom
    @sniglom12 күн бұрын

    Sure the Northwood with more cache would probably been faster than Prescott, but remember where Intel was heading. They wanted to go for 10GHz with Netburst. For that reason I think of Prescott as a step in that direction, Intel increased the pipeline depth without sacrificing performance. That's a step towards that 10GHz goal, but perhaps not a meaningful improvement for the end consumer of a prescott.

  • @e8root
    @e8root23 күн бұрын

    Pentium D was NOT FAILURE because it was cheap and ran on many older P4 mobos. I got Pentium D 805 + used Asus with Nvidia chipset for ~650 PLN (Polish currency) whereas cheapest Athlon X2 3600+ was more than that itself and I would need really good mobo to have any chance overclocking which was like ~500 PLN itself. Then AMD Athlon X2 had issue with timers on Windows XP (which wasn't even patched when I was using 805) that caused random glitches in games and forced users to use workarounds like AMD Dual-Core optimizer which just forced games to use single core. On Core 2 Duo no one ever had any issue with having 2 cores and the same was true for Pentium D. Oh and Pentium D 805 with its 533MHz FSB meant you could rather easily hit 4GHz on it and have 800MHz FSB. Well, I had cheap tower cooler and ran 3.8GHz on 250W PSU (FSP - looked like the same as their 400W). I mean this PC was absolutely amazing for the price. Putting much more money I could have games run slightly better... better was to put the money difference to get better GPU.

  • @TK199999
    @TK1999993 күн бұрын

    I had friend who owned a Pentium 4 Prescott, it breathed fire and terrorized a nearby village.

  • @AIM9XSW
    @AIM9XSW23 күн бұрын

    I built two 2.8 GHz Prescott systems (Intel 865) for multiplayer retro gaming, each with a Win98SE/XP dual boot configuration. This provides excellent performance for demanding MS-DOS games while still being fast enough for DOSBox sessions in Windows XP for speed-sensitive games. When paired with Radeon X850-series GPUs, Prescott offered enough performance to run Far Cry, Half Life 2, Star Wars Battlefront II (classic), and UT2004 with no issues at 1280x1024. Also had a Northwood-based Dell Optiplex GX260 with the very stable Intel 845 chipset. The Northwood/845 combo also ate demanding mid-late ‘90s MS-DOS games for lunch, and provided superb compatibility for Windows 98SE. Both are good choices for retro gaming up to about 2006, depending on clock rates, memory, and graphics configurations. If the goal is just Windows 98/98SE, the 845 chipset is well-supported, with zero issues with the chipset driver. The Intel 865 has some issues in Win98 regarding its implementation of the USB 2.0/Enhanced Host Controller Interface, but is an otherwise great choice if ok with USB 1.x. This is an excellent video of the performance comparisons. Thank you for taking the time to film this!

  • @kodato92
    @kodato9218 күн бұрын

    Extra cache did the trick in some benchs

  • @geeknproud321
    @geeknproud32122 күн бұрын

    I had an Athlon XP and a bunch of Athlon 64s and 2.4GHz was normally fairly easy to reach. At that speed they wiped the floor with the majority of Pentium 4s.

  • @reyvoi5413
    @reyvoi54132 күн бұрын

    i remember swapping that penitum 4 for the pentium D and having a heater and a pc

  • @matt5721
    @matt57214 күн бұрын

    I rocked a Prescott for years... It did the job. Don't come at my nostalgia like that. It ran winmx and snes9x just fine and I remember it fondly.

  • @ronny332
    @ronny33222 күн бұрын

    The best "P4" were the first Core2Duo or Core2Quad, up to the Q9650, which I own until this day 🙂. They share the same (later) socket 775, but the performance and power consumption is much better. Nice video, well done 🙂 What I want to extend: "Most computers had a bad air cooling solution". Well of course they had, the P3 1.4 never came close to 30 Watts, the normal P3 were at 15-20 Watts. Intel climbed the latter of performance just be raising the power consumption. If you compare the last P3 CPUs with a P4, the performance to watt calculation is bad, very bad. Even for the later models. 100 Watts are more was normal. The need for noisy (thermal controlled) coolers started about in the era of the Athlon XP and P4. Before the computers were also loud, caused by the bad fans, but the power consumption of a CPU was at about 10 Watts.

  • @KrGsMrNKusinagi0
    @KrGsMrNKusinagi016 күн бұрын

    OMG my prescott Hyperthread 3.2 GHZ.. the front side bus was 800 mghz.. 200x4.. This system lasted me for 10 years and played everything i used it for.. In fact this PC is till running and i used it as my XP machine for older stuff

  • @laurelsporter4569
    @laurelsporter456914 күн бұрын

    Rambus/RDRAM also did no favors, in the market. It got unfairly destroyed by most DRAM companies, because they hated the idea of paying royalties (running hot, with high latency, was a maturity problem, fixed later on). Then, Intel couldn't enable the DDR support in the SDRAM chipsets. So, early P4s were either expensive and fast, or cheaper but slower than Athlons and Athlon XPs. Once they got DDR support, performance was pretty good across the board.

Келесі