Stanford Seminar - New Golden Age for Computer Architecture

Stanford Seminar - New Golden Age for Computer Architecture - John Hennessy

EE380: Computer Systems Colloquium Seminar
New Golden Age for Computer Architecture: Domain-Specific Hardware/Software Co-Design, Enhanced Security, Open Instruction Sets, and Agile Chip Development
Speaker: John Hennessy, 2017 Turing Award Recipient / Chairman, Alphabet
In the 1980s, Mead and Conway democratized chip design and high-level language programming surpassed assembly language programming, which made instruction set advances viable. Innovations like RISC, superscalar, multilevel caches, and speculation plus compiler advances (especially in register allocation) ushered in a Golden Age of computer architecture, when performance increased annually by 60%. In the later 1990s and 2000s, architectural innovation decreased, so performance came primarily from higher clock rates and larger caches. The ending of Dennard Scaling and Moore's Law also slowed this path; single core performance improved only 3% last year! In addition to poor performance gains of modern microprocessors, Spectre recently demonstrated timing attacks that leak information at high rates. We're on the cusp of another Golden Age that will significantly improve cost, performance, energy, and security.
These architecture challenges are even harder given that we've lost the exponentially increasing resources provided by Dennard scaling and Moore's law. We've identified areas that are critical to this new age.
Turing lecture presented at 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) and published in the Proceedings.
About the Speaker:
John L. Hennessy, Professor of Electrical Engineering and Computer Science, served as President of Stanford University from September 2000 until August 2016. In 2017, he initiated the Knight-Hennessy Scholars Program, the largest fully endowed graduate-level scholarship program in the world, and he currently serves as Director of the program.
His honors include the 2012 Medal of Honor of the Institute of Electrical and Electronics Engineers and the ACM Turing Award (jointly with David Patterson). He is an elected member of the National Academy of Engineering, the National Academy of Science, the American Academy of Arts and Sciences, The Royal Academy of Engineering, and the American Philosophical Society. Hennessy earned his bachelor's degree in electrical engineering from Villanova University and his master's and doctoral degrees in computer science from the Stony Brook University.
For more information about this seminar and its speaker, you can visit ee380.stanford.edu/Abstracts/...
Learn more: bit.ly/WinYX5
0:00 Introduction
0:28 Outline
1:27 IBM Compatibility Problem in Early 1960s By early 1960's, IBM had 4 incompatible lines of computers!
3:33 Microprogramming in IBM 360 Model
4:47 IC Technology, Microcode, and CISC
10:32 Microprocessor Evolution • Rapid progress in 1970s, fueled by advances in MOS technology, imitated minicomputers and mainframe ISAS Microprocessor Wers' compete by adding instructions (easy for microcode). justified given assembly language programming • Intel APX 432: Most ambitious 1970s micro, started in 1975
15:12 Analyzing Microcoded Machines 1980s
17:40 From CISC to RISC . Use RAM for instruction cache of user-visible instructions
19:24 Berkeley & Stanford RISC Chips
20:06 "Iron Law" of Processor Performance: How RISC can win
22:59 CISC vs. RISC Today
25:19 From RISC to Intel/HP Itanium, EPIC IA-64
26:49 VLIW Issues and an "EPIC Failure"
29:18 Fundamental Changes in Technology
31:00 End of Growth of Single Program Speed?
33:28 Moore's Law Slowdown in Intel Processors
33:57 Technology & Power: Dennard Scaling
35:02 Sorry State of Security
36:35 Example of Current State of the Art: x86 . 40+ years of interfaces leading to attack vectors · e.g., Intel Management Engine (ME) processor . Runs firmware management system more privileged than system SW
40:33 What Opportunities Left?
41:59 What's the opportunity? Matrix Multiply: relative speedup to a Python version (18 core Intel)
43:14 Domain Specific Architectures (DSAs) • Achieve higher efficiency by tailoring the architecture to characteristics of the domain • Not one application, but a domain of applications
44:11 Why DSAs Can Win (no magic) Tailor the Architecture to the Domain • More effective parallelism for a specific domain
46:08 Domain Specific Languages
47:14 Deep learning is causing a machine learning revolution
48:27 Tensor Processing Unit v1
48:48 TPU: High-level Chip Architecture
49:55 Perf/Watt TPU vs CPU & GPU
50:34 Concluding Remarks

Пікірлер: 11

@crhu3193 жыл бұрын
51:55 "re-verticalizing" back to the 1970s, expect to see very specialized stacks again, esp for power control (EV, smart grid) where security has to be ironclad, and standards have to hold across the entire planet, and every country will demand verifiability down to the transistor, and won't accept any assurance even from its allies.
@pasamkiranteja66545 жыл бұрын
High level synthesis hopefully will one day achieve it's said goals of matching the corresponding RTL design flow in terms of output performance and efficiency and more insightful design space
@kevin2706
3 жыл бұрын
can you expand on this
@harshitgupta7740
3 жыл бұрын
I think that's what he said at around 46:26 that it has been tried and not worked as a viable solution.
@farmerwang1973
Жыл бұрын
This is already true today. Units on NVIDIA GPU are designed using HLS.
@dgillies54205 жыл бұрын
I think though, that the 5-year-old - when asked about the difference between a cat and a dog - would be generating his answer on the fly by diffing a cat and a dog in front of them. If you asked them the difference between a cat and dog without having any pictures nearby, it's a much tougher question to answer. The most widely-adopted ML models are still given away for free (think google picture classification and translation which is at the level of a 12-year old with a good dictionary). Also if you show a cat to a picture classifier you get an answer of 0.83% cat and 0.22% dog, which is a little bit sketchy. I am skeptical about how soon ML applications will go beyond experimental / free and reach profitability.
@ConorDoesItAll5 жыл бұрын
Is this a Computer Systems course?
@numb20072007
3 жыл бұрын
you may be a reason CA advancements are at a halt. jk.
@crhu3193 жыл бұрын
8:30 Yup I wrote 6502 assembly code without any assembler even, "POKE"ing it into RAM and then saving the memory to tape. Not small programs either. 2k long.
@crhu3193 жыл бұрын
38:15 Spectre just showed what garbage x86 is. Anyone who relied on Intel "enclaves" should sue for every new bug they find every month.
@crhu3193 жыл бұрын
43:09 "C like" scripting languages are the worst of all worlds. LISP derivatives with tail recursion will always mop the floor with them on any math. C++ or D should also, unless the coder is a dummy (many are) and can't use static binding. Stop teaching Java and Python. JavaScript is so much better...and the cloud languages (Haskell, Scala, etc) exist for good reason.