Stanford Seminar - Big Data is (at least) Four Different Problems
"Big Data is (at least) Four Different Problems" - Mike Stonebraker of MIT
Support for the Stanford Colloquium on Computer Systems Seminar Series provided by the Stanford Computer Forum.
Speaker Abstract and Bio can be found here: ee380.stanford.edu/Abstracts/1...
Colloquium on Computer Systems Seminar Series (EE380) presents the current research in design, implementation, analysis, and use of computer systems. Topics range from integrated circuits to operating systems and programming languages. It is free and open to the public, with new lectures each week.
Learn more: bit.ly/WinYX5
0:00 Introduction
1:08 The Meaning of Big Data - 3 V's
1:52 Big Volume - Little Analytics
8:34 The Big Disruption
9:59 Data Science Template
10:42 Complex Analytics on Array Data - An Accessible Example
12:42 Array Answer
13:27 st option)
14:33 Map-Reduce
18:38 The Future of Hadoop
20:29 nd option -- 2015)
24:20 rd option)
26:14 th option)
27:37 The Future of Complex Analytics, Spark, R, and ....
31:17 Big Velocity - 2nd Approach
33:44 In My Opinion....
35:14 Possible Storm Clouds
37:23 Big Variety
39:57 Traditional Solution -- ETL
46:09 And there is NO Global Data Model
47:20 Why Integrate Silos?
47:39 Why is Data Integration Hard?
48:49 Data Integration (Curation) AT SCALE is a VERY Big Deal
49:08 A Bunch of Startups With New Ideas
49:25 To Achieve Scalability....
50:03 Data Lakes
50:43 Take away
Пікірлер: 5
The problem I find with Stonebraker talks generally is that it is never quite possible to filter the professorial objectivity from the vested interests associated with being associated as CTO / Chief Scientist with data management companies with definite points of view.
NoSQL = "NOt yet SQL".. the best definition I heard :D
This is so inspiring.
Presentasi yang sangat bagus
What technical experts don't account for is that people are too lazy to move their data from AWS S3