Data Science Lecture 21: Big data [part of the IDS course @RWTH]

Ғылым және технология

Data Science Lecture 21: Big data [part of the IDS course @RWTH]
This online lecture of the Introduction to Data Science (IDS 2021-2022) course was given by prof.dr.ir. Wil van der Aalst (www.vdaalst.com, @wvdaalst) at RWTH Aachen University. #datascience
See • Introduction to Data S... for the other lectures (see list below).
More about the course Introduction to Data Science given by PADS@RWTH:
Data science has emerged as a new and important discipline. Data science can be viewed as an amalgamation of classical disciplines, such as statistics, data mining, databases, and distributed systems. This combination helps to turn data into value for the profit of individuals and society. In addition, new challenges are constantly emerging and make this field highly dynamic and appealing. These are not just in terms of size (“Big data”), but also regarding complexity of the questions to be answered. Data science provides numerous opportunities to develop exciting products and services. With technological evolution, the boundaries of what algorithms can perform will be pushed even further. This development raises significant questions that will be addressed in this course.
The course is mainly focused on data analysis and discusses a substantial range of analytical approaches and tools. All in all, the course aims to provide a comprehensive overview of data science using analytical tools applied to real-life and synthetic datasets.
The course discusses three main parts of data science:
(1) Data science infrastructure concerned with volume and velocity. The topics include instrumentation, big data infrastructures and distributed systems, databases and data management, and programming. The main challenge is to make making things scalable and instant.
(2) Data science analysis concerned with extracting knowledge from data. The topics cover statistics, data and process mining, machine learning and artificial intelligence, operational research, algorithms, and data visualization. In this part, the main challenge is to provide answers to known and unknown unknowns.
(3) Data science effects concerned with people, organizations, and society. The topics discuss ethics and privacy, IT laws, human-technology interaction, operations management, business models, and entrepreneurship. Here, the main challenge is to implement data practices in a responsible manner.
#datascience #machinelearning #datamining #processmining #artificialintelligence #RWTHAachenUniversity #RWTH
Visit www.vdaalst.com and www.pads.rwth-aachen.de/ for more information. Also see the #processmining courses given by Prof.dr.ir. Wil van der Aalst.
Lecture 1: Introduction
Lecture 2: Basic data visualization/exploration
Lecture 3: Decision trees
Lecture 4: Regression
Lecture 5: Support vector machines
Lecture 6: Neural networks (1/2)
Lecture 7: Neural networks (2/2)
Lecture 8: Evaluation of supervised learning problems
Lecture 9: Clustering
Lecture 10: Frequent items sets
Lecture 11: Association rules
Lecture 12: Sequence mining
Lecture 13: Process mining (unsupervised)
Lecture 14: Process mining (supervised)
Lecture 15: Text mining (1/2)
Lecture 16: Text mining (2/2)
Lecture 17: Data preprocessing, data quality, binning, etc.
Lecture 18: Visual analytics & information visualization
Lecture 19: Responsible data science (1/2)
Lecture 20: Responsible data science (2/2)
Lecture 21: Big data
Lecture 22: Closing

Пікірлер

    Келесі