Stanford Seminar - Programming Tools for the Future of Data Science, Sarah Chasins

Sarah Chasins is an Assistant Professor at University of California, Berkeley.
This talk was given on January 21, 2022.
In the future, anyone will be able to write programs that are currently the exclusive domain of advanced programmers. For now, there's still a big gap between the programming skills of occasional programmers - social scientists, journalists, data scientists - and the skills required to write the programs they want. However, the need is pressing; while there are about 20 million programmers in the world, there are now at least twice as many end users writing code to work with data. In this talk, I'll describe Helena, an ecosystem of programming languages and programming tools that I have used to study how we can support social scientists programming needs. Non-programmers use Helena to collect datasets from the web and, more broadly, to develop custom web automation programs. It brings together the following key innovations: (i) The Helena programming environment uses Programming by Demonstration (PBD); it takes a single-shot learning approach, synthesizing scripts based on recording a single user demonstration. (ii) Helena's adaptive replayer makes scripts robust to webpage redesigns and obfuscation, which enables longitudinal experiments. (iii) With novel language constructs, non-coders can conduct programming tasks usually limited to expert programmers - e.g., failure recovery, parallelization.
Building Helena demanded novel insights into the web automation domain, but it also required a new design approach, a tightly coupled union of techniques from Programming Languages (PL) and Human-Computer Interaction (HCI). I'll connect this work to a discussion about how my lab is bringing together techniques from PL and HCI and why the PL-HCI combination is so powerful for democratizing computation.
Learn more about Stanford's Human-Computer Interaction Group: hci.stanford.edu
Learn about Stanford's Graduate Certificate in HCI: online.stanford.edu/programs/...
View the full playlist of Stanford Seminars here: • Stanford CS547 - Human...
#datascience
0:00 Introduction
1:17 Bridge the gap
2:15 My background
2:47 Agenda
3:19 Framing
8:28 Program Synthesis
9:03 Pop quiz
9:36 Pop quiz 2
10:42 How to get to a better position
11:42 What we will talk about
12:03 How many people ave written a web scraper
13:09 Housing voucher programs
14:45 End user web automation
15:52 Web automation programming
16:43 Why is this so hard
22:01 Web automation demo

Пікірлер

    Келесі