AWS re:Invent 2018: Building Serverless Analytics Pipelines with AWS Glue (ANT308)

Organizations need to gain insight and knowledge from a growing number of IoT, APIs, clickstreams, and unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, we cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL pipelines for your data lake. We also discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Please join us for a speaker meet-and-greet following this session at the Speaker Lounge (ARIA East, Level 1, Willow Lounge). The meet-and-greet starts 15 minutes after the session and runs for half an hour.

Пікірлер: 6

  • @hxz116
    @hxz1164 жыл бұрын

    Mehul Shah gets it right. He explains Glue very well.

  • @varunsood8509
    @varunsood85095 жыл бұрын

    How to manage incremental loading while reading the data from relational database? Is there a way to create a filter/parameter on date while reading from source database? Thank you!

  • @ND-gn8tc

    @ND-gn8tc

    3 жыл бұрын

    You can do that by choosing a bookmark column (last updated date for example). It is a good idea to make sure that column is indexed in your source database.

  • @sundaraanga3916
    @sundaraanga39164 жыл бұрын

    Glue execution model explanation is not clear

  • @bestentertainment2728
    @bestentertainment27285 жыл бұрын

    Another pretty powerpoint slides webinar! No real demo on howto! Typical AWS.

Келесі