Airflow Data Pipeline with AWS and Snowflake for Beginners | Project

👍 Smash the like button to become an Airflow Super Hero!
❤️ Subscribe to my channel to become a master of Airflow
🏆 BECOME A PRO: www.udemy.com/course/the-comp...
🚨 My Patreon: / marclamberti
Build a data pipeline in Airflow and the Astro SDK that interacts with AWS and Snowflake.
You can find the text version of that video and orignal DAG here:
astro-sdk-python.readthedocs....
Materials:
➡️ orders_data_header.csv
order_id,customer_id,purchase_date,amount
ORDER1,CUST1,1/1/2021,100
ORDER2,CUST2,2/2/2022,200
ORDER3,CUST3,3/3/2023,300
➡️ Env vars
AIRFLOW__CORE__ENABLE_XCOM_PICKLING=True
AIRFLOW__ASTRO_SDK__SQL_SCHEMA=ASTRO_SDK_SCHEMA
➡️ SQL requests
CREATE DATABASE ASTRO_SDK_DB;
CREATE WAREHOUSE ASTRO_SDK_DW;
CREATE SCHEMA ASTRO_SDK_SCHEMA;
CREATE OR REPLACE TABLE customers_table (customer_id CHAR(10), customer_name VARCHAR(100), type VARCHAR(10) );
INSERT INTO customers_table (CUSTOMER_ID, CUSTOMER_NAME,TYPE) VALUES ('CUST1','NAME1','TYPE1'),('CUST2','NAME2','TYPE1'),('CUST3','NAME3','TYPE2');
CREATE OR REPLACE TABLE reporting_table (
CUSTOMER_ID CHAR(30), CUSTOMER_NAME VARCHAR(100), ORDER_ID CHAR(10), PURCHASE_DATE DATE, AMOUNT FLOAT, TYPE CHAR(10));
INSERT INTO reporting_table (CUSTOMER_ID, CUSTOMER_NAME, ORDER_ID, PURCHASE_DATE, AMOUNT, TYPE) VALUES
('INCORRECT_CUSTOMER_ID','INCORRECT_CUSTOMER_NAME','ORDER2','2/2/2022',200,'TYPE1'),
('CUST3','NAME3','ORDER3','3/3/2023',300,'TYPE2'),
('CUST4','NAME4','ORDER4','4/4/2022',400,'TYPE2');
Enjoy 🔥
Ready?
Let's go!l

Пікірлер: 47

  • @user-vb7im1jb1b
    @user-vb7im1jb1b9 ай бұрын

    Thanks Marc! Great Tutorial!

  • @MarcLamberti

    @MarcLamberti

    9 ай бұрын

    You’re welcome 🫶

  • @steffot8468
    @steffot846810 ай бұрын

    Thanks man , very much appriciated.

  • @MarcLamberti

    @MarcLamberti

    10 ай бұрын

    You’re welcome

  • @user-rx5ry2ky6l
    @user-rx5ry2ky6l Жыл бұрын

    Learn an easiest way to build dev env for airflow data pipeline. Great!!

  • @user-bl7dy8fg7t
    @user-bl7dy8fg7tАй бұрын

    Still works 😄. really cool pipeline

  • @MarcLamberti

    @MarcLamberti

    Ай бұрын

    Good to know 🥹

  • @MarcLamberti
    @MarcLamberti10 ай бұрын

    For those who don't see the host anymore, in the account field, make sure you add: youraccountnumber.yourregion.yourcloud For example: nb71231.eu-west-3.aws Basically, take everything between and snowflakecomputing.com and leave the region field empty Enjoy

  • @ornachshon1
    @ornachshon17 ай бұрын

    What is the best way to pass CSV between tasks? for example: one function parse a JSON to CSV second function take the CSV to S3 bucket.

  • @datalearningsihan
    @datalearningsihan Жыл бұрын

    I was struggling with airflow installation, so I purchased your udemy course. Hoping, will get some better suppport.

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    Keep me posted ;)

  • @datalearningsihan

    @datalearningsihan

    Жыл бұрын

    @@MarcLamberti did not really help. I asked for a refund to the udemy. I had issues with the installation in your way. My CPU was maxing out. Nothing really was working after I was able to install the airflow in your recommended way. So, it was a bad first impression of the course. So, had to ask for a refund. Sorry.

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    ​@@datalearningsihan you don't have to be sorry. I believe your issues is more related to Docker than Airflow or the course. Check that you have enough memory. Otherwise, you can still install Airflow manually with pip install

  • @mellownun9220
    @mellownun9220 Жыл бұрын

    Is there a benefit to using airflow instead of snowpipe for this purpose?

  • @alejandroflorian9574

    @alejandroflorian9574

    4 ай бұрын

    Imagine needing to consume and migrate not just a single table, but over 100. You'd have to create 100 pipes for inserting the data. Now, with Airflow, it's easier to customize and scale this process.

  • @YEM_
    @YEM_3 ай бұрын

    How do we manage connections credentials not via UI? I mean deploy them as code with a reference to secrets manager.

  • @Aman-lv2ee
    @Aman-lv2ee2 ай бұрын

    Thanks Marc, I am facing this error when connecting to Snowflake from airflow; Airflow is running in docker compose (the file you provided in udemy course), ERROR- 250001: 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting I checked all the parameters but still facing this issue ( Airflow version - v2.8.1)

  • @kurihama4629
    @kurihama4629 Жыл бұрын

    anyone else having issues with snowflake connection? I followed everything but it doesn't seem to work. Not even sure how to know what went wrong

  • @aldoaguirre9864

    @aldoaguirre9864

    Жыл бұрын

    yeah, same problem for me 250001: 250001: Could not connect to Snowflake backend after 0 attempt(s).Aborting

  • @awallaustin
    @awallaustin10 ай бұрын

    can you check on creating the connection within airflow to snowflake? the interface has changed slightly and now i'm unable to create a connection. i've verified that all parameters are correct and yet the test is still failing

  • @isaachernandez3094

    @isaachernandez3094

    10 ай бұрын

    Yes I have the same issue

  • @MarcLamberti

    @MarcLamberti

    9 ай бұрын

    I’ve just released a new video that shows how to make that connection kzread.info/dash/bejne/i46IxauiZdKddqw.htmlsi=8-8-Q8LUasYfz2V0

  • @salilmarathponmadom7255
    @salilmarathponmadom7255 Жыл бұрын

    At SQL Requests STEP -> I had to execute the query to create Dataware House and Schema separately since I ran into a " No active warehouse selected in the current session " Error, later trying to Insert values into the table. Also, in the Airflow UI, in connections I don't have the Amazon S3 option !

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    Use the AWS option instead of the connection. Thanks for sharing

  • @aminemaasri2622
    @aminemaasri26228 ай бұрын

    salut marc, est ce que je dois faire astro dev start encore une fois lorsque je crée le nouveau dag dans le dossier dags

  • @MarcLamberti

    @MarcLamberti

    8 ай бұрын

    Nop

  • @ruchipandey9721
    @ruchipandey9721 Жыл бұрын

    I'm unable to see Amazon S3 on airflow localhost. Can you please help me with that?

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    Did you install the Amazon provider?

  • @Yonatanx3

    @Yonatanx3

    Жыл бұрын

    Hi Ruchi, I'm facing the same issue. Did you mange to solve this? Thanks

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    @@Yonatanx3 Use Amazon Web Services for the connection type ;)

  • @NardeepML
    @NardeepML9 ай бұрын

    Hi, when creating connections in airflow, the test button is greyed out and says 'Testing connections is disabled in Airflow configuration. Contact your deployment admin to enable it' please can you help on this, so test is enabled, I can see in config it's set to disabled, just need to know how to switch it. Thanks

  • @MarcLamberti

    @MarcLamberti

    9 ай бұрын

    Yes. That has been introduced in 2.7. Change the configuration setting AIRFLOW__CORE__TEST_CONNECTION to enabled

  • @kkampassi4820
    @kkampassi482010 ай бұрын

    For me there is no option to add the host url for snowflake as connection type ....please suggest something

  • @MarcLamberti

    @MarcLamberti

    10 ай бұрын

    You need to install the apache-airflow-providers-snowflake==4.4.0 provider

  • @kkampassi4820

    @kkampassi4820

    10 ай бұрын

    @@MarcLamberti I tried but still it is not working, could you please share the git repo for the entire process, this gonna be of great help for us

  • @MarcLamberti

    @MarcLamberti

    9 ай бұрын

    @@kkampassi4820 Look at the pinned comment :) I will release a video tomorrow that uses Snowflake as well with the updated way

  • @sampyism
    @sampyism5 ай бұрын

    I couldn't find the "Amazon S3 connection" on the airflow ui. What's going on?

  • @sampyism

    @sampyism

    5 ай бұрын

    can someone explain how I can install the s3 provider package?

  • @alex45688
    @alex456883 ай бұрын

    I can't see amzon s3 connection type in airflow web

  • @MarcLamberti

    @MarcLamberti

    2 ай бұрын

    It’s AWS now

  • @alex45688

    @alex45688

    2 ай бұрын

    @@MarcLamberti ok

  • @konnen4518
    @konnen4518 Жыл бұрын

    I just can't stand that accent

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    Me too 🤢

  • @akj3344

    @akj3344

    Жыл бұрын

    @@MarcLamberti I love your accent. Dont listen to ungrateful morons.

  • @MarcLamberti

    @MarcLamberti

    Жыл бұрын

    @@akj3344 Thank you🙏

  • @konnen4518

    @konnen4518

    Жыл бұрын

    @@akj3344 eat deek