Running Airflow 2.0 with Docker in 5 mins
Airflow 2.0 is out! How it works, what are the new features, what can do with your DAGs, to answer all those questions, you need to run Airflow 2.0.
What is the easiest and fastest way to do it?
By using Docker!
Let's discover how to run Apache Airflow 2.0 with the CeleryExecutor locally by using Docker!
👍 Smash the like button to become an Airflow Super Hero!
❤️ Subscribe to my channel to become a master of Airflow
🏆 Take my course : www.udemy.com/course/the-ulti... to join the legends of Airflow
🚨 My Patreon: / marclamberti to support my work and be friend for life
The docker-compose file:
airflow.apache.org/docs/apach...
Пікірлер: 203
Wow, its amazing how far the Airflow team has come with this. Thanks Marc!
@MarcLamberti
3 жыл бұрын
Thanks Mike
arrgh .. spent about 3h trying to figure this out, basically all the online-instructions missed one small bit or another ... with your instructions, le voila, it works straight up. Thanks a lot!
I finally got it up and running! Thank you, Marc!
Awesome! Thanks for the tutorial, Marc!
This instruction really helps me thank you so much !
Thank you! very efficiently and clearly explained !
The only of many tutorial that actually helped! Thanks
I struggled to get airflow running for a long time and this short video helped me SO MUCH, thank you!!
@MarcLamberti
Жыл бұрын
Happy to help!
Thank you VERY MUCH!! Marc. This video is very useful.
Thanks marc to sort the installation part of airflow👏👏👏👏
Awesome Marc, thanks for sharing
super amazing job thank u!
this is so much fun and informative, thanks.
Thanks Marc, It helped me a lot!
I love this!!! thanks man!
excellent walkthrough,,, Thanks :)
This is an awesome tutorial ! Thanks a lot~
Thank you Marc. I was in hurry to find out how to run airflow and kept failed somehow. However with your nice clear explanation, nothing is mysterious anymore~
thanks. you help me a lot.
Thanks for the video!
Thanks, simple and clear ;)
Thanks Marc ! awesome tutorial.
@MarcLamberti
3 жыл бұрын
Thank you 😁
VERY EASY TO UNDERSTAND
Thank you so much, Marc. Great content! TIL Using YAML aliases and
@MarcLamberti
3 жыл бұрын
Love it too 😁
Thanks Marc!!! Great Job!
@MarcLamberti
3 жыл бұрын
Thanks Alex
Awesome man!
Amazing! Thank you very much!
@MarcLamberti
3 жыл бұрын
Glad you like it!
Fabulous! Thanks Marc! I have installed Airflow2.0 successfully. The webserver was failed to start in my MAC. But after I have increased the memory to 4GB ...it works.
@afolakebaiyewu476
2 жыл бұрын
how did u increase your mac memory
Thank you so much!
Great job boy. keep it up.❤
THANK YOU
Hi Marc, thanks a lot for this!! :)
@MarcLamberti
3 жыл бұрын
pleasure :)
neat & clean! thanks!!
@MarcLamberti
6 ай бұрын
Thank you ❤️
I always try to find a docker image to perform experiments. Thanks for providing a reference that I can refer anytime in future.
@MarcLamberti
3 жыл бұрын
Here it is 😁
Hi Marc, great video. Just wondering if you could show us how to install a triggerer into your airflow stack using docker compose? Thanks!
super video Marc je m'abonne !
@MarcLamberti
2 жыл бұрын
🙌🙌🙌🙌
Marc, Big fan of your content. Can you make a video for deploying Airflow 2.0 (with Celery executor) on Azure Containers?
For anyone coming here from the 2024's and beyond, in Linux, specifically Ubuntu, remember to use: `docker compose up init-airflow`
Thaaaaanks, i have been having issues with running airflow and now it worked!!! Ill now be able to automate tasks and be lazier lol,
@MarcLamberti
3 ай бұрын
Let’s gooooo
This is super useful... Thank you. One question: Can I use it on an AWS instance. How should I configure the security group and firewalls.
Nice 😊
Thanks Marc for this video. Question: Do I need to run everytime I spin the containers?
Great video Marc. It's sort of crazy how easy that was (even on WSL2)... Thank you Some things I'm still considering afterwards: 1) Is this enough for a production deployment of Airflow if the database was decoupled from the rest of the container? If the container crashed for whatever reason all of the connections would be lost, so separating is a good idea. 3) For local testing/debugging of an instance I'm going to try and mount DAGS that exist in another project folder instead of the one that we created. 3b) For local testing I might also try and store connection details in environment variables in the .env file rather than relying on the persistence of the database.
@senhajirhazihamza7718
2 жыл бұрын
Docker compose is not enough for production, but you can take the same components (containerized) and use kubernetes to go to production
Hello Marc thanks for the videos it's great, I have a question for you how can we version the dag in production ?
@MarcLamberti
3 жыл бұрын
Right now, the only way is to change the dag id with the version. For example, my_dag_v1.0.0, my_dag_v1.0.1 and so on. DAG versioning is coming soon but not yet available
Hi Marc, thanks for your sharing! I'd like to know how to install third-party modules. When I installed yfinance module, there was a dag import error : no module named yfinance.
Great! but where can i locate the requirements.txt to add for example the apache-airflow-providers-snowflake?
Great tutorial Old but relevant. Thanks! Marc, I am using Visual Studio Code and everytime I want to save my dag file, I need click a button "Retry as Sudo". Can you tell me what to do here... it is quite annyoing! Regards!
Thanks for the wanderful tutorial. I understood that the DAG file I stored in DAG folder will be added to Airflow. But what happen if Airflow is running in remote docker that I only have web access? I can upload DAG from my local disk to remote? Or is there other way to do it?
Kindly demonstrate on Teradata and keycloak containerisation
Hello Marc, Your videos are always great and helpful and with your video I get Airflow running quite well. The only trouble is that I need to run java within docker and I have not found any good description of how to get this working. I am starting a shell script that starts a java runtime within the terminal. Could you give me some help on how to get this running? Thanks, Armin
Hello sir, how can we launch every task of etl in a different container as we do via k8s pod operator to launch every task of dag in a different pod?
Hello Marc, I have recently installed airflow 2 using docker compose file as suggested in this video. But, when I enhanced the dag with mutiple connections i.e., Gdrive->S3, S3->Snowflake,Snowflake->S3 operations using pyspark and sql scripts, the webserver keeps restarting and at times shows unhealthy. Can you please suggest or advice what could have gone wrong or should I consider increasing docker memory?
Hi Marc, I have used docker compose to install airflow. However, the sample dags seems not to work for me and I found no logs.
It works on Windows and Mac, I tried both and it works, thanks (on Windows with some tricks)
@anjanashetty482
2 жыл бұрын
Please can you share the tricks on windows..I tried on windows its not working for me. Please do reply will be very helpful
@kikecastor
2 жыл бұрын
@@anjanashetty482 for windows use the wsl tool to run the commands described in the video
@anjanashetty482
2 жыл бұрын
@@kikecastor Thanks for your response Armonia. I was able to install airflow 2 with wsl but when I create a dag and try to debug in VS I am getting error : ModuleNotFoundError: No module named 'airflow'
@kikecastor
2 жыл бұрын
@@anjanashetty482 are you in the correct environment?
@anjanashetty482
2 жыл бұрын
@@kikecastor Yes I am, do I have to explicitly do pip install apache-airflow
Hi Marc, great tutorial. Airflow is running w/o Problems. I tried to use vs code with airflow and found your new video "Configure VS Code to Develop Airflow DAGs with Docker at ease!" However, I don't understand where the Dockerfile come into the picture. Can you please elaborate! ---> Reopen in Container looks totally different as in your video. Thanks
Thanks a lot Marc. Can you also please make a video for deploying Airflow 2 using helm chart? and go over the options on values.yaml file? Thanks in advance
@MarcLamberti
3 жыл бұрын
Coming but there are some issues right now with the Helm chart 😬
@khjomaa
3 жыл бұрын
@@MarcLamberti Thanks!
It was great video thanks. How would i push my custom airflow python file into docker container?
Is there available docker-compose with mysql? Can u please share link if you have
First of all, thank you so much for this *awesome* video. It is really helpful. I followed this tutorial and was able to access AirFlow seamlessly. But I want to have apache-airflow-providers operators. So, I tried giving them in _PIP_ADDITIONAL_REQUIREMENTS and also building using Dockerfile. But nothing worked and I still see "error: command 'gcc' failed with exit status 1". I changed airflow image to 2.1.2-python3.7 as slim versions don't include extra libraries. But no luck. Could you help me resolve this issue?
For those that have a Mac and install Docker Desktop, you will not need to install Compose separately. It comes with Docker Desktop
Hi mark. I bought your course but I got an error trying to run the bash operador that insert data into the user table. I have tried the comand alone in the console and it works but when I used inside my dag in my bash operator I got this error bash command failed. The comand returned a non-zero exit code. I have tried a lot but I still can't found a solution for this
Any tips for this issue: AirflowException('Celery command failed on host:
Hi Marc I installed docker desktop at windows using Ubuntu wsl. I changed the dags directory path in .yaml file to my c:\ drive folder in windows. When I start web UI, it doesn't pick my dags.py file. what can be the issue.
how do you create a celery executor if you haven't specified a dedicated backend mysql or Postgres db in the yaml file?
Localhost:8080 aint opening for me. How to check the logs for any issues?
Is this usable in production? Could you create a production setup?
How do we install providers after installing airflow on docker
How can I install required packages to docker or how can I mentioned required packages in .yaml file
How can we add and setup airflow.cfg file inside project folder?
When i followed the steps and installing, I am getting the error " manifest file not found". I have seen this error is reported by others as well; I changed the image to 2.0.1 and it worked by later I get an error about old version used.
Thank you for the awesome tutorial. I do have one question though: how can I install python packages with docker-compose when creating the containers? for example I would like to install Pymongo.
@ramsescoraspe
3 жыл бұрын
Hi, I would recommend use PythonVirtualenvOperator
@derzemel
3 жыл бұрын
@@ramsescoraspe yes, but how do I install a python library like PyMongo, or OpenCV in the container? PythonVirtualenvOperator allows for functions/methods to be created including the module imports they need and then they are destroyed, but I do not have those modules installed in the container. Until now, each time I installed Python modules in containers I did it with a help of a Dockerfile (e.g. inside the Dockerfile I enter "RUN pip install opencv-python") but it is not clear to me how to do the same using a docker-compose.yaml file. Edit: figured it out: I had to add a pointer to the Dockerfile in the docker-compose.yaml
3 жыл бұрын
@@derzemel hi, how do you do this ? (add a pointer)
@derzemel
3 жыл бұрын
@ in my case, the airflow webserver service is build like this (the Dockerfile is in the same dir as the compose): airflow-webserver: build: context: . dockerfile: Dockerfile
@ramons.g5135
2 жыл бұрын
@@derzemel do you know how can I add Airflow dependencies inside the docker-compose.yaml file? Also is there a way to provide access to my AWS resources, such as S3, either on the yaml file on the Airflow UI?
I have many problems using PythonVirtualenvOperator or ExternalPythonOperator inside docker because you must include system site packages as True (it creates conflicts between venv and base python libraries) or otherwise you will get "ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv"
While running Airflow 2 via Docker Compose(Just like the above video), I am unable to successfully execute DockerOperator tasks. Can you enlighten with a video reference or doc reference about how to properly configure Airflow Docker compose file or Docker Operator to run tasks
When I got to the docker-compose up airflow-init, I get the following error: "Python-dotenv could not parse statement starting at line 1 Traceback (most recent call last): File "docker\api\client.py", line 214, in _retrieve_server_version File "docker\api\daemon.py", line 181, in version File "docker\utils\decorators.py", line 46, in inner " A few dozen more error lines afterwards, but I can't make it work so far
Hi Mark! could I perform these steps without problem on a raspberry pi?
I got all of this running, but once I add a new py in dags it doesnt show on the airflow interface. Anyone had the same issue?
Question: what's the recommended way to increase the number of celery workers using docker compose? Say from 2 workers to 10 workers? Copy&paste worker keys in docker compose yaml files?
@MarcLamberti
3 жыл бұрын
No, use docker-compose up --scale airflow-worker=10 :)
@albertlee9592
3 жыл бұрын
@@MarcLamberti Thank you for your quick reply! TIL docker-compose up -scale!
I installed everything and could not open the localhost:8080. Tried many times and safari said that "safari cannot open the page. The server dropped the connection. This happens when the server is busy" why does that happen?
where do I need to write the command at 2:07?
Anyone facing problem in getting the logs displayed in the UI? Clicking a task --> Log --> gives me a blank log frame. However, it allows me to download it to my machine.
Can someone explain to me why we are running "docker-compose up airflow-init" and then "docker-compose up"?
For my Mac with M1 chip: I had to increase the amount of RAM available to Docker to 8GB, and swap to 2GB's.
I got this error "Error response from daemon: manifest for apache/airflow:2.6.0.dev0 not found: manifest unknown: manifest unknow" after I typied "docker-compose up airflow-init" Why?
I don't know why but it is giving me some python error when i am executing docker compose up airflow-init. Any suggestion ?
anyone know why I can't access the installed airflow docker-compose in ec2 instance via browser? I have installed airflow using docker-compose in ec2 instance, all containers running, I have set inbound rules security group TCP 8080 port to be accessible. But when I open ec2dnsaddress:8080 on the browser, it shows This site can't be reached. I have check it also in docker-compose logs airflow-webserver, it doesn't capture access from the outside and it only logs healthcheck
I followed the steps mentioned here, but getting no response from gunicorn master within 120 seconds and the webserver keeps getting restarted. Can anyone help with any lead here please?
The only issue is that I can't import anything to my dag from other folders (not dag folder). I don't know why but I get a Import Error
getting error : Import "airflow" could not be resolved while importing 'from airflow import DAG'
Great tutorial! Short and sweet. I followed the exact same steps and checked the containers status, redis and postgres were healthy but airflow-scheduler, flower, worker, webserver and triggerer were unhealthy then I deleted all the containers and repeated all the steps and now I'm getting error as "database "airflow" does not exist". Redis and postgres containers are running without any problem. I would appreciate if you can help me understand the error. Thanks.
@ucheokeke4780
2 жыл бұрын
hi, did you ever resolve the problem? I am having the same issues. Thanks
@omeryasirkucuk4250
5 ай бұрын
Hi guys, I got this problem same as you. I am operating in Windows 10. Instead of applying "echo -e....." command, I created a .env file on same directory as .yaml fileAIRFLOW_UID=50000 in it. Problem was solved!
I had to increase the amount of RAM available to Docker to 6GB for this to work on my Mac. Also had to enable permissions to the folder i worked in with CHMOD.
@hamzafaheem8512
2 жыл бұрын
Hi, could you please let me know how you enabled permissions using CHMOD? I keep getting the following error: "OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "version": executable file not found in $PATH: unknown"
Hello guys. Just a tip for everyone: Do not try to create the airflow folder outside your folder user... You will run into permissions problems (a tried to create a Airflow folder o /opt/, but i strong don't recommend that).
For me it does not work. Docker compose is creating path "./local" and I can not access it. Airflow can not read my DAGS. it is very frustrating. I have been installing airflow for 8th time and none of them worked...
When I try to run the official 2.0.1 docker-compose.yaml file at airflow.apache.org/docs/apache-airflow/2.0.1/docker-compose.yaml on my Ubuntu 18.04 LTS I get the following error: ERROR: The Compose file './docker-compose.yaml' is invalid because: Invalid top-level property "x-airflow-common". Valid top-level sections for this Compose file are: services, version, networks, volumes, and extensions starting with "x-". You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1. For more on the Compose file format versions, see docs.docker.com/compose/compose-file/ services.airflow-init.depends_on contains an invalid type, it should be an array services.airflow-scheduler.depends_on contains an invalid type, it should be an array services.airflow-webserver.depends_on contains an invalid type, it should be an array services.airflow-worker.depends_on contains an invalid type, it should be an array services.flower.depends_on contains an invalid type, it should be an array Changing the version to 3.4 removes the first error but I still get docker complaining about depends_on. How can I fix it? My docker-compose version is docker-compose version 1.17.1, build unknown docker-py version: 2.5.1 CPython version: 2.7.17 OpenSSL version: OpenSSL 1.1.1 11 Sep 2018 while docker version is Client: Version: 19.03.6 API version: 1.40 Go version: go1.12.17 Git commit: 369ce74a3c Built: Fri Dec 18 12:21:44 2020 OS/Arch: linux/amd64 Experimental: false Server: Engine: Version: 19.03.6 API version: 1.40 (minimum version 1.12) Go version: go1.12.17 Git commit: 369ce74a3c Built: Thu Dec 10 13:23:49 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.3.3-0ubuntu1~18.04.2 GitCommit: runc: Version: spec: 1.0.1-dev GitCommit: docker-init: Version: 0.18.0 GitCommit: Thanks in advance, Flavio
the only commands that worked for me was the mkdir one everything else gave me an error
6:32 [11077] Failed to execute script docker-compose
where to place my custom Python file into the docker container? Thanks BTW for good video
Hi, I'm facing the issue PermissionError: [Errno 13] Permission denied: '/opt/airflow/logs/scheduler/20201-03-16' when running "docker-compose up airflow-init" . Any idea? Thanks
@DaniloPako
3 жыл бұрын
you can run the command with sudo (sudo docker-compose up airflow-init) but i want to know how to do without the sudo
@leamon9024
3 жыл бұрын
@@DaniloPako Hi, I just solved it without using sudo. You just have to make sure all of the files you create like dags, logs, plugins and the folder you're currently in, the owner and the groupid of them are the user you've log in.
can pleaseeeeeeeeeeeeeeeeeeeeeeeeeee post a video on how to install databricks connection type in airflow 2.0.1
The container for the webserver seems to be restarting continuously every minute or so. Any idea why this may happen?
@kikecastor
2 жыл бұрын
You must increase the RAM of docker and that is how it will work
Hi I'm a noob I'm using the same YAML file but after running the command "docker-compose up airflow-init" on my ubuntu machine I'm getting this error please help. ERROR: The Compose file './docker-compose.yaml' is invalid because: Invalid top-level property "x-airflow-common". Valid top-level sections for this Compose file are: services, version, networks, volumes, and extensions starting with "x-". You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1. For more on the Compose file format versions, see docs.docker.com/compose/compose-file/ services.airflow-init.depends_on contains an invalid type, it should be an array services.airflow-scheduler.depends_on contains an invalid type, it should be an array services.airflow-webserver.depends_on contains an invalid type, it should be an array services.airflow-worker.depends_on contains an invalid type, it should be an array services.flower.depends_on contains an invalid type, it should be an array