AIRFLOW Write & run your first DAG on Airflow
You know what Airflow is and its main concepts. Now make it do something. Write a real DAG with tasks, drop it into Airflow, and watch it run to completion.
What we're doing
You'll write a DAG file from scratch, and trigger it from the UI. Watch the video first, then follow along here.
Step 1: Create the DAG file
Open the dags folder in VS Code Click VS Code in the environment panel. Once it opens you'll see the dags folder in the file explorer on the left. Right click on it and choose New File. Name it firstdag.py.
Step 2: Add the imports
Every DAG file starts with three imports:
from airflow
import DAG from airflow.operators.python
import PythonOperator from datetime
import datetime
DAG— the object that defines your pipeline -PythonOperator— the operator that wraps a Python function into a taskdatetime— used to set the start date of the DAG
Step 3: Define the tasks as Python functions
def extract(): print("Data fetched from the API.")
def transform(): print("Data cleaned and transformed.")
def load(): print("Data loaded into the database...")
These are plain Python functions. The print statements are just placeholders so you can see something in the logs when the tasks run, in a real pipeline these functions would call an API, run a SQL query, or write to a database.
Step 4: Define the DAG
with DAG( dag_id="first_dag",
start_date=datetime(2024, 1, 1),
schedule="@daily",
catchup=False )
as dag:
dag_id— the name that appears in the Airflow UIstart_date— when this DAG became activeschedule="@daily"— run once a day automaticallycatchup=False— won't backfill all the missed runs since the start date
Step 5: Create the tasks and set the order
t1 = PythonOperator(task_id="extract", python_callable=extract)
t2 = PythonOperator(task_id="transform", python_callable=transform)
t3 = PythonOperator(task_id="load", python_callable=load)
t1 >> t2 >> t3
task_id— the name shown in the graph viewpython_callable— points to the function to runt1 >> t2 >> t3— defines the order.
Extract runs first, then transform, then load>>means "runs before". This single line defines the entire pipeline order.*
Save the file:Ctrl+S.
Step 7: Find it in the UI
Open the Airflow UI from the environment panel. Go to the DAGs page and look for first_dag in the list. It appears within 30 seconds of saving the file. Airflow scans the dags folder continuously and the moment it finds a valid Python file, it is getting registered.
Step 8: Trigger it and watch it run
Click the play button on the right of first_dag to trigger it manually. Press on the DAG and open the Graph view. Watch the tasks change color:
- White — waiting
- Yellow — running
- Green — success
- Red — failed
Once all three are green, click any task → Logs to see the output:
After hibernation
If the VM hibernates, reconnect and run in the VS Code terminal:
cd ~/airflow docker
compose up -d
What's next
Start Airflow