AIRFLOW

Write & run your first DAG on Airflow

You know what Airflow is and its main concepts. Now make it do something. Write a real DAG with tasks, drop it into Airflow, and watch it run to completion.

What we're doing

You'll write a DAG file from scratch, and trigger it from the UI. Watch the video first, then follow along here.

Step 1: Create the DAG file

Open the dags folder in VS Code Click VS Code in the environment panel. Once it opens you'll see the dags folder in the file explorer on the left. Right click on it and choose New File. Name it firstdag.py.

Step 2: Add the imports

Every DAG file starts with three imports:

from airflow 
import DAG from airflow.operators.python 
import PythonOperator from datetime 
import datetime 
  • DAG — the object that defines your pipeline -
  • PythonOperator — the operator that wraps a Python function into a task
  • datetime — used to set the start date of the DAG

Step 3: Define the tasks as Python functions

def  extract(): print("Data fetched from the API.") 
def  transform(): print("Data cleaned and transformed.") 
def  load(): print("Data loaded into the database...") 

These are plain Python functions. The print statements are just placeholders so you can see something in the logs when the tasks run, in a real pipeline these functions would call an API, run a SQL query, or write to a database.

Step 4: Define the DAG

with DAG( dag_id="first_dag", 
start_date=datetime(2024,  1,  1),
schedule="@daily", 
catchup=False )  
as dag: 
  • dag_id — the name that appears in the Airflow UI
  • start_date — when this DAG became active
  • schedule="@daily" — run once a day automatically
  • catchup=False — won't backfill all the missed runs since the start date

Step 5: Create the tasks and set the order

t1 = PythonOperator(task_id="extract", python_callable=extract) 
t2 = PythonOperator(task_id="transform", python_callable=transform)
t3 = PythonOperator(task_id="load", python_callable=load)
 
t1 >> t2 >> t3 
  • task_id — the name shown in the graph view
  • python_callable — points to the function to run
  • t1 >> t2 >> t3 — defines the order.
    Extract runs first, then transform, then load >> means "runs before". This single line defines the entire pipeline order.*
    Save the file: Ctrl+S.

Step 7: Find it in the UI

Open the Airflow UI from the environment panel. Go to the DAGs page and look for first_dag in the list. It appears within 30 seconds of saving the file. Airflow scans the dags folder continuously and the moment it finds a valid Python file, it is getting registered.

Step 8: Trigger it and watch it run

Click the play button on the right of first_dag to trigger it manually. Press on the DAG and open the Graph view. Watch the tasks change color:

  • White — waiting
  • Yellow — running
  • Green — success
  • Red — failed

Once all three are green, click any task → Logs to see the output:

After hibernation

If the VM hibernates, reconnect and run in the VS Code terminal:

cd ~/airflow docker 
compose up -d 

What's next

Now go and try this out in a live environment — boot a fresh cluster and play with the manifests above.

Start Airflow
Spec 2 CPU / 8 GiB ·Disk 20 GiB ·Lifetime 7 days
Sign in to launch this environment
Required 1 VM · 2 CPU · 8 GB
Your plan (free) 1 VM · 1 CPU · 2 GB
Sign in