One morning you find out your favorite Airflow DAG did not ran that night. Sad…

Six months later the task ran twice and now you understand: you scheduled your DAG timezone aware and the clock goes back and forth sometimes because of Daylight Saving Time.

For example, in Central European Time (CET) on Sunday 29 March 2020, 02:00, the clocks were turned from “local standard time” forward 1 hour to 03:00:00 “local daylight time”. And recently, on Sunday 25 October 2020, 03:00 the clocks were turned from “local daylight time” backward 1 hour to “local standard time” instead. That means that any activity between 02:00 and 03:00 will not exist, or exists twice.

Does Airflow REALLY behave like this?

Spoiler: luckily it does not :)

This is our simple timezone aware DAG with a DummyOperator that starts before and ends after March 29th 2020:

from datetime import datetime
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
import pendulum

local_tz = pendulum.timezone("Europe/Amsterdam")

default_args=dict(
start_date=datetime(2020, 3, 28, tzinfo=local_tz),
end_date=datetime(2020, 4, 1, tzinfo=local_tz),
owner='joost.dobken'
)

with DAG('check_daylight_savings', schedule_interval='42 2 * * *', default_args=default_args) as dag: