Apache Airflow 2.6.0: What’s New and How to Use It


Apache Airflow is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative³. Apache Airflow 2.6.0 has been released on April 30, 2023, bringing many minor features and improvements to the community¹. In this article, we will highlight some of the notable new features that shipped in this release and show a simple example of how to use the taskflow decorator to define a DAG.

Some Notable New Features

  • Trigger logs can now be viewed in webserver: Trigger logs have now been added to task logs. They appear right alongside the rest of the logs from your task. Adding this feature required changes across the entire Airflow logging stack, so be sure to update your providers if you are using remote logging¹.
  • Grid view improvements: The grid view has received a number of minor improvements in this release. Most notably, there is now a graph tab in the grid view. This offers a more integrated graph representation of the DAG, where choosing a task in either the grid or graph will highlight the same task in both views. You can also filter upstream and downstream from a single task¹.
  • Trigger UI based on DAG level params: A user-friendly form is now shown to users triggering runs for DAGs with DAG level params. See the Params docs for more details¹.
  • Consolidation of handling stuck queued tasks: Airflow now has a single configuration, [scheduler] task_queued_timeout, to handle tasks that get stuck in queued for too long. With a simpler implementation than the outgoing code handling these tasks, tasks stuck in queued will no longer slip through the cracks and stay stuck¹.
  • Cluster Policy hooks can come from plugins: Cluster policy hooks (e.g. dag_policy), can now come from Airflow plugins in addition to Airflow local settings. By allowing multiple hooks to be defined, it makes it easier for more than one team to run hooks in a single Airflow instance. See the cluster policy docs for more details¹.
  • Notification support added: The notifications framework allows you to send messages to external systems when a task instance/DAG run changes state. For example, you can easily post a message to Slack with DAG( “slack_notifier_example”, start_date=datetime(2023, 1, 1), on_success_callback=[ send_slack_notification( text="The DAG { { dag.dag_id }} succeeded", channel="#general", username="Airflow", ) ], ). As of today, Slack is the only system supported out of the box¹.

A Simple Example of Taskflow Decorator

The taskflow decorator is a way of defining DAGs using Python functions as tasks. It was introduced in Airflow 2.0 and has been improved since then. The taskflow decorator allows you to use multiple Python operators with minimal boilerplate code and automatically handles XComs for you.

As you can see, each function decorated with @task becomes a task in the DAG. The function parameters and return values are automatically passed as XComs between tasks. You can also use any Python operator inside the function body.

To run this DAG, you need to save it as a Python file (e.g. simple_taskflow.py) and place it in your dags folder. Then you can trigger it from the webserver UI or the CLI.

Here is a simple example of how to use the taskflow decorator to define a DAG that calculates the sum and product of two numbers:

from airflow import DAG
from airflow.decorators import task, dag

@dag(
dag_id=”my_dag”,
description=”A simple DAG using the taskflow decorator”,
start_date=datetime(2023, 1, 1),
default_args={
“owner”: “airflow”,
“retries”: 1,
“retry_delay”: timedelta(minutes=5),
},
)
def my_dag():

@task
def extract():
# Extract data from a file

@task
def transform(data):
# Transform the data

@task
def load(data):
# Load the data into a database

extract >> transform >> load

Conclusion

Apache Airflow 2.6.0 is a minor release that brings many new features and improvements to the community. Some of the notable new features include trigger logs in webserver, grid view enhancements, trigger UI based on DAG level params, consolidation of handling stuck queued tasks, cluster policy hooks from plugins, and notification support. We also showed a simple example of how to use the taskflow decorator to define a DAG using Python functions as tasks.

If you want to learn more about Apache Airflow 2.6.0, you can check out the release notes, documentation, and blog.


Source:
(1) apache-airflow · PyPI. https://pypi.org/project/apache-airflow/.
(2) what’s new in Apache Airflow 2.6.0 | Apache Airflow. https://airflow.apache.org/blog/airflow-2.6.0/.
(3) Release Notes — Airflow Documentation – Apache Airflow. https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html.

Leave a comment