Introducing the Tidal Automation Adapter for Apache Airflow

The Apache Airflow platform allows users to create, schedule, run, and monitor workflows by creating Python® language scripts that define DAGs (Directed Acyclic Graphs). These DAGs can define a set of dependencies and a sequence of steps, or tasks, to perform at some scheduled date and time. This TA Adapter works with the Airflow platform to enhance Apache Airflow usability and to integrate seamlessly with Enterprise Scheduling. While Apache Airflow supports only basic time-based schedules, the Tidal Adapter offers more robust scheduling abilities.

As shown in the this illustration, TA can include complex Airlflow DAGs as Airflow jobs within a full business workflow, where it can be scheduled based on other dependencies and resources.

Using the Adapter lets you:

  • Schedule Airflow jobs based on business activities in addition to time-based triggers.

  • Create schedule dependencies based on the completion status of other Airflow jobs.

  • Schedule Airflow jobs based on events external to a DAG.

In addition to scheduling enhancements, the adapter can also:

  • Verify the availability of external resources before starting an Airflow job.

  • Distribute Airflow jobs across multiple Airflow adapter for improved performance.

The Tidal Automation Adapter for Apache Airflow appears in the Adapters section of TA. You create Airflow jobs and schedule them to run on Airflow Servers that you define in the Airflow Connection Definition dialog.

You can also invoke Airflow from the Scheduler to add and edit process definitions, and run controls for which you have privileges through authorized Airflow operator IDs.

Airflow adapter components

The Tidal Automation Adapter for Apache Airflow comprises two major components:

  • Tidal Bridge

    The Tidal Bridge is an Airflow plugin that exposes the REST API that the adapter uses to connect to an Airflow instance. Tidal Bridge provides different plugins based on the version of the Airflow instance to which the adapter connects:

    The Airflow 1.0 plugin lets the adapter connect to an Airflow 1.0 instance. Airflow 1.0 does not use a stable REST API, so the plugin provides the necessary connection functionality. This plugin is required for adapters that connect to Airflow 1.0 instances.

    The Airflow 2.0 supports its own stable REST API, which an adapter uses by default for connections. The Airflow 2.0 plugin is optional for Airflow 2.0 instances. If you want to use the plugin for connections, override the default behavior by setting the AIRFLOW_API parameter to PLUGIN in the Airflow 2.0 configuration (Connection Definition > Options > Parameter tab).

  • Tidal wrapper DAG

    The Tidal wrapper DAG provides additional support for running separate tasks in an Airflow DAG and for the Google Cloud Composer in Airflow. Tidal provides separate wrapper DAGs for each Airflow instance version.

Terms to know

  • Directed Acyclic Graph (DAG) – In the Airflow environment, a DAG is a Python script that represents a collection of tasks to run, along with their relationships and dependencies. It describes how to carry out a workflow.

    Directed: Tasks are executed in a specified order.

    Acyclic: Loops (cycles) are not supported.

    Graph: Process flows can be displayed graphically.

  • Operator – An operator describes a single task in a workflow, which defines what gets done by a task.

  • Task – A task defines a unit of work within a DAG that implements an operator.

Software requirements

The Adapter requires the software:

  • Apache Airflow 1.10.10 and higher, which requires installing a Tidal bridge, as described in Installing Tidal Automation adapter for Apache Airflow.

  • Apache Airflow 2.0 and higher, which uses the Airflow API (Stable) to communicate with Airflow 2.0 servers.

  • Tidal Automation REST API Plugin 1.0.3 and above, which is included in the Tidal Bridge installation package. See Installing Tidal Automation adapter for Apache Airflow.

  • Tidal Automation 6.5.5 and higher.