Integrations

OpenLineage connectors have been created for major job schedulers and data platforms. By using these connectors, the appropriate API calls will be made automatically each time your pipeline executes. They capture information about datasets, jobs, and runs, allowsing you to study lineage across multiple data sources.

PlatformVersionData SourcesResources

Apache Airflow

Enabling OpenLineage in Apache Airflow automatically tracks metadata about jobs and datasets as DAGs execute.

1.10+

PostgreSQL

Snowflake

Amazon Redshift

Google BigQuery

Great Expectations

Apache Spark

OpenLineage can automatically track lineage of jobs and datasets across Spark jobs.

2.4+

JDBC

HDFS

Google Cloud Storage

Google BigQuery

Amazon S3

dbt

Enabling OpenLineage in dbt can capture lineage metadata for transformations running within your data warehouse.

0.20+

Snowflake

Google BigQuery