Ecosystem
OpenLineage defines the metadata produced by pipelines and consumed by observability tools. A configurable backend allows the user to select a protocol for receiving events. Services using the OpenLineage standard can either consume or produce metadata.
Consumers
Amundsen's OpenLineageTableLineageExtractor extracts table lineage information from OpenLineage events. | |
Astronomer's Astro uses the openlineage-airflow library to extract lineage from Airflow tasks and stores that data in the Astro control plane. The Astronomer UI then renders a graph and list of all tasks and datasets that include OpenLineage data. | |
Atlan's OpenLineage integration uses job facets to catalog operational metadata from pipelines, enrich existing assets, and provide persona-based lineage information using OpenLineage SDKs. | |
Egeria's OpenLineage integration can capture OpenLineage events directly via HTTP or the proxy backend. | |
The Google Cloud Data Catalog supports importing OpenLineage events through the Data Lineage API to display in the Dataplex UI alongside lineage information from Google Cloud services including Dataproc. | |
Manta's OpenLineage Scanner uses job facets to ingest OpenLineage metadata and enrich overall enterprise data pipeline analysis. | |
Marquez is a metadata server offering an OpenLineage-compatible endpoint for real-time collection of information about running jobs and applications. | |
Metaphor's HTTP endpoint processes OpenLineage events and extracts lineage, data quality metadata, and job facets to enable data governance and data enablement across an organization. | |
The Azure Databricks to Purview Lineage Connector transfers OpenLineage events from Spark operations in Azure Databricks to Microsoft Purview, allowing one to see a table-level lineage graph of operations in Databricks notebooks and jobs. | |
Snowflake's OpenLineage Adapter creates an account-scoped view from ACCESS_HISTORY and QUERY_HISTORY to output each query that accesses tables in OpenLineage JsonSchema specification. |
Producers
A library converts Dagster events to OpenLineage events and emits them to an OpenLineage backend. | |
A wrapper script uses the OpenLineage client for automatic collection of metadata from dbt. | |
Egeria's OpenLineage integration publishes events to lineage integration connectors with OpenLineage listeners registered in the same instance of the Lineage Integrator OMIS. | |
The OpenLineage Flink Agent uses jvm instrumentation to emit OpenLineage metadata. | |
The Google Cloud Data Catalog's Data Lineage API enables importing OpenLineage events using the ProcessOpenLineageRunEvent REST API method and mapping OpenLineage facets to Data Lineage API attributes. | |
The OpenLineageValidationAction collects dataset metadata from Great Expectations' ValidationAction . | |
Keboola's OpenLineage integration automatically pushes all job information to an OpenLineage-compatible API endpoint. | |
The OpenLineage Spark Agent uses jvm instrumentation to emit OpenLineage metadata. |