OpenLineage defines the metadata produced by pipelines and consumed by observability tools. A configurable backend allows the user to select a protocol for receiving events. Services using the OpenLineage standard can either consume or produce metadata.

Consumers

                            
 
Amundsen logo
Amundsen's OpenLineageTableLineageExtractor extracts table lineage information from OpenLineage events.
Egeria logo
Egeria's OpenLineage integration can capture OpenLineage events directly via HTTP or the proxy backend.
Marquez logo
Marquez is a metadata server offering an OpenLineage-compatible endpoint for real-time collection of information about running jobs and applications.
MANTA logo
MANTA's OpenLineage Scanner uses job facets to ingest OpenLineage metadata and enrich overall enterprise data pipeline analysis.

Producers

                             
 
Great Expectations logo
The OpenLineageValidationAction collects dataset metadata from Great Expectations' ValidationAction.
Spark logo
The OpenLineage Spark Agent uses jvm instrumentation to emit OpenLineage metadata.
dbt logo
A wrapper script uses the OpenLineage client for automatic collection of metadata from dbt.
Airflow logo
A library integrates Airflow DAGs for automatic metadata collection.
Flink logo
The OpenLineage Flink Agent uses jvm instrumentation to emit OpenLineage metadata.
Dagster logo
A library converts Dagster events to OpenLineage events and emits them to an OpenLineage backend.