OpenLineage connectors have been created for major job schedulers and data platforms. By using these connectors, the appropriate API calls will be made automatically each time your pipeline executes. They capture information about datasets, jobs, and runs, allowsing you to study lineage across multiple data sources.
Enabling OpenLineage in Apache Airflow automatically tracks metadata about jobs and datasets as DAGs execute.
OpenLineage can automatically track lineage of jobs and datasets across Spark jobs.
Google Cloud Storage
Enabling OpenLineage in dbt can capture lineage metadata for transformations running within your data warehouse.