0.30.1 - 2023-07-25
Added
- Flink: support Iceberg sinks
#1960@pawel-big-lebowski
Detects output datasets when using an Iceberg table as a sink. - Spark: column-level lineage for
merge intoon Delta tables#1958@pawel-big-lebowski
Makes column-level lineage supportmerge intoon Delta tables. Also refactors column-level lineage to deal with multiple Spark versions. - Spark: column-level lineage for
merge intoon Iceberg tables#1971@pawel-big-lebowski
Makes column-level lineage supportmerge intoon Iceberg tables. - Spark: add support for Iceberg REST catalog
#1963@juancappi
Addsrestto the existing options ofhiveandhadoopinIcebergHandler.getDatasetIdentifier()to add support for Iceberg'sRestCatalog. - Airflow: add possibility to force direct-execution based on environment variable
#1934@mobuchowski
Adds the option to use the direct-execution method on the Airflow listener when the existence of a non-SQLAlchemy-based Airflow event mechanism is confirmed. This happens when using Airflow 2.6 or when theOPENLINEAGE_AIRFLOW_ENABLE_DIRECT_EXECUTIONenvironment variable exists. - SQL: add support for Apple Silicon to
openlineage-sql-java#1981@davidjgoss
Expands the OS/architecture checks when compiling to produce a specific file for Apple Silicon. Also expands the corresponding OS/architecture checks when loading the binary at runtime from Java code. - Spec: add facet deletion
#1975@julienledem
In order to add a mechanism for deleting job and dataset facets, adds a{ _deleted: true }object that can take the place of any job or dataset facet (but not run or input/output facets, which are valid only for a specific run). - Client: add a file transport
#1891@Alexkuva
Creates aFileTransportand its configuration classes supporting append mode or write-new-file mode, which is especially useful when an object store does not support append mode, e.g. in the case of Databricks DBFS FUSE.
Changed
- Airflow: do not run plugin if OpenLineage provider is installed
#1999@JDarDagran
SetsOPENLINEAGE_DISABLEDtotrueif the provider is installed. - Python: rename
configtoconfig_class#1998@mobuchowski
Renames theconfigclass variable toconfig_classto avoid potential conflict with the config instance.
Fixed
- Airflow: add workaround for airflow-sqlalchemy event mechanism bug
#1959@mobuchowski
Due to known issues with the fork and thread model in the Airflow-SQLAlchemy-based event-delivery mechanism, a Kafka producer left alone does not emit a `COMPLETE`` event. This creates a producer for each event when we detect that we're under Airflow 2.3 - 2.5. - Spark: fix custom environment variables facet
#1973@pawel-big-lebowski
Enables sending the Spark environment variables facet in a non-deterministic way. - Spark: filter unwanted Delta events
#1968@pawel-big-lebowski
Clears events generated by logical plans havingProjectnode as root. - Python: allow modification of
openlineage.*logging levels via environment variables#1974@JDarDagran
AddsOPENLINEAGE_{CLIENT/AIRFLOW/DBT}_LOGGINGenvironment variables that can be set according to module logging levels and cleans up some logging calls inopenlineage-airflow.