0.30.1 - 2023-07-25
Added
- Flink: support Iceberg sinks
#1960
@pawel-big-lebowski
Detects output datasets when using an Iceberg table as a sink. - Spark: column-level lineage for
merge into
on Delta tables#1958
@pawel-big-lebowski
Makes column-level lineage supportmerge into
on Delta tables. Also refactors column-level lineage to deal with multiple Spark versions. - Spark: column-level lineage for
merge into
on Iceberg tables#1971
@pawel-big-lebowski
Makes column-level lineage supportmerge into
on Iceberg tables. - Spark: add support for Iceberg REST catalog
#1963
@juancappi
Addsrest
to the existing options ofhive
andhadoop
inIcebergHandler.getDatasetIdentifier()
to add support for Iceberg'sRestCatalog
. - Airflow: add possibility to force direct-execution based on environment variable
#1934
@mobuchowski
Adds the option to use the direct-execution method on the Airflow listener when the existence of a non-SQLAlchemy-based Airflow event mechanism is confirmed. This happens when using Airflow 2.6 or when theOPENLINEAGE_AIRFLOW_ENABLE_DIRECT_EXECUTION
environment variable exists. - SQL: add support for Apple Silicon to
openlineage-sql-java
#1981
@davidjgoss
Expands the OS/architecture checks when compiling to produce a specific file for Apple Silicon. Also expands the corresponding OS/architecture checks when loading the binary at runtime from Java code. - Spec: add facet deletion
#1975
@julienledem
In order to add a mechanism for deleting job and dataset facets, adds a{ _deleted: true }
object that can take the place of any job or dataset facet (but not run or input/output facets, which are valid only for a specific run). - Client: add a file transport
#1891
@Alexkuva
Creates aFileTransport
and its configuration classes supporting append mode or write-new-file mode, which is especially useful when an object store does not support append mode, e.g. in the case of Databricks DBFS FUSE.
Changed
- Airflow: do not run plugin if OpenLineage provider is installed
#1999
@JDarDagran
SetsOPENLINEAGE_DISABLED
totrue
if the provider is installed. - Python: rename
config
toconfig_class
#1998
@mobuchowski
Renames theconfig
class variable toconfig_class
to avoid potential conflict with the config instance.
Fixed
- Airflow: add workaround for airflow-sqlalchemy event mechanism bug
#1959
@mobuchowski
Due to known issues with the fork and thread model in the Airflow-SQLAlchemy-based event-delivery mechanism, a Kafka producer left alone does not emit a `COMPLETE`` event. This creates a producer for each event when we detect that we're under Airflow 2.3 - 2.5. - Spark: fix custom environment variables facet
#1973
@pawel-big-lebowski
Enables sending the Spark environment variables facet in a non-deterministic way. - Spark: filter unwanted Delta events
#1968
@pawel-big-lebowski
Clears events generated by logical plans havingProject
node as root. - Python: allow modification of
openlineage.*
logging levels via environment variables#1974
@JDarDagran
AddsOPENLINEAGE_{CLIENT/AIRFLOW/DBT}_LOGGING
environment variables that can be set according to module logging levels and cleans up some logging calls inopenlineage-airflow
.