Skip to main content
Version: 1.21.1

0.30.1 - 2023-07-25

Added

  • Flink: support Iceberg sinks #1960 @pawel-big-lebowski
    Detects output datasets when using an Iceberg table as a sink.
  • Spark: column-level lineage for merge into on Delta tables #1958 @pawel-big-lebowski
    Makes column-level lineage support merge into on Delta tables. Also refactors column-level lineage to deal with multiple Spark versions.
  • Spark: column-level lineage for merge into on Iceberg tables #1971 @pawel-big-lebowski
    Makes column-level lineage support merge into on Iceberg tables.
  • Spark: add support for Iceberg REST catalog #1963 @juancappi
    Adds rest to the existing options of hive and hadoop in IcebergHandler.getDatasetIdentifier() to add support for Iceberg's RestCatalog.
  • Airflow: add possibility to force direct-execution based on environment variable #1934 @mobuchowski
    Adds the option to use the direct-execution method on the Airflow listener when the existence of a non-SQLAlchemy-based Airflow event mechanism is confirmed. This happens when using Airflow 2.6 or when the OPENLINEAGE_AIRFLOW_ENABLE_DIRECT_EXECUTION environment variable exists.
  • SQL: add support for Apple Silicon to openlineage-sql-java #1981 @davidjgoss
    Expands the OS/architecture checks when compiling to produce a specific file for Apple Silicon. Also expands the corresponding OS/architecture checks when loading the binary at runtime from Java code.
  • Spec: add facet deletion #1975 @julienledem
    In order to add a mechanism for deleting job and dataset facets, adds a { _deleted: true } object that can take the place of any job or dataset facet (but not run or input/output facets, which are valid only for a specific run).
  • Client: add a file transport #1891 @Alexkuva
    Creates a FileTransport and its configuration classes supporting append mode or write-new-file mode, which is especially useful when an object store does not support append mode, e.g. in the case of Databricks DBFS FUSE.

Changed

  • Airflow: do not run plugin if OpenLineage provider is installed #1999 @JDarDagran
    Sets OPENLINEAGE_DISABLED to true if the provider is installed.
  • Python: rename config to config_class #1998 @mobuchowski
    Renames the config class variable to config_class to avoid potential conflict with the config instance.

Fixed

  • Airflow: add workaround for airflow-sqlalchemy event mechanism bug #1959 @mobuchowski
    Due to known issues with the fork and thread model in the Airflow-SQLAlchemy-based event-delivery mechanism, a Kafka producer left alone does not emit a `COMPLETE`` event. This creates a producer for each event when we detect that we're under Airflow 2.3 - 2.5.
  • Spark: fix custom environment variables facet #1973 @pawel-big-lebowski
    Enables sending the Spark environment variables facet in a non-deterministic way.
  • Spark: filter unwanted Delta events #1968 @pawel-big-lebowski
    Clears events generated by logical plans having Project node as root.
  • Python: allow modification of openlineage.* logging levels via environment variables #1974 @JDarDagran
    Adds OPENLINEAGE_{CLIENT/AIRFLOW/DBT}_LOGGING environment variables that can be set according to module logging levels and cleans up some logging calls in openlineage-airflow.