Skip to main content

0.21.1 - 2023-03-02

Added

  • Clients: add DEBUG logging of events to transports #1633 @mobuchowski
    Ensures that the DEBUG loglevel on properly configured loggers will always log events, regardless of the chosen transport.
  • Spark: add CustomEnvironmentFacetBuilder class #1545 New contributor @Anirudh181001
    Enables the capture of custom environment variables from Spark.
  • Spark: introduce the new output visitors AlterTableAddPartitionCommandVisitor and AlterTableSetLocationCommandVisitor #1629 New contributor @nataliezeller1
    Adds visitors for extracting table names from the Spark commands AlterTableAddPartitionCommand and AlterTableSetLocationCommand. The intended use case is a custom transport for the OpenMetadata lineage API.
  • Spark: add column lineage for JDBC relations #1636 @tnazarew
    Adds column lineage information to JDBC events with data extracted from query by the SQL parser.
  • SQL: add linux-aarch64 native library to Java SQL parser #1664 @mobuchowski
    Adds a Linux-ARM version of the native library. The Java SQL parser interface had only Linux-x64 and MacOS universal binary variants previously.

Changed

  • Airflow: get table database in Athena extractor #1631 New contributor @rinzool
    Changes the extractor to get a table's database from the table.schema field or the operator default if the field is None.

Fixed

  • dbt: add dbt seed to the list of dbt-ol events #1649 New contributor @pohek321
    Ensures that dbt-ol test no longer fails when run against an event seed.
  • Spark: make column lineage extraction in Spark support caching #1634 @pawel-big-lebowski
    Collect column lineage from Spark logical plans that contain cached datasets.
  • Spark: add support for a deprecated config #1586 @tnazarew
    Maps the deprecated spark.openlineage.url to spark.openlineage.transport.url.
  • Spark: add error message in case of null in url #1590 @tnazarew
    Improves error logging in the case of undefined URLs.
  • Spark: collect complete event for really quick Spark jobs #1650 @pawel-big-lebowski
    Improves the collecting of OpenLineage events on SQL complete in the case of quick operations.
  • Spark: fix input/outputs for one node LogicalRelation plans #1668 @pawel-big-lebowski
    For simple queries like select col1, col2 from my_db.my_table that do not write output, the Spark plan contained just a single node, which was wrongly treated as both an input and output dataset.
  • SQL: fix file existence check in build script for openlineage-sql-java #1613 @sekikn
    Ensures that the build script works if the library is compiled solely for Linux.

Removed

  • Airflow: remove JobIdMapping and update macros to better support Airflow version 2+ #1645 @JDarDagran
    Updates macros to use OpenLineageAdapter's method to generate deterministic run UUIDs because using the JobIdMapping utility is incompatible with Airflow 2+.