0.21.1 - 2023-03-02
Added
- Clients: add
DEBUG
logging of events to transports#1633
@mobuchowski
Ensures that theDEBUG
loglevel on properly configured loggers will always log events, regardless of the chosen transport. - Spark: add
CustomEnvironmentFacetBuilder
class#1545
New contributor @Anirudh181001
Enables the capture of custom environment variables from Spark. - Spark: introduce the new output visitors
AlterTableAddPartitionCommandVisitor
andAlterTableSetLocationCommandVisitor
#1629
New contributor @nataliezeller1
Adds visitors for extracting table names from the Spark commandsAlterTableAddPartitionCommand
andAlterTableSetLocationCommand
. The intended use case is a custom transport for the OpenMetadata lineage API. - Spark: add column lineage for JDBC relations
#1636
@tnazarew
Adds column lineage information to JDBC events with data extracted from query by the SQL parser. - SQL: add linux-aarch64 native library to Java SQL parser
#1664
@mobuchowski
Adds a Linux-ARM version of the native library. The Java SQL parser interface had only Linux-x64 and MacOS universal binary variants previously.
Changed
- Airflow: get table database in Athena extractor
#1631
New contributor @rinzool
Changes the extractor to get a table's database from thetable.schema
field or the operator default if the field isNone
.
Fixed
- dbt: add dbt
seed
to the list of dbt-ol events#1649
New contributor @pohek321
Ensures thatdbt-ol test
no longer fails when run against an event seed. - Spark: make column lineage extraction in Spark support caching
#1634
@pawel-big-lebowski
Collect column lineage from Spark logical plans that contain cached datasets. - Spark: add support for a deprecated config
#1586
@tnazarew
Maps the deprecatedspark.openlineage.url
tospark.openlineage.transport.url
. - Spark: add error message in case of null in url
#1590
@tnazarew
Improves error logging in the case of undefined URLs. - Spark: collect complete event for really quick Spark jobs
#1650
@pawel-big-lebowski
Improves the collecting of OpenLineage events on SQL complete in the case of quick operations. - Spark: fix input/outputs for one node
LogicalRelation
plans#1668
@pawel-big-lebowski
For simple queries likeselect col1, col2 from my_db.my_table
that do not write output, the Spark plan contained just a single node, which was wrongly treated as both an input and output dataset. - SQL: fix file existence check in build script for openlineage-sql-java
#1613
@sekikn
Ensures that the build script works if the library is compiled solely for Linux.
Removed
- Airflow: remove
JobIdMapping
and update macros to better support Airflow version 2+#1645
@JDarDagran
Updates macros to useOpenLineageAdapter
's method to generate deterministic run UUIDs because using theJobIdMapping
utility is incompatible with Airflow 2+.