0.21.1 - 2023-03-02
Added
- Clients: add
DEBUGlogging of events to transports#1633@mobuchowski
Ensures that theDEBUGloglevel on properly configured loggers will always log events, regardless of the chosen transport. - Spark: add
CustomEnvironmentFacetBuilderclass#1545New contributor @Anirudh181001
Enables the capture of custom environment variables from Spark. - Spark: introduce the new output visitors
AlterTableAddPartitionCommandVisitorandAlterTableSetLocationCommandVisitor#1629New contributor @nataliezeller1
Adds visitors for extracting table names from the Spark commandsAlterTableAddPartitionCommandandAlterTableSetLocationCommand. The intended use case is a custom transport for the OpenMetadata lineage API. - Spark: add column lineage for JDBC relations
#1636@tnazarew
Adds column lineage information to JDBC events with data extracted from query by the SQL parser. - SQL: add linux-aarch64 native library to Java SQL parser
#1664@mobuchowski
Adds a Linux-ARM version of the native library. The Java SQL parser interface had only Linux-x64 and MacOS universal binary variants previously.
Changed
- Airflow: get table database in Athena extractor
#1631New contributor @rinzool
Changes the extractor to get a table's database from thetable.schemafield or the operator default if the field isNone.
Fixed
- dbt: add dbt
seedto the list of dbt-ol events#1649New contributor @pohek321
Ensures thatdbt-ol testno longer fails when run against an event seed. - Spark: make column lineage extraction in Spark support caching
#1634@pawel-big-lebowski
Collect column lineage from Spark logical plans that contain cached datasets. - Spark: add support for a deprecated config
#1586@tnazarew
Maps the deprecatedspark.openlineage.urltospark.openlineage.transport.url. - Spark: add error message in case of null in url
#1590@tnazarew
Improves error logging in the case of undefined URLs. - Spark: collect complete event for really quick Spark jobs
#1650@pawel-big-lebowski
Improves the collecting of OpenLineage events on SQL complete in the case of quick operations. - Spark: fix input/outputs for one node
LogicalRelationplans#1668@pawel-big-lebowski
For simple queries likeselect col1, col2 from my_db.my_tablethat do not write output, the Spark plan contained just a single node, which was wrongly treated as both an input and output dataset. - SQL: fix file existence check in build script for openlineage-sql-java
#1613@sekikn
Ensures that the build script works if the library is compiled solely for Linux.
Removed
- Airflow: remove
JobIdMappingand update macros to better support Airflow version 2+#1645@JDarDagran
Updates macros to useOpenLineageAdapter's method to generate deterministic run UUIDs because using theJobIdMappingutility is incompatible with Airflow 2+.