1.9.1 - 2024-02-26
important
This version adds the capability to publish Scala 2.12 and 2.13 variants of Apache Spark,
which necessitates a change in the artifact identifier for io.openlineage:openlineage-spark.
From this version onwards, please use:
io.openlineage:openlineage-spark_${SCALA_BINARY_VERSION}:${OPENLINEAGE_SPARK_VERSION}.
Added
- Airflow: add support for
JobTypeJobFacetproperties#2412@mattiabertorello
Adds support for Job type properties within the Airflow Job facet. - dbt: add support for
JobTypeJobFacetproperties#2411@mattiabertorello
Support Job type properties within the DBT Job facet. - Flink: support Flink Kafka dynamic source and sink
#2417@HuangZhenQiu
Adds support for Flink Kafka Table Connector use cases for topic and schema extraction. - Flink: support multi-topic Kafka Sink
#2372@pawel-big-lebowski
Adds support for multi-topic Kafka sinks. Limitations:recordSerializerneeds to implementKafkaTopicsDescriptor. Please refer to the limitations sections in documentation. - Flink: support lineage for JDBC connector
#2436@HuangZhenQiu
Adds support for use cases that employ this connector. - Flink: add common config gradle plugin
#2461@HuangZhenQiu
Add common config gradle plugin to simplify gradle files of Flink submodules. - Java: extend circuit breaker loaded with
ServiceLoader#2435@pawel-big-lebowski
Loads the circuit breaker builder withServiceLoaderas an addition to a list of implemented builders available within the existing package. - Spark: integration now emits intermediate, application level events wrapping entire job execution
#2371@mobuchowski
Previously, the Spark event model described only single actions, potentially linked only to some parent run. Closes#1672. - Spark: support built-in lineage within
DataSourceV2Relation#2394@pawel-big-lebowski
Enables built-in lineage extraction within fromDataSourceV2Relationlineage nodes. - Spark: add support for
JobTypeJobFacetproperties#2410@mattiabertorello
Adds support for Job type properties within the Spark Job facet. - Spark: stop sending
spark.LogicalPlanfacet by default#2433@pawel-big-lebowski
spark.LogicalPlanhas been added to default value ofspark.openlineage.facets.disabled. - Spark/Flink/Java: circuit breaker
#2407@pawel-big-lebowski
Introduces a circuit breaker mechanism to prevent effects of over-instrumentation. Implemented within Java client, it serves both the Flink and Spark integration. Read the Java client README for more details. - Spark: add the capability to publish Scala 2.12 and 2.13 variants of
openlineage-spark#2446@d-m-h
Adds the capability to publish Scala 2.12 and 2.13 variants ofopenlineage-spark
Changed
- Spark: enable the
appmodule to be compiled with Scala 2.12 and Scala 2.13 variants of Apache Spark#2432@d-m-h
Thespark.binary.versionandspark.versionproperties control which variant to build. - Spark: enable Scala 2.13 support in the
appmodule#2432@d-m-h
Enables theappmodule to be built using both Scala 2.12 and Scala 2.13 variants of various Apache Spark versions, and enables the CI/CD pipeline to build and test them. - Spark: don't fail on exception of
UnknownEntryFacetcreation#2431@mobuchowski
Failure to generateUnknownEntryFacetwas resulting in the event not being sent. - Spark: move Snowflake code into the vendor projects folders
#2405@mattiabertorello
Creates avendorfolder to isolate Snowflake-specific code from the main Spark integration, enhancing organization and flexibility.
Fixed
- Flink: resolve PMD rule violation warnings
#2403@HuangZhenQiu
Resolves the PMD rule violation warnings in the Flink integration module. - Flink: Added the 'isReleaseVersion' property back to the build, enabling the Flink integration to be release
#2468@d-m-h
The 'isReleaseVersion' property was removed from the build, preventing the Flink integration from being released. - Python: fix issue with file config creating additional file
#2447@kacpermuda
FileConfigwas creating an additional file when not in append mode. Closes#2439. - Python: fix issue with append option in file config
#2441@kacpermuda
FileConfigwas ignoring the append key in YAML config. Closes#2440 - Spark: fix integration catalog symlink without warehouse
#2379@algorithmy1
In the case of symlinked Glue Catalog Tables, the parsing method was producing dataset names identical to the namespace. - Flink: fix
IcebergSourceWrapperfor Iceberg connector 1.17#2409@ensctom
In Flink 1.17, the Icebergcatalogloaderwas loading the catalog in the open function, causing theloadTablemethod to throw aNullPointerExceptionerror. - Spark: migrate
spark35,spark3,sharedmodules to produce Scala 2.12 and Scala 2.13 variants#2390#2385#2384@d-m-h
Migrates the three modules to use the refactored Gradle plugins. Also splits some tests into Scala 2.12- and Scala 2.13-specific versions. - Spark: conform the
spark2module to the new build process#2391@d-m-h
Due to a change in the Scala Collections API in Scala 2.13,NoSuchMethodErrorswere being thrown when running the openlineage-spack connector in an Apache Spark runtime compiled using Scala 2.13.