1.9.1 - 2024-02-26
important
This version adds the capability to publish Scala 2.12 and 2.13 variants of Apache Spark,
which necessitates a change in the artifact identifier for io.openlineage:openlineage-spark
.
From this version onwards, please use:
io.openlineage:openlineage-spark_${SCALA_BINARY_VERSION}:${OPENLINEAGE_SPARK_VERSION}
.
Added
- Airflow: add support for
JobTypeJobFacet
properties#2412
@mattiabertorello
Adds support for Job type properties within the Airflow Job facet. - dbt: add support for
JobTypeJobFacet
properties#2411
@mattiabertorello
Support Job type properties within the DBT Job facet. - Flink: support Flink Kafka dynamic source and sink
#2417
@HuangZhenQiu
Adds support for Flink Kafka Table Connector use cases for topic and schema extraction. - Flink: support multi-topic Kafka Sink
#2372
@pawel-big-lebowski
Adds support for multi-topic Kafka sinks. Limitations:recordSerializer
needs to implementKafkaTopicsDescriptor
. Please refer to the limitations sections in documentation. - Flink: support lineage for JDBC connector
#2436
@HuangZhenQiu
Adds support for use cases that employ this connector. - Flink: add common config gradle plugin
#2461
@HuangZhenQiu
Add common config gradle plugin to simplify gradle files of Flink submodules. - Java: extend circuit breaker loaded with
ServiceLoader
#2435
@pawel-big-lebowski
Loads the circuit breaker builder withServiceLoader
as an addition to a list of implemented builders available within the existing package. - Spark: integration now emits intermediate, application level events wrapping entire job execution
#2371
@mobuchowski
Previously, the Spark event model described only single actions, potentially linked only to some parent run. Closes#1672
. - Spark: support built-in lineage within
DataSourceV2Relation
#2394
@pawel-big-lebowski
Enables built-in lineage extraction within fromDataSourceV2Relation
lineage nodes. - Spark: add support for
JobTypeJobFacet
properties#2410
@mattiabertorello
Adds support for Job type properties within the Spark Job facet. - Spark: stop sending
spark.LogicalPlan
facet by default#2433
@pawel-big-lebowski
spark.LogicalPlan
has been added to default value ofspark.openlineage.facets.disabled
. - Spark/Flink/Java: circuit breaker
#2407
@pawel-big-lebowski
Introduces a circuit breaker mechanism to prevent effects of over-instrumentation. Implemented within Java client, it serves both the Flink and Spark integration. Read the Java client README for more details. - Spark: add the capability to publish Scala 2.12 and 2.13 variants of
openlineage-spark
#2446
@d-m-h
Adds the capability to publish Scala 2.12 and 2.13 variants ofopenlineage-spark
Changed
- Spark: enable the
app
module to be compiled with Scala 2.12 and Scala 2.13 variants of Apache Spark#2432
@d-m-h
Thespark.binary.version
andspark.version
properties control which variant to build. - Spark: enable Scala 2.13 support in the
app
module#2432
@d-m-h
Enables theapp
module to be built using both Scala 2.12 and Scala 2.13 variants of various Apache Spark versions, and enables the CI/CD pipeline to build and test them. - Spark: don't fail on exception of
UnknownEntryFacet
creation#2431
@mobuchowski
Failure to generateUnknownEntryFacet
was resulting in the event not being sent. - Spark: move Snowflake code into the vendor projects folders
#2405
@mattiabertorello
Creates avendor
folder to isolate Snowflake-specific code from the main Spark integration, enhancing organization and flexibility.
Fixed
- Flink: resolve PMD rule violation warnings
#2403
@HuangZhenQiu
Resolves the PMD rule violation warnings in the Flink integration module. - Flink: Added the 'isReleaseVersion' property back to the build, enabling the Flink integration to be release
#2468
@d-m-h
The 'isReleaseVersion' property was removed from the build, preventing the Flink integration from being released. - Python: fix issue with file config creating additional file
#2447
@kacpermuda
FileConfig
was creating an additional file when not in append mode. Closes#2439
. - Python: fix issue with append option in file config
#2441
@kacpermuda
FileConfig
was ignoring the append key in YAML config. Closes#2440
- Spark: fix integration catalog symlink without warehouse
#2379
@algorithmy1
In the case of symlinked Glue Catalog Tables, the parsing method was producing dataset names identical to the namespace. - Flink: fix
IcebergSourceWrapper
for Iceberg connector 1.17#2409
@ensctom
In Flink 1.17, the Icebergcatalogloader
was loading the catalog in the open function, causing theloadTable
method to throw aNullPointerException
error. - Spark: migrate
spark35
,spark3
,shared
modules to produce Scala 2.12 and Scala 2.13 variants#2390
#2385
#2384
@d-m-h
Migrates the three modules to use the refactored Gradle plugins. Also splits some tests into Scala 2.12- and Scala 2.13-specific versions. - Spark: conform the
spark2
module to the new build process#2391
@d-m-h
Due to a change in the Scala Collections API in Scala 2.13,NoSuchMethodErrors
were being thrown when running the openlineage-spack connector in an Apache Spark runtime compiled using Scala 2.13.