Skip to main content
Version: 1.24.2

1.9.1 - 2024-02-26

important

This version adds the capability to publish Scala 2.12 and 2.13 variants of Apache Spark, which necessitates a change in the artifact identifier for io.openlineage:openlineage-spark. From this version onwards, please use:
io.openlineage:openlineage-spark_${SCALA_BINARY_VERSION}:${OPENLINEAGE_SPARK_VERSION}.

Added

  • Airflow: add support for JobTypeJobFacet properties #2412 @mattiabertorello
    Adds support for Job type properties within the Airflow Job facet.
  • dbt: add support for JobTypeJobFacet properties #2411 @mattiabertorello
    Support Job type properties within the DBT Job facet.
  • Flink: support Flink Kafka dynamic source and sink #2417 @HuangZhenQiu
    Adds support for Flink Kafka Table Connector use cases for topic and schema extraction.
  • Flink: support multi-topic Kafka Sink #2372 @pawel-big-lebowski
    Adds support for multi-topic Kafka sinks. Limitations: recordSerializer needs to implement KafkaTopicsDescriptor. Please refer to the limitations sections in documentation.
  • Flink: support lineage for JDBC connector #2436 @HuangZhenQiu
    Adds support for use cases that employ this connector.
  • Flink: add common config gradle plugin #2461 @HuangZhenQiu
    Add common config gradle plugin to simplify gradle files of Flink submodules.
  • Java: extend circuit breaker loaded with ServiceLoader #2435 @pawel-big-lebowski
    Loads the circuit breaker builder with ServiceLoader as an addition to a list of implemented builders available within the existing package.
  • Spark: integration now emits intermediate, application level events wrapping entire job execution #2371 @mobuchowski
    Previously, the Spark event model described only single actions, potentially linked only to some parent run. Closes #1672.
  • Spark: support built-in lineage within DataSourceV2Relation #2394 @pawel-big-lebowski
    Enables built-in lineage extraction within from DataSourceV2Relation lineage nodes.
  • Spark: add support for JobTypeJobFacet properties #2410 @mattiabertorello
    Adds support for Job type properties within the Spark Job facet.
  • Spark: stop sending spark.LogicalPlan facet by default #2433 @pawel-big-lebowski
    spark.LogicalPlan has been added to default value of spark.openlineage.facets.disabled.
  • Spark/Flink/Java: circuit breaker #2407 @pawel-big-lebowski
    Introduces a circuit breaker mechanism to prevent effects of over-instrumentation. Implemented within Java client, it serves both the Flink and Spark integration. Read the Java client README for more details.
  • Spark: add the capability to publish Scala 2.12 and 2.13 variants of openlineage-spark #2446 @d-m-h
    Adds the capability to publish Scala 2.12 and 2.13 variants of openlineage-spark

Changed

  • Spark: enable the app module to be compiled with Scala 2.12 and Scala 2.13 variants of Apache Spark #2432 @d-m-h
    The spark.binary.version and spark.version properties control which variant to build.
  • Spark: enable Scala 2.13 support in the app module #2432 @d-m-h
    Enables the app module to be built using both Scala 2.12 and Scala 2.13 variants of various Apache Spark versions, and enables the CI/CD pipeline to build and test them.
  • Spark: don't fail on exception of UnknownEntryFacet creation #2431 @mobuchowski
    Failure to generate UnknownEntryFacet was resulting in the event not being sent.
  • Spark: move Snowflake code into the vendor projects folders #2405 @mattiabertorello
    Creates a vendor folder to isolate Snowflake-specific code from the main Spark integration, enhancing organization and flexibility.

Fixed

  • Flink: resolve PMD rule violation warnings #2403 @HuangZhenQiu
    Resolves the PMD rule violation warnings in the Flink integration module.
  • Flink: Added the 'isReleaseVersion' property back to the build, enabling the Flink integration to be release #2468 @d-m-h
    The 'isReleaseVersion' property was removed from the build, preventing the Flink integration from being released.
  • Python: fix issue with file config creating additional file #2447 @kacpermuda
    FileConfig was creating an additional file when not in append mode. Closes #2439.
  • Python: fix issue with append option in file config #2441 @kacpermuda
    FileConfig was ignoring the append key in YAML config. Closes #2440
  • Spark: fix integration catalog symlink without warehouse #2379 @algorithmy1
    In the case of symlinked Glue Catalog Tables, the parsing method was producing dataset names identical to the namespace.
  • Flink: fix IcebergSourceWrapper for Iceberg connector 1.17 #2409 @ensctom
    In Flink 1.17, the Iceberg catalogloader was loading the catalog in the open function, causing the loadTable method to throw a NullPointerException error.
  • Spark: migrate spark35, spark3, shared modules to produce Scala 2.12 and Scala 2.13 variants #2390 #2385#2384 @d-m-h
    Migrates the three modules to use the refactored Gradle plugins. Also splits some tests into Scala 2.12- and Scala 2.13-specific versions.
  • Spark: conform the spark2 module to the new build process #2391 @d-m-h
    Due to a change in the Scala Collections API in Scala 2.13, NoSuchMethodErrors were being thrown when running the openlineage-spack connector in an Apache Spark runtime compiled using Scala 2.13.