Skip to main content
Version: Next

1.20.5 - 2024-08-23

Added

  • Python: add CompositeTransport #2925 @JDarDagran
    Adds a CompositeTransport that can accept other transport configs to instantiate transports and use them to emit events.
  • Spark: compile & test Spark integration on Java 17 #2828 @pawel-big-lebowski
    The Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration.
  • Spark: support preview release of Spark 4.0 #2854 @pawel-big-lebowski
    Includes the Spark 4.0 preview release in the integration tests.
  • Spark: add handling for Window #2901 @tnazarew
    Adds handling for Window-type nodes of a logical plan.
  • Spark: extract and send events with raw SQL from Spark #2913 @Imbruced
    Adds a parser that traverses QueryExecution to get the SQL query used from the SQL field with a BFS algorithm.
  • Spark: support Mongostream source #2887 @Imbruced
    Adds a Mongo streaming visitor and tests.
  • Spark: new mechanism for disabling facets #2912 @arturowczarek
    The mechanism makes FacetConfig accept the disabled flag for any facet instead of passing them as a list.
  • Spark: support Kinesis source #2906 @Imbruced
    Adds a Kinesis class handler in the streaming source builder.
  • Spark: extract DatasetIdentifier from extension LineageNode #2900 @ddebowczyk92
    Adds support for cases in which LogicalRelation has a grandChild node that implements the LineageRelation interface.
  • Spark: extract Dataset from underlying BaseRelation #2893 @ddebowczyk92
    DatasetIdentifier is now extracted from the underlying node of LogicalRelation.
  • Spark: add descriptions and Marquez UI to Docker Compose file #2889 @jonathanlbt1
    Adds the marquez-web service to docker-compose.yml.

Fixed

  • Proxy: bug fixed on error messages descriptions #2880 @jonathanlbt1
    Improves error logging.
  • Proxy: update Docker image for Fluentd 1.17 #2877 @jonathanlbt1
    Upgrades the Fluentd version.
  • Spark: fix issue with Kafka source when saving with for each batch method #2868 @imbruced
    Fixes an issue when Spark is in streaming mode and input for Kafka was not present in the event.
  • Spark: properly set ARN in namespace for Iceberg Glue symlinks #2943 @arturowczarek
    Makes IcebergHandler support Glue catalog tables and create the symlink using the code from PathUtils.
  • Spark: accept any provider for AWS Glue storage format #2917 @arturowczarek
    Makes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe.
  • Spark: return valid JSON for failed logical plan serialization #2892 @arturowczarek
    The LogicalPlanSerializer now returns <failed-to-serialize-logical-plan> for failed serialization instead of an empty string.
  • Spark: extract legacy column lineage visitors loader #2883 @arturowczarek
    Refactors CustomCollectorsUtils for improved readability.
  • Spark: add Kafka input source when writing in foreach batch mode #2868 @Imbruced
    Fixes a bug keeping Kafka input sources from being produced.
  • Spark: extract DatasetIdentifier from SaveIntoDataSourceCommandVisitor options #2934 @ddebowczyk92
    Extracts DatasetIdentifier from command's options instead of relying on p.createRelation(sqlContext, command.options()), which is a heavy operation for JdbcRelationProvider.