Version: Next

1.20.5 - 2024-08-23

Added

Python: add CompositeTransport #2925 @JDarDagran
Adds a CompositeTransport that can accept other transport configs to instantiate transports and use them to emit events.
Spark: compile & test Spark integration on Java 17 #2828 @pawel-big-lebowski
The Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration.
Spark: support preview release of Spark 4.0 #2854 @pawel-big-lebowski
Includes the Spark 4.0 preview release in the integration tests.
Spark: add handling for Window #2901 @tnazarew
Adds handling for Window-type nodes of a logical plan.
Spark: extract and send events with raw SQL from Spark #2913 @Imbruced
Adds a parser that traverses QueryExecution to get the SQL query used from the SQL field with a BFS algorithm.
Spark: support Mongostream source #2887 @Imbruced
Adds a Mongo streaming visitor and tests.
Spark: new mechanism for disabling facets #2912 @arturowczarek
The mechanism makes FacetConfig accept the disabled flag for any facet instead of passing them as a list.
Spark: support Kinesis source #2906 @Imbruced
Adds a Kinesis class handler in the streaming source builder.
Spark: extract DatasetIdentifier from extension LineageNode #2900 @ddebowczyk92
Adds support for cases in which LogicalRelation has a grandChild node that implements the LineageRelation interface.
Spark: extract Dataset from underlying BaseRelation #2893 @ddebowczyk92
DatasetIdentifier is now extracted from the underlying node of LogicalRelation.
Spark: add descriptions and Marquez UI to Docker Compose file #2889 @jonathanlbt1
Adds the marquez-web service to docker-compose.yml.

Fixed

Proxy: bug fixed on error messages descriptions #2880 @jonathanlbt1
Improves error logging.
Proxy: update Docker image for Fluentd 1.17 #2877 @jonathanlbt1
Upgrades the Fluentd version.
Spark: fix issue with Kafka source when saving with for each batch method #2868 @imbruced
Fixes an issue when Spark is in streaming mode and input for Kafka was not present in the event.
Spark: properly set ARN in namespace for Iceberg Glue symlinks #2943 @arturowczarek
Makes IcebergHandler support Glue catalog tables and create the symlink using the code from PathUtils.
Spark: accept any provider for AWS Glue storage format #2917 @arturowczarek
Makes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe.
Spark: return valid JSON for failed logical plan serialization #2892 @arturowczarek
The LogicalPlanSerializer now returns <failed-to-serialize-logical-plan> for failed serialization instead of an empty string.
Spark: extract legacy column lineage visitors loader #2883 @arturowczarek
Refactors CustomCollectorsUtils for improved readability.
Spark: add Kafka input source when writing in foreach batch mode #2868 @Imbruced
Fixes a bug keeping Kafka input sources from being produced.
Spark: extract DatasetIdentifier from SaveIntoDataSourceCommandVisitor options #2934 @ddebowczyk92
Extracts DatasetIdentifier from command's options instead of relying on p.createRelation(sqlContext, command.options()), which is a heavy operation for JdbcRelationProvider.

Added​

Fixed​

Added

Fixed