1.20.5 - 2024-08-23
Added
- Python: add
CompositeTransport#2925@JDarDagran
Adds aCompositeTransportthat can accept other transport configs to instantiate transports and use them to emit events. - Spark: compile & test Spark integration on Java 17
#2828@pawel-big-lebowski
The Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration. - Spark: support preview release of Spark 4.0
#2854@pawel-big-lebowski
Includes the Spark 4.0 preview release in the integration tests. - Spark: add handling for
Window#2901@tnazarew
Adds handling forWindow-type nodes of a logical plan. - Spark: extract and send events with raw SQL from Spark
#2913@Imbruced
Adds a parser that traversesQueryExecutionto get the SQL query used from the SQL field with a BFS algorithm. - Spark: support Mongostream source
#2887@Imbruced
Adds a Mongo streaming visitor and tests. - Spark: new mechanism for disabling facets
#2912@arturowczarek
The mechanism makesFacetConfigaccept the disabled flag for any facet instead of passing them as a list. - Spark: support Kinesis source
#2906@Imbruced
Adds a Kinesis class handler in the streaming source builder. - Spark: extract
DatasetIdentifierfrom extensionLineageNode#2900@ddebowczyk92
Adds support for cases in whichLogicalRelationhas a grandChild node that implements theLineageRelationinterface. - Spark: extract Dataset from underlying
BaseRelation#2893@ddebowczyk92
DatasetIdentifieris now extracted from the underlying node ofLogicalRelation. - Spark: add descriptions and Marquez UI to Docker Compose file
#2889@jonathanlbt1
Adds themarquez-webservice to docker-compose.yml.
Fixed
- Proxy: bug fixed on error messages descriptions
#2880@jonathanlbt1
Improves error logging. - Proxy: update Docker image for Fluentd 1.17
#2877@jonathanlbt1
Upgrades the Fluentd version. - Spark: fix issue with Kafka source when saving with
for eachbatch method#2868@imbruced
Fixes an issue when Spark is in streaming mode and input for Kafka was not present in the event. - Spark: properly set ARN in namespace for Iceberg Glue symlinks
#2943@arturowczarek
MakesIcebergHandlersupport Glue catalog tables and create the symlink using the code fromPathUtils. - Spark: accept any provider for AWS Glue storage format
#2917@arturowczarek
Makes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe. - Spark: return valid JSON for failed logical plan serialization
#2892@arturowczarek
TheLogicalPlanSerializernow returns<failed-to-serialize-logical-plan>for failed serialization instead of an empty string. - Spark: extract legacy column lineage visitors loader
#2883@arturowczarek
RefactorsCustomCollectorsUtilsfor improved readability. - Spark: add Kafka input source when writing in
foreachbatch mode#2868@Imbruced
Fixes a bug keeping Kafka input sources from being produced. - Spark: extract
DatasetIdentifierfromSaveIntoDataSourceCommandVisitoroptions#2934@ddebowczyk92
ExtractsDatasetIdentifierfrom command's options instead of relying onp.createRelation(sqlContext, command.options()), which is a heavy operation forJdbcRelationProvider.