1.20.5 - 2024-08-23
Added
- Python: add
CompositeTransport
#2925
@JDarDagran
Adds aCompositeTransport
that can accept other transport configs to instantiate transports and use them to emit events. - Spark: compile & test Spark integration on Java 17
#2828
@pawel-big-lebowski
The Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration. - Spark: support preview release of Spark 4.0
#2854
@pawel-big-lebowski
Includes the Spark 4.0 preview release in the integration tests. - Spark: add handling for
Window
#2901
@tnazarew
Adds handling forWindow
-type nodes of a logical plan. - Spark: extract and send events with raw SQL from Spark
#2913
@Imbruced
Adds a parser that traversesQueryExecution
to get the SQL query used from the SQL field with a BFS algorithm. - Spark: support Mongostream source
#2887
@Imbruced
Adds a Mongo streaming visitor and tests. - Spark: new mechanism for disabling facets
#2912
@arturowczarek
The mechanism makesFacetConfig
accept the disabled flag for any facet instead of passing them as a list. - Spark: support Kinesis source
#2906
@Imbruced
Adds a Kinesis class handler in the streaming source builder. - Spark: extract
DatasetIdentifier
from extensionLineageNode
#2900
@ddebowczyk92
Adds support for cases in whichLogicalRelation
has a grandChild node that implements theLineageRelation
interface. - Spark: extract Dataset from underlying
BaseRelation
#2893
@ddebowczyk92
DatasetIdentifier
is now extracted from the underlying node ofLogicalRelation
. - Spark: add descriptions and Marquez UI to Docker Compose file
#2889
@jonathanlbt1
Adds themarquez-web
service to docker-compose.yml.
Fixed
- Proxy: bug fixed on error messages descriptions
#2880
@jonathanlbt1
Improves error logging. - Proxy: update Docker image for Fluentd 1.17
#2877
@jonathanlbt1
Upgrades the Fluentd version. - Spark: fix issue with Kafka source when saving with
for each
batch method#2868
@imbruced
Fixes an issue when Spark is in streaming mode and input for Kafka was not present in the event. - Spark: properly set ARN in namespace for Iceberg Glue symlinks
#2943
@arturowczarek
MakesIcebergHandler
support Glue catalog tables and create the symlink using the code fromPathUtils
. - Spark: accept any provider for AWS Glue storage format
#2917
@arturowczarek
Makes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe. - Spark: return valid JSON for failed logical plan serialization
#2892
@arturowczarek
TheLogicalPlanSerializer
now returns<failed-to-serialize-logical-plan>
for failed serialization instead of an empty string. - Spark: extract legacy column lineage visitors loader
#2883
@arturowczarek
RefactorsCustomCollectorsUtils
for improved readability. - Spark: add Kafka input source when writing in
foreach
batch mode#2868
@Imbruced
Fixes a bug keeping Kafka input sources from being produced. - Spark: extract
DatasetIdentifier
fromSaveIntoDataSourceCommandVisitor
options#2934
@ddebowczyk92
ExtractsDatasetIdentifier
from command's options instead of relying onp.createRelation(sqlContext, command.options())
, which is a heavy operation forJdbcRelationProvider
.