Version: 1.23.0

1.19.0 - 2024-07-22

Airflow: add log_url to AirflowRunFacet #2852 @dolfinus
Adds taskinstance's log_url field to AirflowRunFacet.
Spark: add handling for Generate #2856 @tnazarew
Adds handling for Generate-type nodes of a logical plan (e.g., explode operations).
Java: add DerbyJdbcExtractor #2869 @dolfinus
Adds JdbcExtractor implementation for Derby database. As this is a file-based DBMS, its Dataset namespace is file and name is an absolute path to a database file.
Spark: verify bytecode version of the built jar. #2859 @pawel-big-lebowski
Extends the JarVerifier plugin to ensure all compiled classes have a bytecode version of Java 8 or lower.
Spark: add Kafka streaming source support #2851 @d-m-h @imbruced
Adds support for Kafka streaming sources to Kafka streaming sinks. Inputs and outputs are now included in lineage events.

Airflow: replace datetime.now with airflow.utils.timezone.utcnow #2865 @kacpermuda
Fixes missing timezone information in task FAIL events.
Spark: remove shaded dependency in ColumnLevelLineageBuilder #2850 @tnazarew
Removes the shaded Streams dependency in ColumnLevelLineageBuilder causing a ClassNotFoundException.
Spark: make Delta dataset symlink consistent with non-Delta tables #2863 @dolfinus
Makes dataset symlinks for Delta and non-Delta tables consistent.
Spark: use Table's properties during column-level lineage construction #2855 @ddebowczyk92
Fixes PlanUtils3 so Dataset identifier information based on a Table's properties is also retrieved during the construction of column-level lineage.
Spark: extract job name creation to providers #2861 @arturowczarek
The integration now detects if the spark.app.name was autogenerated by Glue and uses the Glue job name in such cases. Also, each job name provisioning strategy is now extracted to a separate provider.