Version: 1.25.0

0.20.4 - 2023-02-07

Added

Airflow: add new extractor for GCSToGCSOperator #1495 @sekikn
Adds a new extractor for this operator.
Flink: resolve topic names from regex, support 1.16.0 #1522 @pawel-big-lebowski
Adds support for Flink 1.16.0 and makes the integration resolve topic names from Kafka topic patterns.
Proxy: implement lineage event validator for client proxy #1469 @fm100
Implements logic in the proxy (which is still in development) for validating and handling lineage events.

Changed

CI: use ruff instead of flake8, isort, etc., for linting and formatting #1526 @mobuchowski
Adopts the ruff package, which combines several linters and formatters into one fast binary.

Fixed

Airflow: make the Trino catalog non-mandatory #1572 @JDarDagran
Makes the Trino catalog optional in the Trino extractor.
Common: add explicit SQL dependency #1532 @mobuchowski
Addresses a 0.19.2 breaking change to the GE integration by including the SQL dependency explicitly.
DBT: adjust tqdm logging in dbt-ol #1549 @JdarDagran
Adjusts tqdm to show the correct number of iterations and adds START events for parent runs.
DBT: fix typo in log output #1493 @denimalpaca
Fixes 'emittled' typo in log output.
Great Expectations/Airflow: follow Snowflake dataset naming rules #1527 @mobuchowski
Normalizes Snowflake dataset and datasource naming rules among DBT/Airflow/GE; canonizes old Snowflake account paths around making them all full-size with account, region and cloud names.
Java and Python Clients: Kafka does not initialize properties if they are empty; check and notify about Confluent-Kafka requirement #1556 @mobuchowski
Fixes the failure to initialize KafkaTransport in the Java client and adds an exception if the required confluent-kafka module is missing from the Python client.
Spark: add square brackets for list-based Spark configs #1507 @Varunvaruns9
Adds a condition to treat configs with [] as lists. Note: [] will be required for list-based configs starting with 0.21.0.
Spark: fix several Spark/BigQuery-related issues #1557 @mobuchowski
Fixes the assumption that a version is always a number; adds support for HadoopMapReduceWriteConfigUtil; makes the integration access BigQueryUtil and getTableId using reflection, which supports all BigQuery versions; makes logs provide the full serialized LogicalPlan on debug.
SQL: only report partial failures `#1479 @mobuchowski
Changes the parser so it reports partial failures instead of failing the whole extraction.

Added​

Changed​

Fixed​

Added

Changed

Fixed