Skip to main content
Version: 1.26.0

0.20.4 - 2023-02-07

Added

  • Airflow: add new extractor for GCSToGCSOperator #1495 @sekikn
    Adds a new extractor for this operator.
  • Flink: resolve topic names from regex, support 1.16.0 #1522 @pawel-big-lebowski
    Adds support for Flink 1.16.0 and makes the integration resolve topic names from Kafka topic patterns.
  • Proxy: implement lineage event validator for client proxy #1469 @fm100
    Implements logic in the proxy (which is still in development) for validating and handling lineage events.

Changed

  • CI: use ruff instead of flake8, isort, etc., for linting and formatting #1526 @mobuchowski
    Adopts the ruff package, which combines several linters and formatters into one fast binary.

Fixed

  • Airflow: make the Trino catalog non-mandatory #1572 @JDarDagran
    Makes the Trino catalog optional in the Trino extractor.
  • Common: add explicit SQL dependency #1532 @mobuchowski
    Addresses a 0.19.2 breaking change to the GE integration by including the SQL dependency explicitly.
  • DBT: adjust tqdm logging in dbt-ol #1549 @JdarDagran
    Adjusts tqdm to show the correct number of iterations and adds START events for parent runs.
  • DBT: fix typo in log output #1493 @denimalpaca
    Fixes 'emittled' typo in log output.
  • Great Expectations/Airflow: follow Snowflake dataset naming rules #1527 @mobuchowski
    Normalizes Snowflake dataset and datasource naming rules among DBT/Airflow/GE; canonizes old Snowflake account paths around making them all full-size with account, region and cloud names.
  • Java and Python Clients: Kafka does not initialize properties if they are empty; check and notify about Confluent-Kafka requirement #1556 @mobuchowski
    Fixes the failure to initialize KafkaTransport in the Java client and adds an exception if the required confluent-kafka module is missing from the Python client.
  • Spark: add square brackets for list-based Spark configs #1507 @Varunvaruns9
    Adds a condition to treat configs with [] as lists. Note: [] will be required for list-based configs starting with 0.21.0.
  • Spark: fix several Spark/BigQuery-related issues #1557 @mobuchowski
    Fixes the assumption that a version is always a number; adds support for HadoopMapReduceWriteConfigUtil; makes the integration access BigQueryUtil and getTableId using reflection, which supports all BigQuery versions; makes logs provide the full serialized LogicalPlan on debug.
  • SQL: only report partial failures `#1479 @mobuchowski
    Changes the parser so it reports partial failures instead of failing the whole extraction.