Version: 1.22.0


Tip: See current list of all supported transports.


Allows sending events to HTTP endpoint, using ApacheHTTPClient.


  • type - string, must be "http". Required.
  • url - string, base url for HTTP requests. Required.
  • endpoint - string specifying the endpoint to which events are sent, appended to url. Optional, default: /api/v1/lineage.
  • urlParams - dictionary specifying query parameters send in HTTP requests. Optional.
  • timeoutInMillis - integer specifying timeout (in milliseconds) value used while connecting to server. Optional, default: 5000.
  • auth - dictionary specifying authentication options. Optional, by default no authorization is used. If set, requires the type property.
    • type - string specifying the "api_key" or the fully qualified class name of your TokenProvider. Required if auth is provided.
    • apiKey - string setting the Authentication HTTP header as the Bearer. Required if type is api_key.
  • headers - dictionary specifying HTTP request headers. Optional.
  • compression - string, name of algorithm used by HTTP client to compress request body. Optional, default value null, allowed values: gzip. Added in v1.13.0.


Events are serialized to JSON, and then are send as HTTP POST request with Content-Type: application/json.


Anonymous connection:

type: http
url: http://localhost:5000

With authorization:

type: http
url: http://localhost:5000
type: api_key
api_key: f38d2189-c603-4b46-bdea-e573a3b5a7d5

Full example:

type: http
url: http://localhost:5000
endpoint: /api/v1/lineage
param0: value0
param1: value1
timeoutInMillis: 5000
type: api_key
api_key: f38d2189-c603-4b46-bdea-e573a3b5a7d5
X-Some-Extra-Header: abc
compression: gzip


If a transport type is set to kafka, then the below parameters would be read and used when building KafkaProducer. This transport requires the artifact org.apache.kafka:kafka-clients:3.1.0 (or compatible) on your classpath.


  • type - string, must be "kafka". Required.

  • topicName - string specifying the topic on what events will be sent. Required.

  • properties - a dictionary containing a Kafka producer config as in Kafka producer config. Required.

  • localServerId - deprecated, renamed to messageKey since v1.13.0.

  • messageKey - string, key for all Kafka messages produced by transport. Optional, default value described below. Added in v1.13.0.

    Default values for messageKey are:

    • run:{parentJob.namespace}/{} - for RunEvent with parent facet
    • run:{job.namespace}/{} - for RunEvent
    • job:{job.namespace}/{} - for JobEvent
    • dataset:{dataset.namespace}/{} - for DatasetEvent


Events are serialized to JSON, and then dispatched to the Kafka topic.


It is recommended to provide messageKey if Job hierarchy is used. It can be any string, but it should be the same for all jobs in hierarchy, like Airflow task -> Spark application -> Spark task runs.


type: kafka
bootstrap.servers: localhost:9092,
acks: all
retries: 3
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: org.apache.kafka.common.serialization.StringSerializer
messageKey: some-value

If a transport type is set to kinesis, then the below parameters would be read and used when building KinesisProducer. Also, KinesisTransport depends on you to provide artifact com.amazonaws:amazon-kinesis-producer:0.14.0 or compatible on your classpath.


  • type - string, must be "kinesis". Required.
  • streamName - the streamName of the Kinesis. Required.
  • region - the region of the Kinesis. Required.
  • roleArn - the roleArn which is allowed to read/write to Kinesis stream. Optional.
  • properties - a dictionary that contains a Kinesis allowed properties. Optional.


  • Events are serialized to JSON, and then dispatched to the Kinesis stream.
  • The partition key is generated as {jobNamespace}:{jobName}.
  • Two constructors are available: one accepting both KinesisProducer and KinesisConfig and another solely accepting KinesisConfig.


type: kinesis
streamName: your_kinesis_stream_name
region: your_aws_region
roleArn: arn:aws:iam::account-id:role/role-name
VerifyCertificate: true
ConnectTimeout: 6000


This straightforward transport emits OpenLineage events directly to the console through a logger. No additional configuration is required.


Events are serialized to JSON. Then each event is logged with INFO level to logger with name ConsoleTransport.


Be cautious when using the DEBUG log level, as it might result in double-logging due to the OpenLineageClient also logging.


  • type - string, must be "console". Required.


type: console


Designed mainly for integration testing, the FileTransport emits OpenLineage events to a given file.


  • type - string, must be "file". Required.
  • location - string specifying the path of the file. Required.


  • If the target file is absent, it's created.
  • Events are serialized to JSON, and then appended to a file, separated by newlines.
  • Intrinsic newline characters within the event JSON are eliminated to ensure one-line events.

Notes for Yarn/Kubernetes

This transport type is pretty useless on Spark/Flink applications deployed to Yarn or Kubernetes cluster:

  • Each executor will write file to a local filesystem of Yarn container/K8s pod. So resulting file will be removed when such container/pod is destroyed.
  • Kubernetes persistent volumes are not destroyed after pod removal. But all the executors will write to the same network disk in parallel, producing a broken file.


type: file
location: /path/to/your/file