Class OpenLineage.GcpDataprocRunFacet

java.lang.Object
io.openlineage.client.OpenLineage.GcpDataprocRunFacet
All Implemented Interfaces:
OpenLineage.RunFacet
Enclosing class:
OpenLineage

public static final class OpenLineage.GcpDataprocRunFacet extends Object implements OpenLineage.RunFacet
model class for GcpDataprocRunFacet
  • Method Details

    • get_producer

      public URI get_producer()
      Specified by:
      get_producer in interface OpenLineage.RunFacet
      Returns:
      URI identifying the producer of this metadata. For example this could be a git url with a given tag or sha
    • get_schemaURL

      public URI get_schemaURL()
      Specified by:
      get_schemaURL in interface OpenLineage.RunFacet
      Returns:
      The JSON Pointer (https://tools.ietf.org/html/rfc6901) URL to the corresponding version of the schema definition for this facet
    • getAppId

      public String getAppId()
      Returns:
      Application ID set by the resource manager. For spark jobs, it is set in the spark configuration of the current context.
    • getAppName

      public String getAppName()
      Returns:
      App name which may be provided by the user, or some default is used by the resource manager. For spark jobs, it is set in the spark configuration of the current context.
    • getBatchId

      public String getBatchId()
      Returns:
      Populated only for Dataproc serverless batches. The resource id of the batch.
    • getBatchUuid

      public String getBatchUuid()
      Returns:
      Populated only for Dataproc serverless batches. A UUID generated by the service when it creates the batch.
    • getClusterName

      public String getClusterName()
      Returns:
      Populated only for Dataproc GCE workloads. The cluster name is unique within a GCP project.
    • getClusterUuid

      public String getClusterUuid()
      Returns:
      Populated only for Dataproc GCE workloads. A UUID generated by the service at the time of cluster creation.
    • getJobId

      public String getJobId()
      Returns:
      Populated only for Dataproc GCE workloads. If not specified by the user, the job ID will be provided by the service.
    • getJobUuid

      public String getJobUuid()
      Returns:
      Populated only for Dataproc GCE workloads. A UUID that uniquely identifies a job within the project over time.
    • getProjectId

      public String getProjectId()
      Returns:
      The GCP project ID that the resource belongs to.
    • getQueryNodeName

      public String getQueryNodeName()
      Returns:
      The name of the query node in the executed Spark Plan. Often used to describe the command being executed.
    • getJobType

      public String getJobType()
      Returns:
      Identifies whether the process is a job (on a Dataproc cluster), a batch or a session.
    • getSessionId

      public String getSessionId()
      Returns:
      Populated only for Dataproc serverless interactive sessions. The resource id of the session, used for URL generation.
    • getSessionUuid

      public String getSessionUuid()
      Returns:
      Populated only for Dataproc serverless interactive sessions. A UUID generated by the service when it creates the session.
    • getAdditionalProperties

      public Map<String,Object> getAdditionalProperties()
      Specified by:
      getAdditionalProperties in interface OpenLineage.RunFacet
      Returns:
      additional properties