1 of 15

Monthly OpenLineage

TSC meeting

Sep/8/2021

2 of 15

Recording of calls

Reminder:

The meeting is recorded and archived on the wiki

https://wiki.lfaidata.foundation/display/OpenLineage/Monthly+TSC+meeting

2

3 of 15

Roll Call

TSC voting members:

Julien Le Dem

Mandy Chessell

Daniel Henneberger

Drew Banin

James Campbell

Ryan Blue

Willy Lulciuc

Zhamak Dehghani

Michael Collado

Maciej Obuchowski

3

4 of 15

Communication

4

5 of 15

Agenda

  • Update on OpenLineage latest release (0.2.1)
    • dbt integration demo
  • OpenLineage 0.3 scope discussion
    • Facet versioning mechanism (Issue #153)
    • OpenLineage Proxy Backend (Issue #152)
    • OpenLineage implementer test data and validation
    • Kafka client
  • Roadmap
    • Overview of Iceberg needs and facets
    • Beam
  • Open discussion

5

6 of 15

Update on latest release OpenLineage 0.2.1:

  • Marquez integrations imported in OpenLineage
  • Clients
    • python: https://pypi.org/project/openlineage-python/
    • java: https://search.maven.org/artifact/io.openlineage/openlineage-java/0.2.1/jar

6

7 of 15

dbt integration demo

1) pip3 install openlineage-dbt

2) define OPENLINEAGE_URL=...

3) replace dbt run with dbt-ol run

7

8 of 15

OpenLineage 0.3 scope:

https://github.com/OpenLineage/OpenLineage/projects/4

  • Proxy backend progress
  • Spark improvements (Spark 3x and datasource coverage)
  • Airflow 2.0 support
  • Great Expectations improvements

8

9 of 15

OL 0.3: Proxy Backend Proposal

9

10 of 15

Egeria Options (1)

Asynchronous integration means publishing continues even if the consumer (eg Egeria) is down.

10

11 of 15

Egeria options (2)

Direct API support eliminates need for proxy backend but requires Egeria to be running to capture lineage.

11

12 of 15

Egeria ecosystem

However open lineage is received by the integration daemon, it is possible to configure multiple destinations and processing.

12

13 of 15

Roadmap Discussion:

  • Iceberg requirements
  • Apache Beam integration

13

14 of 15

Roadmap:

14

15 of 15

Open Discussion

15