1 of 18

octavia CLI

Manage Airbyte’s configurations with YAML!

2 of 18

Index

  1. Scope
  2. User expressed needs
  3. Say hello to octavia!
  4. Workflow
  5. Secret management
  6. Post-MVP envisaged features
  7. Technical details

2

3 of 18

Scope

This document is an attempt to summarize an internal tech spec about a CLI to manage Airbyte’s configuration.

It is meant to be shared with the community to receive feedback and make sure the direction we are taking meets community’s expectation.

We are describing what we consider to be part of a MVP. �The envisaged additional features and technical details are shared at the end of this document.

3

4 of 18

User expressed needs

4

I want to manage Airbyte configurations without UI interaction!

I want to version my Airbyte configurations with files!

I want to migrate my existing configuration to a new Airbyte instance!

I want to deploy Airbyte configurations from my CI/CD pipelines!

5 of 18

Say hello to octavia!

Features:

  • Scaffolding of a directory architecture that will host the YAML configs.�
  • Auto generation of YAML config files that matches the resources' schemas.�
  • Manage resources (CRUD) from YAML config files.�
  • Safe resources update through diff display and validation.�
  • Simple secret management to avoid versioning credentials.

5

6 of 18

Workflow

6

octavia init

octavia definitions

octavia create

octavia apply

octavia delete

Manually edit YAML files

Project directory creation.

List existing sources / destinations definitions.

Create a templated YAML configuration.

Deploy the configuration on an Airbyte instance.

Delete a configured resource.

1

2

3

4

5

6

Users edit the configuration according to their needs.

7 of 18

Scaffolds a local project directories in current working directory with the following structure:

Directories remain empty at this step but its a way to guarantee that the local project structure is valid for the octavia create command.

7

$ octavia init

├── connections/

├── destinations/

└── sources/

8 of 18

Display a list of existing sources and destinations that can then be created with octavia create commands:

Same command variation will exist for destinations:

Custom connectors definitions can be retrieved by providing the Airbyte instance url:

8

$ octavia definitions

$ octavia definitions sources

Available sources:� - Amazon Ads (image:<connector-version>) - <connector-definition-uuid>� - BigQuery (image:<connector-version>) - <connector-definition-uuid>� - Cart.com (image:<connector-version>) - <connector-definition-uuid>� - Drift (image:<connector-version>) - <connector-definition-uuid>�...

$ octavia definitions destinations

$ octavia --airbyte-url http://localhost:8000 definitions source

9 of 18

Auto-generate a bootstrapped YAML config file with a valid structure for a future source / destination that will be create through the octavia apply command.

The users are then responsible for editing the YAML files according their needs.

The generated files have prefilled values and commented hints, these are extracted from the online Airbyte sources/destinations definitions.

9

$ octavia create

$ octavia create source postgres my_backend_db

  • octavia created ./sources/postgres/my_backend_db.yaml
  • octavia created ./sources/postgres/my_backend_db_secrets.yaml
  • octavia added the secret file to ./sources/postgres/.gitignore

10 of 18

10

./sources/postgres/my_backend_db.yaml

./sources/postgres/my_backend_db_secrets.yaml

Auto-generated YAML files:

11 of 18

Connection bootstrapping will require existing valid YAML config for the related sources / destinations:

This command validates source and destination config file and creates a YAML file for the connection declaration.

Users will have to edit this file to define the connection’s schedule, catalog modification and custom transformation declarations.

11

$ octavia create connection

$ octavia create connection --source ./sources/postgres/my_backend_db.yaml --destination ./destinations/bigquery/my_data_warehouse.yaml

  • octavia created ./connections/postgres_to_bigquery/my_backend_db_to_my_data_warehouse.yaml

12 of 18

12

./connections/postgres_to_bigquery/my_backend_db_to_my_data_warehouse.yaml

Auto-generated YAML file:

13 of 18

Trigger resources creation on Airbyte. The targeted Airbyte instance is declared through an AIRBYTE_URL env var or an option provided to the command.

The command will display a diff to the user between what’s currently existing in Airbyte and what the YAML configuration changes. It will also prompt the user for validation. Validation prompt can be bypassed by a -Y flag.

Note that connection creation will work only if related sources and destinations are successfully created.

13

$ octavia apply

$ octavia --airbyte-url http://localhost:8000 apply

$ octavia apply sources # Only apply sources

$ octavia apply destinations # Only apply destinations

$ octavia apply connections # Only apply connections

$ octavia apply -f <path_to_yaml_config> # Only apply a specific configuration

14 of 18

The delete command is a mirror of the octavia apply command but for resources deletions.

It will prompt the user for validation. Deletion of connected sources and destination won’t be allowed if the related connection still exists.

14

$ octavia delete

$ octavia --airbyte-url http://localhost:8000 delete

$ octavia delete sources # Only delete sources

$ octavia delete destinations # Only delete destinations

$ octavia delete connections # Only delete connections

$ octavia delete -f <path_to_yaml_config> # Only delete a specific configuration

15 of 18

Secret management

15

Sources and destination configurations will always contain sensitive details such as credentials that must never be committed.

To ensure this octavia:

  • Generates a separate YAML config for secrets only on create
  • Generates a .gitignore file that will mention the previously generated secret file
  • Merges secret YAML config with main YAML config on apply commands.

This approach allows users to decide which information is considered as sensitive and also raises awareness on secret management.

16 of 18

Post MVP features

16

  • Import existing resources and convert them to YAML config.�
  • Manage connectors versions and custom additions.

  • Resources sanity check: a command could check the health of recent sync, sources, destinations, connections.

  • Sync log access: read sync logs through a CLI command.

17 of 18

Technical details

17

  • Most of the CLI will rely on Airbyte existing API . We plan to provide autogenerated YAML config from existing sources/destination definitions. We’d like to rely as much as possible on Open API schemas to generate these templates.�
  • We plan to develop the octavia CLI with Python.�
  • Source / destination definitions update should not require an update of the CLI.

18 of 18

Feedback is welcome!

18