Configuration

ODAS uses a YAML configuration file to set its configuration, which is then applied to a cluster using the okctl CLI tool.

Configuration Example

Here is a sample configuration file:

ports:
  # Ports that must be exposed for clients connecting to ODAS. These ports need to
  # be accessible from where the client is connecting from.

  # This is the port for the ODAS REST API and Web UI. This needs to be accessible for
  # clients connecting from the browser.
  REST: 8083

  # The planner and worker API ports. These ports are required for all clients (e.g.
  # spark or python users) to access metadata and data.
  PLANNER_API: 12050
  WORKER_API: 13050

  # This is the port to access the presto API endpoint for users connecting via JDBC.
  PRESTO_API: 14050

cluster:
  #
  # These are configurations for the kubernetes cluster. The CIDR blocks are
  # used exclusively within the kubernetes cluster for internal communication.
  # The CIDR blocks should *not* overlap with CIDR blocks currently being used,
  # including the VPC. For example, these changes should *not* be within the
  # VPC range.
  #
  portRange: "1025-65535"
  podCidr: "172.23.0.0/16"
  serviceCidr: "172.34.0.0/16"

config:
  #
  # Configurations for deploying an ODAS cluster. Dummy values are set as
  # examples and should be replaced.
  # For configs marked [Optional], simply comment that section out if that
  # configuration is not used.
  #

  #
  # General system wide configs.
  #
  CLUSTER_NAME: Dev Cluster
  CLUSTER_LABEL: dev
  TZ: "America/New_York"
  UI_TIMEOUT_MS: 60000

  #
  # Logging and auditing directories. ODAS will need write access to this path prefix.
  #
  WATCHER_AUDIT_LOG_DST_DIR: s3://company/okera/logs
  WATCHER_LOG_DST_DIR: s3://company/okera/audit
  WATCHER_S3_REGION: us-east-1
  WATCHER_S3_ENCRYPT: true

  #
  # Users and groups (comma-separated) that have admin privileges on the catalog
  #
  CATALOG_ADMINS: admin

  #
  # MySQL database url and connection credentials.
  #
  CATALOG_DB_ENGINE: mysql
  CATALOG_DB_URL: aurora.xyz.us-east-1.rds.amazon.com:3306
  CATALOG_DB_USER: dbusername
  CATALOG_DB_PASSWORD: password

  #
  # Names of databases within the database instance where ODAS stores metadata. ODAS
  # will need read and write access to these databases and they must all be unique.
  #
  # CATALOG_DB_HMS_DB can be set to the name of your existing Hive Metastore(HMS) Database
  # (often this is called 'hive') to have the ODAS catalog share the existing HMS objects.
  #
  CATALOG_DB_HMS_DB: odas_hms
  CATALOG_DB_SENTRY_DB: odas_sentry
  CATALOG_DB_USERS_DB: odas_users

  #
  # [Optional] Configuration to enable JWT authentication
  #
  ENABLE_JWT: true
  JWT_ALGORITHM: RSA512
  JWT_PUBLIC_KEY: s3://company/okera/conf/id_rsa.512.pub
  SYSTEM_TOKEN: s3://company/okera/conf/okera.token

  #
  # [Optional] LDAP configuration
  #
  LDAP_HOST: ldap.company.com
  LDAP_PORT: 636
  LDAP_BIND_TEMPLATE: cn=%s,ou=users,dc=company,dc=com

Ports Configuration

This section of the file defines the public ports on which the cluster is accessible. This includes the UI/REST port, Planner API port, Worker API port and the Presto/JDBC API port. You can modify these and then run okctl update to update an existing ODAS cluster.

Cluster Creation Configuration

This section is only used when using the ODAS Installer to deploy a Kubernetes cluster, and is not used when deploying ODAS on an existing Kubernetes cluster (such as AKS or EKS).

Settings in this section rarely need to be modified, and can only be done prior to the cluster being created. The okctl prepare command will use these values to prepare the Kubernetes cluster for creation.

ODAS Configuration

This section contains the ODAS configuration settings, and where you will see the most modification. In the example file, you see a variety of configuration options, but there are others as well. You can modify these and then run okctl update to update an existing ODAS cluster.

Example Scenarios

For both these examples, we will assume you have a file called odas.yaml that contains your existing configuration.

Modifying Ports

Say you want to change the UI/REST port from 8083 to 8000. To do this, edit the odas.yaml file and change the REST value (in the ports section) to 8000.

Once this is set, you can use okctl update to update the cluster:

$ ./okctl update --config odas.yaml

This will cause the cluster to restart and apply your configuration of updated ports.

Modifying Catalog DB

Say you want to change which database server your ODAS cluster is backed with. To do this, edit the odas.yaml file and and put the following values in the config section:

CATALOG_DB_ENGINE: mysql
CATALOG_DB_URL: odasdb.cyn8yfvyuugz.us-west-2.rds.amazonaws.com
CATALOG_DB_USER: odas
CATALOG_DB_PASSWORD: odas12345!

Once this is set, you can use okctl update to update the cluster:

$ ./okctl update --config odas.yaml

This will cause the cluster to restart and apply your configuration of which database server to use.

Configuration Kubernetes Model

ODAS is a Kubernetes-native application, and uses the ConfigMap and Secret objects in Kubernetes to store its configuration and make it available to the running cluster. The configuration file discussed above is translated by okctl into these two objects.

It can be helpful to understand how this translation happens, in cases where you want to update your configuration manually or use a different system to set and update them (e.g. Helm).

Configuration is mounted into each running Pod as follows:

...
envFrom:
- configMapRef:
    name: default-odas-config
- configMapRef:
    name: odas-config
...
volumeMounts:
- mountPath: /etc/secrets
    name: secrets
    readOnly: true
...
volumes:
- name: secrets
  secret:
    defaultMode: 420
    secretName: secrets
```

In other words:

  1. The default-odas-config ConfigMap is mounted as environment variables into each pod. This ConfigMap object stores default values that are necessary for the cluster to be functional but can be overridden.
  2. The odas-config ConfigMap is mounted as environment variables into each pod. This ConfigMap object stores values set by the user.
  3. The secrets Secret is mounted as a set of files under the /etc/secrets folder. This Secret object stores more sensitive values.

When okctl update is run, it does the following:

  1. For each value in the ports section, it will update the Service object with the updated port value.
  2. For each value in the config section, it will:

    1. If it is a non-sensitive value, put it in the odas-config ConfigMap.
    2. If it is a sensitive value, put it in the secrets Secret, and put a reference to that file in the odas-config ConfigMap. For example, if the configuration file has the following setting:

      SYSTEM_TOKEN: file:///path/to/system.token
      

      It will be put as follows in secrets:

      SYSTEM_TOKEN_0: <base64 contents of /path/to/system.token>
      

      And as follows in odas-config:

      SYSTEM_TOKEN: /etc/secrets/SYSTEM_TOKEN_0
      
  3. It will then cause each pod to restart by updating an annotation with the SHA256 of the contents of odas-config and secrets.

If you need to update the contents of these objects yourself, you can follow a similar pattern. After updating the Service port definitions, odas-config and/or secrets, you can restart all the pods by telling Kubernetes to delete the existing ones:

kubectl delete pods --all

Note

You should run the above delete command in the Kubernetes namespace you installed ODAS in (the default namespace is the default).

Note

Any value updated in odas-config or secrets will be the same in all ODAS pods. To make a change for just a specific set of pods (e.g. only planner pods), you will need to edit that specific object type (e.g. Deployment or DaemonSet). This is not recommended and should only be done in consultation with Okera support.

Path Support

For all settings that are considered sensitive, you can put the following types of values for those settings:

  1. A local fully qualified path to the file, e.g. file:///path/to/file.
  2. An S3 path to the file, e.g. s3://bucket/path/to/file.
  3. An ADLS Gen1 path to the file, e.g. adl://account.azuredatalakestorage.net/path/to/file.
  4. A base64-encoded version of the value, e.g. base64://<base64 contents>.