Configuration¶
ODAS uses a YAML configuration file to set its configuration, which is then applied to a cluster using the okctl
CLI tool.
Configuration Example¶
Here is a sample configuration file:
ports:
# Ports that must be exposed for clients connecting to ODAS. These ports need to
# be accessible from where the client is connecting from.
# This is the port for the ODAS REST API and Web UI. This needs to be accessible for
# clients connecting from the browser.
REST: 8083
# The planner and worker API ports. These ports are required for all clients (e.g.
# spark or python users) to access metadata and data.
PLANNER_API: 12050
WORKER_API: 13050
# This is the port to access the presto API endpoint for users connecting via JDBC.
PRESTO_API: 14050
cluster:
#
# These are configurations for the kubernetes cluster. The CIDR blocks are
# used exclusively within the kubernetes cluster for internal communication.
# The CIDR blocks should *not* overlap with CIDR blocks currently being used,
# including the VPC. For example, these changes should *not* be within the
# VPC range.
#
portRange: "1025-65535"
podCidr: "172.23.0.0/16"
serviceCidr: "172.34.0.0/16"
config:
#
# Configurations for deploying an ODAS cluster. Dummy values are set as
# examples and should be replaced.
# For configs marked [Optional], simply comment that section out if that
# configuration is not used.
#
#
# General system wide configs.
#
CLUSTER_NAME: Dev Cluster
CLUSTER_LABEL: dev
TZ: "America/New_York"
UI_TIMEOUT_MS: 60000
#
# Logging and auditing directories. ODAS will need write access to this path prefix.
#
WATCHER_AUDIT_LOG_DST_DIR: s3://company/okera/logs
WATCHER_LOG_DST_DIR: s3://company/okera/audit
WATCHER_S3_REGION: us-east-1
WATCHER_S3_ENCRYPT: true
#
# Users and groups (comma-separated) that have admin privileges on the catalog
#
CATALOG_ADMINS: admin
#
# MySQL database url and connection credentials.
#
CATALOG_DB_ENGINE: mysql
CATALOG_DB_URL: aurora.xyz.us-east-1.rds.amazon.com:3306
CATALOG_DB_USER: dbusername
CATALOG_DB_PASSWORD: password
#
# Names of databases within the database instance where ODAS stores metadata. ODAS
# will need read and write access to these databases and they must all be unique.
#
# CATALOG_DB_HMS_DB can be set to the name of your existing Hive Metastore(HMS) Database
# (often this is called 'hive') to have the ODAS catalog share the existing HMS objects.
#
CATALOG_DB_HMS_DB: odas_hms
CATALOG_DB_SENTRY_DB: odas_sentry
CATALOG_DB_USERS_DB: odas_users
#
# [Optional] Configuration to enable JWT authentication
#
ENABLE_JWT: true
JWT_ALGORITHM: RSA512
JWT_PUBLIC_KEY: s3://company/okera/conf/id_rsa.512.pub
SYSTEM_TOKEN: s3://company/okera/conf/okera.token
#
# [Optional] LDAP configuration
#
LDAP_HOST: ldap.company.com
LDAP_PORT: 636
LDAP_BIND_TEMPLATE: cn=%s,ou=users,dc=company,dc=com
Ports Configuration¶
This section of the file defines the public ports on which the cluster is accessible.
This includes the UI/REST port, Planner API port, Worker API port and the Presto/JDBC API port.
You can modify these and then run okctl update
to update an existing ODAS cluster.
Cluster Creation Configuration¶
This section is only used when using the ODAS Installer to deploy a Kubernetes cluster, and is not used when deploying ODAS on an existing Kubernetes cluster (such as AKS or EKS).
Settings in this section rarely need to be modified, and can only be done prior to the cluster being created.
The okctl prepare
command will use these values to prepare the Kubernetes cluster for creation.
ODAS Configuration¶
This section contains the ODAS configuration settings, and where you will see the most modification.
In the example file, you see a variety of configuration options, but there are others as well.
You can modify these and then run okctl update
to update an existing ODAS cluster.
Example Scenarios¶
For both these examples, we will assume you have a file called odas.yaml
that contains your existing configuration.
Modifying Ports¶
Say you want to change the UI/REST port from 8083
to 8000
.
To do this, edit the odas.yaml
file and change the REST
value (in the ports
section) to 8000
.
Once this is set, you can use okctl update
to update the cluster:
$ ./okctl update --config odas.yaml
This will cause the cluster to restart and apply your configuration of updated ports.
Modifying Catalog DB¶
Say you want to change which database server your ODAS cluster is backed with.
To do this, edit the odas.yaml
file and and put the following values in the config
section:
CATALOG_DB_ENGINE: mysql
CATALOG_DB_URL: odasdb.cyn8yfvyuugz.us-west-2.rds.amazonaws.com
CATALOG_DB_USER: odas
CATALOG_DB_PASSWORD: odas12345!
Once this is set, you can use okctl update
to update the cluster:
$ ./okctl update --config odas.yaml
This will cause the cluster to restart and apply your configuration of which database server to use.
Configuration Kubernetes Model¶
ODAS is a Kubernetes-native application, and uses the ConfigMap
and Secret
objects in Kubernetes to store its configuration and make it available to the running cluster.
The configuration file discussed above is translated by okctl
into these two objects.
It can be helpful to understand how this translation happens, in cases where you want to update your configuration manually or use a different system to set and update them (e.g. Helm).
Configuration is mounted into each running Pod
as follows:
...
envFrom:
- configMapRef:
name: default-odas-config
- configMapRef:
name: odas-config
...
volumeMounts:
- mountPath: /etc/secrets
name: secrets
readOnly: true
...
volumes:
- name: secrets
secret:
defaultMode: 420
secretName: secrets
```
In other words:
- The
default-odas-config
ConfigMap
is mounted as environment variables into each pod. ThisConfigMap
object stores default values that are necessary for the cluster to be functional but can be overridden. - The
odas-config
ConfigMap
is mounted as environment variables into each pod. ThisConfigMap
object stores values set by the user. - The
secrets
Secret
is mounted as a set of files under the/etc/secrets
folder. ThisSecret
object stores more sensitive values.
When okctl update
is run, it does the following:
- For each value in the
ports
section, it will update theService
object with the updated port value. -
For each value in the
config
section, it will:- If it is a non-sensitive value, put it in the
odas-config
ConfigMap
. -
If it is a sensitive value, put it in the
secrets
Secret
, and put a reference to that file in theodas-config
ConfigMap
. For example, if the configuration file has the following setting:SYSTEM_TOKEN: file:///path/to/system.token
It will be put as follows in
secrets
:SYSTEM_TOKEN_0: <base64 contents of /path/to/system.token>
And as follows in
odas-config
:SYSTEM_TOKEN: /etc/secrets/SYSTEM_TOKEN_0
- If it is a non-sensitive value, put it in the
-
It will then cause each pod to restart by updating an annotation with the
SHA256
of the contents ofodas-config
andsecrets
.
If you need to update the contents of these objects yourself, you can follow a similar pattern.
After updating the Service
port definitions, odas-config
and/or secrets
, you can restart all the pods by telling Kubernetes to delete the existing ones:
kubectl delete pods --all
Note
You should run the above delete
command in the Kubernetes namespace you installed ODAS in (the default
namespace is the default).
Note
Any value updated in odas-config
or secrets
will be the same in all ODAS pods.
To make a change for just a specific set of pods (e.g. only planner pods), you will need to edit that specific object type (e.g. Deployment
or DaemonSet
).
This is not recommended and should only be done in consultation with Okera support.
Path Support¶
For all settings that are considered sensitive, you can put the following types of values for those settings:
- A local fully qualified path to the file, e.g.
file:///path/to/file
. - An S3 path to the file, e.g.
s3://bucket/path/to/file
. - An ADLS Gen1 path to the file, e.g.
adl://account.azuredatalakestorage.net/path/to/file
. - A
base64
-encoded version of the value, e.g.base64://<base64 contents>
.