The Okera Configuration File¶

Okera uses a YAML configuration file (values.yaml) to set its configuration, which is then applied to an Okera cluster when it is deployed or as an upgrade using Helm charts.

Configuration Example¶

Here is a sample configuration file, similar to what you receive when you first install the Okera Helm chart.

## Values.yaml for Okera helm charts

## Okera docker image settings
image:
  repository: quay.io/okera
  pullPolicy: IfNotPresent
  tag: "2.11.0"

env:
# values can be one of the following: aws/gcp/azure. Default set to "aws"
  cloud: "aws"

## Okera common configurations
common:
  configs:
    CLUSTER_NAME: "Demo Insecure Cluster"
    CLUSTER_LABEL: "Example Cluster"
    CATALOG_ADMINS: "admin"
#    TZ: "America/New_York"    
#    ENABLE_USERS_FILE: "true"
#    USERS_FILE_LDAP: "/etc/secrets/users"
#    SYSTEM_TOKEN: "/etc/secrets/system-token"
#    UI_TIMEOUT_MS: "60000"
#    OKERA_LEGACY_TOKEN_ESCAPE: "false"
#    ENABLE_PARAMETRIZED_URI_GRANTS: "false"
#    ENABLE_PARAMETRIZED_GRANTS: "false"
#    ENABLE_LEGACY_URI_CHECKS: "false"
#    AUTOTAGGER_CONFIGURATION: "/opt/okera/data/autotagger-config-wellknowns.json"
#    DISABLE_WORKSPACE_DOWNLOAD_BUTTON: "false"

#    OKERA_STAGING_DIR: "" #This can be Amazon S3 / adls / gcs path
#    AUDIT_LOGS_SYNC_FREQUENCY_MINS: "30"
#    OKERA_SCRIPTS_DIR: "/opt/scripts"
#    USER_ATTRIBUTES_SCRIPT: /etc/secrets/USER_ATTRIBUTE_SCRIPT_1,/etc/secrets/USER_ATTRIBUTE_SCRIPT_2

## Set threshold for large queries, in bytes. Queries larger than this are rejected.
#    MAX_REQUEST_SIZE_BYTES: "52751601"

## Logging and auditing directories. Okera will need write access to this path prefix.
#    WATCHER_AUDIT_LOG_DST_DIR: "s3://company/okera/logs"
#    WATCHER_LOG_DST_DIR: "s3://company/okera/audit"
#    WATCHER_S3_REGION: "us-east-1"
#    WATCHER_S3_ENCRYPT: "true"
#    WATCHER_LOG_PARTITIONED_UPLOADS: "false"
#    REST_SERVER_LOG_LEVEL: "DEBUG"

## Proxy pushdown mode policy enforcement parameters
#    PRESTO_ENABLE_PROXY: "true"
#    PRESTO_ENABLE_QUERY_LOGGING: "false"
#    PRESTO_PROXY_JDBC_PUSHDOWN: "true"

#    PRESTO_PROXY_DEBUG_ENABLED: "true"

#    PRESTO_SHOULD_USE_RESOURCE_GROUPS: "false"

## Snowflake policy synchronization parameters
#    POLICY_SYNC_INTERVAL: "1800"

#    POLICY_SYNC_ROLE_PATTERN: "OKERA_%s"
#    POLICY_SYNC_SCHEDULER_ENABLED: "true"

## Enable OkeraEnsemble Amazon S3 Access Proxy
#    REST_SERVER_ENABLE_ACCESS_PROXY: "true"

## Configure the JWKS endpoint
#    JWT_JWKS_URL: ""

## [Optional] Set RS_ARGS
#    RS_ARGS: ""

## [Optional] OAUTH configuration
#    OAUTH_PROVIDER: "google"
#    OAUTH_SECRETS: "/etc/secrets/oauth-secrets.json"     #Refer to files in common.secret_files
#    OAUTH_SCOPES: "openid profile email api:///okera/okera_auth_scope"

## [Optional] GCP configuration
#    GROUP_RESOLUTION_GOOGLE_APPLICATION_CREDENTIALS: "/etc/secrets/google-credentials"   #Refer to files in common.secret_files
#    GSUITE_GROUP_ADMIN_EMAIL: ""

  secret_strings:
#    token.jwt: ""
  secret_files:
#    system-token: files/system.token
#    users: files/users.json
#    oauth-secrets: files/oauth-secrets.json
#    google-credentials: files/google-credentials.json
#    USER_ATTRIBUTE_SCRIPT_1: files/user_attr_script_1.sh
#    USER_ATTRIBUTE_SCRIPT_2: files/user_attr_script_1.sh

## Database configurations

db:
  enabled: false
  configs:
## Catalog DB engine types:  mysql / postgres
#    CATALOG_DB_ENGINE: "mysql"
#    CATALOG_DB_URL: "db.example.example:3306"
#    CATALOG_DB_USER: "okera"

## Names of databases within the database instance where Okera stores metadata. Okera
## will need read and write access to these databases and they must all be unique.
## CATALOG_DB_HMS_DB can be set to the name of your existing Hive Metastore(HMS) Database
## (often this is called 'hive') to have the Okera catalog share the existing HMS objects.

#    CATALOG_DB_HMS_DB: odas_hms
#    CATALOG_DB_SENTRY_DB: odas_sentry
#    CATALOG_DB_USERS_DB: odas_users

## Enable Hive Metastore (HMS) 2 Schema
#    ENABLE_HMS_2_SCHEMA: "false"

  secret_strings:
#    CATALOG_DB_PASSWORD: ""
  secret_files:
#    CATALOG_DB_PASSWORD: files/catalog_db_password

## SSL configurations
ssl:
  enabled: false
  configs:
#    SSL_CERTIFICATE_FILE: "/etc/secrets/SSL_CERTIFICATE_FILE"
#    SSL_KEY_FILE: "/etc/secrets/SSL_KEY_FILE"
  secret_files:
#    SSL_CERTIFICATE_FILE: files/ssl.crt
#    SSL_KEY_FILE: files/ssl.key

## JWT configurations
jwt:
  enabled: false
  configs:
#    JWT_ALGORITHM: "RSA512"
#    JWT_PUBLIC_KEY: "/etc/secrets/JWT_PUBLIC_KEY"
#    JWT_PRIVATE_KEY: "/etc/secrets/JWT_PRIVATE_KEY"
  secret_files:
#    JWT_PUBLIC_KEY: files/jwt_pub_key.pem
#    JWT_PRIVATE_KEY: files/jwt_priv_key.pem

## LDAP configurations
ldap:
  enabled: false
  configs:
#    GROUP_RESOLVER_LDAP_HOST: "ldaps://"
#    GROUP_RESOLVER_LDAP_PORT: "636"
#    GROUP_RESOLVER_LDAP_USE_SSL: "true"
#    GROUP_RESOLVER_LDAP_BASE_DN: ""
#    GROUP_RESOLVER_LDAP_USER_SEARCH_FILTER: "(&(objectClass=person)(uid={0}@company.com))"
#    GROUP_RESOLVER_LDAP_GROUP_SEARCH_FILTER: "(objectClass=groupofUniqueNames)"
#    GROUP_RESOLVER_LDAP_MEMBER_FIELD_NAME: "uniqueMember"
#    GROUP_RESOLVER_LDAP_USER: ""
  secret_strings:
#    GROUP_RESOLVER_LDAP_PASSWORD: ""
#    LDAP_USER_QUERY_SERVICE_PASSWORD: ""
  secret_files:
#    GROUP_RESOLVER_LDAP_PASSWORD: files/group_resolver_ldap_password
#    LDAP_USER_QUERY_SERVICE_PASSWORD: files/ldap_user_query_service_password

## Kubernetes service configurations
## Values for Service type can be one of the following: LoadBalancer/NodePort/ClusterIP
service:
  type: NodePort
  annotations:
#    service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
#    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp

## Kubernetes service-account to use
serviceaccount:
  name: default

## Kubernetes node lables to use
nodeSelector:
#  application: "okera"

## Kubernetes tolerations to use in case the nodes are tainted
tolerations:
# - key: "application"
#   value: "okera"
#   operator: "Equal"
#   effect: "NoSchedule"


extraVolumeMounts:
# - name: extra-volume-1
#   mountPath: /var/lib/extravolumes/1

extraVolumes:
#  - name: extra-volume-1
#    emptyDir: {}

pod:
  annotations:
#    key: value

You can remove the # to uncomment a parameter in this sample values.yaml file or add your own parameters in the comments.config section of this file. A reference of the parameters you can specify in this YAML configuration file is provided in Okera Configuration Parameter Reference.

Ports Configuration¶

At this time, if you need to use different port numbers than the default Okera port numbers, contact Okera. Some customers use Kubernetes Ingress rules to achieve this.

Release Configuration¶

The installation of Okera uses quay.io to build, analyze and distribute its code. The version of Okera to be deployed is specified in the image.tag setting:

image:
  repository: quay.io/okera
  pullPolicy: IfNotPresent
  tag: "2.11.0"

Settings in this section rarely need to be modified, and can only be done for a cluster in its initial deployment. The tag setting specifies the version of Okera that is to be deployed.

Environment Configuration¶

The env section of the file identifies the environment in which Okera is installed: Amazon AWS, Google GCP, or Microsoft Azure. Valid values are aws, gcp, or azure.

env:
# values can be one of the following: aws/gcp/azure. Default set to "aws"
  cloud: "aws"

Okera General Configuration Parameters¶

The cluster configuration parameters are specified in the common.configs section:

## Okera common configurations
common:
  configs:
    CLUSTER_NAME: "Demo Insecure Cluster"
    CLUSTER_LABEL: "Example Cluster"
    CATALOG_ADMINS: "admin"
#    TZ: "America/New_York"    
#    ENABLE_USERS_FILE: "true"
#    USERS_FILE_LDAP: "/etc/secrets/users"
#    SYSTEM_TOKEN: "/etc/secrets/system-token"
...
<more configuration parameters>
...

  secret_strings:
#    token.jwt: ""
  secret_files:
#    system-token: files/system.token
#    users: files/users.json
#    oauth-secrets: files/oauth-secrets.json
#    google-credentials: files/google-credentials.json
#    USER_ATTRIBUTE_SCRIPT_1: files/user_attr_script_1.sh
#    USER_ATTRIBUTE_SCRIPT_2: files/user_attr_script_1.sh

The CLUSTER_NAME specifies the name of the Okera cluster and the CLUSTER_LABEL specifies a brief label or description for the cluster. The CATALOG_ADMINS parameter specifies a list of Okera system administrators.

You can remove the # to uncomment a parameter in this sample values.yaml file or add your own parameters in the comments.config section of this file. A reference of the parameters you can specify in this YAML configuration file is provided in Okera Configuration Parameter Reference.

The parameters in the secret_files section in the common configurations section identify the location of files referenced for your Okera installation. The parameter names in the common.secret_files section of the example above are referenced elsewhere in the common.configs section by the SYSTEM_TOKEN, USERS_FILE_LDAP, OAUTH_SECRETS, GROUP_RESOLUTION_GOOGLE_APPLICATION_CREDENTIALS, and USER_ATTRIBUTES_SCRIPT configuration parameters. When Helm encounters these configuration parameters, it inserts the corresponding files for the configuration parameter. For example, the SYSTEM_TOKEN parameter references the common.secret_files.system-token parameter. The system-token parameter identifies the location of the JWT token file used by Okera for interservice communication. When Helm encounters the SYSTEM_TOKEN configuration parameter, it inserts and uses the token found in the JWT file referenced by the system-token parameter as the SYSTEM_TOKEN value.

Likewise, the USER_ATTRIBUTES_SCRIPT parameter references the common.secret_files.USER_ATTRIBUTE_SCRIPT_1 and common. secret_files.USER_ATTRIBUTE_SCRIPT_2 parameters. The common.secret_files.USER_ATTRIBUTE_SCRIPT_1 and common.secret_files.USER_ATTRIBUTE_SCRIPT_2 parameters identify the location of two custom script-sourced user attribute scripts. When Helm encounters the USER_ATTRIBUTES_SCRIPT configuration parameterin the common.configs section, it inserts and uses the user attribute scripts referenced by the common.secret_files.USER_ATTRIBUTE_SCRIPT_1 and common.secret_files.USER_ATTRIBUTE_SCRIPT_2 parameters.

SSL Configuration Parameters¶

SSL configuration parameters are specified in the ssl section:

## SSL configurations
ssl:
  enabled: false
  configs:
#    SSL_CERTIFICATE_FILE: "/etc/secrets/SSL_CERTIFICATE_FILE"
#    SSL_KEY_FILE: "/etc/secrets/SSL_KEY_FILE"
  secret_files:
#    SSL_CERTIFICATE_FILE: files/ssl.crt
#    SSL_KEY_FILE: files/ssl.key

Specify true for the enabled parameter to enable SSL. When you do this, be sure to uncomment the SSL_CERTIFICATE_FILE and SSL_KEY_FILE parameters.

The SSL_CERTIFICATE_FILE and SSL_KEY_FILE parameters in the secret_files section in the SSL configuration section identify the location of the certificate and key files used by Okera for SSL authentication. The SSL_CERTIFICATE_FILE and SSL_KEY_FILE parameters in the configs section should not be changed. They reference the parameters in the secret_files section.

In the secret_files section, the following configuration parameters should be specified.

Parameter	Description
`SSL_CERTIFICATE_FILE`	Specify the path to the SSL certificates file.
`SSL_KEY_FILE`	Specify the path to the SSL key file.

Okera has two requirements for the certificate file:

It must be in PEM format.
It must contain the full certificate chain. Not including the full chain may cause some clients to reject this certificate.

Note: For Let's Encrypt certificates, the full chain of certificates is in the fullchain.pem file.

JWT Configuration Parameters¶

JSON Web Token (JWT) parameters are specified in the jwt section.

## JWT configurations
jwt:
  enabled: false
  configs:
#    JWT_ALGORITHM: "RSA512"
#    JWT_PUBLIC_KEY: "/etc/secrets/JWT_PUBLIC_KEY"
#    JWT_PRIVATE_KEY: "/etc/secrets/JWT_PRIVATE_KEY"
  secret_files:
#    JWT_PUBLIC_KEY: files/jwt_pub_key.pem
#    JWT_PRIVATE_KEY: files/jwt_priv_key.pem

Specify true for the enabled parameter to enable use of JWTs. When you do this, be sure to uncomment and add other JWT parameters as appropriate.

The JWT_PUBLIC_KEY and JWT_PRIVATE_KEY parameters in the secret_files section in the JWT configuration section identify the location of the public and private key files used by Okera for JWT authentication. The JWT_PUBLIC_KEY and JWT_PRIVATE_KEY parameters in the configs section should not be changed. They reference the parameters in the secret_files section.

In the configs section:

Parameter	Description
`JWT_ALGORITHM`	Specify the algorithm used by JWT. Valid values are `RSA256` and `RSA512`. `RSA512` is the default. You can specify multiple, comma-separated values, just as with `JWT_PUBLIC_KEY`. The order of the values must match the order specified for the corresponding `JWT_PUBLIC_KEY`. See Public and Private Key Validation.
`JWT_PRIVATE_KEY`	Do not change this parameter. This references the `JWT_PRIVATE_KEY` setting in the `secret_files` section.
`JWT_PUBLIC_KEY`	Do not change this parameter. This references the `JWT_PUBLIC_KEY` setting in the `secret_files` section.

In the secret_files section, the following configuration parameters should be specified.

Parameter	Description
`JWT_PRIVATE_KEY`	Specify the path to the private key used to encode Okera-generated JWTs. Private keys should not be specified as multiple, comma-separated values. See Public and Private Key Validation.
`JWT_PUBLIC_KEY`	Specify the path to the public key used to decode JWTs. You can specify multiple, comma-separated JWT public keys. When used to decode an incoming token, they are attempted in the order specified. See Public and Private Key Validation.

To configure the public key approach for validation, the setting JWT_PUBLIC_KEY should specify the full path to the public key used to decode JWTs and the setting JWT_PRIVATE_KEY should specify the full path to the private key used to encode Okera-generated JWTs.

Note: These keys must be in OpenSSL PKCS#8 format.

To configure the algorithm, the setting JWT_ALGORITHM must be set to a string indicating the algorithm used. Currently, supported algorithms are RSA256 and RSA512.

For example, set the following settings in the configuration file:

JWT_PUBLIC_KEY: file:///etc/id_rsa.512.pub
JWT_PRIVATE_KEY: file:///etc/id_rsa.512
JWT_ALGORITHM: RSA512

Okera supports configuring multiple public keys and algorithms for validating JWTs. To do this, specify the public keys and algorithms in comma-delimited lists. When a token is passed, each public key in the list is used to validate it, with the token considered valid as soon as one of the keys matches.

Note: There must be the same number of algorithms specified as public keys and the algorithm order must correspond to the public key order.

For example:

JWT_PUBLIC_KEY: file:///etc/id_rsa.512.pub,file:///etc/external_vendor.256.pub
JWT_PRIVATE_KEY: file:///etc/id_rsa.512
JWT_ALGORITHM: RSA512,RSA256

LDAP Configuration Parameters¶

LDAP configuration parameters are specified in the ldap section.

## LDAP configurations
ldap:
  enabled: false
  configs:
#    GROUP_RESOLVER_LDAP_HOST: "ldaps://"
#    GROUP_RESOLVER_LDAP_PORT: "636"
#    GROUP_RESOLVER_LDAP_USE_SSL: "true"
#    GROUP_RESOLVER_LDAP_BASE_DN: ""
#    GROUP_RESOLVER_LDAP_USER_SEARCH_FILTER: "(&(objectClass=person)(uid={0}@company.com))"
#    GROUP_RESOLVER_LDAP_GROUP_SEARCH_FILTER: "(objectClass=groupofUniqueNames)"
#    GROUP_RESOLVER_LDAP_MEMBER_FIELD_NAME: "uniqueMember"
#    GROUP_RESOLVER_LDAP_USER: ""
  secret_strings:
#    GROUP_RESOLVER_LDAP_PASSWORD: ""
#    LDAP_USER_QUERY_SERVICE_PASSWORD: ""
  secret_files:
#    GROUP_RESOLVER_LDAP_PASSWORD: files/group_resolver_ldap_password
#    LDAP_USER_QUERY_SERVICE_PASSWORD: files/ldap_user_query_service_password

Specify true for the enabled parameter to enable use of LDAP. When you do this, be sure to uncomment and add other LDAP parameters as appropriate.

In the configs section, specify parameters as follows:

Parameter	Description
`GROUP_RESOLVER_LDAP_BASE_DN`	Specify the base distinguised name (DN) to use for LDAP authentication, if the username appears in the DN.
`GROUP_RESOLVER_LDAP_GROUP_BASE_DN`	Specifies the base DN (distinguished name) for groups during group resolution. The LDAP server will use this base DN in its group searches during authentication processing.
`GROUP_RESOLVER_LDAP_GROUP_SEARCH_FILTER`	Specify the group search filter used to limit LDAP queries to only return group-type objects during LDAP group resolution.
`GROUP_RESOLVER_LDAP_HOST`	Specify the URL for the LDAP server to use for LDAP authentication.
`GROUP_RESOLVER_LDAP_MEMBER_FIELD_NAME`	Specify an LDAP user name to limit the search to only groups in which the user is a member.
`GROUP_RESOLVER_LDAP_PORT`	Specify the port number of the LDAP server. Port values are typically `389` (non-SSL connections) or `636` (SSL connections).
`GROUP_RESOLVER_LDAP_USE_SSL`	Enables and disables SSL use for LDAP authentication. Valid values are `true` (enable SSL) and `false` (disable SSL).
`GROUP_RESOLVER_LDAP_USER_BASE_DN`	Specifies the base DN (distinguished name) for users during group resolution. The LDAP server will use this base DN in its user searches during authentication processing.
`GROUP_RESOLVER_LDAP_USER_SEARCH_FILTER`	Specify the user search filter used to limit LDAP queries to only return group-type objects during LDAP group resolution.
`GROUP_RESOLVER_LDAP_USER`	Specify the service account username used for LDAP group resolution.

The GROUP_RESOLVER_LDAP_PASSWORD and LDAP_USER_QUERY_SERVICE_PASSWORD parameters in the secret_files section in the LDAP configuration section identify the location of the service account and user query service passwords used by Okera for JWT authentication. The GROUP_RESOLVER_LDAP_PASSWORD and LDAP_USER_QUERY_SERVICE_PASSWORD parameters in the secret_strings section should not be changed. They reference the parameters in the secret_files section.

Parameter	Description
`GROUP_RESOLVER_LDAP_PASSWORD`	Specify the service account password used for LDAP group resolution.
`LDAP_USER_QUERY_SERVICE_PASSWORD`	Specify the user query service password used by Okera for LDAP authentication.

Database Configuration Parameters¶

Parameters affecting the Okera metadata store (the Okera database) are specified in the db section.

## Database configurations

db:
  enabled: false
  configs:
## Catalog DB engine types:  mysql / postgres
#    CATALOG_DB_ENGINE: "mysql"
#    CATALOG_DB_URL: "db.example.example:3306"
#    CATALOG_DB_USER: "okera"

## Names of databases within the database instance where Okera stores metadata. Okera
## will need read and write access to these databases and they must all be unique.
## CATALOG_DB_HMS_DB can be set to the name of your existing Hive Metastore(HMS) Database
## (often this is called 'hive') to have the Okera catalog share the existing HMS objects.

#    CATALOG_DB_HMS_DB: odas_hms
#    CATALOG_DB_SENTRY_DB: odas_sentry
#    CATALOG_DB_USERS_DB: odas_users

## Enable Hive Metastore (HMS) 2 Schema
#    ENABLE_HMS_2_SCHEMA: "false"

  secret_strings:
#    CATALOG_DB_PASSWORD: ""
  secret_files:
#    CATALOG_DB_PASSWORD: files/catalog_db_password

Specify the following parameters, as appropriate for your environment. Specify true for the enabled parameter to enable use of the database. When you do this, be sure to uncomment and add other database parameters as appropriate.

Parameter	Description
`CATALOG_DB_ENGINE`	Specify the Hive metastore (HMS) database engine type. Valid values are `mysql` and `postgres`. For example, `CATALOG_DB_ENGINE: mysql` or `CATALOG_DB_ENGINE: postgres`.
`CATALOG_DB_URL`	Specify the URL to your Okera Hive metastore.
`CATALOG_DB_USER`	Specify the user name that should be used to access you Okera Hive metastore.
`CATALOG_DB_HMS_DB`	Specify the name of an existing Hive metastore (HMS) database for use with Okera. Okera will use the existing HMS objects in the HMS. See Database Configuration.
`CATALOG_DB_SENTRY_DB`	Specify the name of an existing Sentry database in the Okera Hive metastore where Okera stores metadata. Okera requires read and write access to this database. See Database Configuration.
`CATALOG_DB_USERS_DB`	Specify the name of an existing database in the Okera Hive metastore where Okera stores metadata. See Database Configuration.
`ENABLE_HMS_2_SCHEMA`	Enables and disables use of the HMS v2 schema. Valid values are `true` (use the HMSv2 schema) and `false` (use the HMS v1 schema). The default is `false`.

The CATALOG_DB_PASSWORD parameter in the secret_files section in the LDAP configuration section identifies the location of the password, preferably using secrets, for the HMS database. The `CATALOG_DB_PASSWORD parameter in the secret_strings section should not be changed. It references the parameter in the secret_files section.

Parameter	Description
`CATALOG_DB_PASSWORD`	Defines the password, preferably using secrets, for the HMS database. Okera does not recommend specifying plain text passwords in configuration files.

Kubernetes Configuration Parameters¶

The Kubernetes configuration parameters section of the values.yaml file should not be changed, except under the advisement of an Okera representative.

## Kubernetes service configurations
## Values for Service type can be one of the following: LoadBalancer/NodePort/ClusterIP
service:
  type: NodePort
  annotations:
#    service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
#    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp

## Kubernetes service-account to use
serviceaccount:
  name: default

## Kubernetes node lables to use
nodeSelector:
#  application: "okera"

## Kubernetes tolerations to use in case the nodes are tainted
tolerations:
# - key: "application"
#   value: "okera"
#   operator: "Equal"
#   effect: "NoSchedule"


extraVolumeMounts:
# - name: extra-volume-1
#   mountPath: /var/lib/extravolumes/1

extraVolumes:
#  - name: extra-volume-1
#    emptyDir: {}

pod:
  annotations:
#    key: value