Okera Configuration Parameter Reference¶
The following table describes the configuration parameters you can use in Okera's configuration file.
Parameter |
Description |
---|---|
APPLY_INVERSE_MAP_ON_WRITE |
Indicates whether the cluster should invert the bucket names when a write is attempted on a follower cluster in an active/active environment. Valid values are true (bucket names are inverted) or false (bucket names are not inverted). Follower clusters should always set this to true ; primary clusters should always set this to false . There is no default. |
AUDIT_LOGS_SYNC_FREQUENCY_MINS |
Defines the frequency at which Okera synchronizes audit logs. Valid values range from 1 minute to 180 minutes (3 hours). When not specified, the default is 30 minutes. If you specify a value larger than 180 minutes, Okera defaults to 180 minutes. |
AUTOTAGGER_CONFIGURATION |
Enables or disables autotagging. Valid values are true (enable autotagging) or false (disable autotagging). The default is true . |
BUCKET_TO_ROLE_MAP_FILE |
Specifies the path to the file that should be used for Amazon S3 assume secondary role support. |
CATALOG_ADMINS |
Defines a list of Okera system administrators. Specify the Okera user names in a comma-separated list and specify the list (if there is more than one administrator) in quotes. In this example, users jane and mike are defined as system administrators: CATALOG_ADMINS: "jane,mike" . By default, user admin is always a system administrator. |
CATALOG_DB_CONN_PARAMS |
Specifies optional parameters for the Okera database URL specified in the CATALOG_DB_URL parameter. Valid Postgres parameters are described in the Postgres documentation. Valid MySQL parameters are described in the MySQL documentation. |
CATALOG_DB_ENGINE |
Defines the Hive metastore (HMS) database engine type. Valid values are mysql and postgres . For example, CATALOG_DB_ENGINE: mysql or CATALOG_DB_ENGINE: postgres . |
CATALOG_DB_HMS_DB |
Defines the name of an existing Hive metastore (HMS) database for use with Okera. Okera will use the existing HMS objects in the HMS. See Database Configuration. |
CATALOG_DB_OKERA_DB |
Defines the name of an existing database in the Okera Hive metastore where Okera stores metadata. See Database Configuration. |
CATALOG_DB_PASSWORD |
Defines the password, preferably using secrets, for the HMS database. Okera does not recommend specifying plain text passwords in configuration files. |
CATALOG_DB_SENTRY_DB |
Defines the name of an existing Sentry database in the Okera Hive metastore where Okera stores metadata. Okera requires read and write access to this database. See Database Configuration. |
CATALOG_DB_URL |
Defines the URL to your Okera Hive metastore. |
CATALOG_DB_USER |
Defines the user name that should be used to access you Okera Hive metastore. |
CATALOG_DB_USERS_DB |
Defines the name of an existing database in the Okera Hive metastore where Okera stores metadata. See Database Configuration. |
CLUSTER_LABEL |
Specifies a label name for the Okera cluster. For example, CLUSTER_LABEL: dev . |
CLUSTER_NAME |
Specifies a name for the Okera cluster. For example, CLUSTER_NAME: Dev Cluster . |
CUSTOM_GROUP_RESOLVERS |
For example, CUSTOM_GROUP_RESOLVERS: <java-path1>, <java-path2> . |
ENABLE_HMS_2_SCHEMA |
Enables and disables use of the HMS v2 schema. Valid values are true (use the HMSv2 schema) and false (use the HMS v1 schema). The default is false . |
ENABLE_JWT |
Enables and disables the use of JSON web tokens (JWT) for authentication. Valid values are true (use JWTs) and false (do not use JWTs). The default is fal`se. |
ENABLE_LEGACY_URI_CHECKS |
Enables and disables the ALL grant requirement for the CREATE TABLE and ALTER TABLE SET LOCATION statements. Valid values are true (require an ALL grant) and false (do not require an ALL grant). The default is false . |
ENABLE_PARAMETRIZED_GRANTS |
Enables and disables the ability to reference dynamic parameters in URI, database, and table grants. Valid values are true (you can reference dynamic parameters) and false (you cannot reference dynamic parameters). The default is false . |
ENABLE_PARAMETRIZED_URI_GRANTS |
Enables and disables the ability to reference dynamic parameters in a URI when doing a permission grant. Valid values are true (you can reference dynamic parameters) and false (you cannot reference dynamic parameters). The default is false . |
ENABLE_TASK_ENCRYPTION |
Enables nScale task encryption. Valid values are true and false . When this is set to true and the TASK_ENCRYPTION_KEY configuration parameter is not specified, Okera attempts to use the JWT_PRIVATE_KEY configuration parameter setting, if it is specified. If neither are specified, an error occurs. See Configuration File Encryption Settings for nScale. |
EXTERNAL_OKERA_SECURE_POLICY_DB |
Specifies the database in which Okera's supplied user-defined functions (UDFs) are stored. If this parameter is not specified, the default okera_udfs is used. |
EXTERNAL_OKERA_SECURE_POLICY_SCHEMA |
Specifies the schema in which Okera's supplied user-defined functions (UDFs) are stored. If this parameter is not specified, the default public is used. |
GO_ACCESS_PROXY_CACHE_LOG_PERIOD |
Defines the period, in seconds, at which OkeraFS logs assumed role credential cache statistics. The default is 0 (zero) seconds, which disables logging. When this parameter is set to any value greater than zero, logging occurs at the time intervals specified by this parameter. |
GROUP_RESOLVER_SCRIPTS |
Specifies the paths to the group resolver scripts. Paths can be a local file, an S3 path, or an ADLS path. See Custom Script-Sourced Group Resolution. |
IS_FOLLOWER |
Indicates whether the cluster is a follower cluster in an active/active environment. Valid values are true (it is a follower cluster) and false (it is not a follower cluster). There is no default. |
JWT_ALGORITHM |
Specifies the algorithm used by JWT. Valid values are RSA256 and RSA512 . RSA512 is the default. You can specify multiple, comma-separated values, just as with JWT_PUBLIC_KEY . The order of the values must match the order specified for the corresponding JWT_PUBLIC_KEY . See Public and Private Key Validation. |
JWT_AUTHENTICATION_SERVER_URL |
Specifies the JWT URL for remote endpoint validation. See Remote Endpoint Validation. |
JWT_JWKS_URL |
Specifies the URL of your OAuth identity provider (for example Okta, Auth0, or AzureAD). Okera uses this to dynamically fetch the appropriate public key needed for OAuth authentication. |
JWT_PRIVATE_KEY |
Specifies the path to the private key used to encode Okera-generated JWTs. Private keys should not be specified as multiple, comma-separated values. See Public and Private Key Validation. |
JWT_PUBLIC_KEY |
Specifies the path to the public key used to decode JWTs. You can specify multiple, comma-separated JWT public keys. When used to decode an incoming token, they are attempted in the order specified. See Public and Private Key Validation. |
LDAP_BASE_DN |
Specifies the base distinguised name (DN) to use for LDAP authentication, if the username appears in the DN. |
LDAP_BIND_TEMPLATE |
Specifies the bind template that should be used for LDAP authentication. For example, cn=%s,ou=users,dc=company,dc=com . |
LDAP_DOMAIN |
Specifies the LDAP domain to use for LDAP authentication. Specifying LDAP_BIND_TEMPLATE and LDAP_BASE_DN is preferable to specifying LDAP_DOMAIN . |
LDAP_HOST |
The URL for the LDAP server to use for LDAP authentication. |
LDAP_PORT |
The port number of the LDAP server. Port values are typically 389 (non-SSL connections) or 636 (SSL connections). |
LDAP_USE_SSL |
Enables and disables SSL use for LDAP authentication. Valid values are true (enable SSL) and false (disable SSL). |
MAX_REQUEST_SIZE_BYTES |
Specifies the byte size limit above which queries are rejected. The default value is 52751601 bytes, which is approximately 52MB. |
OAUTH_PROVIDER |
Required if you are using Google OAuth as your authorization provider. This parameter is not required for any other authorization provider. The only valid value is google . For example, OAUTH_PROVIDER: google . |
OAUTH_SCOPES |
Specifies a list of scopes that determines which information the OAuth service provider will allow the user to access once the token is obtained. For example, OAUTH_SCOPES: openid profile email api://<xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx>/okera/okera_auth_scope . |
OAUTH_SECRETS |
Specifies the location of your OAuth secrets file. For example,OAUTH_SECRETS: file:///etc/okera/client_secrets.json . |
OKERA_ASSUME_ROLE_DURATION_SECONDS |
Determines the duration, in seconds, that assumed role credentials are valid. For OkeraFS access proxy processing, this default is 3600 seconds. For regular AWS S3 processing, the default is 900 seconds. |
OKERA_CTE_REWRITE_ENABLED_ENGINES |
Identifies the Okera connection types that should use proxy pushdown processing. Use commas to separate values. Valid values are awsathena , bigquery , dremio:direct , postgresql , redshift , and snowflake . For example, OKERA_CTE_REWRITE_ENABLED_ENGINES: awsathena,bigquery,postresql . |
OKERA_LEGACY_TOKEN_ESCAPE |
Enables or disables token escape processing in Okera when the token is included as a query argument in the URL. Valid values are true (enables token escape processing) and false (disable token escape processing). The default is false . |
OKERA_SCRIPTS_DIR |
Specifies the location of the Okera allowed scripts directory. By default, this is /opt/scripts . Okera will only run scripts from its allowed scripts directory. It automatically makes the scripts specified in the USER_ATTRIBUTES_SCRIPT and GROUP_RESOLVER_SCRIPTS configuration parameters available in this directory, with the right permissions. See Custom Script User Attributes and Custom Script-Sourced Group Resolution. |
OKERA_STAGING_DIR |
Specifies the path to the storage location for Okera audit logs. |
PATH_PREFIX_MAP_FILE |
Identifies the location of the mapping file for the cluster in an active/active environment. |
PLANNER_API |
Specifies the Planner API port. The default is 12050 . This port is required for all clients to access metadata and data. |
podCidr |
Used for the Kubernetes cluster, this parameter specifies the pod IP address block of the CIDR blocks used for internal communication in Okera. For example, podCidr: "172.23.0.0/16" . The CIDR blocks should not overlap with CIDR blocks currently being used, including the VPC. For example, these changes should not be within the VPC range. |
POLICY_SYNC_INTERVAL |
Specifies how often Okera synchronizes Okera policies with Snowflake during process synchronization. Values are specified as a combination of a number and a one or two-letter code that represent the units. Valid unit codes are ns (nanoseconds), us (microseconds), ms (milliseconds), s (seconds), m (minutes), and h (hours). For example, 1h is one hour and 5000ms is 5000 milliseconds. The default is 30m (30 minutes). |
POLICY_SYNC_ROLE_PATTERN |
Specifies the Snowflake role pattern that Okera should use when syncing Okera policies. The default is OKERA_%s , where %s is replaced by the user name. |
POLICY_SYNC_SCHEDULER_ENABLED |
Enables and disables Okera policy synchronization with Snowflake. Valid values are true (enable policy synchronization) and false (disable policy synchronization). The default is true . |
POLICY_SYNC_USERS_ALLOWED_LIST |
Specifies the users for whom Okera policies should be synced. Valid values for this parameter are either a list of Snowflake users or a Snowflake tag and value (Snowflake users with the tag set to the value will be synced). These are the users for whom Okera manages the Snowflake connection. If no list or tag is specified, all Snowflake users are synced. |
portRange |
Used for the Kubernetes cluster, this parameter specifies the port range of the CIDR blocks used for internal communication in Okera. For example, portRange: "1025-65535" . The CIDR blocks should not overlap with CIDR blocks currently being used, including the VPC. For example, these changes should not be within the VPC range. |
PRESTO_API |
Specifies the port to access the Presto API endpoint for users connecting via JDBC. The default is 14050 . |
PRESTO_ENABLE_PROXY |
Enables and disables Okera proxy mode. Valid values are true (enable proxy mode) and false (disable proxy mode). The default is true . |
PRESTO_ENABLE_QUERY_LOGGING |
Enables and disables Presto query logging. Valid values for this new configuration parameter are true (enable Presto query logging) and false (disable Presto query logging). The default is false . |
PRESTO_HTTP_CLIENT_MAX_CONNECTIONS_PER_SERVER |
Customizes the Presto configuration property (http-client.max-connections-per-server ) that controls the maximum number of concurrent connections for a server. If this parameter is not specified, the values specified in your Presto environment are used (usually 20 for this parameter). |
PRESTO_HTTP_CLIENT_MAX_REQUESTS_QUEUED_PER_SERVER |
Customizes the Presto configuration property (http-client.max-requests-queued-per-destination ) that controls the maximum number of requests queued per destination. If this parameter is not specified, the values specified in your Presto environment are used (usually 1024 for this parameter). |
PRESTO_PROXY_DEBUG_ENABLED |
Enables and disables proxy debugging. Valid values are true (enable debugging) and false (disable debugging). The default is true . |
PRESTO_PROXY_JDBC_PUSHDOWN |
Enables and disables pushdown processing for JDBC-based connections. Valid values are true (enable pushdown processin) and false (disable pushdown processing). The default is true . |
PRESTO_RESOURCE_GROUP_FILE_LOCATION |
Identifies the location of a JSON file that contains the Presto resource group definition. Information about the contents of this file can be found in Resource Groups. |
PRESTO_SHOULD_USE_RESOURCE_GROUPS |
Enables or disables the file-based configuration manager for Presto. Valid values are true (enable the file-based configuration manager) and false (disable the file-based configuration manager). The default is false . |
REST |
Specifies the Okera UI/REST port. The default is 8083 . |
REST_SERVER_ENABLE_ACCESS_PROXY |
Enables or disables the OkeraFS S3 access proxy. Valid values are true (enable the access proxy) and false (disable the access proxy). |
REST_SERVER_LOG_LEVEL |
Specifies the message logging level for the REST server log. Valid values are DEBUG , INFO , WARNING , ERROR , and CRITICAL . The default is DEBUG . |
RS_ARGS |
Specifies various configuration options. See RS_ARGS Options. |
serviceCidr |
Used for the Kubernetes cluster, this parameter specifies the IP address range for the service network CIDR blocks used for internal communication in Okera. For example, serviceCidr: "172.34.0.0/16" . The CIDR blocks should not overlap with CIDR blocks currently being used, including the VPC. For example, these changes should not be within the VPC range. |
SIGNED_URL_EXPIRY_TIMEOUT_SECS |
Specifies the number of seconds for which presigned URLs in an nScale task are valid. If this time expires before the task completes, errors occur. The default is 7200 seconds (120 minutes). |
SYSTEM_DB_CONNECTION_URL_SUFFIX |
This configuration parameter is a global Aurora RDS setting that enables Aurora RDS to perform write forwarding. It must always be set on follower clusters in an active/active environment. The only valid value is sessionVariables=aurora_replica_read_consistency='session' . |
SYSTEM_TOKEN |
Specifies the path to the JWT system token used for interservice communication. The JWT system token has a sub of okera and groups of root . See System Token. |
TASK_ENCRYPTION_KEY |
Specifies the path of the file containing the encryption and decryption key for nScale. See Configuration File Encryption Settings for nScale. |
TRANSFORM_UDF_PRIORITIES |
Specifies the priority order of transformation functions in Okera. Values for this property are the transformation function names, specified in sequence, and separated by commas. See Prioritization of Transformations. |
TZ |
Specifies your time zone. For example, TZ: "America/New_York" . |
UI_TIMEOUT_MS |
Specifies the number of milliseconds the UI will wait before cancelling requests. The maximum value is 60000 milliseconds (or 60 seconds). The default is 30000 milliseconds (30 seconds). |
USER_ATTRIBUTES_SCRIPT |
Specifies the paths to the user attribute scripts. Paths can be a local file, an S3 path, or an ADLS path. See Custom Script User Attributes. |
WATCHER_AUDIT_LOG_DST_DIR |
Specifies the path or URL to ADLS or Google Cloud storage where Okera audit logs will be stored. For example, WATCHER_AUDIT_LOG_DST_DIR: s3://company/okera/logs/audit . |
WATCHER_LOG_DST_DIR |
Specifies the path or URL to ADLS or Google Cloud storage where Okera operational logs will be stored. For example, WATCHER_LOG_DST_DIR: s3://company/okera/logs . |
WATCHER_LOG_PARTITIONED_UPLOADS |
Enables and disables Okera partitioned operational log uploads. Valid values are true (enable partitioned operational log uploads) and false (disable partitioned operational log uploads). The default is false . |
WATCHER_S3_ENCRYPT |
Enables and disables Okera audit log encryption. Valid values are true (enable encryptions) and false (disable encryption). The default is true . |
WATCHER_S3_REGION |
Specifies the geographical region for logging. For example, WATCHER_S3_REGION: us-east-1 . |
WORKER_API |
Specifies the Worker API port. The default is 13050 . This port is required for all clients to access metadata and data. |