Skip to content

Configure the Database

Configuring Okera to use a remote database server for Okera metadata storage is recommended for all Okera clusters. Okera metadata storage can be either MySQL or Postgres databases. The metadata stored is technical metadata about the data you want Okera to protect (the schema registry) as well as metadata for Okera policies and processing (the policy registry).

For information on using SSL or TLS to connect to the Okera metadata storage, see Configure SSL/TLS for Okera Metadata Storage.

Configuration Settings

The following configuration settings are used to configure a remote database server:

  • CATALOG_DB_ENGINE
  • CATALOG_DB_URL
  • CATALOG_DB_USER
  • CATALOG_DB_PASSWORD
  • CATALOG_DB_HMS_DB
  • CATALOG_DB_SENTRY_DB
  • CATALOG_DB_USERS_DB
  • CATALOG_DB_OKERA_DB

Configuring the Database

There are two parts to configuring the database:

  1. Configuring database server connectivity
  2. Configuring what databases to use on the server

Configuring Database Connectivity

To configure the connection to the database, you will need to know the host and port of the database server, and the username and password for a user that has sufficient privileges.

For example, if your MySQL database is at db.example.com:3306 with a username/password provided in secrets files, the configuration will look like this:

CATALOG_DB_ENGINE: mysql
CATALOG_DB_URL: db.example.example:3306
CATALOG_DB_USER: <secrets file>
CATALOG_DB_PASSWORD: <secrets file>

Important

Sensitive credentials should not be provided in plaintext, and instead must be provided in a secrets file either from local secret sources such as Kubernetes secrets, or from Cloud secrets managers services. See Provide Secure Credentials.

Configuring Database Names

On your MySQL database server, Okera will use four databases to store different types of values, such as the HMS catalog, policy information, user storage, etc. These databases have the following default values:

  • CATALOG_DB_HMS_DB: hms_db
  • CATALOG_DB_SENTRY_DB: sentry_db
  • CATALOG_DB_USERS_DB: cerebro_users
  • CATALOG_DB_OKERA_DB: okera_db

These values can all be changed. For example, to modify them to have the prefix of example_, you would have the following in your configuration file:

CATALOG_DB_HMS_DB: example_hms_db
CATALOG_DB_SENTRY_DB: example_sentry_db
CATALOG_DB_USERS_DB: example_cerebro_users
CATALOG_DB_OKERA_DB: example_okera_db

Note: You can edit only some of these settings. You are not required to set all of them.

Required Privileges

The user account that is used to connect to the database server must have sufficient privileges to fully administer the above four databases. This is typically achieved by providing superuser credentials that has full privileges for the entire databases server, but can also be done by creating a user that has admin privileges on only those specific databases.

For example, to grant a user called myuser access to the example_... databases above in the MySQL shell:

> GRANT ALL PRIVILEGES ON example_hms_db.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_sentry_db.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_cerebro_users.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_okera_db.* TO 'myuser'@'%'

Verifying DB Connectivity

To verify that the Okera pods can access the MySQL database server, you can launch a shell inside the Catalog pod and use the MySQL CLI to connect to the server.

  1. Run kubectl get pods to get the list of current pods and find the one starting with cerebro-catalog-...:

    $ kubectl get pods
    NAME                                 READY   STATUS    RESTARTS   AGE
    canary-fdtmn                         1/1     Running   0          5d16h
    canary-qz56w                         1/1     Running   0          5d16h
    canary-vlknv                         1/1     Running   0          5d16h
    cdas-rest-server-799dbd7ff4-tfr57    1/1     Running   0          5d16h
    cerebro-catalog-7f58569c4d-gv46c     1/1     Running   0          5d16h
    cerebro-planner-69c977496-wbzc4      1/1     Running   0          5d16h
    cerebro-worker-8xzbw                 1/1     Running   0          5d16h
    cerebro-worker-fzsps                 1/1     Running   0          5d16h
    cerebro-worker-jspr2                 1/1     Running   0          5d16h
    presto-coordinator-f698dfd7d-lfrhf   1/1     Running   0          2d23h
    presto-worker-glzv2                  1/1     Running   0          2d23h
    presto-worker-xbxv9                  1/1     Running   0          2d23h
    presto-worker-xdg58                  1/1     Running   0          2d23h
    
  2. Run kubectl exec -ti cerebro-catalog-... bash (replace the pod name with the actual value from the prior command) to launch a shell inside the pod:

    $ kubectl exec -ti cerebro-catalog-7f58569c4d-gv46c bash
    root@cerebro-catalog-7f58569c4d-gv46c:/#
    
  3. Run the mysql CLI to connect to your database server, e.g. if the MySQL database is at db.example.com:3306 with a username/password of okera/Okera123!:

    root@cerebro-catalog-7f58569c4d-gv46c:/# mysql -h db.example.com -P 3306 -u okera -p
    Enter password:
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 38422
    Server version: 5.6.39.0 MySQL Community Server (GPL)
    
    Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql>
    

If the connection hangs, it may mean that network access is blocked between your Okera nodes and your database server (typically a network configuration issue, e.g. a security group on AWS or Database Firewall on Azure).