Configure the Database¶
Configuring Okera to use a remote database server for Okera metadata storage is recommended for all Okera clusters. Okera metadata storage can be either MySQL or Postgres databases. The metadata stored is technical metadata about the data you want Okera to protect (the schema registry) as well as metadata for Okera policies and processing (the policy registry).
For information on using SSL or TLS to connect to the Okera metadata storage, see Configure SSL/TLS for Okera Metadata Storage.
Configuration Settings¶
The following configuration settings are used to configure a remote database server:
CATALOG_DB_ENGINE
CATALOG_DB_URL
CATALOG_DB_USER
CATALOG_DB_PASSWORD
CATALOG_DB_HMS_DB
CATALOG_DB_SENTRY_DB
CATALOG_DB_USERS_DB
CATALOG_DB_OKERA_DB
Configuring the Database¶
There are two parts to configuring the database:
- Configuring database server connectivity
- Configuring what databases to use on the server
Configuring Database Connectivity¶
To configure the connection to the database, you will need to know the host and port of the database server, and the username and password for a user that has sufficient privileges.
For example, if your MySQL database is at db.example.com:3306
with a username/password provided in secrets files, the configuration will look like this:
CATALOG_DB_ENGINE: mysql
CATALOG_DB_URL: db.example.example:3306
CATALOG_DB_USER: <secrets file>
CATALOG_DB_PASSWORD: <secrets file>
Important
Sensitive credentials should not be provided in plaintext, and instead must be provided in a secrets file either from local secret sources such as Kubernetes secrets, or from Cloud secrets managers services. See Provide Secure Credentials.
Configuring Database Names¶
On your MySQL database server, Okera will use four databases to store different types of values, such as the HMS catalog, policy information, user storage, etc. These databases have the following default values:
CATALOG_DB_HMS_DB
:hms_db
CATALOG_DB_SENTRY_DB
:sentry_db
CATALOG_DB_USERS_DB
:cerebro_users
CATALOG_DB_OKERA_DB
:okera_db
These values can all be changed.
For example, to modify them to have the prefix of example_
, you would have the following in your configuration file:
CATALOG_DB_HMS_DB: example_hms_db
CATALOG_DB_SENTRY_DB: example_sentry_db
CATALOG_DB_USERS_DB: example_cerebro_users
CATALOG_DB_OKERA_DB: example_okera_db
Note: You can edit only some of these settings. You are not required to set all of them.
Required Privileges¶
The user account that is used to connect to the database server must have sufficient privileges to fully administer the above four databases. This is typically achieved by providing superuser credentials that has full privileges for the entire databases server, but can also be done by creating a user that has admin privileges on only those specific databases.
For example, to grant a user called myuser
access to the example_...
databases above in the MySQL shell:
> GRANT ALL PRIVILEGES ON example_hms_db.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_sentry_db.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_cerebro_users.* TO 'myuser'@'%'
> GRANT ALL PRIVILEGES ON example_okera_db.* TO 'myuser'@'%'
Verifying DB Connectivity¶
To verify that the Okera pods can access the MySQL database server, you can launch a shell inside the Catalog pod and use the MySQL CLI to connect to the server.
-
Run
kubectl get pods
to get the list of current pods and find the one starting withcerebro-catalog-...
:$ kubectl get pods NAME READY STATUS RESTARTS AGE canary-fdtmn 1/1 Running 0 5d16h canary-qz56w 1/1 Running 0 5d16h canary-vlknv 1/1 Running 0 5d16h cdas-rest-server-799dbd7ff4-tfr57 1/1 Running 0 5d16h cerebro-catalog-7f58569c4d-gv46c 1/1 Running 0 5d16h cerebro-planner-69c977496-wbzc4 1/1 Running 0 5d16h cerebro-worker-8xzbw 1/1 Running 0 5d16h cerebro-worker-fzsps 1/1 Running 0 5d16h cerebro-worker-jspr2 1/1 Running 0 5d16h presto-coordinator-f698dfd7d-lfrhf 1/1 Running 0 2d23h presto-worker-glzv2 1/1 Running 0 2d23h presto-worker-xbxv9 1/1 Running 0 2d23h presto-worker-xdg58 1/1 Running 0 2d23h
-
Run
kubectl exec -ti cerebro-catalog-... bash
(replace the pod name with the actual value from the prior command) to launch a shell inside the pod:$ kubectl exec -ti cerebro-catalog-7f58569c4d-gv46c bash root@cerebro-catalog-7f58569c4d-gv46c:/#
-
Run the
mysql
CLI to connect to your database server, e.g. if the MySQL database is atdb.example.com:3306
with a username/password ofokera/Okera123!
:root@cerebro-catalog-7f58569c4d-gv46c:/# mysql -h db.example.com -P 3306 -u okera -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 38422 Server version: 5.6.39.0 MySQL Community Server (GPL) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
If the connection hangs, it may mean that network access is blocked between your Okera nodes and your database server (typically a network configuration issue, e.g. a security group on AWS or Database Firewall on Azure).