Skip to content

Connecting to Data Sources

Okera supports connecting to these data sources out of the box:

You can also connect to a custom JDBC data source by providing your own JDBC driver.

Getting Started

There are two main steps to get started with JDBC data sources:

  1. Create a connection to the data source
  2. Create a crawler to automatically register data from a connection

Who Can Manage Connections?

The following table shows the access levels required to manage data connections. If a user has ALL access, they do not need additional access.

Action Catalog (all connections) DATACONNECTION DATABASE
Ability to see connections page in UI ALL, CREATE_DATACONNECTION_AS_OWNER Any access level
Create connnections ALL, CREATE_DATACONNECTION_AS_OWNER
Full administrative actions on specfied connection ALL ALL
Use specified connection for data registration ALL ALL, USE Need ALL or CREATE_AS_OWNER on DATABASE or CATALOG to actually register tables inside a database.
Edit connections ALL ALL
Test connections ALL, CREATE_DATACONNECTION_AS_OWNER ALL, USE
List connection ALL Any access level
View connection details ALL ALL, USE

Create a Connection

  1. Select on the Connections page in the UI.

    Create connection

  2. Specify a connection a name and select your Connection Type.

    Selection connection type

    Note: If you are using a SaaS tenant, only the data source types supported for your tenant are shown.

  3. Specify the relevant connection properties for your data source. For examples of the specific properties for each data source please click on the docs link for your specific data source. The example below is with Snowflake.

    Snowflake connection example

  4. Provide secure credentials. Sensitive credentials cannot be provided in plain text, and instead must be provided in a secrets file either from local secret sources such as Kubernetes secrets, or from Cloud secrets managers services. It is recommended for auditability that you create a new system user for Okera in your underlying database, and use those credentials in your Okera connection. Note that this system user will need to have read access on your data. Supported secure credential stores include:

  5. Test your connection to verify the properties are correct.

    Test connection

  6. Select to finish creating your connection.

    Connection Success Modal

  7. Register the data from your connection. See Register Data From a Connection.

Register Data From a Connection

After you've created a connection, crawl datasets from that connection and register them inside the Okera catalog. See Registering Data With Crawlers.

Manage Connections

List Connections

To list the connections defined for your Okera cluster, select Connections on the Okera side menu. The Connections page appears.

Create connection

Filter and Search for a Connection

You can filter and search the list of connections by both the name of the connection as well as the underlying type (e.g., Snowflake, AWS Redshift, etc.).

Two search boxes (initially empty) are provided at the top of the Connections page:

  • To filter the list by connection name, type the name or part of the name in the left search box. The page immediately filters all the connections for connection names that include the character string you specify.

  • To filter the list by connection type, enter the connection type in the right search box or select a type from the drop-down menu provided with the right search box.

You can filter by both connection name and type at the same time. For example, you can search for all Snowflake connections with names that include the letters test.

To clear all filters, select Clear all filters to the right of the search boxes.

View Connection Details

You can see connection details by clicking into a connection on the connections page.

Connection details

Edit Your Connection

Warning

Altering a connection will impact access to all tables that have been registered via that connection, and should be done carefully.

You can edit a connection's properties, by clicking the Edit Connection button on the connections list. You should test your connection to ensure the new properties still work.

Editing a connection

Note: In a SaaS environment, you can only edit connections for data sources supported by your SaaS tenant.

List the Registered Datasets for a Connection

You can list the datasets that have been registered from a particular connection by selecting the Registered Datasets tab for a particular connection.

Delete connection error

Note: You can only see datasets to which you have access in the list.

Delete Your Connection

Warning

Dropping a connection will impact the ability to access any tables or crawlers registered from that connection. Therefore connections can only be dropped if there are no associated tables or crawlers.

You can delete a connection, by selecting Delete Connection on the connection details page.

Note: In a SaaS environment, you can only delete connections for data sources supported by your SaaS tenant.

Delete connection

If you see the error below, but there are no registered datasets in that connection, ensure you have dropped all associated crawlers from that connection.

Delete connection error

You can search for your connection name on the Registration page to quickly find all associated crawlers.

Search crawlers for connection name

Programmatic Registration

To see the programmatic DDL commands for registration see Programmatic Registration.

Query Data From Relational Data Sources

Data from JDBC data sources can be queried regularly like all data sources. When querying via SQL tools through the Okera Gateway Access Pattern, policy enforcement happens through the Pushdown enforcement pattern to ensure optimum performance. For advanced concepts such as in-line SQL views read here.

Predicate Pushdown

Okera pushes down predicates for JDBC-backed data sources by default. To disable predicate pushdown for a particular JDBC-backed database or table, you can specify 'jdbc.predicates.pushdown.enabled' = 'false' in the DBPROPERTIES or TBLPROPERTIES.

Enable Proxy Pushdown Mode

One Okera policy enforcement mechanism uses proxy pushdown mode. To enable proxy pushdown mode for a connection, use the following Okera configuration parameters.

  • PRESTO_ENABLE_PROXY: Use this configuration parameter to enable proxy mode. Valid values are true and false. To enable proxy mode, set this configuration parameter to true (the default).
  • PRESTO_PROXY_JDBC_PUSHDOWN: Use this configuration parameter to enable pushdown processing for JDBC-based connections. Valid values are true and false. To enable pushdown processing, set this configuration parameter to true (the default).
  • OKERA_CTE_REWRITE_ENABLED_ENGINES: Use this configuration parameter to indicate which Okera connection types should use proxy pushdown processing. Use commas to separate values. Valid values are awsathena, bigquery, dremio:direct, postgresql, redshift, and snowflake. For example:

    OKERA_CTE_REWRITE_ENABLED_ENGINES: awsathena,bigquery,postresql
    

    There are no defaults.

  • PRESTO_PROXY_DEBUG_ENABLED: Use this configuration parameter to enable proxy debugging. Valid values are true and false. The default is true.