Skip to content

OkeraEnsemble Deployment on Databricks (Preview Feature)

Okera provides an authorization layer that intercepts Databricks data requests to Amazon S3 to determine whether users have access to the file and data in the request. If they do, the request is passed to Amazon S3 for processing. If they do not have access to the file, the request is rejected and returned.


OkeraEnsemble on Databricks is file-format dependent. At this time only Parquet, Delta, and Hive table file formats are supported. No other file formats are supported.

The default port used by OkeraEnsemble is 5010.

To enable Okera file access control (OkeraEnsemble) for Databricks in Amazon S3 environments, you must enable the Okera file system driver and enforce path signing using two environment variables OKERA_ENABLE_OKERA_FS and OKERA_FS_REQUIRE_SIGNED_PATHS. In the DatabricksEnvironment Variables section under Clusters -> Advanced Options -> Spark, set the following environment variables to true.


The OKERA_ENABLE_OKERA_FS environment variable installs the Okera file system driver.

The OKERA_FS_REQUIRE_SIGNED_PATHS environment variable enforces paths to be signed for authorization purposes.

The OKERA_DBX_PATH_SIGN_KEY environment variable identifies the location of the Databricks secret sign key used to sign the URLs shared between Databricks and Okera. To set up your Databricks secret sign key, see Databricks Secrets. Once you have defined your secrets sign key, specify the path to it in the OKERA_DBX_PATH_SIGN_KEY environment variable. The <path> is usually specified within double braces. For example: