Skip to content

OkeraEnsemble Deployment on Amazon S3 (Preview Feature)

This document describes the deployment and use of the OkeraEnsemble access proxy service in Amazon S3 environments. The OkeraEnsemble access proxy service can be deployed using one of two modes:

  • Default mode. This is the classic method of deploying OkeraEnsemble in Amazon S3 environments. The OkeraEnsemble access proxy service is deployed in the Okera cluster using the Okera cluster system token specified in the SYSTEM_TOKEN configuration parameter. This token enables it to authenticate to the Okera cluster and authorize its use of the object store. See OkeraEnsemble Default Mode Deployment.
  • nScale mode. This mode is applicable only in Amazon EMR environments. The OkeraEnsemble access proxy service is deployed in each Amazon EMR node, so its workload is distributed across your cluster nodes and scales up and down with your clusters. To do this, the OkeraEnsemble access proxy needs IAM credentials to sign requests bound for Amazon S3 from Okera. See OkeraEnsemble nScale Mode Deployment in Amazon EMR Environments.

To understand how OkeraEnsemble access permissions map to Amazon S3 actions, AWS CLI commands, and Spark actions, see Map Okera Access Permissions.

System Requirements

The following system requirements must be met.

  • For default mode deployment, Okera 2.9 or later must be installed. For nScale mode deployment, Okera 2.11 or later must be installed.

  • When running on Amazon EMR, versions 5.2 and 6.1 are supported.

  • The AWS CLI V1 must be installed, using either Python 2.7 or Python 3.4+.

  • OkeraEnsemble supports both RSA256 and RSA512 JWT algorithms. The algorithm type used in your environment should be set using the JWT_ALGORITHM configuration parameter.

Note: The default port used by OkeraEnsemble is 5010.

Amazon S3 Bucket Role Mapping Support

OkeraEnsemble running in default mode supports the ability to assume secondary roles to read Amazon S3 data, with different roles mapped to different buckets. For more information, see Amazon Amazon S3 Bucket Role Mapping Support.

Note: This is not supported in nScale mode at this time.

Determine the OkeraEnsemble Amazon S3 Plugin Version

You can determine the version of the OkeraEnsemble Amazon S3 plugin by entering the following Python command.

>>> okera_fs_aws.__version__