Dynamic Bypass Querying on Amazon S3 ^{(Preview Feature)}¶

Important

This feature impacts the production of file level access audit logs, as it relies on bypassing the planner for authorization when accessing individual files. Table level and all non-bypass access events will be unaffected.

This document introduces dynamic bypass querying (DBQ), the new query-execution mode for OkeraEnsemble nScale clusters. DBQ selectively bypasses in-data plane operations when the requesting user has full table metadata access. Bypassing has been shown to produce faster query times when users have full access to the tables being queried. OkeraEnsemble can now scale along side your EMR cluster with minimal impact to ODAS.

Once enabled, bypass is able to dynamically route incoming queries according to the user's privileges without any additional configuration. As noted, the bypass functionality is only applied when the requesting user has full table metadata access. Otherwise, in-data plane processing will be used. Privileged queries will now be routed through to the access proxy instead of directly to the planner.

To adapt to this new traffic pattern, several read-through caches have been added to the Access Proxy. The most notable of these caches is the user permission cache. When running in bypass mode, the Access Proxy will now temporarily cache file authorization results to avoid duplicate authorization processing. More information can be found in the caching deep dive section.

Configuring Bypass in AWS EMR¶

In order to preview the Dynamic Bypass Query functionality, you'll need to make some slight modifications to the existing OkeraEnsemble nScale deployment configuration. An additional permission will need to be granted to the roles that will be used. Some additional parameters will also need to be added to the spark configurations.

Role Permission Configuration¶

At this current time, table level permissions cannot be directly translated into URI permissions for DBQ queries. This means that to leverage the bypass feature, both table level and its associated bucket URI need to be granted to the requesting user's role. Removing the need for the redundant permission grant will be included as part of the official rollout of the feature in future versions. Here's an example of how to set both of those permissions:

GRANT ALL ON TABLE nytaxi.parquet TO ROLE role;
GRANT ALL ON URI 's3://bucket/' TO ROLE role;

The following sections review the various combinations of permissions and the expected execution mode they yield. In this context, Full indicates that a user has full table metadata access and Restricted indicates there are conditionals on the user's access to this resource.

Single Resource Grants¶

This section assumes no other permissions have been granted to the requesting user besides the one indicated.

Level	Granted	Bypass Disabled	Bypass Enabled
Database
	Full	Collocated	X
	Restricted	Collocated	Collocated
Table
	Full	Collocated	X
	Restricted	Collocated	Collocated
URI
	Full	X	X
	Restricted	X	X

Database Level Grants¶

This assumes the user has been granted at least some level of database access in conjunction with the listed permission.

DB Grant	URI Path	Access Level	Bypass Disabled	Bypass Enabled
Full
	None	-	Collocated	X
	Bucket
		ALL	Collocated	Bypass
		SELECT	Collocated	X
	Table
		ALL	Collocated	X
		SELECT	Collocated	X
Restricted
	None	-	Collocated	Collocated
	Bucket
		ALL	Collocated	Bypass
		SELECT	Collocated	X
	Table
		ALL	Collocated	X
		SELECT	Collocated	X

Table Level Grants¶

This assumes the user has been granted at least some level of table access in conjunction with the listed permission. Something to note here is that the URI permission granted to the requesting user needs to be on the bucket the table is stored in, not the table location itself.

DB Grant	URI Path	Access Level	Bypass Disabled	Bypass Enabled
Full
	None	-	Collocated	X
	Bucket
		ALL	Collocated	Bypass
		SELECT	Collocated	X
	Table
		ALL	Collocated	X
		SELECT	Collocated	X
Restricted
	None	-	Collocated	Collocated
	Bucket
		ALL	Collocated	Bypass
		SELECT	Collocated	X
	Table
		ALL	Collocated	X
		SELECT	Collocated	X

EMR Cluster Configuration¶

First, be sure to have familiarized yourself with deploying OkeraEnsemble nScale clusters. Enabling DBQ only requires a few additional parameters to be passed when creating your EMR cluster.

Configuring Spark¶

Add these additional parameters while creating your spark configurations in step 2:

add "okera.hms.enable-dynamic-bypass":"true" to spark-hive-site.xml and hive-site.xml
add "okera.hive.allow-original-metadata-on-all-tables":"true" to hive-site.xml

Here's an example configuration of an nScale OkeraEnsemble cluster with bypass enabled:

[
    {
      "Classification": "hive-site",
      "Properties": {
        "okera.hms.enable-dynamic-bypass": "true",
        "okera.hive.allow-original-metadata-on-all-tables": "true",
        "cerebro.hms.log-to-stdout": "true",
        "hive.fetch.task.conversion": "minimal",
        "hive.metastore.rawstore.impl": "com.cerebro.hive.metastore.CerebroObjectStore",
        "okera.hms.log-detailed-metadata": "true",
        "recordservice.planner.hostports": "10.0.0.1:12050",
        "recordservice.workers.local-port": "13050"
      }
    }, {
      "Classification": "core-site",
      "Properties": {
        "fs.s3a.aws.credentials.provider": "com.okera.recordservice.hadoop.OkeraCredentialsProvider",
        "fs.s3a.connection.ssl.enabled": "false",
        "fs.s3a.endpoint": "localhost:5010",
        "fs.s3a.path.style.access": "true",
        "fs.s3a.s3.client.factory.impl": "com.okera.recordservice.hadoop.OkeraS3ClientFactory",
        "okerafs.default.region": "us-west-2",
        "recordservice.token-provisioner": "http://10.0.0.1:8083",
        "recordservice.workers.local-port": "13050"
      }
    }, {
      "Classification": "spark-defaults",
      "Properties": {
        "spark.extraListeners": "com.okera.recordservice.spark.OkeraSparkListener",
        "spark.recordservice.planner.hostports": "10.0.0.1:12050",
        "spark.recordservice.workers.local-port": "13050"
      }
    }, {
      "Classification": "spark-hive-site",
      "Properties": {
        "okera.hms.enable-dynamic-bypass": "true",
        "cerebro.hms.log-to-stdout": "true",
        "okera.hms.log-detailed-metadata": "true",
        "recordservice.planner.hostports": "10.0.0.1:12050",
        "recordservice.workers.local-port": "13050"
      }
    }
  ]

Configuring Bootstrap Script¶

As part of step 4, add the following DBQ specific flags to the Okera bootstrap script's arguments.

--enable-bypass
--ap-user-permission-ttl <ttl in seconds, optional - defaults to 1 hour>

The --enable-bypass flag enables the bypass on the access proxy instances running inside the external workers. The --ap-user-permission-ttl flag can also be passed to set the ttl on the access proxy's user permission cache in seconds.

Access Proxy Caching Deep Dive¶

There are three main stages of file authorization in the Access Proxy:
1. Verify the request
2. Authorize the request
3. Generate a signed URL with proper credentials

As part of the bypass effort, in-memory caches have been added to each of these stages. As these caches are in-memory, restarting/recreating these pods will clear the saved state.

Access Proxy Caches The Access Proxy viewed in isolation with respect to the ODAS cluster. When running in nScale mode the Access Proxy is deployed inside the external workers.

Request Verification Cache¶

This cache is responsible for verifying the credentials of the requesting user. The cache's TTL is based on the expiry defined at the creation of the user's credentials. To change the expiry of the credentials being generated, set the environment variable GO_REST_AWS_TOKEN_EXPIRY to the desired TTL when deploying the Rest Server.

Request Authorization (User Permission) Cache¶

This cache is responsible for storing recently seen file authorization results. The results are cached based on user, action, and storage location provided by the associated URI permission. When a file operation request comes into the Access Proxy, it will look for the widest scope of permissions granted to the user first. If the search exhausts the filepath, it will make an authorize file request to the Planner and then populate the cache only if the user has authorized access.

A common concern is how this cache will perform when cold or used in ephemeral clusters. Something to remember is that this cache is not being indexed on a specific file or query, but instead uses the storage path of the table. All the file level operations performed by spark would immediately populate the cache.

Execution of query The cold startup cache here would only remain cold for the first few file operations or until the in-memory cache propagates that the requesting user has access to the URI `s3://nytaxi/parquet/`. That propagation is usually <~500ms.

Assumed Role Cache¶

This cache is manages the credentials of the assumed roles that are to be used when resigning the proxied request with valid AWS credentials. These credentials can be bucket specific and can be defined as part of Okera's assume secondary role feature. These credentials are generated by assuming the associated role and come with a session expiration which is used as the cache's TTL. These values can be fine tuned as described here.

Dynamic Bypass Querying on Amazon S3 (Preview Feature)¶