In the role-based access control model supported by the Okera Platform, roles are granted access to resources by means of privileges. These vary in their functionality, based on the SQL operation they pertain to. The Okera Policy Engine supports the following privileges:
||Catalog, Database, Table||Grants full and unrestricted access to an object and any descendants. The
||Catalog, Database, Table, Column||Grants read access to an object and any descendants.|
||Catalog, Database, Table||Grants read access to object metadata only. Cannot be granted at a column level.|
||Catalog, Database, Table, Column||Grants write access to the object. Does not include read access.|
||Catalog, Database||Grants ability to create an object within another object, for instance,
||Catalog||Grants ability to create a database and automatically receive ALL privileges on that database e.g.
||Catalog, Database, Table||Grants ability to edit metadata for specified object, such as alter the table/view definition, add/drop/rename columns, change datatypes, partitions, storage locations, table properties.|
Object Types and Scope
All permissions are usually scoped at a specific object level.
Here are the object types supported by Okera:
||Global scope for all objects in the Catalog.|
||Scope on a single database and all included objects|
||Permissions for a specific table or view, with all its columns|
||As before, but for a subset of columns only|
||Specific to a file-based resource.|
* If you see
SERVER, it refers to the old syntax for
You can think of most of these like a hierarchy, with the exception of URIs. The latter is a separate object, while all the others subsume each other from left to right, as shown in the diagram.
Scope is the selection of an object at some point in the hierarchy, and knowing that all child objects are included. For example, if you allow read access to a specific database for a certain role, all user groups that are associated with that role will be able to read all datasets (that is, all tables and views) in the database.
Note: You cannot revoke permissions on a scope that were granted on a higher scope. If you grant access for a database to a specific role, you cannot revoke access to some of the datasets included in that database.
Assigning permissions for child objects are commonly issued at the next higher scope. For instance, you need to permit create permissions for a role on the database level, allowing the role owners to create new tables and views inside that database.
Views are handled using the same scope as tables. In other words, when addressing views as part of the authorization commands, refer to the table documentation.
URIs are a special kind of object, registering paths or specific resource files (such as Java JARs) that are accessible for non-administrative users. There are a handful types of actions that require file system permissions:
- Creating databases
- Creating external tables
- Creating functions
- Altering an external table’s location
- Altering a table’s set of partitions
In general, any action that requires the
ROW FORMAT SERDE keywords is checked for file system permissions before it is allowed by the platform.
Some of the above operations are only allowed at the catalog level scope, implicitly making anyone allowed at that level a global administrator.
And since global administrators are unrestricted, it is assumed they have unrestricted access to the underlying file systems.
Note: The file system checks for global administrators fall back to the Okera Data Access Service (ODAS) having access to the file resources. In practice, every ODAS setup will run with an authenticated technical user account, which needs to have access to all resources that are referenced by any of the location dependent SQL statements.
The following table shows each affected action with the scope and privilege they require, and what that means for the file system checks:
|Action Type||Scope||Privilege||File System Check|
|Create Database||Catalog||All||Not needed|
|Create Function||Catalog||All||Not needed|
|Create External Table||Database||All||Yes1|
|Alter External Table Location||Database||All||Yes1|
|Alter Table Partitions||Database||All||Yes1|
Legend: 1 Applies to non-administrative users only
Note: URIs only are supported in combination with the All privilege, as shown in the table.
For non-specific URIs, that is those which are not referencing a specific resource file (see Extending ODAS for an example), access is checked for the given file system path. Any file or directory inside that location is automatically included. That allows, for example, an administrator to permit access to a specific root path for a given role. Any user that is associated with that role is allowed the same level of access inside that root path.
Finally, any SQL statement that uses one of those resources implicitly, like a
SELECT statement using a UDF, does not require a file system permission check again.
This makes sense as first a table or function must be created before it can be used.
In other words, an administrator or elevated user (with “All” privileges on the catalog or database level respectively) creates the object or function using the explicit location URI.
Any other user with, for example, a read-only role is allowed to access the object or function without requiring explicit access to the underlying resources.
Here are examples showing the difference:
Example: SQL statements that require explicit access to the specified resources
CREATE EXTERNAL TABLE transactions_schemaed( txnid BIGINT, dt_time STRING, sku STRING, userid INT, price FLOAT, creditcard STRING, ip STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION 's3://acme-sales-data/transactions'; ALTER TABLE sales.transactions_schemaed RECOVER PARTITIONS; CREATE FUNCTION sales.mask(STRING) RETURNS STRING LOCATION 's3://acme-udfs-public/udfs/mask-udf.jar' SYMBOL='com.acme.hiveudf.MaskUDF';
Example: SQL statements that do not require explicit access to the underlying resources
SELECT count(txnid) FROM sales.transactions_schemaed; CREATE VIEW sales.transactions AS SELECT txnid, dt_time, sku, if (has_access('sales.transactions_schemaed'), userid, tokenize(userid)) as userid, price, if (has_access('sales.transactions_schemaed'), creditcard,mask_ccn(creditcard)) as creditcard, if (has_access('sales.transactions_schemaed'), ip, cast(tokenize(ip) as STRING)) as ip FROM sales.transactions_schemaed;
How this is used in practice is explained in Best Practices.