Permission Examples¶
This document is for administrators and data stewards learning to set up different types of attribute-based access permissions using Okera. Some other useful reference docs are:
Example 1: Column-Level Access Permission Using Attributes¶
In this example, we create permissions that restrict access to columns that meet certain conditions. We give access to all datasets inside the sales database to any user with the analyst_role
, except for any columns tagged as security.pii
.
Here's the permission definition for the sales
database in the permission builder:
Here's the permission definition in SQL syntax:
GRANT SELECT ON database `sales`
HAVING ATTRIBUTE NOT security.pii
TO ROLE analyst_role;
analyst_role
see only columns that are not tagged security.pii
– in this case the gender
column is restricted (as shown on the right in the image below).
User | Has Access? | What data do they see? |
---|---|---|
Administrators | ✅ | Original data |
Users with analyst_role |
✅ | Original data without the gender column (which was tagged as security.pii ) |
Users without analyst_role |
❌ | N/A |
In this example, only columns that do not have the security.pii
tag on them can be viewed.
To only give access to columns with a specific tag (for example, the security.public
tag) use the following SQL syntax:
GRANT SELECT ON database `sales`
HAVING ATTRIBUTE security.public
TO ROLE analyst_role;
Example 2: Masking¶
In this example, we use masking to anonymize or redact sensitive data values. We give access to all datasets inside the sales
database to any user with the analyst_role
, but only show the last four digits of the credit card numbers. To do this, we use the mask_ccn()
privacy function.
Here's the permission definition for the sales
database in the permission builder. We have added a transform condition applying the mask_ccn()
function to any data tagged pii.credit_card
.
Here's the permission definition in SQL syntax:
GRANT SELECT ON DATABASE `sales`
TRANSFORM pii.credit_card WITH mask_ccn()
TO ROLE analyst_role;
An admin sees the original data with all data columns (as shown on the left in the image below), but users with the analyst_role
see only the last four digits of columns tagged pii.credit_card
(as shown on the right in the image below).
User | Has Access? | What data do they see? |
---|---|---|
Administrators | ✅ | Original data |
Users with analyst_role |
✅ | Can see any table inside the sales database, except any column tagged pii.credit_card will be masked except for the last four digits. In the example above, the ccn column has the pii.credit_card tag. |
Users without analyst_role |
❌ | N/A |
Example 3: Tokenization¶
Tokenization is a pseudonymization technique that substitutes a non-sensitive token for a sensitive value. Okera offers different tokenization functions depending on whether you want preserve the format or maintain referential integrity. Referential integrity means the resulting token is always the same for a given value across tables, so that analytical correlations (such as joins) are still possible. See Privacy and Security Functions for more information.
This example replaces values with a format-preserving token that maintains referential integrity. Access is given to all objects in the sales database for any user with the analyst_role
, but any columns tagged security: restricted
are tokenized. Tokens for each value are unique for each user, but still allow users to join across datasets for analytical correlations.
The tokenize()
privacy function is used for this example.
Here's what a regular tokenization permission definition for the sales
database would look like in the permission builder:
Here's the permission definition in SQL syntax:
GRANT SELECT ON database `sales`
TRANSFORM security.restricted WITH tokenize()
TO ROLE analyst_role;
signed_user()
argument for the tokenization function to true
. If we don't set this to true
, all users will see the same tokenized values for a given input.
Here's the updated permission definition in the permission builder.
Here's the updated permission definition in SQL syntax:
GRANT SELECT ON database `sales`
TRANSFORM security.restricted WITH tokenize(__COLUMN__, signed_user(TRUE))
TO ROLE
Note: Within Okera's permission syntax,
__COLUMN__
is simply a variable for a column that is dynamically referenced by its tags.
Tip
To see other arguments that can be used to configure the behavior of the tokenization functions see Tokenization.
An admin sees the original data with all data columns and no tokenization (as shown on the left in the image below), but users with the analyst_role
see the values of any columns tagged with security:restricted
replaced with a format preserving token (as shown on the right in the image below).
User | Has Access? | What data do they see? |
---|---|---|
Admininistrators | ✅ | Original data |
Users with analyst_role |
✅ | Can see any dataset inside the sales database, however any columns tagged with security.restricted have their sensitive values replaced with a format-preserving token that maintains referential integrity. This token is unique per user, but still allows users to join across datasets. |
Users without analyst_role |
❌ | N/A |
Suppose we had two tables inside the sales
database – sales.customers
and sales.orders
, both containing the uid
column tagged as security:restricted
. Here's sample data from sales.customers
:
uid | dob | gender | ccn |
---|---|---|---|
0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D | 8-Apr-84 | F | 3771-2680-8616-9487 |
00071AA7-86D2-4EB9-871A-A786D27EB9BA | 7-Feb-88 | F | 4539-9934-1924-5730 |
0007967E-F188-4598-9C7C-E64390482CFB | 1-Jun-66 | M | 6011-0440-7310-9221 |
Here's sample data from sales.orders
:
orderid | uid | orderdate |
---|---|---|
3453 | 0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D | 2020-05-04 |
3422 | 00071AA7-86D2-4EB9-871A-A786D27EB9BA | 2020-05-04 |
2342 | 0007967E-F188-4598-9C7C-E64390482CFB | 2020-05-04 |
Note that the uid
column is common between datasets.
Using the permissions definition in this example, analysts can successfully run the following query that joins across both tables, but the uid
field will always appear tokenized.
SELECT
orders.uid,
customers.gender,
orders.orderdate
FROM sales.orders
INNER JOIN sales.customers
ON orders.uid=customers.uid
WHERE orderdate='2020-05-04'
Here's the result of this query as the analyst would see it:
uid | gender | orderdate |
---|---|---|
46221E2B-83TS-4D55-075G-3T89HGWW833U | F | 2020-05-04 |
274O90F6-7993-2903-0O47-ZNC87101F738 | F | 2020-05-04 |
811744H9-3NH0-310B-2J4A-55996663WI7H | M | 2020-05-04 |
5556J4GV-8412-581I-N60G-5U3371Q1QAI3 | U | 2020-05-04 |
Example 4: Row-Level Location-Based Permissions¶
Row-level filtering can be useful when data needs to be filtered by region or country. This might be required to comply with regulations such as GDPR or CCPA, or to further simplify organizationl access control across global data.
Column-Based Row-Level Location Filtering¶
In this example, we grant access to sales transaction for users with the analyst_role
, but only show data from the US.
Here's what the permission definition looks like in the permission builder.
In this example, we give access to the sales.transactions
table, using the value in the column country
as a row filter.
Here's the permission definition in SQL syntax:
GRANT SELECT ON TABLE sales.transactions
WHERE country = 'US'
TO ROLE analyst_role;
Note: Row-level filter permissions can be applied at both the dataset or database level. However, if they are at the database-level, tables that do not contain the referenced column are longer accessible.
An admin sees the original data with all data columns and no row filter (as shown on the left in the image below), but users with the analyst_role
see only US data (as shown on the right in the image below).
User | Has Access? | What data do they see? |
---|---|---|
Admininistrators | ✅ | Original data |
Users with analyst_role |
✅ | Can see the sales.transactions table, except rows will be filtered for country='US'. |
Users without analyst_role |
❌ | N/A |
User-Attribute-Based Row-Level Location Filtering¶
In this example, we grant access to sales transactions for all users with the analyst_role
when the transaction territory is the same as users.
Here's what the permission definition looks like in the permission builder. We give access to the sales.transactions
table an, using the column value territory
as a row filter.
Here is the permission definition in SQL syntax:
GRANT SELECT ON TABLE
WHERE sets_intersect(user_attribute('territory'), territory)
TO ROLE analyst_role;
An admin sees the original data with all data columns and no row filter (as shown on the left in the image below), but users with the analyst_role
see only data corresponding to their assigned territory. In the example below (on the right), only EMEA data is shown because the analyst is assigned to EMEA.
User | Has Access? | What data do they see? |
---|---|---|
Admininistrators | ✅ | Original data |
Users with analyst_role |
✅ | Can see the sales.transactions table, except rows are filtered for column territory=user attribute('territory') . |
Users without analyst_role |
❌ | N/A |
Example 5: Conditional Transformation or Cell-Level Permissions¶
You can create permission definitions that transform a column depending on its value. In this example, we give access to any datasets in the human_resources
database for users with the analyst_role
, but we mask income information if it is greater than 25.00
.
Here's what the permission definition looks like in the permission builder. We leverage the SQL if
statement to place a conditional check to determin if the value is greater than 25.00
.
Here's the permission definition in SQL syntax:
GRANT SELECT ON DATABASE `human_resources`
TRANSFORM hr.income WITH if(__COLUMN__ > '25.00', mask(__COLUMN__), __COLUMN__)
TO ROLE analyst_role;
An admin sees the original data with all data columns and no transformation (as shown on the left in the image below), but users with the analyst_role
cannot see income values if they are over 25.00
.
User | Has Access? | What data do they see? |
---|---|---|
Admininistrators | ✅ | Original data |
Users with analyst_role |
✅ | Can see any table inside the human_resources database, except any column tagged hr.income has values masked if they are over 25.00 . |
Users without analyst_role |
❌ | N/A |