Policy Examples Guide

This document is for administrators and data stewards who are looking for a guide on how to setup different types of attribute-based access policies using Okera. Some other useful reference docs are:

Column level access control using attributes

Create policies that restrict access to only columns that meet certain attribute conditions.

Policy example: For any user with the analyst_role, give access to all datasets inside the sales database, except any columns tagged security.piis.

Here's what our policy on the sales database looks like in the policy builder:

Okera restrict columns using ABAC example in policy builder

Example policy in SQL syntax:

GRANT SELECT ON database `sales`
HAVING ATTRIBUTE NOT security.pii
TO ROLE analyst_role;

Policy result

The admin sees the original data with all the columns, whilst users with the analyst_role see only columns that are not tagged security.pii – in this case the gender column is restricted.

Result of column restriction policy

User Has Access? What data do they see?
Admininistrators Original data
Users with analyst_role Original data without the gender column (which was tagged as security.pii)
Users without analyst_role N/A

In the above example we are only allowing columns that do not have the security.pii tag on them. If we wanted to only give access to columns that have a certain tag we can use the below syntax. In the below example we are only giving access to columns tagged security.public.

GRANT SELECT ON database `sales`
HAVING ATTRIBUTE security.public
TO ROLE analyst_role;

Masking

Anonymize or redact sensitive data values using attributes.

Policy example: For any user with the analyst_role, give access to all datasets inside the sales database, but only show the last 4 digits of credit card number.

Privacy function used: mask_ccn(). See reference.

Here's what our policy on the sales database looks like in the policy builder. We are adding a transform condition applying the mask_ccn() function to any data tagged pii.credit_card.

Okera masking example in policy builder

Example policy in SQL syntax:

GRANT SELECT ON DATABASE `sales`
TRANSFORM pii.credit_card WITH mask_ccn()
TO ROLE analyst_role;

Policy result

Here's what the data looks like to different users. The admin sees the original data in full fidelity, whilst users with the analyst_role only see the last 4 digits of columns tagged pii.credit_card.

Result of masking policy

User Has Access? What data do they see?
Admininistrators Original data
Users with analyst_role Can see any table inside the sales database, except any column tagged pii.credit_card will be masked except for the last 4 digits. In the above example the ccn column has the pii.credit_card tag.
Users without analyst_role N/A

Tokenization

Tokenization is pseudonymization technique that substitutes a value that may be sensitive, with a non-sensitive token. Okera offers different tokenization functions depending on if you want format-preservation, or to maintain referential integrity. Referential integrity means the resulting token is always the same for a given value across tables, so that analytical correlations i.e joins are still possible. See the Privacy and Security Functions docs for more information.

Format preserving tokenization

Replaces values with a format preserving token that maintains referential integrity.

Policy example: For any user with the analyst_role, give access to all objects in the sales database and tokenize any columns tagged security: restricted. Tokens for each value should be unique per user, but still allow each user to join across datasets for analytical correlations.

Privacy function used: tokenize(). See reference.

Here's what a regular tokenization policy for the sales database would look like in the policy builder:

Okera default tokenization example in policy builder

Our policy in SQL syntax:

GRANT SELECT ON database `sales`
TRANSFORM security.restricted WITH tokenize()
TO ROLE analyst_role;

However in our example we want tokens to be unique per user, so we must set the signed_user() argument for the tokenization function to true. If we didn't set this to true, all users would see the same tokenized values for a given input.

Okera tokenization example in policy builder with signed user argument

New policy in SQL syntax:

GRANT SELECT ON database `sales`
TRANSFORM security.restricted WITH tokenize(__COLUMN__, signed_user(TRUE))
TO ROLE

Note

Within Okera's policy syntax, __COLUMN__ is simply a variable for a column that is dynamically referenced by its tags.

Tip

To see other arguments that can be used to configure the behavior of the tokenization functions see here.

Policy result

Here's what the data looks like to different users. The admin sees the original data in full fidelity with no tokenization, whilst the analyst user sees the values of any columns tagged with security:restricted replaced with a format preserving token. In the below example the uid and the ccn columns have been tagged security:restricted.

Okera format preserving tokenization result

User Has Access? What data do they see?
Admininistrators Original data
Users with analyst_role Can see any dataset inside the sales database, however any columns tagged with security.restricted will have their sensitive values replaced with a format-preserving token that maintains referential integrity. This token will be unique per user, but still allow that user to join across datasets.
Users without analyst_role N/A

Imagine we had two tables inside the sales database – sales.customers and sales.orders, both containing the uid column tagged as security:restricted.

Here's a sample from sales.customers:

uid dob gender ccn
0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D 8-Apr-84 F 3771-2680-8616-9487
00071AA7-86D2-4EB9-871A-A786D27EB9BA 7-Feb-88 F 4539-9934-1924-5730
0007967E-F188-4598-9C7C-E64390482CFB 1-Jun-66 M 6011-0440-7310-9221

Here's a sample from sales.orders:

orderid uid orderdate
3453 0001BDD9-EABF-4D0D-81BD-D9EABFCD0D7D 2020-05-04
3422 00071AA7-86D2-4EB9-871A-A786D27EB9BA 2020-05-04
2342 0007967E-F188-4598-9C7C-E64390482CFB 2020-05-04

Notice that the uid column is common between datasets.

With the above policy, the analyst could still successfully run the query below that joins across both tables, however the uid field will always appear tokenized.

SELECT
orders.uid,
customers.gender,
orders.orderdate
FROM sales.orders
INNER JOIN sales.customers
ON orders.uid=customers.uid
WHERE orderdate='2020-05-04'

Here's the result of the query above as the analyst would see it:

uid gender orderdate
46221E2B-83TS-4D55-075G-3T89HGWW833U F 2020-05-04
274O90F6-7993-2903-0O47-ZNC87101F738 F 2020-05-04
811744H9-3NH0-310B-2J4A-55996663WI7H M 2020-05-04
5556J4GV-8412-581I-N60G-5U3371Q1QAI3 U 2020-05-04

Row filtering policies

Location based policies

Leveraging row based filtering can be useful when data needs to be filtered by region or country. This could be in order to comply with regulations such as GDPR or CCPA, or to further simplify organizational access control across global data.

Policy example: For users with the analyst_role, grant access to sales transactions but only show data from the US.

Here's what our policy looks like in the policy builder. In this example we are giving access to the sales.transactions table, and using the value in the column country as a row filter.

Okera row filter policy builder

Our policy in SQL syntax:

GRANT SELECT ON TABLE sales.transactions
WHERE country = 'US'
TO ROLE analyst_role;

Note

Row filter policies can be applied at both the dataset or database scope, however if applied at the database level then tables that do not contain the referenced column will no longer be accessible.

Policy result

Here's what the data looks like to different users. The admin sees the original data in full fidelty with no row filter, whilst the analyst user can only see US data.

Okera Row filter policy result

User Has Access? What data do they see?
Admininistrators Original data
Users with analyst_role Can see the sales.transactions table, except rows will be filtered to where country='US'.
Users without analyst_role N/A

Conditional transformation or cell level policies

Policies that transform a column depending on its value.

Policy example: For users with the role analyst_role, give access to any datasets inside the human_resources database, however mask any income information if greater than 25.00.

Here's what our policy looks like in the policy builder. As you can see we are leveraging a SQL if statement to place the conditional check if the value is above '25.00'.

Okera conditional masking policy builder

New policy in SQL syntax:

GRANT SELECT ON DATABASE `human_resources`
TRANSFORM hr.income WITH if(__COLUMN__ > '25.00', mask(__COLUMN__), __COLUMN__)
TO ROLE analyst_role;

Policy result

Here's what the data looks like to different users. The admin sees the original data in full fidelity with no transformation, whilst the analyst user cannot see income values if they are over '25.00'.

Okera tokenization policy result

User Has Access? What data do they see?
Admininistrators Original data
Users with analyst_role Can see any table inside the human_resources database, except any column tagged hr.income will have its value masked if it's over 25.00.
Users without analyst_role N/A