Managing Tags

The Tags page allows users with access to view existing tags, create new tags associated with a namespace and delete tags. Note a tag is a form of "attribute" within Okera, and may be used interchangeably. You can assign attributes on objects based on the data they contain, e.g. you may want to tag a dataset containing sales data as sales or a column with sensitive data as pii. This enables the creation of attribute-based access policies based on tags.

Controlling access to tags

Access control for tag management

Access to attribute namespaces can be controlled like other objects in the system. ATTRIBUTE NAMESPACE is an object level that sits under catalog. You can grant CREATE, ADD_ATTRIBUTE or ALL permission levels on attribute namespaces.

  • CREATE - Users will only be able to create attributes inside the specified attribute namespace.
  • ADD_ATTRIBUTE - Users will only be able to assign attributes inside the specified attribute namespace, on data they have permission to assign on.
  • ALL - User will be able to create, drop and assign attributes inside the specified attribute namespace.

For example if you wanted to give access to a role to create, drop and assign attributes from a particular attribute namespace, you would use the below:

GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;

Granting access to tags page in the UI

In addition to this, if you wish to grant access to the tags page in the UI, so that a user can create and manage tags there, grant okera_tags_role to that user's group. Note, in order to assign attributes on data the user will still need to have the correct privileges on the data they are trying to assign on.

GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward_role;

Controlling who can assign tags on objects

In order to actually assign attributes on an object (i.e a database, dataset or column) the user needs to have:

  1. either ALL or ADD_ATTRIBUTE and REMOVE_ATTRIBUTE permissions on the data they wish to assign on. In order to grant a role the ability to assign a tag on an object, you must grant both add and remove permissions.
  2. either ALL or ADD_ATTRIBUTE on the attribute namespace they wish to assign tags from.

If you only grant #1 without #2, then the user will not have any tags available to assign, since they have not been granted access to any attribute namespaces.

As an example here's how you could setup a data steward who can only assign tags from the marketing attribute namespace, on data inside the marketingdb database:

GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
GRANT ALL on DATABASE marketing TO ROLE marketing_steward;

Here's a more granular way of achieving the same use case, the only difference is the marketing steward does not have ALL access to the database here, so we have to give them permissions to add and remove attributes on the data.

GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
GRANT ADD_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward;
GRANT REMOVE_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward;

Example of tag permissions

Below is an example of granting permissions for managing tags to a marketing data steward.

CREATE ROLE marketing_steward_role;
GRANT ROLE marketing_steward_role to group marketing_steward_group;

-- Marketing steward will be able to assign tags on the marketing database 
-- since they have ALL access on it
GRANT ALL ON DATABASE marketing TO ROLE marketing_steward_role;

-- Marketing steward will be able to create, drop and assign tags
-- only from the marketing attribute namespace 
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward_role;

-- Grant access to the tags page in the UI
GRANT ROLE okera_tags_role to group marketing_steward_group;

Creating a tag

Using the UI

To create a new tag, select an existing namespace or create a new one. A namespace acts as a category for grouping similar tags. For example, tags associated with security might be grouped under a namespace called 'Security'.

Once you have specified a namespace, write the name of your new tag in the Tag field and click 'Add'. The new tag will appear in your list of existing tags under the namespace provided.

Create a tag

Using DDL

CREATE ATTRIBUTE [IF NOT EXISTS] <namespace.attribute_name>;

Deleting a tag

Using the UI

To delete a tag, select the tag you wish to delete and click the trash can icon that appears in the top right of the tag details pane.

Delete a tag

Select 'delete tag' if you would like to permanently delete a tag.

Delete modal

Deleting a tag will permanently delete all instances of this tag from your data and will void any policies on this tag.

Note

Deleting tags can affect data security and discoverability.

Using DDL

DROP ATTRIBUTE [IF NOT EXISTS] <namespace.attribute>;

Assigning a tag on an object

In order to assign tags on objects you need to have: * ALL or ADD_ATTRIBUTE and REMOVE_ATTRIBUTE privilege on the data you want to assign on * ALL or ADD_ATTRIBUTE privilege on the ATTRIBUTE NAMESPACE you wish to assign from

In the UI

To assign a tag from the UI go to the Datasets page and find a dataset or column to tag. To assign tags to a dataset, click on the edit icon next to 'Tags' in the Dataset Details view.

Assign a tag to a dataset

A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the dataset. Click 'Save' when you have finished assigning tags. Tags assigned to a dataset will display on the Dataset Summary Card, as well as in the Dataset Details view.

Assign a tag modal

To assign a tag to a column, open the Dataset Details view and go to the Schema. Under the Tags column, click 'Click to Add' or the edit icon.

Note

Tags cannot be assigned to partitioned columns or to nested fields of complex types.

Assign a tag to a dataset column

A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the column. Click 'Save when you have finished assigning tags. Tags assigned to columns will display in the Dataset Details view, but not on the Dataset Summary Card.

Tag assigned to a dataset column

Using DDL

Assign an attribute on a database

ALTER DATABASE <db> ADD|DROP ATTRIBUTE <namespace.attribute>

Assign an attribute on a table or view

ALTER TABLE|VIEW cerebro.sample ADD|DROP ATTRIBUTE <namespace.attribute>;

Assign an attribute on a column

ALTER TABLE|VIEW <dataset_name> ADD|DROP COLUMN ATTRIBUTE <column_name> <namespace.attribute>;

Admins can view assigned attributes as part of the dataset metadata by running describe database <database_name> or describe formatted <dataset_name>.

Tag inheritance

By default any tag assignment or removal operations on a parent dataset will cascade down to its descendant views. This includes assigning tags at the dataset level, as well as at the column level. On creating new views, tags on the parent dataset will be inherited by the newly created view at that point in time.

There are some cases where tags will not cascade:

  • If the view transforms the parent in any way (such as aggregations, column data manipulation, etc), none of its tags will cascade
  • For views with joins:
    • Okera will cascade tags only on the directly referenced columns. Take for example:
CREATE VIEW sales.orders_by_customer AS 
SELECT transactions.userid, customers.customername, transactions.orderdate
FROM sales.transactions 
INNER JOIN sales.customers 
ON transactions.userid=customers.userid;

Only tags on columns transactions.userid, customers.customername and transactions.orderdate will cascade to the new view. Any tags on customers.userid will not cascade.

  • Tags applied at the table level will not cascade e.g if one dataset was tagged status:approved and the other is not, the resulting joined view will not have the status:approved tag on it.
  • If the view lineage is not present. See the section below.

View lineage

In order to cascade tags, Okera maintains the view lineage as part of a dataset's metadata to understand where tags have been inherited from. This can be viewed in the UI.

Tags will only cascade if the view lineage is present. View lineage is only available for views created through Okera after 2.1. If Okera is connected to an external metastore, and views were created directly inside that metastore, bypassing Okera, they will not show view lineage. Similarly any views created through Okera prior to the 2.1 release will not have their view lineage. It may be necessary to recreate these views if you wish tags to cascade on them.

Warning

Dropping a dataset (table or view), and recreating it, will remove its view lineage.

Creating policies using tags

To learn more about leveraging tags in attribute-based access control policies, see ABAC.

Configuring Auto-tagging

ODAS includes an auto-tagging capability that can help stewards identify common, formatted data, such as phone and credit card numbers, to simplify the effort of classifying new data and writing access policies for them. Learn more here

FAQs

How do I delete a namespace?

You cannot delete a namespace that has tags inside it. A namespace will automatically be removed once there are no associated tags.

Can I move a tag from one namespace to another?

This is not supported yet, for now, you will need to delete the tag from the original namespace, and recreate it in the new namespace. Note that all assignments of the original tag will be removed and you will need to reassign the new tag on any objects that had it previously.

Can I update the name of a tag?

This is not supported yet.

Do tags only cascade if assigned through the UI?

No, the tag cascade behavior is consistent regardless of what tool you use to assign or remove tags, whether that's through the UI, or by running DDL, for instance, through PyOkera.

My tag did not cascade down to the view

There are certain scenarios where tags will not cascade, such as if the view contains aggregations or if the view lineage is not present. These are outlined in the Tag inheritance section.