The Tags page allows users with access to view existing tags, create new tags associated with a namespace and delete tags.
Note a tag is a form of "attribute" within Okera, and may be used interchangeably.
You can assign attributes on objects based on the data they contain, e.g. you may want to tag a dataset containing sales data as
sales or a column with sensitive data as
This enables the creation of attribute-based access policies based on tags.
Who has access to manage tags?¶
Access to attribute namespaces can be controlled like other objects in the system.
ATTRIBUTE NAMESPACE is an object level that sits under catalog. You can grant
ALL permission levels on attribute namespaces.
CREATE- Users will only be able to create attributes inside the specified attribute namespace.
ADD_ATTRIBUTE- Users will only be able to assign attributes inside the specified attribute namespace, on data they have permission to assign on.
ALL- User will be able to create, drop and assign attributes inside the specified attribute namespace.
For example if you wanted to give access to a role to create, drop and assign attributes from a particular attribute namespace, you would use the below:
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
Controlling who can assign tags on objects¶
In order to actually assign attributes on an object (i.e a database, dataset or column) the user needs to have:
REMOVE_ATTRIBUTEpermissions on the data they wish to assign on. In order to grant a role the ability to assign a tag on an object, you must grant both add and remove permissions.
ADD_ATTRIBUTEon the attribute namespace they wish to assign tags from.
If you only grant #1 without #2, then the user will not have any tags available to assign, since they have not been granted access to any attribute namespaces.
As an example here's how you could setup a data steward who can only assign tags from the
marketing attribute namespace, on data inside the
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward; GRANT ALL on DATABASE marketing TO ROLE marketing_steward;
Here's a more granular way of achieving the same use case, the only difference is the marketing steward does not have
ALL access to the database here, so we have to give them permissions to add and remove attributes on the data.
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward; GRANT ADD_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward; GRANT REMOVE_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward;
Example of tag permissions¶
Below is an example of granting permissions for managing tags to a marketing data steward.
CREATE ROLE marketing_steward_role; GRANT ROLE marketing_steward_role to group marketing_steward_group; -- Marketing steward will be able to assign tags on the marketing database -- since they have ALL access on it GRANT ALL ON DATABASE marketing TO ROLE marketing_steward_role; -- Marketing steward will be able to create, drop and assign tags -- only from the marketing attribute namespace GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward_role;
Creating a tag¶
Using the UI¶
To create a new tag, select an existing namespace or create a new one. A namespace acts as a category for grouping similar tags. For example, tags associated with security might be grouped under a namespace called 'Security'.
Once you have specified a namespace, write the name of your new tag in the Tag field and click 'Add'. The new tag will appear in your list of existing tags under the namespace provided.
CREATE ATTRIBUTE [IF NOT EXISTS] <namespace.attribute_name>;
Deleting a tag¶
Using the UI¶
To delete a tag, select the tag you wish to delete and click the trash can icon that appears in the top right of the tag details pane.
Select 'delete tag' if you would like to permanently delete a tag.
Deleting a tag will permanently delete all instances of this tag from your data and will void any policies on this tag.
Deleting tags can affect data security and discoverability.
DROP ATTRIBUTE [IF NOT EXISTS] <namespace.attribute>;
Assigning a tag on an object¶
In order to assign tags on objects you need to have:
ADD_ATTRIBUTE and REMOVE_ATTRIBUTE privilege on the data you want to assign on
ADD_ATTRIBUTE privilege on the
ATTRIBUTE NAMESPACE you wish to assign from
In the UI¶
To assign a tag from the UI go to the Datasets page and find a dataset or column to tag. To assign tags to a dataset, click on the edit icon next to 'Tags' in the Dataset Details view.
A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the dataset. Click 'Save' when you have finished assigning tags. Tags assigned to a dataset will display on the Dataset Summary Card, as well as in the Dataset Details view.
To assign a tag to a column, open the Dataset Details view and go to the Schema. Under the Tags column, click 'Click to Add' or the edit icon.
Tags cannot be assigned to partitioned columns or to nested fields of complex types.
A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the column. Click 'Save when you have finished assigning tags. Tags assigned to columns will display in the Dataset Details view, but not on the Dataset Summary Card.
Assign an attribute on a database
ALTER DATABASE <db> ADD|DROP ATTRIBUTE <namespace.attribute>
Assign an attribute on a table or view
ALTER TABLE|VIEW cerebro.sample ADD|DROP ATTRIBUTE <namespace.attribute>;
Assign an attribute on a column
ALTER TABLE|VIEW <dataset_name> ADD|DROP COLUMN ATTRIBUTE <column_name> <namespace.attribute>;
Admins can view assigned attributes as part of the dataset metadata by running
describe database <database_name> or
describe formatted <dataset_name>.
By default any tag assignment or removal operations on a parent dataset will cascade down to its descendant views. This includes assigning tags at the dataset level, as well as at the column level. On creating new views, tags on the parent dataset will be inherited by the newly created view at that point in time.
There are some cases where tags will not cascade:
- If the view transforms the parent in any way (such as aggregations, column data manipulation, etc), none of its tags will cascade
- For views with joins:
- Okera will cascade tags only on the directly referenced columns. Take for example:
CREATE VIEW sales.orders_by_customer AS SELECT transactions.userid, customers.customername, transactions.orderdate FROM sales.transactions INNER JOIN sales.customers ON transactions.userid=customers.userid;
Only tags on columns
transactions.orderdate will cascade to the new view.
Any tags on
customers.userid will not cascade.
- Tags applied at the table level will not cascade e.g if one dataset was tagged
status:approvedand the other is not, the resulting joined view will not have the
status:approvedtag on it.
- If the view lineage is not present. See the section below.
In order to cascade tags, Okera maintains the view lineage as part of a dataset's metadata to understand where tags have been inherited from. This can be viewed in the UI.
Tags will only cascade if the view lineage is present. View lineage is only available for views created through Okera after 2.1. If Okera is connected to an external metastore, and views were created directly inside that metastore, bypassing Okera, they will not show view lineage. Similarly any views created through Okera prior to the 2.1 release will not have their view lineage. It may be necessary to recreate these views if you wish tags to cascade on them.
Dropping a dataset (table or view), and recreating it, will remove its view lineage.
Creating policies using tags¶
To learn more about leveraging tags in attribute-based access control policies, see ABAC.
ODAS includes an auto-tagging capability that can help stewards identify common, formatted data, such as phone and credit card numbers, to simplify the effort of classifying new data and writing access policies for them. Learn more here
How do I delete a namespace?¶
You cannot delete a namespace that has tags inside it. A namespace will automatically be removed once there are no associated tags.
Can I move a tag from one namespace to another?¶
This is not supported yet, for now, you will need to delete the tag from the original namespace, and recreate it in the new namespace. Note that all assignments of the original tag will be removed and you will need to reassign the new tag on any objects that had it previously.
Can I update the name of a tag?¶
This is not supported yet.
Do tags only cascade if assigned through the UI?¶
No, the tag cascade behavior is consistent regardless of what tool you use to assign or remove tags, whether that's through the UI, or by running DDL, for instance, through PyOkera.
My tag did not cascade down to the view¶
There are certain scenarios where tags will not cascade, such as if the view contains aggregations or if the view lineage is not present. These are outlined in the Tag inheritance section.