Managing Tags¶
The Tags page allows users with access to view existing tags, create new tags associated with a namespace and delete tags.
Note a tag is a form of "attribute" within Okera, and may be used interchangeably.
You can assign attributes on objects based on the data they contain, e.g. you may want to tag a dataset containing sales data as sales
or a column with sensitive data as pii
.
This enables the creation of attribute-based access policies based on tags.
Controlling access to tags¶
Access control for tag management¶
Access to attribute namespaces can be controlled like other objects in the system.
ATTRIBUTE NAMESPACE
is an object level that sits under catalog. You can grant CREATE
, ADD_ATTRIBUTE
or ALL
permission levels on attribute namespaces.
CREATE
- Users will only be able to create attributes inside the specified attribute namespace.ADD_ATTRIBUTE
- Users will only be able to assign attributes inside the specified attribute namespace, on data they have permission to assign on.ALL
- User will be able to create, drop and assign attributes inside the specified attribute namespace.
For example if you wanted to give access to a role to create, drop and assign attributes from a particular attribute namespace, you would use the below:
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
Granting access to tags page in the UI¶
In addition to this, if you wish to grant access to the tags page in the UI, so that a user can create and manage tags there, grant okera_tags_role
to that user's group.
Note, in order to assign attributes on data the user will still need to have the correct privileges on the data they are trying to assign on.
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward_role;
Controlling who can assign tags on objects¶
In order to actually assign attributes on an object (i.e a database, dataset or column) the user needs to have:
- either
ALL
orADD_ATTRIBUTE
andREMOVE_ATTRIBUTE
permissions on the data they wish to assign on. In order to grant a role the ability to assign a tag on an object, you must grant both add and remove permissions. - either
ALL
orADD_ATTRIBUTE
on the attribute namespace they wish to assign tags from.
If you only grant #1 without #2, then the user will not have any tags available to assign, since they have not been granted access to any attribute namespaces.
As an example here's how you could setup a data steward who can only assign tags from the marketing
attribute namespace, on data inside the marketingdb
database:
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
GRANT ALL on DATABASE marketing TO ROLE marketing_steward;
Here's a more granular way of achieving the same use case, the only difference is the marketing steward does not have ALL
access to the database here, so we have to give them permissions to add and remove attributes on the data.
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward;
GRANT ADD_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward;
GRANT REMOVE_ATTRIBUTE ON TABLE okera_sample.sample TO ROLE marketing_steward;
Example of tag permissions¶
Below is an example of granting permissions for managing tags to a marketing data steward.
CREATE ROLE marketing_steward_role;
GRANT ROLE marketing_steward_role to group marketing_steward_group;
-- Marketing steward will be able to assign tags on the marketing database
-- since they have ALL access on it
GRANT ALL ON DATABASE marketing TO ROLE marketing_steward_role;
-- Marketing steward will be able to create, drop and assign tags
-- only from the marketing attribute namespace
GRANT ALL on ATTRIBUTE NAMESPACE marketing TO ROLE marketing_steward_role;
-- Grant access to the tags page in the UI
GRANT ROLE okera_tags_role to group marketing_steward_group;
Creating a tag¶
Using the UI¶
To create a new tag, select an existing namespace or create a new one. A namespace acts as a category for grouping similar tags. For example, tags associated with security might be grouped under a namespace called 'Security'.
Once you have specified a namespace, write the name of your new tag in the Tag field and click 'Add'. The new tag will appear in your list of existing tags under the namespace provided.
Using DDL¶
CREATE ATTRIBUTE [IF NOT EXISTS] <namespace.attribute_name>;
Deleting a tag¶
Using the UI¶
To delete a tag, select the tag you wish to delete and click the trash can icon that appears in the top right of the tag details pane.
Select 'delete tag' if you would like to permanently delete a tag.
Deleting a tag will permanently delete all instances of this tag from your data and will void any policies on this tag.
Note
Deleting tags can affect data security and discoverability.
Using DDL¶
DROP ATTRIBUTE [IF NOT EXISTS] <namespace.attribute>;
Assigning a tag on an object¶
In order to assign tags on objects you need to have:
* ALL
or ADD_ATTRIBUTE and REMOVE_ATTRIBUTE
privilege on the data you want to assign on
* ALL
or ADD_ATTRIBUTE
privilege on the ATTRIBUTE NAMESPACE
you wish to assign from
In the UI¶
To assign a tag from the UI go to the Datasets page and find a dataset or column to tag. To assign tags to a dataset, click on the edit icon next to 'Tags' in the Dataset Details view.
A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the dataset. Click 'Save' when you have finished assigning tags. Tags assigned to a dataset will display on the Dataset Summary Card, as well as in the Dataset Details view.
To assign a tag to a column, open the Dataset Details view and go to the Schema. Under the Tags column, click 'Click to Add' or the edit icon.
Note
Tags cannot be assigned to partitioned columns or to nested fields of complex types.
A modal will display a list of tags. Select the checkbox next to a tag to assign it. You may also uncheck a checkbox to remove that tag from the column. Click 'Save when you have finished assigning tags. Tags assigned to columns will display in the Dataset Details view, but not on the Dataset Summary Card.
Using DDL¶
Assign an attribute on a database
ALTER DATABASE <db> ADD|DROP ATTRIBUTE <namespace.attribute>
Assign an attribute on a table or view
ALTER TABLE|VIEW cerebro.sample ADD|DROP ATTRIBUTE <namespace.attribute>;
Assign an attribute on a column
ALTER TABLE|VIEW <dataset_name> ADD|DROP COLUMN ATTRIBUTE <column_name> <namespace.attribute>;
Admins can view assigned attributes as part of the dataset metadata by running describe database <database_name>
or describe formatted <dataset_name>
.
Tag inheritance¶
By default any tag assignment or removal operations on a parent dataset will cascade down to its descendant views. This includes assigning tags at the dataset level, as well as at the column level. On creating new views, tags on the parent dataset will be inherited by the newly created view at that point in time.
There are some cases where tags will not cascade:
- If the view transforms the parent in any way (such as aggregations, column data manipulation, etc), none of its tags will cascade
- For views with joins:
- Okera will cascade tags only on the directly referenced columns. Take for example:
CREATE VIEW sales.orders_by_customer AS
SELECT transactions.userid, customers.customername, transactions.orderdate
FROM sales.transactions
INNER JOIN sales.customers
ON transactions.userid=customers.userid;
Only tags on columns transactions.userid
, customers.customername
and transactions.orderdate
will cascade to the new view.
Any tags on customers.userid
will not cascade.
- Tags applied at the table level will not cascade e.g if one dataset was tagged
status:approved
and the other is not, the resulting joined view will not have thestatus:approved
tag on it. - If the view lineage is not present. See the section below.
View lineage¶
In order to cascade tags, Okera maintains the view lineage as part of a dataset's metadata to understand where tags have been inherited from. This can be viewed in the UI.
Tags will only cascade if the view lineage is present. View lineage is only available for views created through Okera after 2.1. If Okera is connected to an external metastore, and views were created directly inside that metastore, bypassing Okera, they will not show view lineage. Similarly any views created through Okera prior to the 2.1 release will not have their view lineage. It may be necessary to recreate these views if you wish tags to cascade on them.
Warning
Dropping a dataset (table or view), and recreating it, will remove its view lineage.
Creating policies using tags¶
To learn more about leveraging tags in attribute-based access control policies, see ABAC.
Configuring Auto-tagging¶
ODAS includes an auto-tagging capability that can help stewards identify common, formatted data, such as phone and credit card numbers, to simplify the effort of classifying new data and writing access policies for them. Learn more here
FAQs¶
How do I delete a namespace?¶
You cannot delete a namespace that has tags inside it. A namespace will automatically be removed once there are no associated tags.
Can I move a tag from one namespace to another?¶
This is not supported yet, for now, you will need to delete the tag from the original namespace, and recreate it in the new namespace. Note that all assignments of the original tag will be removed and you will need to reassign the new tag on any objects that had it previously.
Can I update the name of a tag?¶
This is not supported yet.
Do tags only cascade if assigned through the UI?¶
No, the tag cascade behavior is consistent regardless of what tool you use to assign or remove tags, whether that's through the UI, or by running DDL, for instance, through PyOkera.
My tag did not cascade down to the view¶
There are certain scenarios where tags will not cascade, such as if the view contains aggregations or if the view lineage is not present. These are outlined in the Tag inheritance section.