Explore the Catalog¶
Okera supports many diverse sources for data including cloud-object storage (such as Amazon S3, ADLS, or GS), data warehouses (such as Snowflake or AWS Redshift) and relational databases (such as Postgres). You can also integrate an existing Hive metastore or connect to AWS Glue.
In this section of the Okera test drive, we explore the data provided with the test drive and suggest how you might want to protect it.
Okera's catalog is the collection of technical metadata for the data you wish to protect. Each user who views the catalog should only see the data to which they have been granted access via their Okera permissions.
Terminology¶
Here are some Okera key terms you need to understand. These terms do not necessarily match similar terms used by your data stores.
Okera Term | Description |
---|---|
Database | A logical or virtual collection of datasets inside the Okera catalog. These are analogous to schemas in some relational databases. When integrating with Snowflake, the Snowflake schemas become databases in Okera. |
Dataset | A table or view of data registered in Okera’s catalog. When integrating with Snowflake, the Snowflake tables become datasets in Okera. |
Schema | A column definition for an Okera dataset. |
Catalog Exploration Steps¶
-
Log in to your Okera test drive environment as the
admin_<tenant>
user. The password is provided in your test drive email. -
Select Explore my data catalog on the home page or select Data in the left-hand menu to see the catalog of the sample databases supplied with this Okera environment.
-
Select the
sales
database. Two datasets (tables) included in thesales
database are listed:- The
transactions
dataset records global sales transactions. - The
customer_contact_details
dataset contains sensitive information for each customer contact.
- The
-
Select the eye icon (
) at the end of a dataset row to preview the data inside a table. Select the name of the dataset to see the table schema, permissions, and other details. Select
sales.customer_contact_details
andsales.transactions
in the image below to see the schema for each dataset.Datasets
Schema for the customer_contact_details
DatasetSchema for the transactions
DatasetUsing Okera, we'd like to restrict the data that can be viewed in these ways:
-
Sales directors should only see data pertaining to the territories they manage. The
region
field in thetransactions
dataset is used in this test drive to set up permissions to restrict the territory data that sales directors see. -
Sales analysts should only see data pertaining to the geographical region relevant for them. The
country
field in thetransactions
dataset is used in this test drive to set up permissions that restrict the country data that sales analysts see. -
Sales analysts should not see data that is classified as sensitive or data that contains personally identifying information (PII). Okera tags and permissions is used to mask the data contained in the
contactname
,email
,phone
, andaddress
fields in thecustomer_contact_details
dataset, which contain PII information. -
Additional restrictions are applied to the Snowflake data in the Snowflake unit of this test drive. See Use Snowflake Policy Synchronization.
Next Step¶
-