Security Functions

The following security related functions are available in ODAS. These can also be found in Privacy and Security Functions

Function Description
autotag() Apply the autotagging rules to the given string
diff_privacy() Enables differential privacy for numeric values
fnv_hash() Hashes the given value using the FNV algorithm
fp_noref_tokenize Replaces value with a format preserving token that does not maintain referential integrity.
fp_ref_tokenize Replaces value with a format preserving token that maintains referential integrity for all types.
fpt() Same as fp_noref_tokenize
fpt_ref() Same as fp_ref_tokenize
get_groups() Returns the groups of the given user/group
get_roles() Returns the roles of the given user/group
get_tags() Returns the tags associated with a schema registry object
has_access() Returns true if the user has access to a specified resource, else returns false
has_roles() Returns true if the user has the specified roles, else returns false
mask() Masks a value, crossing out alpha-numerical characters
mask_ccn() Masks a credit card number, crossing out the leading digits
nfp_ref_tokenize Replaces value with a non-format preserving token that maintains referential integrity.
nfp_noref_tokenize Replaces value with a non-format preserving token that does not maintain referential integrity.
random_ccn() Returns a random credit card number
tokenize() Replaces value with a format preserving token that maintains referential integrity for STRING & BIGINT types only.

The following explains some of the functions and statements in more detail.

  • <numeric_type> diff_privacy(<numeric_type> <value>, INT <noise>)

    Allows to add a statistical noise (or jitter) to a numeric value, while retaining the same properties when aggregated. For example, it is possible to add +/- 10 years to the individual values of a column containing birthdays, but when calculating the average age across all records you still get close to the same result.

    Careful attention has been taken to ensure that this is not susceptible to repeated queries. For example a query like SELECT salary WHERE id = 1234 will always return the same value, not a normally distributed random value around the true one. Otherwise, a malicious user could simply rerun queries and average results. This is implemented by making sure the "same query" always returns the same noise,that is, you get no additional information from repeated runs.

    Example: Using the diff_privacy() function

    > SELECT diff_privacy(50, 10);
    54
    
  • STRING autotag(STRING)

    Allows to apply the auto-tagging rules to a given string.

    Added in ODAS 1.6.0.

    > SELECT autotag("125.1.10.34")
    pii.ipv4
    
  • BIGINT fnv_hash(<primitive_type>)

    Applies a FNV based hash algorithm to the input value. Similar to MD5 but non-cryptographic and faster. The resulting value is not reversible, but for a specific input the resulting value is always the same.

    Example: Using the fnv_hash() function

    > SELECT fnv_hash("125.1.10.34")
    -7425713518659373000
    
  • STRING fp_random(STRING)

    Similar to tokenize() and fpt_ref(), but with random values for every time the function is invoked, no matter if the original string stays the same or not.

    Example: Using the fp_random() function

    > SELECT fp_random("125.1.10.34")
    077.2.58.52
    > SELECT fp_random("125.1.10.34")
    568.8.64.21
    
  • STRING get_groups(STRING)

    Returns the groups a given user is part of.

    Added in ODAS 1.6.0.

    > SELECT get_groups("analyst")
    analyst,mktg_analyst
    
  • STRING get_roles(STRING)

    Returns the roles a given user has access to.

    Added in ODAS 1.6.0.

    > SELECT get_roles("analyst")
    mktg_analyst_role,okera_public_role,okera_workspace_role
    
  • STRING get_tags(STRING)

    Returns the tags a given resources (such as a database or dataset) is associated with.

    Added in ODAS 1.6.0.

    > SELECT get_tags("customer.account_address_created");
    dog.labradoodle,feline.lion
    
  • BOOLEAN has_access(STRING) and BOOLEAN has_roles(STRING)

    See Authorization Built-ins for more details.

    The has_access() function allows performing conditional checks. It returns true if the current user has access to the given catalog object, otherwise returns false.

    Example: Using the has_access() function

    > SELECT has_access('okera_sample.users');
    true
    

    The has_roles() function returns true when the current user is granted all of the listed roles, specified as a comma-separated list.

    Example: Using the has_roles() function

    > SELECT has_roles('dev_role');
    false -- dev_role is _not_ granted to the current user
    
    > SELECT has_roles('sales_role');
    true -- sales_role is granted to the current user
    
    > SELECT has_roles('sales,role,dev_role');
    false -- since dev_role is _not_ granted to the current user
    
  • STRING mask(STRING, INT <start>, INT <length>)

    Masks the characters in a given string value starting at position <start> (1-based) and ending after <length> characters.

    Example: Using the mask() function

    > SELECT mask("5000 4000 3000 1000", 6, 4);
    5000 XXXX 3000 1000
    
    > SELECT mask("5000 4000 3000 1000", 6, 9);
    5000 XXXXXXXXX 1000
    
  • STRING mask_ccn(STRING)

    Masks all but that last four digits of a given credit card number.

    Example: Using the mask_ccn() function

    > SELECT mask_ccn("5000 4000 3000 1000")
    XXXXXXXXXXXXXXX1000
    
    > SELECT mask_ccn("5000400030001000")
    XXXXXXXXXXXX1000
    
    > SELECT mask_ccn("50004000300010001")
    XXXXXXXXXXXXX0001
    
  • STRING tokenize(STRING) and STRING fpt_ref(STRING)

    Applies a format-preserving transformation to the given text. This means, all alpha-numerical characters are replaced with random ones. All non-alpha-numerical characters are left as-is. The random generator used to determine the new characters is seeded with the original value, meaning the resulting tokenized value is always the same for a specific original value.

    Example: Using the tokenize() and fpt_ref() function

    > SELECT tokenize("125.1.10.34")
    847.9.84.92
    
    > SELECT fpt_ref("125.1.10.34")
    847.9.84.92
    
  • STRING random_ccn()

    Returns a random credit card number, divided by dashes.

    Example: Using the random_ccn() function

    > SELECT random_ccn();
    5580-5293-0763-8227
    
    > SELECT random_ccn();
    3484-8051-1437-2629