Okera Version 2.14 Release Notes¶

This topic provides Release Notes for all 2.14 versions of Okera.

2.14.3 (1/20/2023)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

CVE-2022-41946 Information Exposure

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes and Improvements¶

Optimized the performance of Okera's getPartitions() API endpoint, resulting in lower latency and load on the catalog database.
Improved the performance of SHOW CREATE TABLE statements.
Fixed a bug that caused null pointer exceptions after an upgrade from Okera 2.11.x. This bug caused problems logging into the UI as a non-admin user.

Fixed page errors that occurred when there were conflicts creating permissions.

2.14.2 (12/19/2022)¶

OkeraEnsemble Updates¶

Okera has updated how you should deploy OkeraEnsemble nScale mode support with Amazon EMR 5 and Amazon EMR 6. Differences in the two Amazon EMR versions require that OkeraEnsemble nScale mode be deployed differently, based on the version of Amazon EMR you are using.

When deploying OkeraEnsemble nScale in an Amazon EMR 5 environment, set the core-site.xml flag called fs.s3a.s3.client.factory.impl to org.apache.hadoop.fs.s3a.OkeraS3ClientFactory. When deploying OkeraEnsemble in an Amazon EMR 6 environment, set the core-site.xml flag called fs.s3a.s3.client.factory.impl to com.okera.recordservice.hadoop.OkeraS3ClientFactory.

For more information, see OkeraEnsemble nScale Mode Deployment in Amazon EMR Environments.

2.14.1 (12/10/2022)¶

Amazon EMR 6.5.0 and Spark 3.1.2 Support¶

With this release, Okera supports Amazon EMR 6.5 and Spark 3.1.2 environments, with one limitation (at this time). The limitation is that you cannot perform an insert operation on a non-TEXT-type partitioned table (for example, ORC, Parquet, Avro) if the Hive recordservice.spark.client-bypass configuration setting is set to true (a requirement today when writing to a SQL table using spark.sql on an Okera-integrated Amazon EMR cluster).

OkeraEnsemble Updates¶

OkeraEnsemble now supports RSA256 as a JWT algorithm. In past releases, only RSA512 was supported, although Okera itself has always supported both RSA256 and RSA512. The algorithm type used in your environment should be set using the JWT_ALGORITHM configuration parameter.

BigQuery Updates¶

The following updates have been made for BigQuery connections in this release:

You can now inject the Okera connection query ID into BigQuery history and in the Okera audit logs. This ID can be used to correlate the BigQuery project history with the logging in Okera audit logs.

To support this functionality, a new connection configuration parameter inject.query-id has been added. Valid values are true (enable okera ID injection) and false (do not enable okera ID injection). When enabled for a connection, the ID is injected as a comment in the Okera-generated SQL sent to the connection and appears in BigQuery history. For most connections, the default for inject.query-id is false, but for BigQuery connections, the default is true. See Inject the Okera Connection Query ID Into BigQuery History.
You can now register cross-project BigQuery tables from the same Okera connection. For example, using a single connection that references one BigQuery project, you can create a second Okera crawler to crawl the same connection using a second BigQuery project. This new functionality ensures that defining multiple BigQuery connections in Okera is no longer necessary, allowing Dataproc cross-project join queries to complete successfully. It also enables cross-project joins using Presto pushdown, which moves the compute actions to the BigQuery engine and away from the Okera Enforcement Fleet (workers). Finally, it reduces your BigQuery chargeback complexity because all queries get consolidated into a single Okera connection.

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Alpine-13661 Alpine314: Alpine-13661
CVE-2018-25032 Alpine314: Out-of-bounds Write
CVE-2021-46828 Alpine314: Allocation of Resources Without Limits or Throttling
CVE-2022-0778 Alpine314: Loop with Unreachable Exit Condition ('Infinite Loop')
CVE-2022-1097 Alpine314: OpenJDK
CVE-2022-1271 Alpine314: Improper Input Validation
CVE-2022-2097 Alpine314: Inadequate Encryption Strength
CVE-2022-2309 Alpine314: NULL Pointer Dereference
CVE-2022-21540 Alpine315: OpenJDK
CVE-2022-21541 Alpine315: OpenJDK
CVE-2022-21549 Alpine315: OpenJDK
CVE-2022-21619 Alpine315: OpenJDK
CVE-2022-21624 Alpine315: OpenJDK
CVE-2022-21626 Alpine315: OpenJDK
CVE-2022-21628 Alpine315: OpenJDK
CVE-2022-22576 Alpine314: Improper Authentication
CVE-2022-25647 Alpine315: Deserialization of Untrusted Data
CVE-2022-27404 Alpine314: Out-of-bounds Write
CVE-2022-27405 Alpine314: Out-of-bounds Read
CVE-2022-27406 Alpine314: Out-of-bounds Read
CVE-2022-27774 Alpine314: Insufficiently Protected Credentials
CVE-2022-27775 Alpine314: Curl
CVE-2022-27776 Alpine314: Insufficiently Protected Credentials
CVE-2022-27781 Alpine314: Loop with Unreachable Exit Condition ('Infinite Loop')
CVE-2022-27782 Alpine314: Improper Certificate Validation
CVE-2022-28391 Alpine314: BusyBox
CVE-2022-29458 Alpine314: Out-of-bounds Read
CVE-2022-29824 Alpine314: Integer Overflow or Wraparound
CVE-2022-32205 Alpine314: Allocation of Resources Without Limits or Throttling
CVE-2022-32206 Alpine314: Allocation of Resources Without Limits or Throttling
CVE-2022-32207 Alpine314: Incorrect Default Permissions
CVE-2022-32208 Alpine314: Out-of-bounds Write
CVE-2022-34169 Alpine315: Incorrect Conversion between Numeric Types
CVE-2022-35252 Alpine314: Curl
CVE-2022-37434 Alpine314: Out-of-bounds Write
CVE-2022-39399 Alpine315: OpenJDK
CVE-2022-40303 Alpine314: Integer Overflow or Wraparound
CVE-2022-40304 Alpine314: XML External Entity (XXE) Injection
CVE-2022-40674 Alpine314: Use After Free
CVE-2022-42898 Alpine315: KRB5
CVE-2022-43680 Alpine314: Use After Free

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

2.14.0 (11/30/2022)¶

OkeraEnsemble (OkeraFS) General Updates¶

The following updates were made to OkeraEnsemble in this release.

OkeraFS was renamed OkeraEnsemble.
Python 2.7 is no longer supported for OkeraEnsemble installations.
You can now determine the version of the OkeraEnsemble Amazon S3 plugin. See Determine the OkeraEnsemble Amazon S3 Plugin Version.

OkeraEnsemble UI and ABAC Support ^{(Preview Feature)}¶

OkeraEnsemble extends Okera's pre-existing fine-grained access controls to unstructured data (URIs). Unstructured data is data that cannot be mapped in a tabular structure, such as a library of images (for example, medical X-rays) or an individual video or sound file. With this release, you can now register and apply permissions to your unstructured data using OkeraEnsemble and the Okera UI. Your unstructured data can be tagged, and its access can now be controlled using Okera's attribute-based access control (ABAC). For more information, see Register Unstructured Data URIs.

OkeraEnsemble System Token Generation Changes in Default Mode¶

With this release, the system token required by OkeraEnsemble in default (non-nScale) mode is not automatically generated from a private key.

When deployed in nScale mode, the OkeraEnsemble access proxy requires a JWT token to authenticate to the Okera cluster. This token can be provided using either the JWT_PRIVATE_KEY or SYSTEM_TOKEN configuration parameters. When the JWT_PRIVATE_KEY configuration parameter is specified, the OkeraEnsemble access proxy automatically generates its own JWT token with the provided private key. When the SYSTEM_TOKEN configuration parameter is specified, it defines the location of the system JWT token file. If both are specified, the JWT_PRIVATE_KEY takes precedence and is used, by default, to generate the required JWT token..

However, when OkeraEnsemble is deployed in default (non-nScale) mode on the Okera cluster, the access proxy now defaults to using the token defined by the SYSTEM_TOKEN configuration parameter. It no longer will generate the token from a private key.

Apache Ranger Migration Script ^{(Preview Feature)}¶

With this release, Okera provides a Python script you can use to extract Hive, Hadoop Distributed File System (HDFS), and Starburst Enterprise (Trino) policies from an Apache Ranger policy server. The script connects to Ranger, queries for the policies, and generates equivalent Okera DDL in a JSON file. The script can also automatically run the resulting Okera DDL against a running Okera cluster. For more information, see Apache Ranger Migration.

Tag Updates in the UI¶

This release introduces a new search bar on the Tags page in the UI. Use this search bar to locate a tag listed on the tags page. For more information, see Search for Tags.

UI Changes¶

To support OkeraEnsemble's use in the UI, a new Files tab has been added to the Databases page. From this tab you can register unstructured data (URIs or files), tag it, and apply access permissions to it. See Register Unstructured Data (URIs and Files).
The title of the Databases page changed to Data because both structured data (tables) and unstructure data (files and URIs) are now supported.
The names of the following buttons or options changed:
- The Create new connection button changed to Create connection.
- The option changed to when creating autotags.
- The Save button on the Create auto tagging rule dialog changed to Create.
- The Add button on the Create tag dialog changed to Create*.
The following dialog titles have changed:
- The Create new connection dialog title changed to Create connection.
- The New automatic tagging rule dialog title changed to Create auto tagging rule.
- The Editing automatic tagging rule dialog title changed to Editing auto tagging rule.

Casing Changes for Table Name Lookups¶

With this release, Okera functionally ignores case for table name lookups and performs table lookups in a case-insensitive manner. For example, if you query user_Table_1 when the Okera crawler found user_TABLE_1, Okera now generates a query against user_TABLE_1 rather than returning a "no object" response.

Warning

If you have multiple objects with the same name but distinguished by different casing, Okera provides no means to differentiate between them. For example, if TABLE_1 and table_1 are defined in the same database, a query against table_1, tABLE_1, TABLE_1, or any other permutation maps to either TABLE_1 or table_1 but which table it maps to is a function of the original database scan and cannot be controlled. Further note that Okera may generate queries with quoted identifiers, so an input query that does not specify casing may generate a more specific query. For example, if Okera has table_1 defined (unquoted) and is presented with a query that reads select id from TabLe_1, Okera may generate and issue the more specific query: select "id" from "table_1".

Athena Connection Changes¶

When creating or editing an Athena connection, the default source schema field is now optional.

API Updates¶

A new API endpoint /api/v2/tags/{name}/tagging-rules, with four methods (GET, POST, PUT, and DELETE) was introduced in this release. Use this endpoint to list, create, update, and delete tagging rules.
A new API endpoint /api/v2/uri/, with three methods (GET and POST) has been added in this release. Use this endpoint to list, fetch, and register an unstructured data URI.

For information about any Okera API endpoint, see the Okera API documentation, available after you log into the Web UI by appending /api/v2-docs/api/ after the web UI port number (8083). For example: https://my.okera.installation:8083/api/v2-docs/api/.

Security Vulnerabilities (CVEs/CWEs) Addressed¶

CVE-2020-16156 Ubuntu 18.04 - Perl (Improper Verification of Cryptographic Signature)
CVE-2021-43618 Ubuntu 18.04 - gmp (Integer Overflow or Wraparound)
CVE-2021-46848 Alpine Curl - Out-of-bounds Read
CVE-2022-21589 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21592 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21608 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21617 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-32221 Ubuntu 18.04 - Curl
CVE-2022-39253 Ubuntu 18.04 - git (Link Following)
CVE-2022-39260 Ubuntu 18.04 - git (Out-of-bounds Write)
CVE-2022-42915 Alpine Curl - Double Free
CVE-2022-42916 Alpine Curl - Cleartext Transmission of Sensitive Information

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes and Improvements¶

Fixed an issue where some user attributes were missing in the Okera UI after they were added using the DDL.

Fixed the pagination of the list of roles on the Roles page.

Fixed a bug that occurred during local worker startup in a Google Cloud Platform Dataproc environment running in nScale mode.
Fixed a bug in Okera's phi_zip3 transformations to correctly deidentify string zip codes to the first three digits. In addition, numeric zip codes that fall outside the range of 0 through 99999 now return a null value, rather than the original value. In other words, phi_zip3 transformations are only supported for five-digit numeric zip codes (however, if the zip codes are in string format, this limitation does not apply).

Fixed an issue where users were required to have write privileges to view metadata.

Fixed an issue where the property okera.external.view in Databricks environments did not always match the value of the cerebro.external.view property.

Fixed an issue in which a crawler failed even when abort on error was set to false.

Fixed an out-of-resources error that occurred with the OkeraEnsemble access proxy.

Fixed the no module named ez_setup errors you might have encountered when installing the OkeraEnsemble plugin.

Corrected a bug in error handling during dataset renames.

Okera Version 2.14 Release Notes¶

2.14.3 (1/20/2023)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes and Improvements¶

2.14.2 (12/19/2022)¶

OkeraEnsemble Updates¶

2.14.1 (12/10/2022)¶

Amazon EMR 6.5.0 and Spark 3.1.2 Support¶

OkeraEnsemble Updates¶

BigQuery Updates¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

2.14.0 (11/30/2022)¶

OkeraEnsemble (OkeraFS) General Updates¶

OkeraEnsemble UI and ABAC Support (Preview Feature)¶

OkeraEnsemble System Token Generation Changes in Default Mode¶

Apache Ranger Migration Script (Preview Feature)¶

Tag Updates in the UI¶

UI Changes¶

Casing Changes for Table Name Lookups¶

Athena Connection Changes¶

API Updates¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes and Improvements¶

OkeraEnsemble UI and ABAC Support ^{(Preview Feature)}¶

Apache Ranger Migration Script ^{(Preview Feature)}¶