Okera Version 2.11 Release Notes¶

This topic provides Release Notes for all 2.11 versions of Okera.

2.11.8 (1/18/2023)¶

Bug Fixes and Improvements¶

Optimized the performance of Okera's getPartitions() API endpoint, resulting in lower latency and load on the catalog database.

Fixed page errors that occurred when there were conflicts creating permissions.

2.11.7 (1/9/2023)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

CVE-2022-41946 Information Exposure
CVE-2022-42898 Integer Overflow or Wraparound
CVE-2022-45061 Resource Exhaustion

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes and Improvements¶

Fixed a bug where a crawler ignored the default schema specified for an Athena connection.

2.11.6 (12/9/2022)¶

Athena Connection Changes¶

When creating or editing an Athena connection, the default source schema field is now optional.

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Alpine-13661 Alpine314: Alpine-13661
CVE-2018-25032 Alpine314: Out-of-bounds Write
CVE-2021-46828 Alpine314: Allocation of Resources Without Limits or Throttling
CVE-2022-0778 Alpine314: Loop with Unreachable Exit Condition ('Infinite Loop')
CVE-2022-1097 Alpine314: OpenJDK
CVE-2022-1271 Alpine314: Improper Input Validation
CVE-2022-2309 Alpine314: NULL Pointer Dereference
CVE-2022-3970 Numeric Errors
CVE-2022-21540 Alpine315: OpenJDK
CVE-2022-21541 Alpine315: OpenJDK
CVE-2022-21549 Alpine315: OpenJDK
CVE-2022-21619 Alpine315: OpenJDK
CVE-2022-21624 Alpine315: OpenJDK
CVE-2022-21626 Alpine315: OpenJDK
CVE-2022-21628 Alpine315: OpenJDK
CVE-2022-27404 Alpine314: Out-of-bounds Write
CVE-2022-27405 Alpine314: Out-of-bounds Read
CVE-2022-27406 Alpine314: Out-of-bounds Read
CVE-2022-27774 Alpine314: Insufficiently Protected Credentials
CVE-2022-27776 Alpine314: Insufficiently Protected Credentials
CVE-2022-28391 Alpine314: BusyBox
CVE-2022-29824 Alpine314: Integer Overflow or Wraparound
CVE-2022-35252 Alpine314: Curl
CVE-2022-39399 Alpine315: OpenJDK
CVE-2022-40303 Alpine314: Integer Overflow or Wraparound
CVE-2022-40304 Alpine314: XML External Entity (XXE) Injection
CVE-2022-40674 Alpine314: Use After Free
CVE-2022-42898 Alpine315: KRB5
CVE-2022-43680 Alpine314: Use After Free

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes and Improvements¶

Fixed an issue where the property okera.external.view in Databricks environments did not always match the value of the cerebro.external.view property.

2.11.5 (11/15/2022)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

CVE-2020-16156 Ubuntu 18.04 - Perl (Improper Verification of Cryptographic Signature)
CVE-2021-43618 Ubuntu 18.04 - gmp (Integer Overflow or Wraparound)
CVE-2021-46848 Alpine Curl - Out-of-bounds Read
CVE-2022-21589 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21592 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21608 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-21617 Ubuntu 18.04 - MySQL Server Vulnerability
CVE-2022-32221 Ubuntu 18.04 - Curl
CVE-2022-39253 Ubuntu 18.04 - git (Link Following)
CVE-2022-39260 Ubuntu 18.04 - git (Out-of-bounds Write)
CVE-2022-42915 Alpine Curl - Double Free
CVE-2022-42916 Alpine Curl - Cleartext Transmission of Sensitive Information

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes and Improvements¶

Fixed an issue where users were required to have write privileges to view metadata.

2.11.4 (10/23/2022)¶

Bug Fixes and Improvements¶

Updated the Okera Spark3 connector to address an incompatibility resulting from a Databricks library change in Okera-supported Databricks runtime versions 9.1 LTS and later.

2.11.3 (10/20/2022)¶

Support for Referencing Amazon S3 Objects in Okera Configuration Parameters in nScale Amazon EMR Deployments¶

With this release, you can now reference objects stored in Amazon S3 as Okera configuration parameters for odas-emr-bootstrap. Okera pulls the objects referenced in the configuration parameters and mounts them in the nScale container, making the Amazon S3 paths available to Okera for processing. For example, this is helpful when configuring the SSL certificate and key required to start the OkeraEnsemble Amazon EMR access proxy in TLS/SSL mode:

--external-objects-to-container SSL_CERTIFICATE_FILE=s3://bucket/certificate-object, SSL_KEY_FILE=s3://bucket/key-object

If SSL_CERTIFICATE_FILE specifies the path to the SSL certificates file in Amazon S3 and SSL_KEY_FILE specifies the path to the SSL key file in Amazon S3, these paths can be used by the OkeraEnsemble access proxy for any necessary TLS/SSL processing.

Security Vulnerabilities (CVEs/CWEs) Addressed¶

CVE-2021-4209 Null Pointer Dereference
CVE-2022-2509 Double Free
CVE-2022-2526 Use After Free
CVE-2022-36944 Remote Code Execute (RCE)
CVE-2022-37434 Out-of-Bounds Write
CVE-2022-40664 Improper Authentication
CVE-2022-41828 Use of Function with Inconsistent Implementations
CVE-2022-42003 Deserialization of Untrusted Data
CVE-2022-42004 Deserialization of Untrusted Data

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes¶

The following bugs were fixed in this release:

Fixed an issue where some sensitive environment variables were not always redacted in logs.

The default for the Databricks environment variable OKERA_PREFER_ACTIVE_SESSION_FOR_ID is now true, so the workaround described in Problem 2: Partitioned Temporary Tables Generated Cannot Be Accessed is no longer necessary.

You no longer need to specify the transport protocol (http:// or https://) in the bootstrap script option --rest-server-hostports of odas-emr-bootstrap.

2.11.2 (9/23/2022)¶

Blocking Access to the Okera UI for Tablets¶

This release introduces the ability to block access to the Okera UI on tablets. The configuration parameter, BLOCK_WEB_UI_FOR_MOBILE_CLIENTS, controls this behavior. Valid values for this parameter are true (the UI is blocked on mobile devices and tablets) and false (the UI is not blocked on mobile devices and tablets). The default is false.

Updated Input to LDAP Filtering¶

In this release, we have updated Okera's input for LDAP filtering. Two new configuration parameters have been defined to allow you to specify separate base DNs (distinguished names) for users and groups for LDAP server searches during its authentication processing.

Use GROUP_RESOLVER_LDAP_USER_BASE_DN to specify the base DN for users.
Use GROUP_RESOLVER_LDAP_GROUP_BASE_DN to specify the base DN for groups.

See Okera Configuration Parameter Reference for a complete list of the configuration parameters available to you for Okera configurations.

OkeraEnsemble nScale Amazon S3 Bucket Access Enhancements¶

This release enhances OkeraEnsemble nScale mode deployment in Amazon EMR Spark environments by supporting access control of both Amazon S3 buckets defined for Amazon's assume secondary role feature, and optionally those buckets to which the Okera cluster is granted access using its IAM permission.

Since nScale deploys the OkeraEnsemble access proxy with least-privilege access to Amazon EMR, it has no IAM permissions naturally and retrieves its credentials to sign Amazon S3 requests from the Okera Policy Engine (Planner). Consequently, when you deploy OkeraEnsemble in nScale mode, you must provide access to the Amazon S3 buckets using either of two methods:

Using Amazon S3's assume secondary role feature. For Amazon S3 buckets that use assume secondary roles (bucket role map), the OkeraEnsemble access proxy retrieves the AWS Security Token Service (STS) credentials associated with the Amazon Resource Name (ARN) for the Amazon S3 bucket.
By setting a new OKERA_SYSTEM_IAM_ROLE_ARN configuration parameter in the Okera configuration file to the IAM Amazon Resource Name (ARN) associated with the Okera cluster. When this is activated, Okera can grant OkeraEnsemble nScale users access to buckets to which the Okera cluster has access by permission through its IAM role.

For more information about OkeraEnsemble nScale mode deployment in Amazon EMR environments, see OkeraEnsemble nScale Mode Deployment in Amazon EMR Environments.

OkeraEnsemble nScale System Token Duration Controls¶

You can now specify the duration, in minutes, of the JWT system token for OkeraEnsemble nScale processing. A new environment variable, SYSTEM_TOKEN_DURATION_MIN, can be set on the nScale container using the Okera Amazon EMR odas-emr-bootstrap script to configure the duration of the Okera system token. For example, passing the following arguments with the odas-emr-bootstrap.sh script will configure the system token duration time to 300 minutes. Valid values are positive integers. The default value is equivalent to one day (1440 minutes).

--local-worker-env-vars "-e SYSTEM_TOKEN_DURATION_MIN=300"

This configuration setting only works when the nScale proxy is configured using JWT_PRIVATE_KEY and not with SYSTEM_TOKEN. When configured using JWT_PRIVATE_KEY, the nScale access proxy generates its own token and the SYSTEM_TOKEN_DURATION_MIN setting determines how long that token is good for. When configured with SYSTEM_TOKEN, the SYSTEM_TOKEN_DURATION_MIN setting has no effect because the JWT token identified by the SYSTEM_TOKEN path includes an embedded expiration time that cannot be governed by SYSTEM_TOKEN_DURATION_MIN setting. If both JWT_PRIVATE_KEY and SYSTEM_TOKEN are specified, the JWT_PRIVATE_KEY is used and the SYSTEM_TOKEN is ignored.

Security Vulnerabilities (CVEs) Addressed¶

CVE-2017-7525 Incomplete Blacklist
CVE-2017-15095 Deserialization of Untrusted Data
CVE-2017-17485 Deserialization of Untrusted Data
CVE-2018-5968 Deserialization of Untrusted Data
CVE-2018-7489 Incomplete Blacklist
CVE-2018-11307 Deserialization of Untrusted Data
CVE-2018-12022 Deserialization of Untrusted Data
CVE-2018-12023 Deserialization of Untrusted Data
CVE-2018-14718 Deserialization of Untrusted Data
CVE-2018-14719 Deserialization of Untrusted Data
CVE-2018-14720 XML External Entity (XXE) Injection
CVE-2018-14721 Server-Side Request Forgery (SSRF)
CVE-2018-19360 Deserialization of Untrusted Data
CVE-2018-19361 Deserialization of Untrusted Data
CVE-2018-19362 Deserialization of Untrusted Data
CVE-2019-12086 Deserialization of Untrusted Data
CVE-2019-12384 Deserialization of Untrusted Data
CVE-2019-12384 Deserialization of Untrusted Data
CVE-2019-12814 Deserialization of Untrusted Data
CVE-2019-14379 Improperly Controlled Modification of Dynamically-Determined Object Attributes
CVE-2019-14439 Deserialization of Untrusted Data
CVE-2019-14540 Deserialization of Untrusted Data
CVE-2019-14892 Deserialization of Untrusted Data
CVE-2019-14893 Deserialization of Untrusted Data
CVE-2019-16335 Deserialization of Untrusted Data
CVE-2019-16942 Deserialization of Untrusted Data
CVE-2019-16943 Deserialization of Untrusted Data
CVE-2019-17267 Deserialization of Untrusted Data
CVE-2019-17531 Deserialization of Untrusted Data
CVE-2019-20330 Deserialization of Untrusted Data
CVE-2020-8840 Deserialization of Untrusted Data
CVE-2020-9546 Deserialization of Untrusted Data
CVE-2020-9547 Deserialization of Untrusted Data
CVE-2020-9548 Deserialization of Untrusted Data
CVE-2020-10650 Deserialization of Untrusted Data
CVE-2020-10672 CVE-2020-10672
CVE-2020-10673 CVE-2020-10673
CVE-2020-10968 Deserialization of Untrusted Data
CVE-2020-10969 Deserialization of Untrusted Data
CVE-2020-11111 Deserialization of Untrusted Data
CVE-2020-11112 Deserialization of Untrusted Data
CVE-2020-11113 Deserialization of Untrusted Data
CVE-2020-11619 Deserialization of Untrusted Data
CVE-2020-11620 Deserialization of Untrusted Data
CVE-2020-14060 Deserialization of Untrusted Data
CVE-2020-14061 Deserialization of Untrusted Data
CVE-2020-14062 Deserialization of Untrusted Data
CVE-2020-14195 Deserialization of Untrusted Data
CVE-2020-17523 Authentication Bypass
CVE-2020-24616 Deserialization of Untrusted Data
CVE-2020-24750 Deserialization of Untrusted Data
CVE-2020-25649 XML External Entity (XXE) Injection
CVE-2020-35490 Deserialization of Untrusted Data
CVE-2020-35491 Deserialization of Untrusted Data
CVE-2020-35728 Deserialization of Untrusted Data
CVE-2020-36179 Deserialization of Untrusted Data
CVE-2020-36180 Deserialization of Untrusted Data
CVE-2020-36181 Deserialization of Untrusted Data
CVE-2020-36182 Deserialization of Untrusted Data
CVE-2020-36183 Deserialization of Untrusted Data
CVE-2020-36184 Deserialization of Untrusted Data
CVE-2020-36185 Deserialization of Untrusted Data
CVE-2020-36186 Deserialization of Untrusted Data
CVE-2020-36187 Deserialization of Untrusted Data
CVE-2020-36188 Deserialization of Untrusted Data
CVE-2020-36189 Deserialization of Untrusted Data
CVE-2020-36518 Out-of-bounds Write
CVE-2021-20190 Deserialization of Untrusted Data
CVE-2022-21434 Oracle Java SE Vulnerability
CVE-2022-34169 Incorrect Conversion between Numeric Types
CVE-2022-37434 Out-of-bounds Write

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes¶

The following bugs were fixed in this release:

Privacy functions encrypt and aes_encrypt can now only be used by Okera admins.

The fix in Okera version 2.11.0 that returned string format instead of tabular format in the output of show create table has been reverted.

Upgraded Okera's version of Jackson to 2.13.3.
Upgraded Okera's base Alpine version to 3.15.6.
The policy synchronization enforcement mechanism used for Snowflake connections is now no longer enabled by default. You must enable it manually using the POLICY_SYNC_SCHEDULER_ENABLED configuration parameter or the okera.policy_sync.enabled advanced parameter in your Snowflake connection.

Upgraded Okera's version of Apache Shiro to 1.9.1.
Upgraded Okera's version of OpenJDK to 8.345.01-r0.
Fixed an error that occurred when querying Athena tables using JDBC pushdown processing. The error received was: An error has been thrown from the AWS Athena client. 1 validation error detected: Value '' at 'queryExecutionContext.catalog' failed to satisfy constraint: Member must have length greater than or equal to 1 [Execution ID not available].

2.11.1¶

Okera Version 2.11.1 was never distributed. Its updates were rolled into Okera 2.11.2.

2.11.0 (8/11/2022)¶

Snowflake Policy Synchronization Changes¶

This release introduces support for multiple changes to Okera's Snowflake policy synchronization enforcement.

All the access levels that are supported in both Okera and Snowflake are now supported. In past releases, only SELECT access was supported. This release extends Okera support for ALL, INSERT, DELETE, and UPDATE access as well. For more information about Snowflake policy synchronization, see Policy Synchronization Enforcement Overview.

The Snowflake connection dialog in the UI has been updated in this release. Users are now required to choose one of the following user options for policy synchronization when they set up a Snowflake connection in the UI:
1. They can select a checkbox indicating that synchronization should occur for all users.
2. They can specify a comma-separated list of users or a tag in a provided entry box.
You should no longer specify the okera.policy_sync.user_allowed_list advanced connection property in the Advanced properties box in the UI dialog. The list is now managed by the new checkbox and entry box. However, you can continue to use the property when setting up a Snowflake connection using the API.
The connection details for Snowflake connections (Connection Details tab for a connection) now more closely matches the details provided for other connections.
Instructions and a sample script are now provided for creating a tag in Snowflake for Okera policy synchronization and applying it to your Snowflake user definitions. See Tag Users in Snowflake.

For complete information about Snowflake policy synchronization see Policy Synchronization Enforcement Overview. For information about setting up a Snowflake connection, see Create a Snowflake Connection.

Tag Restrictions¶

This release introduces restrictions for tagging.

Users who do not have permissions to create tags can no longer see the button on the Tags page in the UI.
Users who do not have permissions to create tag namespaces can no longer create them on the Create new tag dialog.
Users who do not have permissions to remove tags can no longer see the option on the Tags page in the UI.
When creating or removing a tag, users can only select the namespaces for which they have privileges.

See Managing Tags for more information about tags.

Deleting Databases From the UI¶

This release introduces the ability to delete an Okera database in the UI. For more information, see Delete a Database.

Workspace and Preview Changes in the UI¶

This release introduces the following changes to the Workspace page and to the dataset preview pages available for datasets registered to a database and for the dataset details of a crawler on the Registration page.

The Workspace page, when accessed from a dataset details page, now defaults to using the Presto API.
The dataset previews now default to using the Presto API for the preview queries.

In past releases, the Okera API was used.

Blocking Access to the Okera UI for Mobile Devices¶

This release introduces the ability to block access to the Okera UI on mobile devices. A new configuration parameter, BLOCK_WEB_UI_FOR_MOBILE_CLIENTS, has been introduced to control this behavior. Valid values for this parameter are true (the UI is blocked on mobile devices) and false (the UI is not blocked on mobile devices). The default is false.

OkeraEnsemble Amazon EMR nScale Mode Deployment ¶

With this release, you can elect to deploy the OkeraEnsemble access proxy in nScale mode in Amazon EMR environments, so the OkeraEnsemble workload is distributed across your cluster nodes and scales up and down with your clusters. To do this, the OkeraEnsemble access proxy retrieves AWS credentials from the Okera Policy Engine (planner). To communicate with the Okera cluster, the access proxy generates its own system token if it is configured with the JWT private key used by the Okera cluster (via the JWT_PRIVATE_KEY configuration property). This is done for you if you use the odas-emr-bootstrap.sh script with the --install-jwt-key argument (specifying the Amazon S3 path to the key).

For more information, see OkeraEnsemble nScale Mode Deployment in Amazon EMR Environments.

Databricks 10 and 11 Support ¶

This release introduces support for Databricks 10.0 through 10.5 and 11.0. In past versions, Okera only supported versions 8.3, 8.4, 9.0 and 9.1. For more information, see Databricks Integration Steps.

Note: With this release, Okera drops support for Databricks 7.3.

Sample GCP Group Resolution Script¶

This release introduces a sample script to resolve groups in Okera when using Google Cloud Platform (GCP). The script requires that the following configuration parameters be specified in the Okera configuration file.

Parameter GROUP_RESOLUTION_GOOGLE_APPLICATION_CREDENTIALS must provide the fully qualified path to a credentials JSON file for a GCP service account with appropriate admin privileges. The path can be a container path (the JSON file is mounted to the container by Kubernetes), an Amazon S3 (s3:) path, or an ADLS (adl:) path.
Parameter GSUITE_GROUP_ADMIN_EMAIL must provide the email of a GCP user with appropriate administrative privileges.
Parameter GROUP_RESOLVER_SCRIPTS must specify the fully qualified path /opt/scripts/resolve_groups_gcp_example.py.

When all configuration parameters are specified correctly, GCP group resolution is performed for Okera. See Sample GCP Group Resolution Script.

Configuring Parquet File Resolution Types¶

Table property parquet.resolve-by.type can now be used to configure how a Parquet data file is resolved. Valid values are ordinal (positional resolution) and name (name resolution). In past releases, resolution was configured globally and by default, resolved by name.

For example:

  ALTER TABLE nation SET TBLPROPERTIES('parquet.resolve-by.type'='name')
  ALTER TABLE nation SET TBLPROPERTIES('parquet.resolve-by.type'='ordinal')

AWS Athena Upgrade and Performance Improvements¶

This release upgrades Okera to use the AWS Athena 2.0.30 JDBC driver. With this upgrade, the Athena JDBC JAR file is no longer provided by Okera in the Maven repository, so you must download it from https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html. Okera does not require a JDBC driver with the AWS SDK, so download the one without the AWS SDK. In addition, Okera connections to Athena also now require specification of the path to the JDBC JAR file and its class name, specified in the driver.jar.path and driver.class.name properties in the connection. If you are creating an Athena connection in the Okera UI, these properties can be specified in the Driver file path and Driver class name fields. See Athena Data Source Connections.

Starting with Athena 2.0.5, the Athena JDBC connector uses the result set streaming API to improve its performance when fetching query results. To use this new Athena feature:

Include and allow the athena:GetQueryResultsStream action in your IAM policy statement. For details on managing Athena IAM policies, see https://docs.aws.amazon.com/athena/latest/ug/security-iam-athena.html.
If you are connecting to Athena through a proxy server, make sure that the proxy server does not block port 444. The result set streaming API uses port 444 on the Athena server for outbound communications.

Glue Enhancements¶

The following changes were made in this release to Okera's integration with AWS Glue:

A new configuration parameter, OKERA_GLUE_SILENCE_TBL_PAGINATOR_500 has been introduced. This parameter enables and disables Okera's silencing of unknown Glue errors that can affect the dataset counts on the Data page in the Okera UI. Valid values are true (silence the errors) and false (don't silence the errors). The default is false. You only need to set this configuration parameter if you receive an InternalServiceException 500 from Glue, while trying to open the Data page in the Okera UI.
Additional log messages have been added to improve any debugging that might be needed in Glue environments.

For more information about Okera's integration with AWS Glue, see Using Glue as a Third-Party Metadata Catalog.

SSL/TLS Enabled for the Okera Catalog¶

This release introduces configurable SSL and TLS support for Okera MySQL catalog databases and enhanced SSL support for Okera Postgres catalog databases. In past releases, Okera only provided very basic, non-configurable SSL support for Postgres catalogs and did not support TLS for either MySQL or Postgres catalogs. You can enable this enhanced configurable encryption by setting the CATALOG_DB_SSL parameter to "true" (the default is "false") in the Okera configuration file as well as setting the following new Okera parameters (encoded in base64) as described below:

CATALOG_DB_SERVER_CERT: Specifies the SSL/TLS certificate for the MySQL or Postgres catalog database server.
CATALOG_DB_CLIENT_CERT: Specifies the TLS certificate for the MySQL catalog database client. This parameter is only needed for TLS support.
CATALOG_DB_CLIENT_CERT_KEY: Specifies the private key for the MySQL catalog client TLS certificate. This parameter is only needed for TLS support.

Okera can determine which protocol (SSL or TLS) to use based on the certificates provided.

Notes: This change only impacts Okera connections to its MySQL or Postgres catalog and does not establish SSL/TLS configurable support throughout the Okera cluster.

Okera only supports TLS for MySQL catalogs at this time. It does not support Cloud SQL Auth proxy functionality.

For more information see Configure SSL/TLS for Okera Metadata Storage

Active/Active In-Parallel Policy Loading¶

This release introduces the ability to load Okera policies in parallel when active/active environments start up. This speeds up service start time, particularly for slower RDBMS environments or environments in which many roles must be loaded. Okera uses two thread pools to perform active/active in-parallel policy loading, one for the initial load and one that occurs in the background. The default number of roles loaded in parallel for an initial load is 12; the default number of roles loaded in the background is 2. To control these settings, two new configuration parameters have been introduced:

SENTRY_INITIAL_LOAD_THREADS can be used to override the initial in-parallel load default of 12 roles. Specify the number of roles that should be loaded in parallel when an active/active environment is initially started.
SENTRY_BACKGROUND_LOAD_THREADS can be used to override the background in-parallel load default of 2 roles. Specify the number of roles that should be loaded in parallel in the background of an active/active environment.

For more information about active/active environments, see Active/Active Deployment in Aurora RDS. For more information about Okera configuration parameters, see Configuration and Okera Configuration Parameter Reference.

Restricting Use of Privacy Functions¶

This release introduces the ability to restrict use of Okera's privacy functions and user-defined functions (UDFs) to Okera administrators only. To activate this feature, add the RESTRICTED_UDFS configuration parameter to your Okera configuration file. Valid values for this parameter are a comma-separated list of function names. Use of any functions listed in the parameter require administrator privileges. In the following example, the aes_decrypt and nfp_ref_tokenize privacy functions can only be used by administrators.

RESTRICTED_UDFS: aes_decrypt,nfp_ref_tokenize

For complete information about the privacy functions supported by Okera, see Privacy and Security Functions.

Dropping Attributes From Nested Fields¶

This release introduces the ability to drop attributes from nested fields. See Nested Field Tags.

Transformation Priority Defaults¶

This release introduces defaults for transformation priorities, when more than one transformation is applied to a single column. The default priority is:

null
zero
sha2
hash
fnv_hash
aes_decrypt
aes_encrypt
tokenize
fp_ref_tokenize
nfp_ref_tokenize
mask
mask_ccn
diff_privacy
phi_age
phi_date
phi_dob
phi_zip3
fp_random
nfp_random
random_ccn

The higher the priority (the later transformations in this list) override the earlier transformations with lower priority (for example, mask_ccn overrides zero). However, you can specify your own prioritization. See Prioritization of Transformations.

Native Delta Lake Table Support ^{(Preview Feature)}¶

This release introduces manifest-less native support for files in Delta Lake tables. This is introduced as an Okera preview feature. Previously, Okera only read Delta Lake tables if a manifest was explicitly created. With native support, this is no longer needed. Okera recommends switching to (manifest-less) native support, if you currently use the manifest method.

Native support is disabled by default, but can be enabled for individual Delta Lake tables or databases by specifying okera.delta.native-support=true as a table or database property. You can also enable it for the entire Okera cluster using the new DELTA_TABLE_NATIVE_SUPPORT configuration parameter in the Okera configuration file. Valid values for these properties are true (use native support, not manifest support) and false (use manifest support, not native support). The default is currently false for the cluster, but will be changed to true in a future release.

Note: Okera currently only supports querying the latest snapshot of a Delta Lake table.

For more information about Delta Lake file support, see Databricks Delta Lake Table Support.

Notable Changes¶

This release drops Okera's support for Kubernetes v1beta1. Okera now only supports Kubernetes v1. The v1beta1 version is deprecated and should no longer be used. This change affects the okctl and Helm charts used by your Okera clusters. The change was necessary because without it, Okera cannot support newer Kubernetes clusters. However because of this change, Okera cannot support clusters older than 2017. If your cluster uses Kubernetes v1beta1 or is older than 2017, please upgrade your Okera environment or contact Okera for assistance.

The authorize-query REST server API now requires that authorize_for be set to your user name when you submit an Okera query unless you are an Okera admin. Okera admins do not need to specify authorize_for. See Okera Policy Engine Integration.
This release drops support for Databricks 7.3.
Cloudera CDH is no longer supported in Okera.
Amazon Web Services EMR versions lower than version 5.24 are no longer supported in Okera.

Security Vulnerabilities (CVEs) Addressed¶

CVE-2020-28483 HTTP Response Splitting
CVE-2022-2097 Inadequate Encryption Strength
CVE-2022-22576 Improper Authentication
CVE-2022-22747 NSS Issue
CVE-2022-23437 XML Injection
CVE-2022-24765 Uncontrolled Search Path Element
CVE-2022-25647 Deserialization of Untrusted Data
CVE-2022-27775 Curl
CVE-2022-27781 Loop with Unreachable Exit Condition ('Infinite Loop')
CVE-2022-27782 Improper Certificate Validation
CVE-2022-29155 Ubuntu USN-5424-1 OpenLDAP Vulnerability
CVE-2022-29187 Uncontrolled Search Path Element
CVE-2022-29361 HTTP Request Smuggling
CVE-2022-29458 Out-of-Bounds Read
CVE-2022-31197 SQL Injection
CVE-2022-32205 Allocation of Resources Without Limits or Throttling
CVE-2022-32206 Allocation of Resources Without Limits or Throttling
CVE-2022-32207 Incorrect Default Permissions
CVE-2022-32208 Out-of-Bounds Write
CVE-2022-34480 NSS Issue
CVE-2022-34903 Arbitrary Code Injection

Okera uses Snyk and GitHub Advanced Security for security vulnerability scanning.

Bug Fixes¶

Okera's base Ubuntu image has been upgraded to bionic-20220801.
Audit log entries are now added for all CREATE_AS_OWNER implied operations.

Upgraded the base Alpine image used by Okera to 3.15.5.

Fixed an issue in which Okera failed to start when using an Azure database for Postgres.

Tooltips in the UI now have an updated, more legible, look.

Resolved a problem in which the dataset previews for a BigQuery table failed because the row limit was not applied on queries, and consequently produced very large result sets.
Fixed the CSP headers for the REST API documentation.

Cloned tables with applied policies in Snowflake no longer break policy synchronization when the original table used for the clone is removed.

Upgraded Scala in recordservice-spark-2.0.jar to version 2.11.12.

Fixed an issue in which database passwords with special characters were not properly encoded when establishing a connection.

Corrected a problem with string data when using Avro complex data types for certain unnesting queries.

Users who do not have permissions to create Okera databases now receive an authorization error if they attempt to create a database that already exists.

Fixed an issue in which the Presto configuration was written with mismatched closing tags.

The Snowflake connection synchronization details tab now provides a dropdown link you can select to show the last error message stored during connection synchronization.

Fixed a bug in which users without the correct privileges (for example, granted only SELECT privileges) could add groups to a role using the REST API. This is no longer possible without the correct privileges.
Snowflake connection details have been reordered and headings have changed in the Web UI.

Fixed a bug in which users without the correct permissions were able to update dataset and dataset column descriptions directly using the REST API.

Snowflake pushdown processing now supports queries using column tags (tag-based row filtering).

A Content-Security-Policy header is now applied to the REST server resources used by the Okera UI.

Fixed a bug in which a crawler's details could not be viewed the first time after deleting another crawler.
Upgraded to the latest version (2.1.0.7) of the Redshift JDBC driver.
Improved the performance of Okera metadata queries when calling Databricks with JDBC.

Improved Okera performance when evaluating ABAC policies.
The Google Cloud CLI (gcloud) is now removed from the Okera core image.

Improved performance when authorizing datasets from Databricks.
Improved Okera performance when authorizing table access from Databricks.

Fixed a layout issue with a dataset's Yes this dataset dialog, where the copyable text extended beyond its containing dialog.

Improved performance when using policies with transform clauses.
Improved Okera performance when loading table metadata.
Improved the performance of AuthorizeQuery RPC calls.

Okera Version 2.11 Release Notes¶

2.11.8 (1/18/2023)¶

Bug Fixes and Improvements¶

2.11.7 (1/9/2023)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes and Improvements¶

2.11.6 (12/9/2022)¶

Athena Connection Changes¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes and Improvements¶

2.11.5 (11/15/2022)¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes and Improvements¶

2.11.4 (10/23/2022)¶

Bug Fixes and Improvements¶

2.11.3 (10/20/2022)¶

Support for Referencing Amazon S3 Objects in Okera Configuration Parameters in nScale Amazon EMR Deployments¶

Security Vulnerabilities (CVEs/CWEs) Addressed¶

Bug Fixes¶

2.11.2 (9/23/2022)¶

Blocking Access to the Okera UI for Tablets¶

Updated Input to LDAP Filtering¶

OkeraEnsemble nScale Amazon S3 Bucket Access Enhancements¶

OkeraEnsemble nScale System Token Duration Controls¶

Security Vulnerabilities (CVEs) Addressed¶

Bug Fixes¶

2.11.1¶

2.11.0 (8/11/2022)¶

Snowflake Policy Synchronization Changes¶

Tag Restrictions¶

Deleting Databases From the UI¶

Workspace and Preview Changes in the UI¶

Blocking Access to the Okera UI for Mobile Devices¶

OkeraEnsemble Amazon EMR nScale Mode Deployment ¶

Databricks 10 and 11 Support ¶

Sample GCP Group Resolution Script¶

Configuring Parquet File Resolution Types¶

AWS Athena Upgrade and Performance Improvements¶

Glue Enhancements¶

SSL/TLS Enabled for the Okera Catalog¶

Active/Active In-Parallel Policy Loading¶

Restricting Use of Privacy Functions¶

Dropping Attributes From Nested Fields¶

Transformation Priority Defaults¶

Native Delta Lake Table Support (Preview Feature)¶

Notable Changes¶

Security Vulnerabilities (CVEs) Addressed¶

Bug Fixes¶

Native Delta Lake Table Support ^{(Preview Feature)}¶