Skip to content

Athena Data Source Connections ( Preview Feature)

The Athena JDBC JAR file is not provided by Okera in the Maven repository, so you must download it from https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html. Okera does not require a JDBC driver with the AWS SDK, so download the one without the AWS SDK. Store the driver wherever you prefer, but in a location that is accessible to Okera. Okera recommends an S3 location that is accessible by Okera.

In addition, Okera connections to Athena require specification of the path to the JDBC JAR file and its class name. If you're using the Okera DDL to create the Athena connection, specify this information in the driver.jar.path and driver.class.name properties in the connection. If you are creating an Athena connection in the Okera UI, these properties can be specified in the Driver file path and Driver class name fields.

Here is a programmatic example for connecting to Athena.

CREATE DATACONNECTION athena_connection CXNPROPERTIES
(
  'connection_type'='JDBC',
  'jdbc_driver'='awsathena',
  'host'='athena.<region>.amazonaws.com',
  'port'='<port, typically 443>',
  'user_key'='awssm://<my-username>',
  'password_key'='awssm://<my-password>',
  'jdbc.schema.name'='<Athena schema name to connect to>',
  'jdbc.db.name'='<Athena database name to connect to>'
  'connection_properties'='{"AwsRegion":"<region>", "driver.jar.path":"s3://okera-jdbc-test/drivers/AthenaJDBC42.jar", "driver.class.name":"com.simba.athena.jdbc.Driver", "S3OutputLocation":"<The s3 default output path, can be found in Athena settings>"}'
);

Use Athena's Result Set Streaming API With Okera

Starting with Athena 2.0.5, the Athena JDBC connector uses the result set streaming API to improve its performance when fetching query results. To use this Athena feature:

  1. Include and allow the athena:GetQueryResultsStream action in your IAM policy statement. For details on managing Athena IAM policies, see https://docs.aws.amazon.com/athena/latest/ug/security-iam-athena.html.

  2. If you are connecting to Athena through a proxy server, make sure that the proxy server does not block port 444. The result set streaming API uses port 444 on the Athena server for outbound communications.

Enable Query Pushdown

To enable transparent query pushdown, add awsathena to the list of enabled engines in the OKERA_CTE_REWRITE_ENABLED_ENGINES configuration setting. Multiple engines can be enabled (in addition to the ones enabled by default) by providing them as comma-separated values in this configuration setting.

For example:

OKERA_CTE_REWRITE_ENABLED_ENGINES: dremio:direct,redshift,awsathena