Quick Start Guide: AWS CloudTrail Integration

Introduction

The goal of this document is to provide all of the necessary information to integrate an Okera cluster with Amazon’s CloudTrail services.

What is CloudTrail?

CloudTrail is Amazon’s audit logging service for certain AWS APIs. CloudTrail provides an automatic means to record all S3 reads, writes and policy changes for S3 content owned by an AWS account. When CloudTrail is configured, AWS will automatically write out a CloudTrail log file to a predetermined S3 bucket in roughly 15 minute intervals. The contents of each file contains a ledger of API operations that occurred within the file’s time range. Okera has the ability to receive notification of freshly delivered CloudTrail log files so that it may consume those log files and use them to optimize certain maintenance operations such as discovery of dataset files and partitions without the necessity of polling S3.

For more specific details on CloudTrail, please refer to the AWS CloudTrail User Guide

Configuring CloudTrail for Okera

Configuring CloudTrail For Okera involves three AWS service configurations:

  1. An SQS Queue per Okera cluster where the cluster configured to receive logfile delivery notifications
  2. An SNS Topic for pubsub handling of CloudTrail logfile delivery notifications
  3. A CloudTrail Trail which captures the necesssary logs and delivers them to an S3 bucket and posts notifications to your SNS Topic

Once these AWS services have been configured, follow the steps in the Enabling the Okera CloudTrail Service section to configure your cluster.

SQS Queue Configuration

To create an SQS Queue, follow these steps:

  1. Navigate to the AWS SQS Console
  2. Create a Queue using the “Quick-Create Queue” button
  3. Select the newly created Queue in the console and choose “Add a Permission” in the “Queue Actions” dropdown
  4. In the “Add a permission” dialog, do the following:
    - Select “Allow” for the Effect
    - Specify the Principal for the Okera cluster
    - Select “All SQS Actions”
  5. Copy the ARN URI for the SQS Queue and proceed to the next section

SNS Topic Configuration

Create an SNS Topic that is linked to your SQS Queue defined in the previous section. To create an SNS Topic, follow these steps:

  1. Navigate to the AWS SNS Console
  2. Create a Topic with the default settings
  3. Add the Queue’s ARN URI as a subscriber for your Topic
  4. Copy the ARN URI for the Topic and process to the next section

CloudTrail Trail Configuration

Create a new CloudTrail Trail via the AWS CloudTrail console The Trail should ideally capture S3 Write operations for all S3 buckets in your account for all regions but at minimum needs to be configured for the S3 paths that contain datafiles backing Okera datasets. The Trail should be configured to send SNS notification for every log file delivery with the SNS Topic created in the SNS Topic Configuration section. To create a CloudTrail Trail, follow these steps:

  1. Navigate to the AWS CloudTrail Console
  2. Create a new Trail and specify the following settings:
    - Specify a name for the Trail and select “Yes” for the “Apply trail to all regions” option
    - Specify “None” for the “Read/Write events” value under “Management events”
    - Under the “Data Events” section, select the “S3” tab and add the S3 bucket(s) you want Okera to watch and uncheck the “Read” checkbox. Alternatively, you can choose “Select all S3 buckets in your account”
    - Specify an S3 path for the “Storage location” of the CloudTrail logs. This needs to be an S3 path that is accessible by your Okera Cluster.
    - Under “Advanced” and choose “Yes” for “Send SNS notification for every log file delivery” and paste the ARN URI for your SNS Topic from the previous section

Enabling the Okera CloudTrail Service

To enable the CloudTrail log processing in your Okera cluster, perform the following steps:

  1. Create a JSON parsable configuration file using the following template:
    {
        "s3Region": "<S3 region>",
        "sqsRegion": "<SQS region>",
        "sqsUrl": "<SQS URL>"
    }
    
  2. Save this file to an S3 path that is accessible by your Okera Cluster
  3. Add the following variable to your Okera Cluster’s env.sh: - export OKERA_CLOUDTRAIL_SERVICE_CONFIGURATION=s3://bucket/path/to/config.json Alternatively, you can change the Kubernetes deployment cerebro-planner deployment to specify the same variable for Okera clusters that are already running
  4. Boot your cluster normally

Troubleshooting CloudTrail

There are some situations where Okera will fail to internally enable the CloudTrail Service because of either a bad AWS configuration or a missing config file or a config file that is not well-formed. Below are some steps to take to verify that Okera is processing your CloudTrail log files correctly:

Navigate to the AWS SQS Console and verify that the Messages Available field for your queue is close to zero. Reload the page in 5 minute increments to see if the Messages Available count is increasing or decreasing. If it is consistently increasing, then the CloudTrail Service in Okera is likely not enabled properly.

If you suspect that Okera is not properly processing CloudTrail logs, then scan the Planner container logs for the following error strings:

  1. Periodically in the Planner logs, the following appears:
    CloudTrailService failed to poll SQS queue; waiting 120s before re-attempting
    

    Followed by a Java Exception. This indicates that either the sqsRegion or sqsUrl are incorrect/inaccessible or that Okera does not have the necessary permissions to pull messages from the SQS Queue.

  2. The Planner fails to start and logs show:
    Failed to load CloudTrailService configuration file stored at s3://bucket/path/to/config.json
    

    This indicates that Planner is reading the OKERA_CLOUDTRAIL_SERVICE_CONFIGURATION env variable during service startup but that it is not able to download or parse the config file. This could either be caused by the config.json file not being found or a policy problem on the S3 bucket where the config.json file is stored or could also be caused by the config.json file not being well-formed JSON.

If the CloudTrail Service is correctly configured and is successfully processing CloudTrail log files, you should periodically see the following in your Planner logs:

Polled 1 sqs messages from https://sqs.us-west-2.amazonaws.com/9874597439573/queue_name

Which should be followed by:

Downloaded log file AWSLogs/9874597439573/CloudTrail/us-west-2/2018/11/05/9874597439573_CloudTrail_us-west-2_20181105T1725Z_lNRiZh3vsiRV0bjt.json.gz from bucket_name