The Okera Portal provides a Reports page where users who have access to it can view usage metrics about events occurring in the system.
Users can select a time range over which to view reports and filter them by database, dataset, and tags involved.
Reports are generated using Presto to query the reporting audit table (
which is derived from the Okera audit logs (for more on the structure of the audit logs, see Auditing).
Enabling the Reports page¶
See here for how to enable access to the reports page.
Note that even though a user has access to the Reports page, the audit views are set up to only show audit logs pertaining to the datasets a user has
ALL privileges on.
Filters at the top of the page can be used to filter all of the reports by time range, databases, datasets, and tags. When applied, these filters apply to all of the reports, not only the ones related to that particular entity type. For example, filtering by a dataset or set of datasets will not only filter the data in the "Top Accessed Datasets" report, but also all other reports.
Data is filtered as an
AND relationship between the filters.
For example, when a database and tag are selected, only events pertaining to the database and the tag are counted.
Note that you must click "Apply" in order for your selected filters to take effect and fetch updated data for the reports.
Database and dataset filters¶
By default, reports will show aggregated metrics for all of the datasets on which the current user has the
Users can change which database, dataset or set of databases and datasets the reports show using the multi-select dropdowns in the top of the page. Only databases and datasets the current user has
ALL privileges on will appear in the dropdowns. The datasets dropdown only shows datasets in the currently selected databases.
The tags filter allows a user to filter reports to only events related to data with the currently selected set of tags.
The default time range on the Reports page is "Last 24 hours". This means that every report on the page will be showing data only for queries that occurred in the 24 hours prior to the page visit. "Time range" changes the time range used to generate the reports to one of the following options:
- Last 24 hours (default): reports will show data from the 24 hours prior to visiting the page.
- Last 7 days: reports will show data for seven days prior to visiting the page.
- Custom range: when this option is selected, two date pickers become available, one for the start time of the reporting time frame, and one for the end time. When a custom time range is selected, the reports will show data for events that occurred between the beginning of the start date and the end of the end date. The available dates in the custom time range picker go back 90 days from the current date.
For large deployments, longer time ranges will take some time to load. The page will not time out, but will wait for Presto to return results. Up to 90 days can be selected in the "Custom range" selectors, but more than a week or two may take tens of minutes to return.
Each report consists of a heading, a chart, table or value, and some interaction icons. For example, here is the report for "Top Accessed Datasets":
- In the top left, the heading, which consists of a title, subheader, and info icon . The subheader describes the time frame shown, as well as filters that are applied to the data. Mousing over the info icon will show a short explanation of the report.
- In the center, the report visualization. A graph, a table or a value.
- In the top right, interaction icons.
Which interactions are available depends on the report.
In the "Top Accessed Datasets" example we see all three options:
- View all - Some reports only show the top 10 results by count. Clicking this icon will display a table of all of the results found.
- View SQL - Clicking this icon will display the SQL used to generate the report. For users with access to Workspace, there will also be a link to "Open in Workspace," where the SQL can be run to see the raw results.
- Download as CSV - Clicking this icon will download the report data as a csv file.
On most reports, a user can drill down further by clicking on any data point in the report. The drilldown displays a sample of up to 100 queries run for the selected data point. This allows users to explore the underlying data powering the reports.
For users with Workspace access, there will also be a link to view the full details in Workspace. This link takes the user to Workspace with the drilldown query pre-populated. Each report has a slightly different drilldown query, but they all follow a similar pattern of a query that selects the particular fields of interest from the reports view, followed by filters to restrict the results by time, other filters, and the constraints that match the datapoint drilled-down on. For example, the drilldown for "Top Accessed Datasets" looks like this:
SELECT start_time_utc, user, statement_type, statement, status, ae_table, ae_attribute, client_application, start_unix_time FROM okera_system.reporting_audit_logs WHERE ( ymd BETWEEN '2020-05-23' AND '2020-05-24' AND start_time_utc >= from_iso8601_timestamp('2020-05-23T00:46:41') AND start_time_utc <= from_iso8601_timestamp('2020-05-24T00:46:41') ) AND ( cardinality( array_intersect( split(ae_table, ','), ARRAY ['okera_system.role_names'] ) ) > 0 ) ORDER BY start_time_utc DESC LIMIT 100
There are a few things to note about this query:
- We are selecting the necessary fields from
okera_system.reporting_audit_logs, this is a special system view that the exposes data relevant to reporting from the audit logs.
- We are restricting by
start_unix_time. This seems redundant, but it helps performance by first choosing the right partition, then restricting data further based on timestamps. Note that dates and times here are in UTC time so may not match the local times and dates selected for the time range. The selected local time has been converted to the corresponding UTC time.
- Finally, the query adds a filter to only show events on the dataset
okera_system.role_names, which was the data point clicked on to produce this drilldown query.
The "go to Workspace" link in the drilldown preview to open the query in Workspace. There, you can modify the query to find other insights. If you need to see more results, you can use the same query in an EMR cluster or whatever other client application you use to query data through Okera.
Understanding the reports¶
There are two tabs, with a total of eight reports on the Reports page. On the first tab, "Analytics," there are six reports which show summaries of activity on various entities in the Okera deployment.
- Activity Overview
- Active Users Over Time
- Top Active Users
- Top Applications
- Top Accessed Datasets
- Top Accessed Tags
The second tab, "All Queries" contains two reports on queries in the system.
This report gives a summary of activity in the Okera deployment in the form of two counts. The first count is the number of unique active users of the system in the selected time range. These are users that ran at least one query in that time.
The second count is the number of datasets queried at least once in the selected time range.
This summary report does not include any interactions.
Active Users Over Time¶
This report shows the number of active users over time within the selected time range. If the selected time range is a day or less, it shows the number of active users per hour. If the selected time range is more than one day, it shows the number of active users per day. Each data point represents the count of users during the time between the time displayed at that data point, and the next data point. For example, the point labeled 11:00 am shows the count of users in the 11 o'clock hour.
Clicking on a point will show a drilldown of events contributing to the count of users during that time.
The "View SQL" button shows the Presto-SQL used to generate the report. For example (comments added):
SELECT startTime, count(distinct(user)) as count FROM ( SELECT -- startTime is computed by converting the stored utc time to local time, -- then only using the date portion to group events based on date date_format( start_time_utc AT TIME ZONE 'America/Los_Angeles', '%Y-%m-%d' ) as startTime, user FROM okera_system.reporting_audit_logs WHERE -- ymd is a partitioning column. The right partitions are first selected, -- then the time is further restricted to the necessary hours. -- These parameters have been converted from local time to UTC time ( ymd BETWEEN '2020-05-01' AND '2020-05-17' AND start_time_utc >= from_iso8601_timestamp('2020-05-01T07:00:00') AND start_time_utc <= from_iso8601_timestamp('2020-05-17T06:59:59') ) AND user > '' ) GROUP BY startTime ORDER BY startTime
startTimeand adds a distinct count of users. Note again that the filter used to restrict the time range of data selected is in UTC time. Because the logs are stored in UTC time in the database, the time selected by the user is converted to UTC time. This means dates may not match the local dates. The aggregation is then done in local time.
Top Active Users¶
As the name implies, this report shows a list of users in the system, ranked by the number queries they ran in the selected time range. Click on any row to see a drilldown of some of the queries run by that particular user. This report shows only the top ten users. To see a complete list of users, click the "View all" button. Clicking "View SQL" will show the SQL used to generate the report.
This report shows the most popular client applications ranked by number of queries run in the given time range. The top ten client applications are represented as bars in a bar chart, and more applications can be shown by clicking on the "View all" button. Mouseover on the bars in the chart will show the number of queries on the application.
Top Accessed Datasets¶
This report is very similar to "Top Applications", showing the most queried datasets in the selected time range. The top ten datasets are represented as bars in the bar chart, and the rest can be viewed by clicking "View all". Mouseover on the bars in the chart will show the number of queries on that dataset and clicking a bar will show the events used to calculate the count. In the case of joins or other operations that involve more than one dataset, each dataset is counted independently.
Top Accessed Tags¶
This report is very similar to the other reports for counts. It shows the most accessed tags in the selected time range, meaning this is a count of queries that include that tag. The top ten tags are represented as bars in the bar chart, and the rest can be viewed by clicking "View all". Mouseover on the bars in the chart will show the number of queries that include that tag.
Queries Over Time¶
This report shows the number of queries that occurred during the selected time range as bar chart with each bar representing the sum of queries in a particular time bucket. For time ranges of a day or less, each bar represents an hour of data. For time ranges of more than a day, each bar represents a day.
"Queries Over Time" supports drilling down to see a list of some of the queries run in a particular time bucket. To drill down, pick a bucket and click on it. You will see a modal that displays a sample of queries run in this time frame.
This report shows 100 recent queries in the time range as a table. To view more, use must use the SQL shown by clicking "View all" to run a query using Presto in your SQL environment of choice.
Notes on implementation¶
The times shown in the reports are all in client-local time. Times stored on the server (and therefore part of drilldowns) are stored as UTC time in milliseconds and represent when each event happened on the server. Each event is counted at its start time, not its end time, although end times are included in the logs.
When is data available?¶
Report data is updated every five minutes. Queries run in the last five minutes are being logged, but may not be reflected in the reports. Every five minutes a watcher task processes the audit logs, adding them to the corresponding query-able table.
What is being counted?¶
Queries are not counted 1 to 1 in the audit logs; Okera logs more than one event per user query.
For each query, Okera logs a
PLAN event, plus some number of
SCAN events depending on the scale of the data being scanned.
For very large scans, the number of events logged can be very high.
All of these events are shown in queries on the audit logs, but in reports most of them are rolled up into a small number of events.
The reports reflect a low constant number of events per query.
The exact number depends on the client and query type, but not on scale of data scanned.
So, for similar queries from the same source, the numbers in the reports are consistent.