Cloud9QL is an optional SQL-like syntax to transform data in powerful ways when needed.


  • Transform, clean and post-process data.

  • Complement existing API's/query syntax of datastores.

    For example, if your data is in MongoDB, Cassandra or any other datasource that we support, Cloud9QL can be used a post processor to native query syntax.

  • Generate alternate views from the same data within Knowi.

  • Query file based data or datastores with no/limited query support.


Paste your own data below and write your own queries to experiment (or use the default data and queries):

Select everything:

select *

Keyword 'select' is optional.

Select specific fields:

select Sent, Date

Field Aliases:

select Sent as Sent Messages, Date


select * where opened > 100000
select * where campaign_name like artists
select * where Message Type = Transactional and sent > 200000

Supported Operators: >, >=, <, <=, !=, like, not like

Logical Operators: and, or


select * where opened > 100000 order by opened desc

Supported order types: asc, desc


select * where opened > 100000 order by opened desc limit 1

Unique records:

 select distinct *
 select distinct customer


Aggregations functions enable grouping/dimensions from the data.

Without GROUP BY

 select sum(sent)
 select sum(sent), avg(sent), count(*), median(sent), max(sent), min(sent)

Supported: sum, count, avg, median, max, min


Enables aggregations based on one or more groups/dimensions.

  select sum(sent) as Total Sent, Customer group by Customer

Combines values on multiple rows of a given field into an array based on group by field(s).

ARRAY(<field>, <remove-duplicates-flag>)
ARRAY(<field>, <remove-duplicates-flag>, <filter-out-null-flag>)

select Stock, array(Price) as Trends group by Stock
select Stock, array(Price, true, true) as Trends group by Stock



Arithmetic Operations

Arithmetic operations can be used within the query

select (opened/sent)*100 as Open Rate, Customer

Supported operators:

Standard Deviation

Useful to determine variance of a set of values;

select sd(opened) as Std Deviation, Customer group by customer

Useful to access data from a previous row with an optional row offset;

select LAG([, offset[, default]])
select LAG(customer, 3)

Date Operations

Cloud9QL will automatically attempt to parse various date formats.

Use str_to_date(<date>,<format>) for unsupported formats.


Converts an Epoch number of seconds to a readable date format.

select epoch_to_date(date) as datetime

Truncates a date to midnight. When used within group by performs aggregation by date.

select date(date), sent
select date(date) as Sent Date, sum(sent) as Total Sent group by date(date)

Day name of the week (Sunday, Monday etc)

select day_of_week(date), sum(sent) as Total Sent group by day_of_week(date)

Day of the month (1, 2, 3 etc)

select day_of_month(date), sum(sent) as Total Sent group by day_of_month(date)

Length of the month in days (28, 29, 30, 31 etc)

select days_in_month(date), sum(sent) as Total Sent group by days_in_month(date)

Truncates to a date to the beginning of the week (Sunday). When used within group by performs aggregation by week.

select week(date) as Sent Week, sum(sent) as Total Sent group by week(date)

By default, WEEK(<date-field>) returns the Sunday date at midnight of a given week but with WEEK(<date-field>, offset), you can offset the days and alter the day of the week returned.

Offset Reference:

1 - Monday

2 - Tuesday

3 - Wednesday

4 - Thursday

5 - Friday

6 - Saturday

Example: WEEK(<date-field>, 1): returns the Monday date at midnight of a given week.


Week Number integer for the input date

select week_of_year(date) as Sent Week, sum(sent) as Total Sent group by week_of_year(date)
  1. A week is defined from Monday-Sunday regardless of month or year.

  2. All weeks are 7 days long

  3. Weeks are not dependent on the month but it is possible to have 5 weeks associated with a day in a month depending on how the days are aligned, For example, if the 1st of a month falls on a Saturday, then that day will belong to the week starting on Monday which falls in the previous month.

  4. The first week of a year will always follow the last week in December of the previous year


Truncates to the 1st of the month. Aggregates data on a monthly basis when used within group by.

select month(date) as Sent Month, sum(sent) as Total Sent group by month(date)

Month of the year (1, 2, 7, 12 etc)

select month_of_year(date), sum(sent) as Total Sent group by month_of_year(date)

Truncates to the beginning of the quarter. Aggregates data on a quarterly basis when used within group by.

select quarter(date) as Sent Quarter, sum(sent) as Total Sent group by quarter(date)

Truncates to the 1st of the year. Aggregates data on a yearly basis when used within group by.

select year(date) as Sent Quarter, sum(sent) as Total Sent group by year(date)

Truncates to the hour for dates with timestamps

select HOUR(date) as Sent Hour, sum(sent) as Total Sent group by hour(date)

Truncates/Groups to the minute for dates with timestamps

select MINUTE(timestamp) as Sent Hour, sum(sent) as Total Sent group by MINUTE(timestamp)

Current date/time

select now()

Converts a date into another format


select date_format(date,dd-MMM) as Display Format


y Year
M Month
w Week of Year
W Week in month
D Day in Year
d Day in Month
F Day of Week in Month
E Day name in week. Example: Tuesday,Tue
a Am/PM marker
H Hour in day (0-23)
h Hour in am/pm (1-12)
m Minute in hour
s Second in minute
S Millisecond
z Time zone
Z Time zone

Add a datetime amount to a date


select date_add(date,+1y) as Date

Date conversion from a String Converts a string into date using a provided format


select str_to_date(date,dd-MMM-yy HH:mm) as Converted Date
Date Tokens

The following reserved tokens enable date queries based on current date/time:

$c9_now Current Time
$c9_thishour 00:00 of the Current hour
$c9_today Midnight of the current date
$c9_yesterday Midnight, yesterday
$c9_thisweek Start of the current week (Sunday midnight)
$c9_lastweek Start of last week (Sunday midnight)
$c9_thismonth Midnight of the 1st of the current month
$c9_lastmonth Midnight of the 1st of the last month
$c9_thisquarter Midnight of the 1st of the current quarter (Jan, April, July, Oct)
$c9_lastquarter Midnight of the 1st of the last quarter (Jan, April, July, Oct)
$c9_thisyear Midnight, Jan 1, of the current year
$c9_lastyear Midnight, Jan 1, of the last year

select * where date > $c9_thisyear

In addition, these can be further manipulated with +/- operands along with time unit identifiers. For example:

select * where date > $c9_thisyear+2m

Gets data from March onwards

select * where date > $c9_yesterday+2h

Data from 2:00 AM yesterday

Time Units

The following are the list of supported time units:

min Minutes
h Hours
d Days
w Weeks
m Months
q Quarters
y Years


Default timezone is US/Pacific for date display within Knowi. On-premise agents inherit the server timezone.

Custom Timezones can be set in the query using:

set time_zone=US/Eastern;

Full list of Timezones here.


Date Deltas

This calculates the amount of time between two date-times objects to a date/time unit. The result will be a positive whole number, even if the end is before the start.

For example, number of minutes between two date-times:

MINUTES_DELTA(<date field>,<date field>)

select minutes_delta('02/28/2015 22:25:34', '01/28/2015 16:28:34') as minutes_delta;
select minutes_delta(now(), date) as minutes_delta;

Number of hours:

HOURS_DELTA(<date field>,<date field>)

select hours_delta('02/28/2015 22:25:34', '01/28/2015 16:28:34') as hours_delta;
select hours_delta(now(), date) as hours_delta;


DAYS_DELTA(<date field>,<date field>)

select days_delta('02/28/2015 22:25:34', '01/28/2015 16:28:34') as days_delta;
select days_delta(now(), date) as days_delta;


MONTHS_DELTA(<date field>,<date field>)

select months_delta('02/28/2015 22:25:34', '01/28/2015 16:28:34') as months_delta;
select months_delta(now(), date) as months_delta;

Rolling & Cumulative

Cloud9QL provides a set of operations which can be utilized to calculate rolling and cumulative operations such as accumulate, growth, delta, simple moving average, cumulative moving average, and time moving average.

The standard usecase for these operations is to allow computation <operation> of a <value field> across a set of <dimension field(s)> and optionally grouping by a set of <grouping field(s)>.

<operation>(<value field>[, <grouping field(s)>]);

For example, compute the DELTA of Sent across Week grouping by Customer. In this example:

<operation>: DELTA
<value field>: Sent
<dimension field(s)>: Week
<grouping field(s)>: Customer


select Customer, delta(Sent, Customer) as SentDelta

There is one important restriction when using these Cloud9 QL functions: the input data need to be ordered by the <grouping field(s)> and <dimension field(s)> in that order.


Creates cumulative totals for a field between records, given a sorted dataset.

accumulate(<value field>[, <grouping field(s)>]);

select accumulate(sent), date

The above example returns a cumulative sum of sent count for a pre-sorted date order.


Calculates a growth percentage for a field between records, for a sorted dataset.

growth(<value field>[, <grouping field(s)>]);

 select growth(sent), date

Calculates a difference for a field between records, for a sorted dataset.

delta(<value field>[, <grouping field(s)>]);

 select delta(sent), date

Simple moving average based on a field and a window size for it. Assumes a sorted dataset.

SMA(<value field>, <window size>[, <grouping field(s)>]);

select sma(sent, 10)


Cumulative moving average returns the moving average of all data up to the current data point.

CMA(<value field>[, <grouping field(s)>]);

select cma(sent)


Time moving average based on a field, date field, and a window time unit size for it. See Time Units for all available time units. Assumes a sorted dataset

TMA(<value field>, <date field>, <time unit window>[, <grouping field(s)>]);

select tma(sent, date, 1w)

For more details on moving average definitions, see

String & Number Operators


Specify the number of decimal points to display

ROUND(<field>, <decimal points>)

select round(sent,1)

Substring between start and end indexes.

SUBSTRING(<field to check against>, < startIndex>,< length>)

 select substring(Message Type,0,10)

Substring before the first occurrence of a delimiter for a field value.

SUBSTRING_BEFORE(<field to check against>, < delimiter>)

 select substring_before(Message Type,someDelim)

Substring after the first occurrence of a delimiter for a field value.

SUBSTRING_AFTER(<field to check against>, < delimiter>)

  select substring_after(Message Type,someDelim)

Concatenates multiple columns together. When a field name does not exist in the current dataset, a fixed string is used.

CONCAT(<field name>, < anotherfield>, < yetanotherfield>,...)

 select concat(Customer, for Week of, Week)

Split a string of elements separated by separator into an array. If separator is not specified, comma will be used.

SPLIT(<field name>, <separator>)

 select split(Customer, ",")

Join elements of an array value together separated by separator. When a field name does not exist in the current dataset, a fixed string is used.

ARRAYTOSTRING(<field name>, <separator>)

 select array_to_string(Customer, ", ")

Upper cases a string

UPPER(<field name>)

 select upper(Customer)

Lower cases a string

LOWER(<field name>)

 select lower(Customer)

Removes leading and trailing spaces from a string

TRIM(<field name>)

 select trim(address)

Returns the length of a string.

LENGTH(<field name>)

 select length(address)

Formats a number to a locale specific currency format. Defaults to US currency format (en_US) if locale is not specified.

CURRENCY_FORMAT(<field name>, <locale>)

CURRENCY_FORMAT(<field name>, <decimal points>)

CURRENCY_FORMAT(<field name>, <locale>, <decimal points>)

  select currency_format(revenue)

Example with Locale:

  select currency_format(revenue,en-GBP)

This function allows you to control the display of leading and trailing zeros, prefixes and suffixes, grouping (thousands) separators, and the decimal separator.


select number_format(clicks,##,###.00) as Number of clicks

The output for the preceding lines of code is described in the following table. The value is the number, a double , that is to be formatted. The pattern is the String that specifies the formatting properties. The output, which is a String, represents the formatted number:

value pattern output explanation
123456.789 ###,###.### 123,456.789 The pound sign (#) denotes a digit, the comma is a placeholder for the grouping separator, and the period is a placeholder for the decimal separator.
123456.789 ###.## 123456.79 The value has three digits to the right of the decimal point, but the pattern has only two. The format method handles this by rounding up.
123.78 000000.000 000123.780 The pattern specifies leading and trailing zeros, because the 0 character is used instead of the pound sign (#).
12345.67 $###,###.### $12,345.67 The first character in the pattern is the dollar sign ($). Note that it immediately precedes the leftmost digit in the formatted output.
12345.67 \u00A5###,###.### ¥12,345.67 The pattern specifies the currency sign for Japanese yen (¥) with the Unicode value 00A5.

Replaces each substring of this string that matches the given regular expression with the given replacement.

In case replacement parameter is not provided an empty string value "" is used as default replacement.

REGEX_REPLACE(<field name>, < regex>) REGEX_REPLACE(<field name>, < regex>, < replacement>)

For example, to replace all occurrences of white spaces in a string

  select regex_replace('Morris Park Bake Shop', '\s') as regex_replaced; 
  ==> MorrisParkBakeShop

Extract and return all matches (non-overlapped) for the regular expression from the given input string field.

In case there is no match, NULL will be returned.

REGEX_EXTRACT(<field name>, <regex>, [<extract groups>])

For example, to extract all string occurrences between (and include) '%' characters

  select regex_extract("|Morris Park| |Bake Shop|", "\|([^|]*)\|");
  ==> ["|Morris Park|","|Bake Shop|"]
  select regex_extract("|Morris Park| |Bake Shop|", "\|(([^|]*))\|", true); 
  ==> [ ["|Morris Park|","Morris Park"],["|Bake Shop|","Bake Shop"] ]

Returns an alternate value to be used in case the specified field does not exist or the value is NULL.

IFNULL(<field name>, <alternate value>)

  select IFNULL(CustomerName, "N/A")

You can also specify an alternate column in place of an alternate value

IFNULL(<field name>, <another field name>)

  select IFNULL(CustomerName, CustomerId)



CASE WHEN statements provide great flexibility when dealing with buckets of results or when you need to find a way to filter out certain results. Another way to think of it is it's a conditional logic similar to IF-THEN statements in other programming languages.

When using a CASE WHEN statement, it's important to remember you need a condition, what to do when that condition is met, and an END clause. A simple example is below:

    WHEN condition
        THEN result
    ELSE other_result

For example,

      WHEN country = 'USA'
        THEN 'North America'
      WHEN country = 'Mexico'
        THEN 'North America'
      ELSE country
END AS 'West Region'

Returns the value of the field for the specified percentile rank.

PERCENTILE(<field>, <percentile>)

select percentile(sent,75)

Pivots row values for a field to columns

TRANSPOSE(<field to transpose>, < current column name>)

select transpose(Message Type,Sent)

To collapse based on a key field, use:

TRANSPOSE(<field to transpose>, <current column name>, <key column name>)

select transpose(Message Type, Sent, Customer)


Opposite of Transpose, folds columns into rows. 


REVERSE_TRANSPOSE(<New ID column>, <New Value column>, <Value column 1>, ...., <Value column N>)(, , , ...., )


As you can see, I have initial data with 5 columns. After executing: select reversetranspose(NEWID, NEWV, V1, V2, V3) I get them value columns folded into the new column, where NEWID's value is the old value column names (V1, V2, V3 one for each new row) and the NEW_V contains the corresponding value. 

When you have multiple columns that you want to fold in, specify all the columns you want to pin it by, specify the "pin" columns at the front, followed by new ID and the value column, followed by a *. 


REVERSE_TRANSPOSE(Customer, Campaign, State, <New ID column>, <New Value column>, *)

This will fold in all the columns except for the columns in the first section. 


Injects last value records in for a date range when the values are not present for that date.

For example, if a sensor emits data point 100 for 01/01/2016 and and the next change of value is at 200 10 days later, you can use the inject function to inject 100 into all dates in between that range.

INJECT(<Date Field>, <Start Date for Injecting>, <End Date for Injecting>, <Injection Frequency> [, <Select Fields>])
[group by <Dimension 1>[, ..., <Dimension N>]]

The optional <Select Field> can either be * (for all fields) or a comma separated list of selected fields from input data.

select inject(date, start_range_date, end_range_date, 1d, Name, Division, Score)
group by Name, Division

Combines the results of two queries into the same dataset

(select sum(sent) as Facebook where customer=Facebook) append (select sum(sent) as LinkedIn Sent where customer=Linkedin)


Use the dot notation to query nested elements and the array notation for selecting items in an array. The example below uses a JSON nested string and uses Cloud9QL to parse it.

To unwind/expand an array, use the expand syntax.


select customer, nestedObj.secondLevel.y as Nested;
select expand(Nested);

Note that expand must be specified on its own, without any other elements within the select.

Chaining Statements

Multiple statements can be chained one after the other using a semi-colon delimiter, where the results of the first statement is passed in to the second and so on.


select sum(sent) as Sent, sum(opened) as opened, customer, month(date) as Month group by customer, month(date);
select (opened/sent)*100 as Open Rate, Opened, Sent, Customer, Month;
select round(Open Rate,1) as Open Rate, Opened, Sent, Customer, Month where Month > $c9_thisyear+2m and Open Rate > 20 order by Open Rate asc


a) gets the Total Sent and Opens on a monthly basis for each customer,

b) then calculates the open rate based on the data from previous step since March,

c) adds additional criteria along with rounding the open rate.

IP to Geo-Location

IP to GEO function enables city level geo location from IP addresses.

Note: This uses GeoLite2 data created by MaxMind.


select ip_to_geo(IP), Some Field


  • Queries the MaxMind database to determine location fields.
  • The fields are added as separate columns to the result.

Geocoding - Lat/Long from Address

Retrieves lat/longs from addresses using

Note: This requires your own API key from


select geocode(<fullAddress>,<apiKey>)


  • Issues a batch geocoding request to
  • The fields are added as separate columns to the result.


Cloud9QL supports the ability to forecast/predict future data points based on previous data history for any dataset.

To predict 1 point in the future:

select PREDICT(<field to predict>, <date field>, <start date for prediction>[, <prediction model>])

To predict multiple data point in the future:

select PREDICT(<Field to Predict>, <Date Field>, <Start Date for Prediction>,
    <Prediction Frequency>, <Prediction Data Points>[, <prediction model>])

To predict multiple data point in the future based on secondary dimension(s) (ie: grouping(s)):

select PREDICT(<Field to Predict>, <Date Field>, <Start Date for Prediction>,
    <Prediction Frequency>, <Prediction Data Points>[, <prediction model>])
group by <Dimension 1>, ..., <Dimension N>


  • Loads the data points from input data.
  • Determines the optimum prediction model.
  • Predicts future data point(s).
  • <prediction frequency> is in the format of <Number><TimeUnits> for example 1d means daily 2m specifies every 2 months.
  • <prediction model> you can choose a specific model to be applied from one of the following supported models (case-sensitive):
    • Regression
    • PolynomialRegression
    • MultipleLinearRegression



Cohorts are useful to determine a grouping of data across a date dimension. For example, how many users remain active n months after initial signup, etc.

We currently support 2 types of input data.

select COHORT(<Date Column>, <Cohort Date Definition>, <Cohort Period Definition>), <Cohort Operation>
group by <Cohort Date>, <Cohort Period>


  1. Input data needs to be sorted by Date ascending order.

  2. Cohort Period returns a number (ie: the period) or a date. Example:

    a. 1m: Cohort Period as number

    b. (1m): Cohort Period as Date

Example 1: If we already have the cohort date populated

Example 2: If we only have transactional events like the following example:

Example 3: Cohorts can be used in combination with transpose to flatten the result based on date

Example 4: A common cohort is retention in percentage format which can be computed as follows:


Unwinds nested field values: This function will expand array, map, or array of maps data structure into rows.

select EXPAND(<Column with Nested Value>)

Example: In this example, Name field's value is an array of map of Last Name and First Name.

Note that this function must be used in isolation, i.e., cannot be used in combination with others. Use query chaining to manipulate the results:

select expand(Name);
select * where First Name like John


In case your data is a JSON string, PARSE function can be used to convert it to object which can be further manipulated and processed.

select PARSE(<JSON String Column>)

Example: In this example, Name field's value is an array of map of Last Name and First Name.


In case your data needs to be encrypted/obfuscated for storing, you can use the ENCRYPT and DECRYPT functions to achieve this goal.

select ENCRYPT(<Column>)
select DECRYPT(<Column>)