delika SQL Reference
The delika SQL is a SQL-like language to process data on delika platform. The main differences with the standard SQL are:
- only SELECT statement is supporeted
- table name has special format like
[account/dataset/data.extension]and is only available in FROM clauses with alias name
- duration literal in ISO 8601 format (see below)
The duration literal can be written as a string in the ISO-8601 format with the d prefix. Due to technical limitations, it is not possible to mix the precision of year and month with that of day, hour, minute and second. Only a precision of seconds is allowed for decimals, and all other precision is for integers only.
d'P1Y2M' (1 year and 2 months),
d'P1DT2H3M4.5S' (1 day, 2 hours, 3 minutes and 4.5 seconds),
d'PT10M' (10 minutes)
The delika SQL allows users to query against data at specific version or time. If users do not specify the version or time, it indicates users specify the latest version of data.
Snapshot by Version
Users can specify the version of data in the format of
Snapshot by Time
Users can specify the time of data in the format of
<YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>±<hh>:<mm> (offset can be
Z for UTC).
Users cannot refer to individual files in a dataset. Users can refer to one or more files in a partition in a format like
[account/dataset/partition.extension]. For example, if a dataset is partitioned by year, queries look like
SELECT * FROM [account/dataset/2021.csv].
Users can also query multiple tables using wildcard like
[account/dataset/202101*.csv]. If users would like to specify partitions in detail, they should use partition column in their queries.
SELECT this_year.date AS this_date, last_year.date AS last_date, this_year.value AS this_value, last_year.value AS last_value FROM [account/collection/2021.csv] this_year FULL OUTER JOIN [account/collection/2020.csv] last_year ON last_year.date + d'P1Y' = this_year.date
Single line comment (
--) and multiline comment (
/* */) are supported. Multiline comment can be nested.
Modules and Functions
A module is a collection of functions.
A function takes the following form:
A window funtion takes the following form:
PARTITION BY and
ORDER BY can be omitted if unnecessary.
Some functions can be called without their module name. Such functions are called global functions.
See List of Modules and Functions in delika SQL to look up functions.