Dataset
Dataset
A dataset is a collection of data.
Dataset Types
- "Data File" is a type of dataset which can contain files with different schemas.
- "Data Collection" is a type of dataset which can be handled as like a single file with single common schema.
The syntax of table name in delika SQL varies according to the dataset type. See delika SQL for details.
Permission Levels
The following table depicts the member permission levels in a team.
Action | Owner | Editor | Reader | Logged-in User | Anonymous |
---|---|---|---|---|---|
read public dataset | ✔ | ✔ | ✔ | ✔ | ✔ |
read private dataset | ✔ | ✔ | ✔ | ||
update dataset description | ✔ | ✔ | |||
create data in dataset | ✔ | ✔ | |||
update data in dataset | ✔ | ✔ | |||
delete data in dataset | ✔ | ✔ | |||
delete dataset | ✔ | ||||
set dataset permission to users | ✔ |
License
The data provider may license their user data.
License text can contain the following YAML front matter:
---
License: <license_name>
Author: <author_name>
---
If <license_name>
is one of the followings, it will be converted into appropriate tags:
- CC BY 4.0
- CC BY-NC 4.0
- CC BY-SA 4.0
- CC BY-NC-SA 4.0
- CC0 1.0
Example:
---
License: CC BY 4.0
Author: Example Company
---
Data Collection
Partition
File names in a data collection must be in the following format.
Partition Unit | File Name Format |
---|---|
year | yyyy__<suffix>.extension |
month | yyyyMM__<suffix>.extension |
day | yyyyMMdd__<suffix>.extension |
hour | yyyyMMddHH__<suffix>.extension |
Restriction
Files in a data collection cannot be overwritten. If you would like to replace a file, you must delete it first and then upload a new file.