Warning
Documentation website under active development. This is not a stable release.
Data Structures#
Files used in plans
are related to the following two primitive data structures:
A Table can store a frame of data in rows and columns in a single file. A Raster can store a map in a matrix/grid of numbers in a single file.
Input files must be formatted in by standard way, otherwise the tool is not going to work. The standards are meant to be simple and user-friendly.
Using open-source applications
Open-source applications like LibreOffice and QGIS are very convenient to fit data into plans
standards.
Table#
A Table
in plans
is a frame of data defined by rows
and columns. Usually, the first row stores the names of the
fields and the subsequent rows stores the data itself.
Structure rules
[required] file extension:
.csv
[required] column separator: semi-colon
;
[required] first row stores field names
[required] decimal separator for numbers:
.
[required] no-data convention: empty cell
Example
In the following table id
is an integer number int
field,
ndvi_mean
is a real number float
field and the remaining are text str
fields.
id; name; alias; color; ndvi_mean
1; Water; W; blue; -0.9
2; Forest; F; green; 0.87
3; Crops; C; magenta; 0.42
4; Pasture; P; orange; 0.76
5; Urban; U; 9AA7A3; 0.24
plans
is case-sensitive
Upper case and lower case matters. Name
is different than name
.
Column names standards
Field/column names may follow standards also
Data types
See Data Types section for more references on field formatting.
Information Table#
An Information Table is a special kind of Table that stores field information in a listed format.
Structure rules
[required] file name signature:
{filename}_info
[required] file extension:
.csv
[required] column separator: semi-colon
;
[required] first row stores field names
[required] decimal separator for numbers:
.
[required] no-data convention: empty cell
Basic required fields
Name |
Description |
Data Type |
Units |
---|---|---|---|
|
Name of field |
|
unitless |
|
Value set for field |
|
n.a. |
Extra required fields
Extra required fields may be also needed, depending on each input file.
Required horizontal fields
A required horizontal field in Information Table
is an
expected row with set values for field
and value
.
Example
field; value
name; Mill Creek Model
alias; MCM-M001
color; blue
source; Ipo
description; Rainfall-Runoff model
file_parameters;
folder_data; ./data/inputs
Attribute Table#
An Attribute Table is a special kind of Table that stores extra information about Raster maps. Each column represents a field that must be homogeneous. This means that each field stores the same data type.
Structure rules
[required] file name signature:
{filename}_attributes
[required] file extension:
.csv
[required] column separator: semi-colon
;
[required] first row stores field names;
[required] decimal separator for numbers:
.
[required] no-data convention: empty cell
[required] homogeneous data type for on each column
Basic required fields
Name |
Description |
Data Type |
Units |
---|---|---|---|
|
Unique numeric code |
|
index |
|
Unique name |
|
n.a. |
|
Unique short nickname or label |
|
n.a. |
|
Color HEX code or name available in Matplotlib |
|
n.a. |
Extra required fields
Extra required fields may be also needed, depending on each input file.
Example
id; name; alias; color; ndvi_mean
1; Water; W; blue; -0.9
2; Forest; F; green; 0.87
3; Crops; C; magenta; 0.42
4; Pasture; P; orange; 0.76
5; Urban; U; #9AA7A3; 0.24
Add non-required fields
Any other fields (columns) other than the required will be ignored so you can add convenient and useful extra non-required fields. For instance, here a description
text field was added for holding more information about each land use class:
id; name; alias; color; ndvi_mean description
1; Water; W; blue; -0.9; Lakes, rivers and ocean
2; Forest; F; green; 0.87; Forests (natural and cultivated)
3; Crops; C; magenta; 0.42; Conventional annual crops
4; Pasture; P; orange; 0.76; Conventional pasture and grasslands
5; Urban; U; 9AA7A3; 0.24; Developed areas
Time Series#
A Time Series in plans
is a special kind of Table file that must have a datetime
text field (preferably in the first column).
Structure rules
[required] file name signature:
{filename}_series[_optional-suffix]
[required] file extension:
.csv
[required] column separator: semi-colon
;
[required] first row stores field names
[required] decimal separator for numbers:
.
[required] no-data convention: empty cell
[required] homogeneous data type for on each column
[required]
datetime
text field (preferably in the first column)[recommended]
datetime
formatted in ISO 8601:yyyy-mm-dd HH:MM:SS.S
[recommended] homogeneous datetime frequency
[recommended] no gaps or voids in data
Basic required fields
Name |
Description |
Data Type |
Units |
---|---|---|---|
|
Date and time in ISO 8601: |
|
datetime |
Extra required fields
Extra required fields may be also needed, depending on each input file.
Variable fields
The other fields than datetime
generally are fields that stores
the state of variables like precipitation ppt
and surface air temperature tas
.
Datetime frequency
Time Series
also have a homogeneous datetime frequency. Recommended frequencies:
15 minutes
20 minutes
30 minutes
Hourly
Daily
Shorter and longer frequencies
Shorter frequencies than 15 min are not recommended due to processing performance. Longer frequencies than 1 day are not recommended due to effective hydrological process representation.
Example
Time Series
files tends to have a large number of rows. The first 10 rows of
a daily Time Series
file looks like this:
datetime; ppt; tas
2020-01-01 00:00:00.000; 0.0; 20.1
2020-01-02 00:00:00.000; 5.1; 24.3
2020-01-03 00:00:00.000; 0.0; 25.8
2020-01-04 00:00:00.000; 12.9; 21.4
2020-01-05 00:00:00.000; 0.0; 21.5
2020-01-06 00:00:00.000; 0.0; 23.6
2020-01-07 00:00:00.000; 8.6; 20.6
2020-01-08 00:00:00.000; 4.7; 28.3
2020-01-09 00:00:00.000; 0.0; 27.1
Automatic fill of time information
During processing, plans
will fill time information (hours,
minute and seconds) if only the date is passed (year, month and day), like in the above example.
Small gaps and voids in Time Series
plans
will try to fill or interpolate small gaps and voids in a
given Time Series
. However, be aware that this may cause unnoticed
impacts on model outputs. A best practice is to interpolate and fill
voids prior to the processing so users can understand what is going on.
For instance, consider the following Time Series
that has a gap
(missing Jan/3 and Jan/4 dates) and a void for ppt
in Jan/8:
datetime; ppt; tas
2020-01-01 00:00:00.000; 0.0; 20.1
2020-01-02 00:00:00.000; 5.1; 24.3
2020-01-05 00:00:00.000; 0.0; 21.5
2020-01-06 00:00:00.000; 0.0; 23.6
2020-01-07 00:00:00.000; 8.6; 20.6
2020-01-08 00:00:00.000; ; 28.3
2020-01-09 00:00:00.000; 0.0; 27.1
In this case, plans
would interpolate temperature tas
and fill
with 0 the precipitation ppt
:
datetime; ppt; tas
2020-01-01 00:00:00.000; 0.0; 20.1
2020-01-02 00:00:00.000; 5.1; 24.3
2020-01-03 00:00:00.000; 0.0; 23.3
2020-01-04 00:00:00.000; 0.0; 22.4
2020-01-05 00:00:00.000; 0.0; 21.5
2020-01-06 00:00:00.000; 0.0; 23.6
2020-01-07 00:00:00.000; 8.6; 20.6
2020-01-08 00:00:00.000; 0.0; 28.3
2020-01-09 00:00:00.000; 0.0; 27.1
Raster#
A Raster in plans
is a map of data defined by a matrix or grid
of cells storing numbers (int or float) and encoded in way that it can be
georeferenced in a given Coordinate Reference System (CRS
).
Structure rules
Rule set for a single file:
[required] GeoTIFF file with
.tif
extension[recommended] projected
CRS
so all cells are measured in meters
Rule set for multiple files:
[required] files are aligned for the same spatial extension
[required] files are aligned for the same spatial resolution
Raster grid shape must be the same
The rule set for multiple files implies that all Raster
files in
a given project must share the same grid shape (number or rows and columns).
GeoTIFF file#
The GeoTIFF file is the standard Raster
file in plans
. This
is a well-known raster file distributed by most of dataset providers.
The advantages of GeoTIFF
is that it stores data and metadata together
in the same file. plans
parse GeoTIFF
files using the Rasterio libray.
GDAL reference
More details about the GeoTIFF file is given in GDAL documentation.
Time Raster#
A Time Raster
in plans
is a special kind of Raster
file in which the data refers to a snapshot of the time line.
Structure rules
Rule set for a single file:
[required] GeoTIFF file with
.tif
extension[required] file name signature:
{filename}_{date}
[recommended] projected
CRS
so all cells are measured in meters
Rule set for multiple files:
[required] files are aligned for the same spatial extension
[required] files are aligned for the same spatial resolution
Example
For instance, Land Use Land Cover is a spatial data that may require
many Time Raster
files:
{folder}/
├── lulc_2020-01-01.tif # Raster - Land Use in 2020
├── lulc_2021-01-01.tif # Raster - Land Use in 2021
└── lulc_2022-01-01.tif # Raster - Land Use in 2022
Quali Raster#
A Quali Raster in plans
is a special kind of Raster
file in which data is qualitative (classes or ids), and an auxiliary
Attribute Table must be provided.
Structure rules
Rule set for a single file:
[required] GeoTIFF file with
.tif
extension[required] an auxiliar Attribute Table with same name of GeoTIFF
[recommended] projected
CRS
so all cells are measured in meters
Rule set for multiple files:
[required] files are aligned for the same spatial extension
[required] files are aligned for the same spatial resolution
Example
For instance, a Quali Raster
for Land Use Land Cover only stores
the id
code for each land use class. More information and parameters
must be stored in the auxiliar Attribute Table
.
{folder}/
├── lulc_2020-01-01.tif # Raster - Land Use in 2020
└── lulc_attributes.csv # <-- Attribute Table
One Attribute Table
can feed multiple maps
The same Attribute Table
file can supply the information required of multiple Raster
maps.
For instance, consider a set of 3 Land Use Land Cover maps, for different years.
They all can use the same Attribute Table
file:
{folder}/
├── lulc_2020-01-01.tif # <-- multiple Rasters
├── lulc_2021-01-01.tif
├── lulc_2022-01-01.tif
└── lulc_attributes.csv # <-- single Attribute Table
Time Quali Raster#
A Time Quali Raster
in plans
is a special kind of Raster
file that arises when the map is both a Time Raster and
a Quali Raster. Land Use maps are the classical example,
as shown above. Rules overlap.
Data Types#
Data Type is the encoding of data at the hardware level. For beginners, one may understand data types by this primitive classification:
str
text string: common text charactersint
integer numbers: 2, 0, 1000float
real numbers: 1.2, -3.44misc
miscellaneous, undefined data type
Detailed data types
The data types listed above are very primitive. For instance,
int
can be int8
or int64
, which yield a big difference in memory usage.
See below for a comprehensive reference.
No-data value convention#
A nodata
value is a convention of what values in data means that
there are actually no data (a data void). For tables, this is usually
set as empty cells or some text like “N.A.” (not-apply, etc). For
raster maps, the GeoTIFF
format has a built-in metadata
that stores a nodata
value.
Enforcement of nodata
Users are not required to set nodata
values, but the incoming
values may be overwritten to plans
standard convention.
Data Types Reference#
Name |
Label |
GDAL Id |
Lower |
Upper |
Decimals |
Structure |
---|---|---|---|---|---|---|
text string |
|
text characters |
||||
8-bits integer unsigned |
|
1 |
0 |
255 |
0 |
integer number |
8-bits integer |
|
14 |
-128 |
127 |
0 |
integer number |
16-bits integer unsigned |
|
2 |
0 |
65535 |
0 |
integer number |
16-bits integer |
|
3 |
-32768 |
32767 |
0 |
integer number |
32-bits integer unsigned |
|
4 |
0 |
4e9 |
0 |
integer number |
32-bits integer |
|
5 |
-2e-9 |
2e9 |
0 |
integer number |
64-bits integer unsigned |
|
12 |
0 |
18e15 |
0 |
integer number |
64-bits integer |
|
13 |
-9e15 |
9e15 |
0 |
integer number |
16-bits float |
|
15 |
-65504 |
65504 |
3 |
real number |
32-bits float |
|
6 |
-3.4e38 |
3.4e38 |
6 |
real number |
64-bits float |
|
7 |
-1.8e108 |
1.8e108 |
15 |
real number |
Note
Hi-order values in the above table are approximations. For example, the exact upper value of int32
is 4,294,967,295.
NumPy Data Types
Check out NumPy Data Types documentation page for mode details for data types in Python arrays.
GDAL Data Types
Check out GDAL Data Types documentation page for mode details for data types in raster maps.