Raster data - the basics#

This tutorial focuses on working with basic Raster data management and analysis using plans.

Notebook setup#

For users running this tutorial as a Jupyter Notebook, this cell must be executed first:

import sys
from pathlib import Path
import pprint
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Install `plans` in `google.colab`.
# Use `pip install plans` for other environments.

if "google.colab" in sys.modules:
    import os
    os.system(f"{sys.executable} -m pip install -q plans")

# This avoids warnings related to uninstalled fonts
import logging
# Set the matplotlib font manager logger to only show errors (hides warnings)
logging.getLogger('matplotlib.font_manager').setLevel(logging.ERROR)

# define output folder
OUTPUT_DIR = Path("outputs/raster")
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
print(f"Outputs will be saved to: ./{OUTPUT_DIR}")
Outputs will be saved to: ./outputs/raster

The Raster object#

The Raster object is a primitive class that lives under plans.datasets module. The Raster object stores all core methods for working with single-banded, gridded spatial data (maps).

Import Raster object:

from plans.datasets import Raster

Create an instance of the Raster:

rs = Raster(name="Testing", alias="tst")

Check out the rs variable type:

print(type(rs))
<class 'plans.datasets.core.Raster'>

Edit and inspect attributes:

rs.units = "cm"
rs.description = "Just a tutorial"
print(rs)
Object: <class 'plans.datasets.core.Raster'>
Metadata:
	name: Testing
	alias: tst
	size: None
	color: blue
	source: None
	description: Just a tutorial
	file_data: None
	var_name: Unknown variable
	var_alias: UnVar
	units: cm
	datetime: None
	resolution: None
	rows: None
	columns: None
	xll: None
	yll: None
	nodata_value: None

Working with data#

Raster data files are expected to be in the standard tif (GeoTiff) format.

For the Raster core object, data type is parsed as float32, that is, the data is treated as real numbers by default (instead of integers, for example).

Setup a built-in example project#

Access simple example raster files via the TOY_PROJECT feature:

from plans.config import TOY_PROJECT
from plans.project import load_project
pj = load_project(TOY_PROJECT)

Load data from tif file#

Let’s now get the slope map from the topo folder:

raster_file = Path(pj.folder_topo) / "slope.tif"

Then use load_data()

rs.load_data(file_data=raster_file)

Inspect and view data#

Inspect raster metadata:

pprint.pp(rs.raster_metadata)
{'ncols': 92,
 'nrows': 98,
 'xllcorner': 5680993.86496,
 'yllcorner': 8230146.277,
 'cellsize': 30.0,
 'NODATA_value': -99999.0,
 'crs': CRS.from_wkt('PROJCS["SIRGAS 2000 / Brazil Polyconic",GEOGCS["SIRGAS 2000",DATUM["Sistema_de_Referencia_Geocentrico_para_las_AmericaS_2000",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6674"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4674"]],PROJECTION["Polyconic"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-54],PARAMETER["false_easting",5000000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","5880"]]'),
 'transform': Affine(30.0, 0.0, 5680993.86496,
       0.0, -30.0, 8233086.277)}

Inspect data (numpy array):

print(rs.data)
[[10.044629   9.663136   8.774215  ...  1.9429301  1.838419   1.6902953]
 [11.436489  11.172483  10.271279  ...  1.97898    1.774743   1.588655 ]
 [12.415273  11.988841  11.464977  ...  1.5335212  1.5402688  1.566922 ]
 ...
 [ 4.5116916  6.271887   7.699945  ...  2.0929537  2.1747503  2.143842 ]
 [ 3.694886   5.549275   7.57203   ...  2.032168   2.0935419  2.0754018]
 [ 3.1743133  5.325974   7.503374  ...  2.3002913  2.2270842  2.2148573]]

Get basic statistics in a dict:

stats_dict = rs.get_stats()
pprint.pp(stats_dict)
{'count': 9016.0,
 'sum': 67032.2890625,
 'mean': 7.434814453125,
 'sd': 6.424027919769287,
 'min': 0.03058564104139805,
 'p01': 0.1082386001944542,
 'p05': 0.3892689347267151,
 'p10': 0.8747305870056152,
 'p20': 1.5102627277374268,
 'p25': 1.8075969219207764,
 'p40': 3.5482723712921143,
 'p50': 5.570770263671875,
 'p60': 7.9957275390625,
 'p75': 12.0378999710083,
 'p80': 13.618303298950195,
 'p90': 17.239530563354492,
 'p95': 19.720626831054688,
 'p99': 23.914939880371094,
 'max': 29.092971801757812}

View map:

rs.view()
../_images/9e739cb2d699298d90e0bf80de0f3b9c80d60beb1fa1c26ce90a30d546b6c10d.png

Convert and get an Univar instance:

uv = rs.get_univar()
print(type(uv))
<class 'plans.analyst.Univar'>

View as Univar:

uv.view()
../_images/91fa9c1a3559bdbb1c43938aa226326b900beed61723f7bf5587ccff47129527.png

Export to other file#

Tif format is exported by default.

fo = rs.export(folder=OUTPUT_DIR, filename="slope_example")
print(fo)
print(f"File was created: {Path(fo).is_file()}")
outputs/raster/slope_example.tif
File was created: True

Copy raster structure#

Sometimes it is useful to process raster data and then save it back to the same raster grid.

Example: making a brand new data grid with the same size using numpy.random

new_data = np.random.normal(loc=100, scale=10, size=np.shape(rs.data))
print(type(new_data))
<class 'numpy.ndarray'>

Define new raster, copy structure from other and set data:

# define a new raster
rs_new = Raster()
# copy structure from rs object
rs_new.copy_structure(raster_ref=rs)
# set data
rs_new.set_data(grid=new_data)

View and/or save

rs_new.view()
../_images/bda924e4a7ae987acb17d3f7f7e960b74416c4d2af4791ab34d25cafd6fb0495.png
rs_new.export(folder=OUTPUT_DIR, filename="new_data")
'outputs/raster/new_data.tif'

Types of Rasters#

The Raster class is extented into several families or types.

SciRaster family#

This is a large family of rasters that are numerical values. Behaviour is nearly the same from Raster base class, but NODATA_value and dtype is enforced to the incoming data.

Members of this family are all conventional quantitative spatial variables, like terrain elevation or soil water content.

from plans.datasets import SciRaster
sciraster = SciRaster()
sciraster.load_data(file_data=raster_file)
print(f"Data type: {sciraster.dtype}")
print(f"No-data value: {sciraster.raster_metadata['NODATA_value']}")
Data type: float32
No-data value: -99999

View layout is the same from Raster:

sciraster.view()
../_images/4ecac6dddec18a5afb88ec4e87bd826de95fe1a227855acb582e0055814364ab.png

QualiRaster family#

This is a large family of rasters that supports categorical maps, eg., land use maps. Data is always accompanied by an auxiliar Attribute Table with id, name, alias and color fields:

from plans.datasets import QualiRaster

Define new files (raster and table):

quali_raster_file = Path(pj.folder_lulc) / "observed/lulc_2020-01-01.tif"
quali_raster_table = Path(pj.folder_lulc) / "observed/lulc_attributes.csv"
qualiraster = QualiRaster()
qualiraster.load_data(
    file_data=quali_raster_file,
    file_table=quali_raster_table
)
print(f"Data type: {qualiraster.dtype}")
print(f"No-data value: {qualiraster.raster_metadata['NODATA_value']}")
Data type: uint8
No-data value: 255

Access the Attribute Table:

qualiraster.table.head()
id alias name name_pt color w_ck w_ca w_sk w_soa w_soc w_sua w_suc w_sux w_dea
0 3 Fr Forest Floresta #1f8d49 10 1 1 1 1 1 1 1 1
1 4 Sv Savanna Savana #7dc975 5 1 1 1 1 1 1 1 1
2 5 Mg Mangrove Mangue #04381d 1 1 1 1 1 1 1 1 1
3 6 Fl Floodable Forest Floresta inundável #026975 10 1 1 1 1 1 1 1 1
4 9 Fp Forest Plantation Silvicultura #7a5900 1 1 1 1 1 1 1 1 1

Get areas table:

areas_df = qualiraster.get_areas()
areas_df
id name alias color count area_m2 area_ha area_km2 area_f area_%
0 3 Forest Fr #1f8d49 843 758700.0 75.87 0.7587 0.093500 9.350044
1 4 Savanna Sv #7dc975 4644 4179600.0 417.96 4.1796 0.515084 51.508429
2 5 Mangrove Mg #04381d 0 0.0 0.00 0.0000 0.000000 0.000000
3 6 Floodable Forest Fl #026975 0 0.0 0.00 0.0000 0.000000 0.000000
4 9 Forest Plantation Fp #7a5900 43 38700.0 3.87 0.0387 0.004769 0.476930
5 11 Wetland We #519799 6 5400.0 0.54 0.0054 0.000665 0.066548
6 12 Grassland Gr #d6bc74 1183 1064700.0 106.47 1.0647 0.131211 13.121118
7 13 Not-Forest Nf #d89f5c 0 0.0 0.00 0.0000 0.000000 0.000000
8 15 Pasture Ps #edde8e 1117 1005300.0 100.53 1.0053 0.123891 12.389086
9 20 Sugar cane Sc #db7093 0 0.0 0.00 0.0000 0.000000 0.000000
10 21 Mosaic of Uses Mu #ffefc3 505 454500.0 45.45 0.4545 0.056012 5.601154
11 23 Beach, Dune and Sand Spot Sd #ffa07a 0 0.0 0.00 0.0000 0.000000 0.000000
12 24 Urban Area Ur #d4271e 0 0.0 0.00 0.0000 0.000000 0.000000
13 25 Non-Vegetated Nv #db4d4f 137 123300.0 12.33 0.1233 0.015195 1.519521
14 29 Rocky Outcrop Ro #ffaa5f 0 0.0 0.00 0.0000 0.000000 0.000000
15 30 Mining Mi #9c0027 0 0.0 0.00 0.0000 0.000000 0.000000
16 31 Aquaculture Aq #091077 0 0.0 0.00 0.0000 0.000000 0.000000
17 32 Hypersaline Tidal Flat Sf #fc8114 0 0.0 0.00 0.0000 0.000000 0.000000
18 33 River, Lake and Ocean W #2532e4 0 0.0 0.00 0.0000 0.000000 0.000000
19 35 Palm Oil Po #9065d0 0 0.0 0.00 0.0000 0.000000 0.000000
20 39 Soybean Sy #f5b3c8 538 484200.0 48.42 0.4842 0.059672 5.967169
21 40 Rice Rc #c71585 0 0.0 0.00 0.0000 0.000000 0.000000
22 41 Temporary Crops Tc #f54ca9 0 0.0 0.00 0.0000 0.000000 0.000000
23 46 Coffee Cf #d68fe2 0 0.0 0.00 0.0000 0.000000 0.000000
24 47 Citrus Ci #9932cc 0 0.0 0.00 0.0000 0.000000 0.000000
25 48 Other Perennial Crops Pc #e6ccff 0 0.0 0.00 0.0000 0.000000 0.000000
26 49 Wooded Sandbank Vegetation Ws #02d659 0 0.0 0.00 0.0000 0.000000 0.000000
27 50 Herbaceous Sandbank Vegetation Hs #ad5100 0 0.0 0.00 0.0000 0.000000 0.000000
28 62 Cotton Ct #ff69b4 0 0.0 0.00 0.0000 0.000000 0.000000
29 101 Dirty Road Rd #D2CFC0 0 0.0 0.00 0.0000 0.000000 0.000000
30 102 Paved Road Rs #D4AD9B 0 0.0 0.00 0.0000 0.000000 0.000000
31 103 Railway Rr #B5A295 0 0.0 0.00 0.0000 0.000000 0.000000
32 255 Not Observed No #ffffff 0 0.0 0.00 0.0000 0.000000 0.000000

View scheme provides a horizontal bar for aereal ratios:

qualiraster.view()
../_images/6b5332f1f65a3cb98344959eed6ed457282538fced236a3c6141c1b67f07b0fe.png

Dedicated extensions#

plans.datasets.spatial offers a series of dedicated Rasters for spatial variables. This helps to shortcut attributes and visual setup. Some examples are provided below.

Elevation raster#

from plans.datasets.spatial import Elevation
dem_file = Path(pj.folder_topo) / "dem.tif"
dem_raster = Elevation()
dem_raster.load_data(file_data=dem_file)
dem_raster.view_specs["range"] = [800, 1200]
dem_raster.view()
../_images/f3b924a9ce02b06f64271c37be446e949e72b84607e9bec20577d894fc745372.png

Slope raster#

from plans.datasets.spatial import Slope
slope_file = Path(pj.folder_topo) / "slope.tif"
slope_raster = Slope()
slope_raster.load_data(file_data=slope_file)
slope_raster.view()
../_images/ba8fe57c765d3055895a73ee3a8b0d877918671d9af9747e535da3cf3d998eab.png

HAND raster#

from plans.datasets.spatial import HAND
hand_file = Path(pj.folder_topo) / "hand.tif"
hand_raster = HAND()
hand_raster.load_data(file_data=hand_file)
hand_raster.view()
../_images/9f203c327b1f31567f5e16bd9effc982e80d1882f518aec1a303b1ca84683043.png

TWI raster#

from plans.datasets.spatial import TWI
twi_file = Path(pj.folder_topo) / "twi.tif"
twi_raster = TWI()
twi_raster.load_data(file_data=twi_file)
twi_raster.view()
../_images/24a88ac20aadefe36796be93cc12ac0ab56d9ee6ffa43b38f530a8c14c3757ad.png

LDD raster#

LDD (Local Drain Direction) is a core map for hydrological analysis. It depends on a convention of integer numbers.

from plans.datasets.spatial import LDD

ldd_file = Path(pj.folder_topo) / "ldd.tif"
ldd_raster = LDD()
ldd_raster.load_data(file_data=ldd_file)
ldd_raster.view()
../_images/48a5ae9088943b8174c5cc8b3bbbf5113af28b6feaf9a692e825abd56ce210b3.png

Soils raster#

Soils may reflect soils classes and lithology. Requires the Attribute Table.

from plans.datasets.spatial import Soils
# define file
soils_file = Path(pj.folder_soils) / "soils.tif"
# define table
soils_table = Path(pj.folder_soils) / "soils_attributes.csv"
# define raster
soils_raster = Soils()
soils_raster.load_data(file_data=soils_file, file_table=soils_table)
soils_raster.view()
../_images/71e102b42d177afc3cc404c2a49dc86f436ac2408c5bd128c0a149b22231cc26.png

LULC raster#

LULC (Land Use and Land Cover) requires the Attribute Table and date to be provided.

from plans.datasets.spatial import LULC
# define file
lulc_file = Path(pj.folder_lulc) / "observed/lulc_2020-01-01.tif"
# define table
lulc_table = Path(pj.folder_lulc) / "observed/lulc_attributes.csv"
# define name and datetime
lulc_raster = LULC(name="LULC", datetime="2020-01-01")
lulc_raster.load_data(file_data=lulc_file, file_table=lulc_table)
lulc_raster.view_specs["title"] = "Land Use in 2020"
lulc_raster.view()
../_images/d81248e410288a3689947788ba192afd7a2d476154b32a9a0fbebdc862b46782.png

AOI Raster#

The AOI Raster (Area-Of-Interest) is a member of the QualiRaster family.

It is a boolean grid used mostly for masking operations. In this case, there is no need for an Attribute Table to be provided.

from plans.datasets import AOI
aoi = AOI()
# define basin raster for the AOI
aoi_file = Path(pj.folder_basins) / "main/basin.tif"
aoi.load_data(file_data=aoi_file)
aoi.view()
../_images/7c5c5547a0845d2db149749e3cb7e9b5dffedca12dac02e381a122bca5f0b958.png

Masking Operations#

Masking is a very basic operation when working with Rasters, since in most cases we need to filter the data to a given AOI, like a catchment basin.

Say we want to know the average slope of a basin:

# use the data from the aoi map
slope_raster.apply_aoi_mask(grid_aoi=aoi.data)
slope_raster.view_specs["title"] = "Slope map for the basin"
slope_raster.view()
../_images/2d70c62c496ec1b248df0c28fcd8f590392f6275797a1e74585132a07ed0ff89.png

The AOI mask can be realeased:

slope_raster.release_aoi_mask()
slope_raster.view_specs["title"] = "Slope map for the full extension"
slope_raster.view()
../_images/15266585da6141829f45951cacbbd170e6e3d23fb99faad0be0febbc72bed7c2.png

The same can be done with qualitative maps:

lulc_raster.apply_aoi_mask(grid_aoi=aoi.data)
lulc_raster.view_specs["title"] = "Land Use of the basin in 2020"
lulc_raster.view()
../_images/5a70b4189f846fa5b341818d76fd38631a370fd76af38f7c8b3991e2b997fb70.png