Open Data Science Europe Workshop 2021

Tom Hengl

Technical director at OpenGeoHub Foundation

The speaker's profile picture

Sessions

09-10
10:00
20min
Spatiotemporal modeling of environmental dynamics at global scale: building open multiscale data cubes
Tom Hengl, Leandro Parente

A global compilation of monthly and annual time-series of images for the periods 1982-2018 and 2000-2020 (data cube) is described. The prepared time-series for 1982-2018 (global at 5-km resolution) comprise: TerraClimate (Abatzoglou et al., 2018), vegetation monthly NDVI 90% percentiles for period 1982--2018 as a merge of the AVHRR daily and MODIS NDVI product, Vegetation Continuous Fields (VCF5KYR) Version 1 dataset (Song et al., 2018), Hyde v3.2 land use annual time-series (Klein Goldewijk et al., 2017), . For period 2000-2020 (global at 1-km resolution) MODIS land products (NDVI, LST, snow cover) in combination with MODIS atmospheric products (water vapour, cloud fraction), and global relief (MERIT DEM) and climate layers (CHELSA) are used. All layers have been resampled and gap-filled so they can be imported as an Analysis-Ready spatiotemporal array. For each pixel we also provide geometric temperatures (derived from latitude, day of the year and elevation) and for many layers also uncertainty measures. These datastacks have been made available via our OpenLandMap.org data portal and Cloud-Optimized GeoTIFF S3 file service and available for research and development. Overlaying Earth System Science point datasets (https://gitlab.com/openlandmap/compiled-ess-point-data-sets) such as the global compilation of soil organic carbon demonstrates that the global data cubes can be used to build complex spatiotemporal 2D+T models, including 3D+T, and produce predictions of important variables representing our dynamic environment. The two important advantages of running machine learning on spatiotemporal data recognized include: (1) possibility to explain complex casual relationships between environmental dynamics of plants, ecosystems communities, and soil variables and dynamic climate and human influence, (2) possibility to predict states beyond the time-span covered by training data - e.g. to predict future (as in scenario testing) and past states for which there are no training points.

General
HUGOTech
09-08
09:00
25min
Opening plenary
Tom Hengl

Opening plenary.

HUGOTech
09-10
14:40
45min
Awards and closing plenary
Tom Hengl

Awards and closing plenary

HUGOTech
09-06
13:30
90min
Spatiotemporal Ensemble ML in R
Tom Hengl

Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to Ensemble Machine Learning: the mlr3 framework,
Selecting learners, fine-tuning, feature selection and model stacking,
Using Machine Learning with spatial and spatiotemporal data:
Using ML for spatial interpolation: landmap package (vs geoR and similar geostatistical software),
Adding geographical distances and features to spatial interpolation,
Fitting and using EML for predicting eumap land cover data (Witjes et al, 2021),

workshop
HUGOTech
09-06
15:30
90min
Computing with Cloud-Optimized GeoTIFFs in R
Tom Hengl

Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to the Cloud-Optimized GeoTIFFs: scalable spatial databases,
Accessing COG files using rgdal and terra packages in R,
Making functions that run on COG files,
Spatial overlay with COG files (in parallel),
Running spatial analysis and plotting results

workshop
HUGOTech
09-07
09:00
90min
Data visualization: from R to Google Earth and QGIS
Tom Hengl

Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Accessing COG files using QGIS,
Using gdaltiler to produce plots in Google Earth (local copy),
Using plotKML package to visualize data in Google Earth

workshop
HUGOTech
09-06
11:00
90min
Modeling with spatial and spatiatemporal data in R
Tom Hengl

Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to RStudio, starting packages and loading data,
Introduction to spatiotemporal datasets,
eumap spatiotemporal datasets example: landcover 2000-2020 training dataset (Witjes et al, 2021),
Visualizing spatiotemporal data and data summaries.

workshop
HUGOTech