Tom Hengl
Technical director at OpenGeoHub Foundation
Sessions
A global compilation of monthly and annual time-series of images for the periods 1982-2018 and 2000-2020 (data cube) is described. The prepared time-series for 1982-2018 (global at 5-km resolution) comprise: TerraClimate (Abatzoglou et al., 2018), vegetation monthly NDVI 90% percentiles for period 1982--2018 as a merge of the AVHRR daily and MODIS NDVI product, Vegetation Continuous Fields (VCF5KYR) Version 1 dataset (Song et al., 2018), Hyde v3.2 land use annual time-series (Klein Goldewijk et al., 2017), . For period 2000-2020 (global at 1-km resolution) MODIS land products (NDVI, LST, snow cover) in combination with MODIS atmospheric products (water vapour, cloud fraction), and global relief (MERIT DEM) and climate layers (CHELSA) are used. All layers have been resampled and gap-filled so they can be imported as an Analysis-Ready spatiotemporal array. For each pixel we also provide geometric temperatures (derived from latitude, day of the year and elevation) and for many layers also uncertainty measures. These datastacks have been made available via our OpenLandMap.org data portal and Cloud-Optimized GeoTIFF S3 file service and available for research and development. Overlaying Earth System Science point datasets (https://gitlab.com/openlandmap/compiled-ess-point-data-sets) such as the global compilation of soil organic carbon demonstrates that the global data cubes can be used to build complex spatiotemporal 2D+T models, including 3D+T, and produce predictions of important variables representing our dynamic environment. The two important advantages of running machine learning on spatiotemporal data recognized include: (1) possibility to explain complex casual relationships between environmental dynamics of plants, ecosystems communities, and soil variables and dynamic climate and human influence, (2) possibility to predict states beyond the time-span covered by training data - e.g. to predict future (as in scenario testing) and past states for which there are no training points.
Opening plenary.
Awards and closing plenary
The first 30 minutes will be dedicated to Software/libraries preparations and user support
Software requirements: Python, Jupyter, QGIS, GRASS GIS, R
Content:
General concepts and main advantages of docker containers
What is Docker image and where to find it?
Starting with the docker image opengeohub/py-geo.g
Which tag/version should I use?
Install new OS and python packages inside the container
Share files between the host machine and the container
OSGeo live ready to use in the VirtualBox
Supporting time to help with software and libraries preparations
The next 60 minutes will be dedicated to the introduction to spatial and spatiotemporal data in R
Software requirements: opengeohub/r-geo docker image [https://hub.docker.com/r/opengeohub/r-geo] (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to RStudio, starting packages and loading data,
Introduction to spatiotemporal datasets,
eumap spatiotemporal datasets example: landcover 2000-2020 training dataset (Witjes et al, 2021),
Visualizing spatiotemporal data and data summaries,
Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to Ensemble Machine Learning: the mlr3 framework,
Selecting learners, fine-tuning, feature selection and model stacking,
Using Machine Learning with spatial and spatiotemporal data:
Using ML for spatial interpolation: landmap package (vs geoR and similar geostatistical software),
Adding geographical distances and features to spatial interpolation,
Fitting and using EML for predicting eumap land cover data (Witjes et al, 2021),
Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to the Cloud-Optimized GeoTIFFs: scalable spatial databases,
Accessing COG files using rgdal and terra packages in R,
Making functions that run on COG files,
Spatial overlay with COG files (in parallel),
Running spatial analysis and plotting results
Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Accessing COG files using QGIS,
Using gdaltiler to produce plots in Google Earth (local copy),
Using plotKML package to visualize data in Google Earth
Software requirements: opengeohub/r-geo docker image (R, rgdal, terra, mlr3), QGIS, Google Earth Pro
Content:
Introduction to RStudio, starting packages and loading data,
Introduction to spatiotemporal datasets,
eumap spatiotemporal datasets example: landcover 2000-2020 training dataset (Witjes et al, 2021),
Visualizing spatiotemporal data and data summaries.