Open Data Science Europe Workshop 2021

Land cover time-series data stack for Europe 2000--2019 based on LUCAS, GLAD Landsat and Spatiotemporal Ensemble Machine Learning
2021-09-09, 09:40–10:00, HUGOTech

We classified 33 land use / land cover (LULC) classes between 2000 and 2019 using a single spatiotemporal ensemble machine learning model in a fully automated, free and open source workflow. This workflow includes harmonization and preprocessing of several high-resolution publically available covariate datasets and over five million training samples, spatial K-fold cross-validation, hyperparameter optimization, and multiple methods for LULC change analysis. We show how the per-class probability predictions (1) facilitate useful prediction uncertainty metrics, (2) inform use case-tailored post-processing strategies, and (3) enable a novel way to quantify LULC change dynamics without relying on hard-class predictions. We show that for this purpose, spatial models that are trained on data from a single year are consistently outperformed by a single spatiotemporal model that is trained on all data from all years, especially when generalizing to input data from years that are not included in the training dataset. We present a final land cover dataset with per-class probability and uncertainty metrics, as well as a hard-class classifications with 62\% cross-validation (CV) accuracy for 33 Corine Land Cover (CLC) level 3 classes, 70\% accuracy for 14 level 2 CLC classes, and 87\% accuracy for the 5 level 1 classes. Our results suggest that our method enables land cover classification for subsequent years without waiting for new training data, while facilitating improved training data collection through analysing variable importance, per-class performance, and uncertainty metrics.

We propose that he future of land cover land use mapping and change detection will likely be driven by developments in the following fields: (1) multisource data harmonization, such as combining Sentinel and Landsat data, (2) leveraging the spatial context of remote sensing data by applying pattern recognition and object-based image analysis on spectral features, and (3) combining spatiotemporal ML with process-based techniques such as urban crawl and vegetation growth modeling.


Please, insert here all the other authors of your submission, together with their affiliated institution.

Martijn Witjes (1)
Leandro Parente (1)
Chris van Diemen (1)
Tomislav Hengl (1)
Martin Landa (2)
Lukas Brodsky (2)
Lena Halounova (2)
Josip Krizan (3)
Luka Antonic (3)
Codrina Maria Ilie (4)
Vasile Craciunescu (4)
Milan Kilibarda (5)
Ognjen Antonijevic (5)
Luka Glusica (6)

1: OpenGeoHub, Wageningen, the Netherlands
2: Department of Geomatics, Faculty of Civil Engineering, CTU in Prague, Czech Republic
3: MultiOne, Zagreb, Croatia
4: Terrasigna, Romania
5: Department of Geodesy and Geoinformatics, Faculty of Civil Engineering, University of Belgrade, Belgrade, Serbia
6: GiLAB, Belgrade, Serbia