2022-06-16, 11:45–12:05, Conference room - C202
The talk will describe a data-driven framework based on spatio-temporal ensemble machine learning to produce distribution maps for 16 tree species at high spatial resolution (30m). Tree occurrence data for a total of 3 million of points was used to train different Machine Learning (ML) algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 585 coarse and high resolution covariates representing spectral reflectance, different biophysical conditions and biotic competition was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. AUC, logloss and computing time were used to select the three best algorithms to train an ensemble model based on stacking with a logistic regressor as a metalearner for each species. Probability and model uncertainty maps were produced for each species using a time window of 4 years for a total of 6 distribution maps per species. The ensemble model outperformed or performed as good as the best individual model in all potential species distributions, while for ten species it performed worse than the best individual model in modeling realized distributions. The framework shows how combining continuous and consistent Earth Observation time series data with state of the art ML can be used to derive dynamic distribution maps.
Carmelo has a MSc in forest systems sciences and technologies, with a specialization in forest resources monitoring and management through geospatial data science applications and time series analysis.
Carmelo is a PhD Candidate at Wageningen University and Research (WUR) in the Geo-information Science and Remote Sensing program and works as a Research assistant at the OpenGeoHub Foundation