Spatial-temporal analysis of small-scale determinants of the COVID-19 pandemic


This research project aimed to provide detailed insights into area-specific key drivers of the COVID-19 pandemic and indicate to what extent educational level, household structure or average living space had an impact on COVID-19 incidence.

We combined a detailed and comprehensive set of small-scale spatial information on socio-economic, demographic, urban, and environmental characteristics with spatially and temporally aggregated data on Covid-19 incidences in the city of Bremen to create a spatial-temporal analysis dataset. Particularly, the use of mobile network data allowed to measure changes in local mobility flows of the population throughout the pandemic to investigate their effect on COVID-19 incidences at the neighbourhood-level.

Weekly Covid-19 incidences were assessed in 260 statistical neighbourhoods and 70 districts in Bremen in the weekly time interval of 117 weeks from 02/03/2020 until 23/05/2022 and grouped into five waves over the course of the pandemic. Spatial autocorrelation revealed high clustering of Covid-19 incidences from wave 2 to wave 4 with lower clustering in wave 5.

Descriptive analysis of key factors assumed to be relevant for Covid-19 incidences revealed differences between neighbourhoods. From wave 2 to wave 4, neighbourhoods with highest number of Covid-19 infections showed lower income, lower residential space, higher population density and higher number of children with need for language support in preschool compared to neighbourhoods with lowest Covid-19 infections, indicating problematic living conditions, deprivation, and language barriers as possible factors increasing Covid-19 incidences. Eventually, in wave 5, difference between these factors were low showing a similar spread of Covid-19 in the omicron wave.

Thorough investigation of socio-demographic and environmental factors using Poisson mixed regression models accounting for space and time variation showed that mostly seasonal variables such as average temperature and sun hours negatively affected Covid-19 incidences. With regard to specific waves, socio-demographic factors revealed a significant association with Covid-19 incidences. For example, number of people in the household showed a positive effect on Covid-19 incidences in wave 2 and 4, while median income was negatively associated with Covid-19 incidences in wave 3 and 5.

Latent profile analyses including selection socio-demographic variables identified three main profiles for 70 sub-districts, i.e. highly deprived, working class areas, and least deprived. Covid-19 incidence throughout the pandemic stratified by three profiles revealed that highly deprived sub-districts had higher Covid-19 incidences which increased earlier in almost all waves, followed by working class sub-districts, while least deprived sub-districts had lowest Covid-19 incidences from wave 2 to wave 4.

To investigate machine learning methods for prediction of Covid-19 incidences, we considered tree-learning and deep-learning approaches. Temporal validation of prediction models including 12 weeks of data for training to predict upcoming four weeks of Covid-19 incidences revealed better performance of the random forest compared to Fully-Connected-Neural-Network (FCNN), but the models showed a shifted prediction, since the random forest could not catch up with the ground truth
Important variables for random forest models and FCNN showed the use of both time variant seasonal variables and socio-demographic variables as important factors to be included in model architecture. To account for more complex prediction modelling including the time series data, further research will focus on RNN to stepwise improve the prediction during the course of the pandemic by subsequently adding time series data to the training of network architecture.

Funding period

Begin:   October 2021
End:   December 2022


  • German Research Foundation


Dr. rer. nat. Christoph Buck