7. Leveraging complex health data

How can we develop intelligent methods for analyzing multidimensional data?

Increasingly, more types of data are available for our research, such as geospatial, sensor or, sequencing data. We obtain them from various data sources such as registries, geo-databases, or research projects. Especially their combination and their often high-dimensional structure pose new and complex challenges for statistical analysis. Recently, very powerful machine learning methods have therefore been developed to analyze such complex data; also, data protection-compliant linkage of different data sources for distributed data analysis is emerging as a separate field of research. For the successful application in epidemiology, we will develop new methods that enable simultaneous processing of different data sources and improve the interpretation of machine learning methods. Our aim is to use these methods efficiently for the analysis of large data volumes in order to uncover structures, make predictions, and draw causal conclusions.