Statistical modeling of covariate-varying networks
- Statistical methods for network analysis play a key role for the description and understanding of complex systems. In molecular and genetic epidemiology, these methods are used, e.g., to investigate interactions between molecular regulators and gene expression levels. Typically, the relevant dependence structures are not static. Hence, recent statistical methods have been designed specifically for time-dynamic network analysis. So far, however, it has largely been neglected that the structure of complex systems often depends considerably on other, non-temporal factors. UV radiation, for instance, can damage the gene network responsible for DNA repair. In combination with other external covariates, this may lead to a dysfunction and, ultimately, to an increased risk of skin cancer. An understanding of whether and how networks change due to potentially modifiable covariates thus appears essential, e.g., for identifying promising starting points for disease prevention.
This project will therefore develop a novel statistical approach enabling researchers to investigate how changing network structures depend on covariates. The new approach consists of a model class and corresponding innovative methods for model fitting. Based on the theory of graphical models and individual observational data, network changes are modeled as a function of covariates. The resulting new model class of "covariate-varying networks" (CVNs) will be sufficiently general for dependencies among several discrete and continuous covariates to be mapped jointly onto the network structure, thus accounting for individual heterogeneity with respect to these covariates. Moreover, the new methods will be based on conditional covariance matrices and allow inference on CVNs even in high dimensional settings. Smoothing methods will be combined with regularization approaches to identify the most important influences on the network structure. The smoothing aspect ensures that only relevant structural changes are detected; the regularization, in contrast, ensures that only covariates with substantial influence on the structure are selected. Model fitting will also allow incorporating available prior knowledge about a network (e.g. from online databases). This model fitting will require solving challenging optimization problems and suitable efficient algorithms will need to be developed. The theoretical asymptotic properties of the approach will be investigated and their performance in terms of model selection consistency will be assessed in simulation studies. Feasibility and interpretability will be illustrated with real data applications.
- Begin: September 2020
End: April 2024
- German Research Foundation