Statistische Inferenz mit maschinellen Lernverfahren

Beschreibung

Die Nachwuchsgruppe beschäftigt sich mit der Entwicklung statistischer Inferenzmethoden für maschinelle Lernverfahren. Besonderes Augenmerk legen wir auf epidemiologische Probleme wie Confounding, hochdimensionale Daten und Überlebenszeitanalysen. Das Projekt ist methodischer Natur, konzentriert sich jedoch stark auf Anwendungen. Die von uns entwickelten Methoden sind als Softwarepakete öffentlich verfügbar, die von Praktikern und angewandten Forschern direkt verwendet werden können.

Die Forschungsschwerpunkte der Nachwuchsgruppe sind:

Interpretierbares maschinelles Lernen
Statistische Eigenschaften von maschinellen Lernverfahren
Überlebenszeitanalyse
Statistische Software
Anwendung auf hochdimensionale Daten

Die Gruppe wird durch das Emmy Noether-Programm der Deutschen Forschungsgemeinschaft (DFG) gefördert und von Marvin N. Wright geleitet.

Förderzeitraum

Beginn: Mai 2020
Ende: Dezember 2026

Förderer

Deutsche Forschungsgemeinschaft (DFG)

Kontaktperson

Prof. Dr. Marvin N. Wright

Ausgewählte Veröffentlichungen zum Projekt

Zeitschriftenartikel mit peer-review

Blesch K, Watson DS, Wright MN. Conditional feature importance for mixed data. AStA Advances in Statistical Analysis. 2024;108(2):259-278.
https://doi.org/10.1007/s10182-023-00477-9
Askland KD, Strong D, Wright MN, Moore JH. The translational machine: A novel machine-learning approach to illuminate complex genetic architectures. Genetic Epidemiology. 2021;45(5):485-536.
https://dx.doi.org/10.1002/gepi.22383
Watson DS, Wright MN. Testing conditional independence in supervised learning algorithms. Machine Learning. 2021;110(8):2107-2129.
https://doi.org/10.1007/s10994-021-06030-6

Editorials

Boulesteix A-L, Wright MN. Special issue: Artificial intelligence in genomics. Human Genetics. 2022;141(9):1449-1450.
https://doi.org/10.1007/s00439-022-02472-7

Beiträge zu Büchern und Proceedings

Binder M, Pfisterer F, Becker M, Wright MN. Non-sequential pipelines and tuning. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 174-195
https://mlr3book.mlr-org.com/chapters/chapter8/non-sequential_pipelines_and_tuning.html
Casalicchio G, Burk L. Evaluation and benchmarking. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 53-82
https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html
Dandl S, Biecek P, Casalicchio G, Wright MN. Model interpretation. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 259-282
https://mlr3book.mlr-org.com/chapters/chapter12/model_interpretation.html
Wright MN. Feature selection. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 146-160
https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html

Vorträge bei wissenschaftlichen Tagungen (eingeladen)

Blesch K, Watson DS, Wright MN. Conditional feature importance for mixed data. Seminar in Econometrics, 2 May 2023, Cologne.
Blesch K, Wright MN, Watson DS. Unfooling SHAP and SAGE: Knockoff imputation for Shapley values. 2nd TRR 318 Conference "Measuring Understanding," 6-7 November 2023, Paderborn.
Swenne A, Wright MN. Confounder adjustment with random forests based on local residuals in genetic association studies. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
Wright MN. From explainable AI to generative modeling with tree-based machine learning. Statistics and Econometrics Seminar, Humboldt-Universität zu Berlin, 17 October 2023, Berlin.
Wright MN. Interpretable machine learning. Begegnungszone: Statistical Physics and Machine Learning, 18.-21. September 2023, Leipzig.
Wright MN. Random forests on high-dimensional data: From classification and survival analysis to generative modelling. Seminar des Graduiertenkollegs 2624 der Technischen Universität Dortmund, 4. Juli 2022, Dortmund.
Wright MN. Random forests: Myths and facts. 52nd Workshop Statistical Computing, 24-27 July 2022, Günzburg.
Wright MN. Interpretable machine learning. Interpretable Machine Learning Workshop with the School of Statistics and Actuarial Science, University of the Witwatersrand, 19-20 September 2022, Johannesburg, South Africa.
Wright MN. Machine learning for survival data. Seminar des Instituts für Medizinische Biometrie, Epidemiologie und Informatik (IMBEI), Universitätsmedizin der Johannes Gutenberg-Universität Mainz, 10. Juni 2021, Mainz.
Wright MN. Model-agnostic interpretable machine learning. MOOD (MOnitoring Outbreaks for Disease surveillance in a data science context) Webinar, 30 June 2021, online presentation.
Wright MN. Interpretable machine learning in genetics. XXXIInd Conference of the Austro-Swiss Region (ROeS) of the International Biometric Society, 9 September 2021, Salzburg, Austria.
Wright MN. Random forests: Myths and facts. Kolloquium "Statistische Methoden in der empirischen Forschung," 23. November 2021, Online-Vortrag.
Wright MN. Machine learning for time to event data. Expertenvortrag im Workshop des Projekts "ARTEMIS - Künstliche Intelligenz bei muskuloskelettalen Erkrankungen", 24. September 2021, Online-Vortrag.
Wright MN. Genome-wide interpretable machine learning. Seminar Series of the Charles Bronfman Institute for Personalized Medicine at the Icahn School of Medicine at Mount Sinai, 8 December 2021, online presentation.

Vorträge bei wissenschaftlichen Tagungen

Burk L, Zobolas J, Bischl B, Bender A, Lang M, Wright MN, Sonabend R. A large-scale neutral comparison study of survival models. 70th Biometric Colloquium, 28 February-1 March 2024, Lübeck.
Golchian P, Kapar J, Blesch K, Watson DS, Wright MN. Adversial random forests for imputing missing values. 70th Biometric Colloquium, 28 February-1 March 2024, Lübeck.
Burk L, Bender A, Wright MN. High-dimensional variable selection for competing risks with cooperative penalized regression. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
Koenen N, Wright MN. Interpreting neural networks: A biostatistical perspective. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
Blesch K, Watson DS, Wright MN. Conditional variable importance for mixed data. 6. Konferenz der Deutschen Arbeitsgemeinschaft Statistik (DAGStat), 28. März-1. April 2022, Hamburg.
Koenen N, Wright MN. Interpreting deep neural networks with the R package innsight. 6. Konferenz der Deutschen Arbeitsgemeinschaft Statistik (DAGStat), 28. März-1. April 2022, Hamburg.
Koenen N, Wright MN. Interpreting deep neural networks with the R package innsight. The R User Conference "UseR!," 20-23 June 2022, online presentation.
Wright MN, Blesch K, Watson DS. Testing conditional independence in supervised learning algorithms with the cpi package. The R User Conference "UseR!," 20-23 June 2022, online presentation.
Wright MN. Genome-wide conditional independence testing with machine learning. 67. Biometrisches Kolloquium der Deutschen Region der Internationalen Biometrischen Gesellschaft (IBS-DR), 14.-17. März 2021, Online-Vortrag.

Poster bei wissenschaftlichen Tagungen

Watson DS, Blesch K, Kapar J, Wright MN. Adversarial random forests for density estimation and generative modeling. 26th International Conference on Artificial Intelligence and Statistics (AISTATS), 25-27 April 2023, Valenica, Spain.

Software

Blesch K, Wright MN. arfpy. (Version 0.1.1); 2023.
https://github.com/bips-hb/arfpy
Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.2.0); 2023.
https://github.com/bips-hb/innsight
Wright MN, Wager S, Probst P. ranger: A fast implementation of random forests. (Version 0.16.0); 2023.
https://cran.r-project.org/package=ranger
Wright MN, Watson DS. arf: Adversarial random forests. (Version 0.1.3); 2023.
https://cran.r-project.org/package=arf
Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.1.1); 2022.
https://cran.r-project.org/package=innsight
Wright MN, Watson DS. cpi: Conditional predictive impact. (Version 0.1.4); 2022.
https://cran.r-project.org/package=cpi
Wright MN, Watson DS. arf: Adversarial random forests. (Version 0.1.2); 2022.
https://cran.r-project.org/package=arf
Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.1.0); 2021.
https://cran.r-project.org/package=innsight

Suche mit Stichworten