Statistische Inferenz mit maschinellen Lernverfahren

Beschreibung

Die Nachwuchsgruppe beschäftigt sich mit der Entwicklung statistischer Inferenzmethoden für maschinelle Lernverfahren. Besonderes Augenmerk legen wir auf epidemiologische Probleme wie Confounding, hochdimensionale Daten und Überlebenszeitanalysen. Das Projekt ist methodischer Natur, konzentriert sich jedoch stark auf Anwendungen. Die von uns entwickelten Methoden sind als Softwarepakete öffentlich verfügbar, die von Praktikern und angewandten Forschern direkt verwendet werden können.

Die Forschungsschwerpunkte der Nachwuchsgruppe sind:
  • Interpretierbares maschinelles Lernen
  • Statistische Eigenschaften von maschinellen Lernverfahren
  • Überlebenszeitanalyse
  • Statistische Software
  • Anwendung auf hochdimensionale Daten

Die Gruppe wird durch das Emmy Noether-Programm der Deutschen Forschungsgemeinschaft (DFG) gefördert und von Marvin N. Wright geleitet.

Förderzeitraum

Beginn:   Mai 2020
Ende:   Dezember 2026

Förderer

  • Deutsche Forschungsgemeinschaft (DFG)

Kontaktperson

Prof. Dr. Marvin N. Wright

Ausgewählte Veröffentlichungen zum Projekt

    Zeitschriftenartikel mit peer-review

  • Blesch K, Watson DS, Wright MN. Conditional feature importance for mixed data. AStA Advances in Statistical Analysis. 2024; (Epub 2023 Apr 29).
    https://doi.org/10.1007/s10182-023-00477-9
  • Askland KD, Strong D, Wright MN, Moore JH. The translational machine: A novel machine-learning approach to illuminate complex genetic architectures. Genetic Epidemiology. 2021;45(5):485-536.
    https://dx.doi.org/10.1002/gepi.22383
  • Watson DS, Wright MN. Testing conditional independence in supervised learning algorithms. Machine Learning. 2021;110(8):2107-2129.
    https://doi.org/10.1007/s10994-021-06030-6
  • Editorials

  • Boulesteix A-L, Wright MN. Special issue: Artificial intelligence in genomics. Human Genetics. 2022;141(9):1449-1450.
    https://doi.org/10.1007/s00439-022-02472-7
  • Beiträge zu Büchern und Proceedings

  • Binder M, Pfisterer F, Becker M, Wright MN. Non-sequential pipelines and tuning. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 174-195
    https://mlr3book.mlr-org.com/chapters/chapter8/non-sequential_pipelines_and_tuning.html
  • Casalicchio G, Burk L. Evaluation and benchmarking. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 53-82
    https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html
  • Dandl S, Biecek P, Casalicchio G, Wright MN. Model interpretation. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 259-282
    https://mlr3book.mlr-org.com/chapters/chapter12/model_interpretation.html
  • Wright MN. Feature selection. In: Bischl B, Sonabend R, Kotthoff L, Lang M, Herausgeber. Applied machine learning using mlr3 in R. Boca Raton: CRC Press. 2024. S. 146-160
    https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html
  • Vorträge bei wissenschaftlichen Tagungen (eingeladen)

  • Blesch K, Watson DS, Wright MN. Conditional feature importance for mixed data. Seminar in Econometrics, 2 May 2023, Cologne.
  • Blesch K, Wright MN, Watson DS. Unfooling SHAP and SAGE: Knockoff imputation for Shapley values. 2nd TRR 318 Conference "Measuring Understanding," 6-7 November 2023, Paderborn.
  • Swenne A, Wright MN. Confounder adjustment with random forests based on local residuals in genetic association studies. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
  • Wright MN. From explainable AI to generative modeling with tree-based machine learning. Statistics and Econometrics Seminar, Humboldt-Universität zu Berlin, 17 October 2023, Berlin.
  • Wright MN. Interpretable machine learning. Begegnungszone: Statistical Physics and Machine Learning, 18.-21. September 2023, Leipzig.
  • Wright MN. Random forests on high-dimensional data: From classification and survival analysis to generative modelling. Seminar des Graduiertenkollegs 2624 der Technischen Universität Dortmund, 4. Juli 2022, Dortmund.
  • Wright MN. Random forests: Myths and facts. 52nd Workshop Statistical Computing, 24-27 July 2022, Günzburg.
  • Wright MN. Interpretable machine learning. Interpretable Machine Learning Workshop with the School of Statistics and Actuarial Science, University of the Witwatersrand, 19-20 September 2022, Johannesburg, South Africa.
  • Wright MN. Machine learning for survival data. Seminar des Instituts für Medizinische Biometrie, Epidemiologie und Informatik (IMBEI), Universitätsmedizin der Johannes Gutenberg-Universität Mainz, 10. Juni 2021, Mainz.
  • Wright MN. Model-agnostic interpretable machine learning. MOOD (MOnitoring Outbreaks for Disease surveillance in a data science context) Webinar, 30 June 2021, online presentation.
  • Wright MN. Interpretable machine learning in genetics. XXXIInd Conference of the Austro-Swiss Region (ROeS) of the International Biometric Society, 9 September 2021, Salzburg, Austria.
  • Wright MN. Random forests: Myths and facts. Kolloquium "Statistische Methoden in der empirischen Forschung," 23. November 2021, Online-Vortrag.
  • Wright MN. Machine learning for time to event data. Expertenvortrag im Workshop des Projekts "ARTEMIS - Künstliche Intelligenz bei muskuloskelettalen Erkrankungen", 24. September 2021, Online-Vortrag.
  • Wright MN. Genome-wide interpretable machine learning. Seminar Series of the Charles Bronfman Institute for Personalized Medicine at the Icahn School of Medicine at Mount Sinai, 8 December 2021, online presentation.
  • Vorträge bei wissenschaftlichen Tagungen

  • Burk L, Zobolas J, Bischl B, Bender A, Lang M, Wright MN, Sonabend R. A large-scale neutral comparison study of survival models. 70th Biometric Colloquium, 28 February-1 March 2024, Lübeck.
  • Golchian P, Kapar J, Blesch K, Watson DS, Wright MN. Adversial random forests for imputing missing values. 70th Biometric Colloquium, 28 February-1 March 2024, Lübeck.
  • Burk L, Bender A, Wright MN. High-dimensional variable selection for competing risks with cooperative penalized regression. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
  • Koenen N, Wright MN. Interpreting neural networks: A biostatistical perspective. 5th Conference of the Central European Network (CEN), 3-7 September 2023, Basel, Switzerland.
  • Blesch K, Watson DS, Wright MN. Conditional variable importance for mixed data. 6. Konferenz der Deutschen Arbeitsgemeinschaft Statistik (DAGStat), 28. März-1. April 2022, Hamburg.
  • Koenen N, Wright MN. Interpreting deep neural networks with the R package innsight. 6. Konferenz der Deutschen Arbeitsgemeinschaft Statistik (DAGStat), 28. März-1. April 2022, Hamburg.
  • Koenen N, Wright MN. Interpreting deep neural networks with the R package innsight. The R User Conference "UseR!," 20-23 June 2022, online presentation.
  • Wright MN, Blesch K, Watson DS. Testing conditional independence in supervised learning algorithms with the cpi package. The R User Conference "UseR!," 20-23 June 2022, online presentation.
  • Wright MN. Genome-wide conditional independence testing with machine learning. 67. Biometrisches Kolloquium der Deutschen Region der Internationalen Biometrischen Gesellschaft (IBS-DR), 14.-17. März 2021, Online-Vortrag.
  • Poster bei wissenschaftlichen Tagungen

  • Watson DS, Blesch K, Kapar J, Wright MN. Adversarial random forests for density estimation and generative modeling. 26th International Conference on Artificial Intelligence and Statistics (AISTATS), 25-27 April 2023, Valenica, Spain.
  • Software

  • Blesch K, Wright MN. arfpy. (Version 0.1.1); 2023.
    https://github.com/bips-hb/arfpy
  • Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.2.0); 2023.
    https://github.com/bips-hb/innsight
  • Wright MN, Wager S, Probst P. ranger: A fast implementation of random forests. (Version 0.16.0); 2023.
    https://cran.r-project.org/package=ranger
  • Wright MN, Watson DS. arf: Adversarial random forests. (Version 0.1.3); 2023.
    https://cran.r-project.org/package=arf
  • Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.1.1); 2022.
    https://cran.r-project.org/package=innsight
  • Wright MN, Watson DS. cpi: Conditional predictive impact. (Version 0.1.4); 2022.
    https://cran.r-project.org/package=cpi
  • Wright MN, Watson DS. arf: Adversarial random forests. (Version 0.1.2); 2022.
    https://cran.r-project.org/package=arf
  • Koenen N, Baudeu R. innsight: Get the insights of your neural network. (Version 0.1.0); 2021.
    https://cran.r-project.org/package=innsight