Jonathan Crabbé

Jonathan Crabbé

PhD Researcher

University of Cambridge


I am currently working towards my PhD thesis in the Applied Mathematics Department at the University of Cambridge. In this stimulating environment, I am learning to become a well-rounded Machine Learning researcher.

We are gradually entering in a phase where humans will increasingly interact with generative and predictive AIs, hence forming human-AI teams. I see an immense potential in these teams to approach cutting-edge scientific and medical problems. My research focuses on making these teams more efficient by improving the information flow between complex ML models and human users. This touches upon various subjects of the AI literature, including ML Interpretability, Robust ML and Data-Centric AI. In some sense, my goal is to build this microscope that would allow human beings to look inside a machine learning model. Through the interface of this microscope, human beings can rigorously validate ML models, extract knowledge from them and learn to use these models more efficiently.

Download my resumé.

  • Interpretability
  • Robust ML
  • Generative AI
  • ML for Science and Healthcare
  • Representation Learning
  • Data-Centric AI
  • PhD in Applied Mathematics, 2020-2024

    University of Cambridge

  • MASt in Applied Mathematics, 2018-2019

    University of Cambridge

  • M1 in Physics, 2017-2018

    Ecole Normale Supérieure Paris

  • Bachelor in Engineering, 2014-2017

    Université Libre de Bruxelles



Experience in big-tech companies, implementation of SOTA ML

Mathematical Modelling

Strong mathematical background, my publications have strong theoretical components


Presentation of my research at many prestigious venues (NeurIPS, ICML, ICLR)

Team Working

~50% of my publications are the result of collaborative work


~50% of my publications are the result of autonomous work


Supervision of several MPhil and PhD students, creation of pedagogical YouTube videos


Apple Research
Research Intern
May 2023 – Oct 2023 Cambridge, UK
Microsoft Research
Research Intern
Feb 2023 – May 2023 Cambridge, UK
  • Conducted research in the AI4Science team.
  • Contributed to the development of generative ML techniques for material discovery.
  • Presented findings in front of AI4Science team, results will be integrated in a big paper.
University of Cambridge
PhD Researcher
Oct 2020 – Present Cambridge, UK
  • Conducted research in various sub-fields of machine learning.
  • Published several papers in top-tier conferences (NeurIPS, ICML).
  • Supervised the research of several MPhil/PhD students.
Quantitative Research Intern
Jun 2022 – Sep 2022 London, UK
  • Conducted research in machine learning applied to quantitative finance.
  • Learned to turn raw financial data into predictive features.
  • Presented findings in front of quant managers.
Université Libre de Bruxelles
Research Intern
Oct 2019 – Oct 2020 Bruxelles, BE
  • Conducted research in black holes physics.
  • Created several pedagogical videos to help young students with maths and physics.
  • Responsible of physics example classes for first year pharma students.
Imperial College London
Research Intern
Feb 2018 – Jul 2018 London, UK
  • Conducted research in quantum field theory and cosmology.
  • Implemented numerical solver for simulating the evolution of a toy cosmological model.
  • Demonstrated the emergence of a new type of singularity called caustics.


PhD Fellowship
Full PhD funding (tuition and maintenance).
Research Assistant Fellowship
Funding for a year of research.
Jennings Price
Awarded based on outstanding results for my MASt.
Awarded based on academic excellence.

Recent Publications

Quickly discover relevant content by filtering publications.
(2023). Robust multimodal models have outlier features and encode more concepts. In arXiv.

PDF Cite Project Project Project

(2023). Explaining the Absorption Features of Deep Learning Hyperspectral Classification Models. In IGARSS 2023.

PDF Cite Project

(2023). TRIAGE: Characterizing and auditing training data for improved regression. In NeurIPS 2023.

PDF Cite Project Poster

(2023). TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization. In ICLR 2023.

PDF Cite Code Project

(2023). Joint Training of Deep Ensembles Fails Due to Learner Collusion. In NeurIPS 2023.

PDF Cite Project Project Poster

(2022). Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data. In NeurIPS 2022.

PDF Cite Code Project

(2022). Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability. NeurIPS 2022 Datasets and Benchmarks Track.

PDF Cite Code Project Project

(2022). Data-SUITE: Data-centric identification of in-distribution incongruous examples. In ICML 2022.

PDF Cite Project

(2022). Latent Density Models for Uncertainty Categorization. In NeurIPS 2023.

PDF Cite Project Project Poster

(2021). Explaining Latent Representations with a Corpus of Examples. In NeurIPS 2021.

PDF Cite Code Project Video

(2021). Explaining Time Series Predictions with Dynamic Masks. In ICML 2021.

PDF Cite Code Project Video

(2020). Learning outside the black-box: the pursuit of interpretable models. In NeurIPS 2020.

PDF Cite Code Project Video