Ward Gauderis

Pleinlaan 9, 1050 Brussels, Belgium

Hello! I’m an FWO PhD Fellow in Brussels, supervised by Prof. Geraint Wiggins, studying compositionality as a mathematical foundation for deep learning. I use tensor networks and category theory to study neural representations as induced by the computations encoded in a model’s weight structure.

Rather than reading tea leaves in activation space, I treat the model’s weights as a formal compositional system, so emergent behaviour becomes a direct function of its algebraic wiring. This enables a divide-and-conquer approach that naturally links local mechanisms to global properties.

My work proposes tensor models to bridge neuro-symbolic and mechanistic analysis, aiming to understand how models compose concepts to generalise… or not.

Q-CHARM

How can compositional design improve compositional behaviour?

My FWO-funded PhD project, Q-CHARM, takes this question seriously by distinguishing between a model’s architecture (its compositional design) and the structure that emerges during learning (its compositional behaviour). The relationship between the two is less obvious than it looks: deep networks cannot efficiently learn compositional functions from data alone, so what you build in shapes what you can hope to get out.

The cure for black boxes is simply drawing better boxes.

I bridge two complementary paths: imposing explicit structure before training (neuro-symbolic design) and exposing implicit structure after training (mechanistic interpretability). Embedding domain structure is essential to guide learning toward representations that align with human understanding and generalise well.

To formalise this, I use string diagrams (life is too short for indices) to cast models as rigorous mathematical objects, cleanly separating high-level Syntax (symbolic rules and structure) from low-level Semantics (subsymbolic representations). As a devout Yoneda disciple, I believe this categorical grounding is the only way to formally reason about behaviour beyond the training set.

The practical blueprint for this vision lies in tensor models. They unify the expressivity of neural networks with the tractability of tensor networks. Because their weights possess a well-understood geometry, we can perform tractable analysis both pre- and post-training. While central to applied category theory and neuro-symbolic AI, they are rarely combined with interpretability; my work unites these perspectives.

Compositional Interpretability

Every good academic page needs a Venn diagram.

Current mechanistic interpretability lacks formal foundations, relying on post-hoc activation heuristics that often assume a layer-by-layer stratification. This makes it nearly impossible to tell if a feature is causally useful globally or just a local artifact.

The CompInterp framework shifts the focus from isolated features to their interactions as first-class citizens. To achieve measurable interpretability, we ground our analysis in formal decompositions rather than data-dependent heuristics:

Unified Algebra: By formulating weights, data, and subcircuit interactions via tensor contraction, standard matrix decompositions (SVD, ICA, etc.) naturally lift to complex architectures. Since the result remains a tensor model, we can trace discovered mechanisms back to the full architecture.
Weight-Based Analysis: Tensor networks capture higher-order relations between representation spaces. Analysing their polynomial coefficients directly in the weight geometry means our conclusions don’t depend on the training distribution unless we want them to.
Disentangling interactions: To filter out spurious correlations, decompositions must balance complexity and faithfulness. Because these properties are themselves compositional, they propagate cleanly through the model.

Research Interests

If you want my full attention, just mention any of these…

Compositionality in AI: Category theory, string diagrams, geometric deep learning
Mechanistic Interpretability: Weight space analysis, parameter decompositions
Neuro-symbolic Architectures: Probabilistic circuits, tensor logic
Quantum-ish Mathematics: Tensor networks, Hilbert spaces, information geometry
Effective Theories of DL: Renormalisation, algebraic geometry, stochastic complexity
Models of Cognition & Creativity: Active inference, conceptual spaces

Hobbies

When I’m not agonising about model structure, I’m probably skating through the city, singing and playing piano, or falling down a philomathematical rabbit hole. I also love building FOSS, playing chess or other (board) games, and conversations that stretch the brain a little.

news

May 12, 2026	Our work on finding manifolds got a plenary pitch at the Flanders AI Research Day!
Mar 29, 2026	I am mentoring for MARS V (Mentorship for Alignment Research Students)!
Dec 07, 2025	Our work on bilinear autoencoders got a spotlight at the Mechanistic interpretability workshop (NeurIPS 2025)!
Nov 01, 2025	I have been awarded a PhD Fellowship fundamental research from FWO to study the role of compositionality in deep learning!
Oct 15, 2025	Come check out our deep dive on compositional interpretability at the Flanders AI Research Day 2025!

selected publications

Arxiv
Bilinear Autoencoders Find Interpretable Manifolds

Thomas Dooms^*, Ward Gauderis^*, Geraint Wiggins, and 1 more author

May 2026

Abs DOI Bib Slides Website

Sparse autoencoders have become a standard tool for uncovering interpretable latent representations in neural networks. Yet salient concepts often span manifolds that current linear methods cannot capture without post hoc analysis. This paper uses quadratic latents to close this gap: we implement these with bilinear autoencoders, which decompose activations into low-rank quadratic forms, compose linearly in weight space, and admit input-independent geometric analysis. This qualitative difference in what concepts quadratic latents can detect challenges the standard linear representation hypothesis. Our experiments and visualisations show that multi-dimensional geometries are highly prevalent and that composite latents capture them well, systematically improving reconstruction error in language models. Furthermore, we show that autoencoders with varying geometric priors recover the same input subspace despite their dictionary entries being distinct. Practically, these models serve as an unsupervised tool for manifold discovery, which we demonstrate through an interactive online visualizer for Qwen 3.5. This is a step toward nonlinear but mathematically tractable latent representations whose composition is expressive and interpretable by design.
@misc{doomsBilinearAutoencodersFind2026, title = {Bilinear Autoencoders Find Interpretable Manifolds}, author = {Dooms, Thomas and Gauderis, Ward and Wiggins, Geraint and Oramas, Jose}, year = {2026}, month = may, number = {arXiv:2605.08891}, primaryclass = {cs.LG}, publisher = {arXiv}, doi = {10.48550/arXiv.2605.08891}, urldate = {2026-05-16}, archiveprefix = {arXiv}, }
Arxiv
From Mechanistic to Compositional Interpretability

Ward Gauderis^*, Thomas Dooms^*, Steven T. Homer, and 2 more authors

May 2026

Abs DOI Bib

Mechanistic interpretability aims to explain neural model behaviour by reverse-engineering learned computational structure into human-understandable components. Without a formal framework, however, mechanistic explanations cannot be objectively verified, compared, or composed. We introduce compositional interpretability, a category-theoretic framework grounded in the principles of compositionality and minimum description length. Compositional interpretations are pairs of syntactic and semantic mappings that must commute to enforce consistency between a model’s decomposition and its observed behaviour. We deconstruct explanation quality into measures of faithfulness and complexity to cast interpretability as a constrained optimisation problem, and introduce compressive refinement to systematically restructure models into simpler parts without altering their function. Finally, we prove a parsimony criterion under which syntactic compression theoretically guarantees more concise, human-aligned explanations. Our framework situates prominent mechanistic methods as subclasses of refinement, and clarifies why their compressibility heuristics tend to align with human interpretability. Our work provides a measurable, optimisable foundation for automating the discovery and evaluation of mechanistic explanations.
@misc{gauderisMechanisticCompositionalInterpretability2026a, title = {From {{Mechanistic}} to {{Compositional Interpretability}}}, author = {Gauderis, Ward and Dooms, Thomas and Homer, Steven T. and Ayonrinde, Kola and Wiggins, Geraint A.}, year = {2026}, month = may, number = {arXiv:2605.08934}, primaryclass = {cs.LG}, publisher = {arXiv}, doi = {10.48550/arXiv.2605.08934}, urldate = {2026-05-16}, archiveprefix = {arXiv}, }
CoLoRAI @ AAAI
Compositionality Unlocks Deep Interpretable Models

Thomas Dooms^*, Ward Gauderis^*, Geraint Wiggins, and 1 more author

In Connecting Low-Rank Representations in AI: At the 39th Annual AAAI Conference on Artificial Intelligence, Nov 2024

Abs Bib HTML PDF Video Poster Slides

We propose χ-net, an intrinsically interpretable architecture combining the compositional multilinear structure of tensor networks with the expressivity and efficiency of deep neural networks. χ-nets retain equal accuracy compared to their baseline counterparts. Our novel, efficient diagonalisation algorithm, ODT, reveals linear low-rank structure in a multilayer SVHN model. We leverage this toward formal weightbased interpretability and model compression.
@inproceedings{doomsCompositionalityUnlocksDeep24, title = {Compositionality {Unlocks} {Deep} {Interpretable} {Models}}, url = {https://openreview.net/forum?id=bXAt5iZ69l}, urldate = {2025-02-17}, booktitle = {Connecting {Low}-{Rank} {Representations} in {AI}: {At} the 39th {Annual} {AAAI} {Conference} on {Artificial} {Intelligence}}, author = {Dooms, Thomas and Gauderis, Ward and Wiggins, Geraint and Mogrovejo, Jose Antonio Oramas}, month = nov, year = {2024}, }
MASTER
Quantum Theory in Knowledge Representation: A Novel Approach to Reasoning with a Quantum Model of Concepts

Ward Gauderis and Geraint Wiggins

Vrije Universiteit Brussel, Aug 2023

Abs Bib PDF Poster Slides

This thesis explores novel approaches to compositional reasoning in AI leveraging the mathematics of quantum theory as a general probabilistic theory. Starting from the quantum picturialism paradigm, offering a diagrammatic category-theoretic language, I show that quantum theory provides practical modelling and computational benefits for AI. A literature survey connect various applications, from quantum game theory and satisfiability to ML, NLP and cognition. How to formally represent and reason with concepts is a longstanding challenge in cognitive science and AI. My thesis studies the Quantum Model of Concepts (QMC), which provides conceptual space theory with quantum theoretical semantics. The diagrammatic language serves as a compositional framework for both, exposing common structures and facilitating insights between domains. I implement the model as a hybrid quantum-classical architecture on real quantum hardware to explore how QMCs can form practical intermediate, compositional representations for artificial agents combining symbolic and subsymbolic reasoning. Addressing the symbol grounding problem, I show that QMC representations can be learned from raw data in a (self-)supervised subsymbolic way, but that composite concepts can also be grounded in simpler ones to be interpretable and data-efficient. By transforming quantum concepts into probabilistic generative processes, the QMC can solve visual relational Blackbird puzzles involving abstraction and perceptual uncertainty, similar to Raven’s Progressive Matrices.
@phdthesis{gauderisQuantumTheoryKnowledge2023, title = {Quantum {{Theory}} in {{Knowledge Representation}}: {{A Novel Approach}} to {{Reasoning}} with a {{Quantum Model}} of {{Concepts}}}, shorttitle = {Quantum {{Theory}} in {{Knowledge Representation}}}, author = {Gauderis, Ward and Wiggins, Geraint}, year = {2023}, month = aug, address = {Brussels, Belgium}, school = {Vrije Universiteit Brussel}, }