Publications

Full list on Google Scholar.

EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law

I. Lichkovski, A. Müller, M. Ibrahim, T. Mhundwa

arXiv preprint, 2025

A benchmark for evaluating how well LLM agents comply with EU legislation. Investigates whether language model agents exhibit illegal behavior when deployed in realistic scenarios governed by European law.

Uncovering Internal Prediction Mechanisms of Transformer-Based Chemical Foundation Models

A. Müller, J. Cardenas-Cartagena, R. Pollice

ChemRxiv, 2025

Applying mechanistic interpretability techniques to chemical foundation models, investigating how transformers internally represent and predict molecular properties. Follow-up from BSc thesis research.

From Steering Vectors to Conceptors: Compositional Affine Activation Steering for LLMs

S. Abreu, J. Postmus, A. Müller, J.L. Ferrao, I. Lichkovski, K.F. Michalak, et al.

2025

Extending steering vectors with conceptors for more fine-grained and composable control of LLM behavior through compositional activation steering.

Collective Deliberation for Safer CBRN Decisions: A Multi-Agent LLM Debate Pipeline

A. Müller, A. Golicins, G. Lesnic

AI Safety Initiative Groningen · Apart Research, 2025

Testing whether structured multi-agent LLM debate can improve reliability in high-stakes CBRN decision-making without model retraining. Collective debate significantly outperformed the best individual model, with resilience to adversarial persuasion.

Course Projects

Selected academic projects from my MSc coursework, ranging from multi-agent systems and epistemic logic to advanced machine learning and proof theory.

The Wisdom of the LLM Crowd

A. Todorov, A. Müller, M. Umaña, S. Ferguson

Design of Multi-Agent Systems · University of Groningen

Can collective LLM deliberation produce moral judgments more aligned with human preferences than individual models? We simulated deliberative groups of five LLMs on 1,000 Moral Machine dilemmas and found that deliberation actually worsened moral alignment. Debate-induced extremization collapsed nuanced priors into simplistic heuristics.

Multi-Agent Systems AI Alignment Moral Machine LLM Debate
Read PDF →

Intervention Analysis on the Latent Space of Variational Autoencoders

A. Müller, A. Predescu, T. Ludwig, M. Umaña Lemus, A. Sultanji

Advanced Machine Learning · University of Groningen

How does training data diversity influence uncertainty and reconstruction blur in VAEs? We varied class and sample diversity on MNIST/FashionMNIST, then applied intervention analysis to identify which latent dimensions contribute most to reconstruction error. More diverse data leads to richer, more interpretable latent spaces.

VAEs Interpretability Uncertainty

Bridging Social Choice and Dynamic Epistemic Logic by Modeling Strategic Voting

A. Müller, J. Wallinga, J. de Vries, L. Tanis

Logical Aspects of Multi-Agent Systems (LAMAS) · University of Groningen

Modeling strategic voting in repeated elections using dynamic epistemic logic. We model polls as public announcements in S5 epistemic logic and use the Kendall Tau distance as a manipulation heuristic, showing how a strategic voter can successfully manipulate outcomes while inadvertently creating common knowledge.

Epistemic Logic Social Choice Theory Game Theory
Read PDF →

Modal Logic and Proof Theory: Sequent Calculus Proofs

A. Müller

Modal Logic and Proof Theory · University of Groningen

Individual assignment on sequent calculus derivations for propositional and modal logic. Covers proofs of standard axioms, cut elimination, and derivations in systems including S4 and S5 modal logic.

Modal Logic Proof Theory Sequent Calculus
Read PDF →