Publications
Full list on Google Scholar.
EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law
I. Lichkovski, A. Müller, M. Ibrahim, T. Mhundwa
arXiv preprint, 2025
A benchmark for evaluating how well LLM agents comply with EU legislation. Investigates whether language model agents exhibit illegal behavior when deployed in realistic scenarios governed by European law.
Uncovering Internal Prediction Mechanisms of Transformer-Based Chemical Foundation Models
A. Müller, J. Cardenas-Cartagena, R. Pollice
ChemRxiv, 2025
Applying mechanistic interpretability techniques to chemical foundation models, investigating how transformers internally represent and predict molecular properties. Follow-up from BSc thesis research.
From Steering Vectors to Conceptors: Compositional Affine Activation Steering for LLMs
S. Abreu, J. Postmus, A. Müller, J.L. Ferrao, I. Lichkovski, K.F. Michalak, et al.
2025
Extending steering vectors with conceptors for more fine-grained and composable control of LLM behavior through compositional activation steering.
Collective Deliberation for Safer CBRN Decisions: A Multi-Agent LLM Debate Pipeline
A. Müller, A. Golicins, G. Lesnic
AI Safety Initiative Groningen · Apart Research, 2025
Testing whether structured multi-agent LLM debate can improve reliability in high-stakes CBRN decision-making without model retraining. Collective debate significantly outperformed the best individual model, with resilience to adversarial persuasion.
Course Projects
Selected academic projects from my MSc coursework, ranging from multi-agent systems and epistemic logic to advanced machine learning and proof theory.
The Wisdom of the LLM Crowd
A. Todorov, A. Müller, M. Umaña, S. Ferguson
Design of Multi-Agent Systems · University of Groningen
Can collective LLM deliberation produce moral judgments more aligned with human preferences than individual models? We simulated deliberative groups of five LLMs on 1,000 Moral Machine dilemmas and found that deliberation actually worsened moral alignment. Debate-induced extremization collapsed nuanced priors into simplistic heuristics.
Intervention Analysis on the Latent Space of Variational Autoencoders
A. Müller, A. Predescu, T. Ludwig, M. Umaña Lemus, A. Sultanji
Advanced Machine Learning · University of Groningen
How does training data diversity influence uncertainty and reconstruction blur in VAEs? We varied class and sample diversity on MNIST/FashionMNIST, then applied intervention analysis to identify which latent dimensions contribute most to reconstruction error. More diverse data leads to richer, more interpretable latent spaces.
Bridging Social Choice and Dynamic Epistemic Logic by Modeling Strategic Voting
A. Müller, J. Wallinga, J. de Vries, L. Tanis
Logical Aspects of Multi-Agent Systems (LAMAS) · University of Groningen
Modeling strategic voting in repeated elections using dynamic epistemic logic. We model polls as public announcements in S5 epistemic logic and use the Kendall Tau distance as a manipulation heuristic, showing how a strategic voter can successfully manipulate outcomes while inadvertently creating common knowledge.
Modal Logic and Proof Theory: Sequent Calculus Proofs
A. Müller
Modal Logic and Proof Theory · University of Groningen
Individual assignment on sequent calculus derivations for propositional and modal logic. Covers proofs of standard axioms, cut elimination, and derivations in systems including S4 and S5 modal logic.