Adrià Garriga-Alonso
AI Alignment Researcher
Technical Co-founder @ Dokimasia
I research how to make AI systems safe and interpretable. My work on Automated Circuit Discovery (ACDC) helped establish mechanistic interpretability as a field, with ~400 citations and adoption across major AI labs.
Currently building Dokimasia, helping humans realize their values when using computers. Previously at FAR AI leading interpretability research, and Redwood Research. PhD from Cambridge on Bayesian neural networks.
Experience
2026–present
Technical Co-founder
Dokimasia
Building tools to help humans realize their values when using computers.
2023–2025
Research Scientist
FAR AI
Led interpretability research, managed team of 3, collaborated with 11. Built GPU infrastructure (8-80 GPUs).
2022–2023
Member of Technical Staff
Redwood Research
Correctness testing for optimizing compilers. Mentored 8 interns.
2017–2021
PhD in Machine Learning
University of Cambridge
Thesis: "Priors in finite and infinite Bayesian convolutional neural networks"