GARRIGA-ALONSO, A. // PERSONNEL FILE

MISSION BRIEFING SEC_LEVEL: PUBLIC

Researching methods to make AI systems safe and interpretable. Senior author on Automated Circuit Discovery (ACDC), a subfield-defining work in mechanistic interpretability with ~400 citations, now adopted across major AI laboratories.

Currently building DOKIMASIA — tools to help humans realize their values when using computers. Previously led interpretability research at FAR AI, built testing infrastructure at Redwood Research. PhD from Cambridge on Bayesian neural networks under Carl E. Rasmussen.

OPERATIONAL HISTORY TIMELINE

2026 — PRESENT

Technical Co-founder

DOKIMASIA

Building tools for value-aligned computing. Shield against unwanted information.

2023 — 2025

Research Scientist

FAR AI

Led interpretability research. Managed team of 3, collaborated with 11. Built GPU infrastructure (8-80 GPUs).

2022 — 2023

Member of Technical Staff

Redwood Research

Correctness testing for optimizing compilers. Mentored 8 interns across 4 projects.

2017 — 2021

PhD Machine Learning

University of Cambridge

Thesis: "Priors in finite and infinite Bayesian convolutional neural networks"

2016 — 2017

MSc Computer Science

University of Oxford // Distinction

2012 — 2016

BSc Computer Science

Pompeu Fabra University // 1st in class

RESEARCH OUTPUT SELECTED

NeurIPS 2023 // Spotlight

Towards Automated Circuit Discovery for Mechanistic Interpretability

A. Conmy, A. Mavor-Parker, A. Lynch, S. Heimersheim, A. Garriga-Alonso

▲ ~400 CITATIONS

ICLR 2019

Deep Convolutional Networks as Shallow Gaussian Processes

A. Garriga-Alonso, L. Aitchison, C.E. Rasmussen

▲ ~330 CITATIONS

Alignment Forum

Causal Scrubbing: A Method for Rigorously Testing Interpretability Hypotheses

L. Chan, A. Garriga-Alonso, N. Goldowsky-Dill, R. Greenblatt, et al.

▲ ~90 CITATIONS

FIELD REPORTS CONTINUOUSLY UPDATED

ML Why Deep Learning Works: Specificity, Not Flexibility UPD: 2026.01.15

PHIL On Consciousness and Moral Weight UPD: 2025.12.20

MATH Testing Integrals: A Practical Guide UPD: 2025.11.08

ETHICS On Killing vs Letting Die UPD: 2025.10.14

TOOLS Remote Development with Unison UPD: 2025.09.22