Research Icon

Yoshihiro Izawa

Affiliation

The University of Tokyo, Graduate School of IST, M1

Yamakata Laboratory

Research Field

Mechanistic Interpretability

About

My primary research area is Mechanistic Interpretability — understanding the internal mechanisms of Large Language Models (LLMs). I work on improving model interpretability and controllability through techniques such as Activation Steering.

Loading Publications...
Loading Awards...
Loading Education...
Loading Presentations...