Yoshihiro Izawa

Affiliation

The University of Tokyo, Graduate School of IST, M1

Yamakata Laboratory

Research Field

Mechanistic Interpretability

About

My primary research area is Mechanistic Interpretability — understanding the internal mechanisms of Large Language Models (LLMs). I work on improving model interpretability and controllability through techniques such as Activation Steering.

Publications

Steering at the Source: Style Modulation Heads for Robust Persona Control

Yoshihiro Izawa, Gouki Minegishi, Koshi Eguchi, Sosuke Hosokawa, Kenjiro Taura

First AuthorICLR 2026 Workshop

arXiv Code OpenReview

Loading Publications...

Awards

Dean's Award for Research—Faculty of Engineering, The University of Tokyo2026Research

Loading Awards...

Education

The University of TokyoGraduate School of Information Science and Technology / Yamakata Laboratory2026.4 - PresentCurrent

The University of TokyoDepartment of Electrical Engineering and Information Systems / Taura Laboratory2022.4 - 2026.3

Loading Education...

Presentations

Coming Soon...

Loading Presentations...