Research Associate Professor of Computer Science at UT Austin
Contact:
wbradleyknox@gmail.com (potential students, read instructions at bottom before emailing)
Google Scholar

My most recent research on RLHF and value-aligned reward specification can be found here.

Here is a 5-minute talk I gave on reframing the common RLHF fine-tuning algorithm at the New Orleans Alignment Workshop. A longer version that covers much more is embedded below.


My research career has encompassed the control of robots and other agents, machine learning (reinforcement learning in particular), human-computer interaction, and computational models of human behavior for cognitive science research. I am particularly drawn to specifying problems, both doing so myself for novel research topics and studying how to enable users to specify problems for learning agents such that agents’ objectives are aligned with the users’ interests.


If you are interested in joining my group and are not a UT Austin student, please apply directly to UT Austin. Do not contact me! I do not control the admissions process: admission is based on grades, previous research experience, your research statement, and the quality of your reference letters. If you would like to work with me, then first apply to UT Austin and then contact me once your are admitted and mention “B. F. Skinner” in the first few lines of your email (bonus if it’s in the natural flow of your message). Otherwise, to reserve time for my research activities I will probably not respond. I also do not plan to collaborate with high school students and likewise will probably not respond.