Research Associate Professor of Computer Science at UT Austin
Contact: wbradleyknox@gmail.com (potential students, read instructions at bottom before emailing)
Google Scholar
News:
- Harmful Traits of AI Companions pre-print is on arxiv. It’s readable for technical and non-technical audiences.
- Paper on leveraging VLMs to help you avoid distraction on computing devices accepted to CHI 2026
- Congrats to first author Callie Muslimani on our RLC 2025 paper on an alignment score for reward functions that won Outstanding Paper Award on Emerging Topics in Reinforcement Learning!
My most recent research on RLHF and value-aligned reward specification can be found here.
Here is a 5-minute talk I gave on reframing the common RLHF fine-tuning algorithm at the New Orleans Alignment Workshop. A longer version that covers much more is embedded below.
My research career has encompassed the control of robots and other agents, machine learning (reinforcement learning in particular), human-computer interaction, and computational models of human behavior for cognitive science research. I am particularly drawn to specifying problems, both doing so myself for novel research topics and studying how to enable users to specify problems for learning agents such that agents’ objectives are aligned with the users’ interests.
If you are interested in joining my group and are not a UT Austin student, please apply directly to UT Austin. Do not contact me! I do not control the admissions process: admission is based on grades, previous research experience, your research statement, and the quality of your reference letters. If you would like to work with me, then first apply to UT Austin and then contact me once your are admitted and mention “B. F. Skinner” in the first few lines of your email (bonus if it’s in the natural flow of your message). Otherwise, to reserve time for my research activities I will probably not respond. I also do not plan to collaborate with high school students and likewise will probably not respond.