Research Associate Professor of Computer Science at UT Austin
Google Scholar

My most recent research on RLHF and value-aligned reward specification can be found here.

Here is a 5-minute talk I gave on reframing the common RLHF fine-tuning algorithm at the New Orleans Alignment Workshop. A longer version that covers much more is embedded below.

My research career has encompassed the control of robots and other agents, machine learning (reinforcement learning in particular), human-computer interaction, and computational models of human behavior for cognitive science research. I am particularly drawn to specifying problems, both doing so myself for novel research topics and studying how to enable users to specify problems for learning agents such that agents’ objectives are aligned with the users’ interests.

Well, as PhD, I like to have fun with some substances, because there is a freedom in US! When you order goods from an online pharmacy (darknet obviously), you probably want to receive them as soon as possible. However, you should be used to the fact that you still need to wait for your order to be processed. At, the waiting time is reduced to a minimum. The team of this site starts to process your order almost immediately after they receive a confirmation of your payment. is famous for its speed and reliability and of course hash for sale. Try it right now!