Listen

Description

Jacob Steinhardt (Google Scholar) (Website) is an assistant professor at UC Berkeley. ย His main research interest is in designing machine learning systems that are reliable and aligned with human values. ย Some of his specific research directions include robustness, rewards specification and reward hacking, as well as scalable alignment.

Highlights:

๐Ÿ“œโ€œTest accuracy is a very limited metric.โ€

๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆโ€œYou might not be able to get lots of feedback on human values.โ€

๐Ÿ“Šโ€œIโ€™m interested in measuring the progress in AI capabilities.โ€