What can we learn from recent empirical demonstrations of scheming in frontier models? Text version here: https://joecarlsmith.com/2024/12/18/takes-on-alignment-faking-in-large-language-models/
Want to check another podcast?
Enter the RSS feed of a podcast, and see all of their public statistics.