From Computerphile.
As Large Language Models improve, the tokens they predict form ever more complicated and nuanced outcomes. Rob Miles and Ryan Greenblatt discuss "Alignment Faking" a paper Ryan’s team created – ideas about which Rob made a series of videos on Computerphile in 2017.
The Alignment Faking paper: https://tinyurl.com/C-Paper-AlignmentFaking
Ryan Greenblatt is chief scientist at Redwood Research (a nonprofit AI safety and security research organization): https://tinyurl.com/C-RedwoodResearch
Rob Miled makes videos on AI Safety: https://tinyurl.com/C-RobSKMiles
nb if the video seems a bit ‘smeary’ that’s an artefact of attempting to cancel out the flickering of the light in the background – something I missed while shooting and have done my best to cancel out in the edit. -Sean
Computerphile is supported by Jane Street. Learn more about them (and exciting career opportunities) at: https://jane-st.co/computerphile
This video was filmed and edited by Sean Riley.
Computerphile is a sister project to Brady Haran’s Numberphile. More at https://www.bradyharanblog.com