From Computerphile.
Following the theme of AI research and safety, Aric Floyd talks about how some Large Language Models might follow the all too human trait of sandbagging – "lying" about their true capabilities.
AI Sandbagging Paper: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations
Computerphile is supported by Jane Street. Learn more about them (and exciting career opportunities) at: https://jane-st.co/computerphile
This video was filmed and edited by Sean Riley.
Computerphile is a sister project to Brady Haran’s Numberphile. More at https://www.bradyharanblog.com