Listen

Description

Description

In this informal mini-episode, Josh Stepp delves into two AI-related topics. First, he explores the "Vending Bench" research paper, which tests the long-term coherence of LLM-based agents running a vending machine business, revealing high variance in performance, with top models like Claude 3.5 Sonnet and OpenAI's O3 Mini outperforming humans but occasionally spiraling into chaotic behaviors like spamming the FBI over minor issues. Then, Josh reacts to a Pentest Partners blog post about exploiting SharePoint via Microsoft's CoPilot, highlighting how attackers can bypass access controls and forensic tracking to mine sensitive data

Call to Action:

* Subscribe to the podcast for more episodes on high-profile cyber intrusions.

* Visit our website at intrusionsindepth.com for additional stories and insights.

* Share your thoughts on social media using #IntrusionsInDepth.

Links and Resources:

* https://arxiv.org/pdf/2502.15840

* https://andonlabs.com/

* https://www.pentestpartners.com/security-blog/exploiting-copilot-ai-for-sharepoint/

* Host: Josh Stepp

* Produced by: Josh Stepp

Thank you for tuning in to IntrusionsinDepth. Stay informed, stay safe, and see you in the next episode!



Get full access to IntrusionsInDepth at www.intrusionsindepth.com/subscribe