podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
Robin Ranjit Singh Chauhan
Shows
TalkRL: The Reinforcement Learning Podcast
Danijar Hafner on Dreamer v4
Danijar Hafner was a Research Scientist at Google DeepMind until recently.Featured References Training Agents Inside of Scalable World Models [ blog ] Danijar Hafner, Wilson Yan, Timothy LillicrapOne Step Diffusion via Shortcut ModelsKevin Frans, Danijar Hafner, Sergey Levine, Pieter AbbeelAction and Perception as Divergence Minimization [ blog ] Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess Additional References Mastering Diverse Domains through World Models [ blog ] DreaverV3l Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap Mastering Atari with Discrete...
2025-11-10
1h 40
TalkRL: The Reinforcement Learning Podcast
David Abel on the Science of Agency @ RLDM 2025
David Abel is a Senior Research Scientist at DeepMind on the Agency team, and an Honorary Fellow at the University of Edinburgh. His research blends computer science and philosophy, exploring foundational questions about reinforcement learning, definitions, and the nature of agency. Featured References Plasticity as the Mirror of Empowerment David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh A Definition of Continual RL David Abel, André...
2025-09-08
59 min
TalkRL: The Reinforcement Learning Podcast
Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025
Recorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada.Featured ReferencesLecture on the Oak Architecture, Rich SuttonAlberta Plan, Rich Sutton with Mike Bowling and Patrick Pilarski Additional ReferencesJacob Beck on Google Scholar Alex Goldie on Google ScholarCornelius Braun on Google ScholarReinforcement Learning Conference
2025-08-19
12 min
TalkRL: The Reinforcement Learning Podcast
Outstanding Paper Award Winners - 2/2 @ RLC 2025
We caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References Empirical Reinforcement Learning ResearchMitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functionsAyush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Biyik, Joseph J LimApplications of Reinforcement LearningWOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management StrategiesWilliam Solow, Sandhya Saisubramanian, Alan FernEmerging Topics in Reinforcement...
2025-08-18
14 min
TalkRL: The Reinforcement Learning Podcast
Outstanding Paper Award Winners - 1/2 @ RLC 2025
We caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References Scientific Understanding in Reinforcement Learning How Should We Meta-Learn Reinforcement Learning Algorithms? Alexander David Goldie, Zilin Wang, Jakob Nicolaus Foerster, Shimon Whiteson Tooling, Environments, and Evaluation for Reinforcement Learning Syllabus: Portable Curricula for Reinforcement Learning Agents Ryan Sullivan, Ryan Pégoud, Ameen Ur Rehman, Xinchen Yang, Junyun Huang, Aayush Verma, Nistha Mitra, John P Dickerso...
2025-08-15
06 min
TalkRL: The Reinforcement Learning Podcast
Thomas Akam on Model-based RL in the Brain
Prof Thomas Akam is a Neuroscientist at the Oxford University Department of Experimental Psychology. He is a Wellcome Career Development Fellow and Associate Professor at the University of Oxford, and leads the Cognitive Circuits research group.Featured ReferencesBrain Architecture for Adaptive BehaviourThomas Akam, RLDM 2025 TutorialAdditional ReferencesThomas Akam on Google ScholarpyPhotometry : Open source, Python based, fiber photometry data acquisition pyControl : Open source, Python based, behavioural experiment control.Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nathaniel D Daw, Yael Niv, Peter Dayan, 2005Further analysis of th...
2025-08-04
52 min
TalkRL: The Reinforcement Learning Podcast
Stefano Albrecht on Multi-Agent RL @ RLDM 2025
Stefano V. Albrecht was previously Associate Professor at the University of Edinburgh, and is currently serving as Director of AI at startup Deepflow. He is a Program Chair of RLDM 2025 and is co-author of the MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches".Featured ReferencesMulti-Agent Reinforcement Learning: Foundations and Modern ApproachesStefano V. Albrecht, Filippos Christianos, Lukas SchäferMIT Press, 2024RLDM 2025: Reinforcement Learning and Decision Making ConferenceDublin, IrelandEPy...
2025-07-22
31 min
TalkRL: The Reinforcement Learning Podcast
Satinder Singh: The Origin Story of RLDM @ RLDM 2025
Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of RLDM. Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).Recorded on location at Trinity College Dublin, Ireland during RLDM 2025.Featured ReferencesRLDM 2025: Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)June 11-14, 2025 at Trinity College Dublin, IrelandSatinder Singh on Google Scholar
2025-06-25
05 min
TalkRL: The Reinforcement Learning Podcast
NeurIPS 2024 - Posters and Hallways 3
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL Harley Wiltzer from MILA: Foundations of Multivariate Distributional Reinforcement Learning Vinzenz Thoma from ETH AI Center: Contextual Bilevel Reinforcement Learning for Incentive Alignment Haozhe (Tony) Chen & Ang (Leon) Li from Columbia: QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
2025-03-09
10 min
TalkRL: The Reinforcement Learning Podcast
NeurIPS 2024 - Posters and Hallways 2
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Rory Young from University of Glasgow: Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach Glen Berseth from MILA: Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn Alexander Rutherford from University of Oxford: JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
2025-03-05
08 min
TalkRL: The Reinforcement Learning Podcast
NeurIPS 2024 - Posters and Hallways 1
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs Soumyendu Sarkar of HP Labs : SustainDC: Benchmarking for Sustainable Data Center Control Matteo Bettini of Cambridge University: BenchMARL: Benchmarking Multi-Agent Reinforcement Learning Michael Bowling of U Alberta : Beyond Optimism: Exploration With Partially Observable Rewards
2025-03-03
09 min
TalkRL: The Reinforcement Learning Podcast
Abhishek Naik on Continuing RL & Average Reward
Abhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton. Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications. Featured References Reinforcement Learning for Continuing Problems Using Average Reward Abhishek Naik Ph.D. dissertation 2024 Reward Centering Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton 2024 Learning and Planning in Average-Reward Markov Decision Processes Yi Wan, Abhishek Naik, Richard S. Sutton...
2025-02-10
1h 21
TalkRL: The Reinforcement Learning Podcast
Neurips 2024 RL meetup Hot takes: What sucks about RL?
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out! Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024. Special thanks to "David Beckham" for the inspiration :)
2024-12-23
17 min
TalkRL: The Reinforcement Learning Podcast
RLC 2024 - Posters and Hallways 5
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports 0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward 2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL 08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork 10:21 Claas Voelcker from University of Toronto on Can we hop in general? 11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination
2024-09-20
13 min
TalkRL: The Reinforcement Learning Podcast
RLC 2024 - Posters and Hallways 4
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Abel from DeepMind on 3 Dogmas of RL 0:55 Kevin Wang from Brown on learning variable depth search for MCTS 2:17 Ashwin Kumar from Washington University in St Louis on fairness in resource allocation 3:36 Prabhat Nagarajan from UAlberta on Value overestimation
2024-09-19
04 min
TalkRL: The Reinforcement Learning Podcast
RLC 2024 - Posters and Hallways 3
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Kris De Asis from Openmind on Time Discretization 2:23 Anna Hakhverdyan from U of Alberta on Online Hyperparameters 3:59 Dilip Arumugam from Princeton on Information Theory and Exploration 5:04 Micah Carroll from UC Berkeley on Changing preferences and AI alignment
2024-09-18
06 min
TalkRL: The Reinforcement Learning Podcast
RLC 2024 - Posters and Hallways 2
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Hector Kohler from Centre Inria de l'Université de Lille with "Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning" 2:29 Quentin Delfosse from TU Darmstadt on "Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents" 4:15 Sonja Johnson-Yu from Harvard on "Understanding biological active sensing behaviors by interpreting learned artificial agent policies" 6:42 Jannis Blüml from TU Darmstadt on "OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments" 8:20 Cameron Allen from UC Berkeley on "Resolving Partial Observability in Decision Processes via the Lambda Discrepancy" 9:48 James Staley fro...
2024-09-16
15 min
TalkRL: The Reinforcement Learning Podcast
RLC 2024 - Posters and Hallways 1
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Ann Huang from Harvard on Learning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural Controllers 1:37 Jannis Blüml from TU Darmstadt on HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning 3:13 Benjamin Fuhrer from NVIDIA on Gradient Boosting Reinforcement Learning 3:54 Paul Festor from Imperial College London on Evaluating the impact of explainable RL on physician decision-making in high-fidelity simulations: insights from eye-tracking metrics
2024-09-11
05 min
TalkRL: The Reinforcement Learning Podcast
Finale Doshi-Velez on RL for Healthcare @ RCL 2024
Finale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences. This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024. Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I jumped at the chance to get a few minutes of her thoughts -- even though you can tell I was not prepared and a bit flustered tbh. Thanks to Prof Doshi-Velez for taking a moment for...
2024-09-02
07 min
TalkRL: The Reinforcement Learning Podcast
David Silver 2 - Discussion after Keynote @ RCL 2024
Thanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture. Recorded at UMass Amherst during RCL 2024.Due to the live recording environment, audio quality varies. We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion. References AlphaProof announcement on DeepMind's blogDiscovering Reinforcement Learning Algorithms, Oh et al -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published Reinforcement Learning Conference 2024 David Silver on Google Scholar
2024-08-28
16 min
TalkRL: The Reinforcement Learning Podcast
David Silver @ RCL 2024
David Silver is a principal research scientist at DeepMind and a professor at University College London. This interview was recorded at UMass Amherst during RLC 2024. References Discovering Reinforcement Learning Algorithms, Oh et al -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Silver et al 2017 -- the AlphaZero algo was used in his recent work on AlphaProof AlphaProof on the DeepMind blog AlphaFold on the DeepMind blog Reinforcement Learning Conference 2024 David Silver on Google Scholar
2024-08-26
11 min
TalkRL: The Reinforcement Learning Podcast
Vincent Moens on TorchRL
Dr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch. Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens Additional References TorchRL on github TensorDict Documentation
2024-04-08
40 min
TalkRL: The Reinforcement Learning Podcast
Arash Ahmadian on Rethinking RLHF
Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.Featured ReferenceBack to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMsArash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara HookerAdditional ReferencesSelf-Rewarding Language Models, Yuan et al 2024 Reinforcement Learning: An Introduction, Sutton and Barto 1992Learning from Delayed Rewards, Chris Watkins 1989Simple Statistical Gradient-Following Algor...
2024-03-25
33 min
TalkRL: The Reinforcement Learning Podcast
Glen Berseth on RL Conference
Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL). Featured Links Reinforcement Learning Conference Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach
2024-03-11
21 min
TalkRL: The Reinforcement Learning Podcast
Ian Osband
Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty. We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References Reinforcement Learning, Bit by Bit Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen From Predictions to Decisions: The Importance of Joint Predictive Distributions Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracher...
2024-03-07
1h 08