Look for any podcast host, guest or anyone
Showing episodes and shows of

Robin Ranjit Singh Chauhan

Shows

TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastNeurIPS 2024 - Posters and Hallways 3Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control  Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL  Harley Wiltzer from MILA: Foundations of Multivariate Distributional Reinforcement Learning  Vinzenz Thoma from ETH AI Center: Contextual Bilevel Reinforcement Learning for Incentive Alignment  Haozhe (Tony) Chen & Ang (Leon) Li from Columbia: QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers  2025-03-0910 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastNeurIPS 2024 - Posters and Hallways 2Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning  Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning  Rory Young from University of Glasgow: Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach  Glen Berseth from MILA: Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn  Alexander Rutherford from University of Oxford: JaxMARL: Multi-Agent RL Environments and Algorithms in JAX  2025-03-0508 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastNeurIPS 2024 - Posters and Hallways 1Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning  Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO  Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs  Soumyendu Sarkar of HP Labs : SustainDC: Benchmarking for Sustainable Data Center Control  Matteo Bettini of Cambridge University: BenchMARL: Benchmarking Multi-Agent Reinforcement Learning  Michael Bowling of U Alberta : Beyond Optimism: Exploration With Partially Observable Rewards  2025-03-0309 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastAbhishek Naik on Continuing RL & Average RewardAbhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton.  Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications.  Featured References  Reinforcement Learning for Continuing Problems Using Average Reward Abhishek Naik Ph.D. dissertation 2024  Reward Centering Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton 2024   Learning and Planning in Average-Reward Markov Decision Processes Yi Wan, Abhishek Naik, Richard S. Sutton...2025-02-101h 21TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastNeurips 2024 RL meetup Hot takes: What sucks about RL?What do RL researchers complain about after hours at the bar?  In this "Hot takes" episode, we find out!  Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.  Special thanks to "David Beckham" for the inspiration :)  2024-12-2317 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastRLC 2024 - Posters and Hallways 5Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.   Featuring:  0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports  0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward  2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL  08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork  10:21 Claas Voelcker from University of Toronto on Can we hop in general?  11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination  2024-09-2013 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastRLC 2024 - Posters and Hallways 4Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.   Featuring:  0:01  David Abel from DeepMind on 3 Dogmas of RL  0:55 Kevin Wang from Brown on learning variable depth search for MCTS  2:17 Ashwin Kumar from Washington University in St Louis on fairness in resource allocation  3:36 Prabhat Nagarajan from UAlberta on Value overestimation  2024-09-1904 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastRLC 2024 - Posters and Hallways 3Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Kris De Asis from Openmind on Time Discretization  2:23 Anna Hakhverdyan from U of Alberta on Online Hyperparameters  3:59 Dilip Arumugam from Princeton on Information Theory and Exploration  5:04 Micah Carroll from UC Berkeley on Changing preferences and AI alignment  2024-09-1806 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastRLC 2024 - Posters and Hallways 2Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Hector Kohler from Centre Inria de l'Université de Lille with "Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning"  2:29 Quentin Delfosse from TU Darmstadt on "Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents"  4:15 Sonja Johnson-Yu from Harvard on "Understanding biological active sensing behaviors by interpreting learned artificial agent policies"  6:42 Jannis Blüml from TU Darmstadt on "OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments"  8:20 Cameron Allen from UC Berkeley on "Resolving Partial Observability in Decision Processes via the Lambda Discrepancy"  9:48 James Staley fro...2024-09-1615 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastRLC 2024 - Posters and Hallways 1Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Ann Huang from Harvard on Learning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural Controllers  1:37 Jannis Blüml from TU Darmstadt on HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning  3:13 Benjamin Fuhrer from NVIDIA on Gradient Boosting Reinforcement Learning  3:54 Paul Festor from Imperial College London on Evaluating the impact of explainable RL on physician decision-making in high-fidelity simulations: insights from eye-tracking metrics  2024-09-1105 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastFinale Doshi-Velez on RL for Healthcare @ RCL 2024Finale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences.  This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024.   Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I jumped at the chance to get a few minutes of her thoughts -- even though you can tell I was not prepared and a bit flustered tbh.  Thanks to Prof Doshi-Velez for taking a moment for...2024-09-0207 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastDavid Silver 2 - Discussion after Keynote @ RCL 2024Thanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture.   Recorded at UMass Amherst during RCL 2024.Due to the live recording environment, audio quality varies.  We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion.   References  AlphaProof announcement on DeepMind's blogDiscovering Reinforcement Learning Algorithms, Oh et al  -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published  Reinforcement Learning Conference 2024  David Silver on Google Scholar  2024-08-2816 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastDavid Silver @ RCL 2024David Silver is a principal research scientist at DeepMind and a professor at University College London.  This interview was recorded at UMass Amherst during RLC 2024.   References  Discovering Reinforcement Learning Algorithms, Oh et al  -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Silver et al 2017 -- the AlphaZero algo was used   in his recent work on AlphaProof  AlphaProof on the DeepMind blog AlphaFold on the DeepMind blog Reinforcement Learning Conference 2024  David Silver on Google Scholar  2024-08-2611 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastVincent Moens on TorchRLDr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch.  Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens  Additional References  TorchRL on github  TensorDict Documentation  2024-04-0840 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastArash Ahmadian on Rethinking RLHFArash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.Featured ReferenceBack to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMsArash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara HookerAdditional ReferencesSelf-Rewarding Language Models, Yuan et al 2024 Reinforcement Learning: An Introduction, Sutton and Barto 1992Learning from Delayed Rewards, Chris Watkins 1989Simple Statistical Gradient-Following Algor...2024-03-2533 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastGlen Berseth on RL ConferenceGlen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL).  Featured Links  Reinforcement Learning Conference  Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach2024-03-1121 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastIan OsbandIan Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.  We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References  Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen  From Predictions to Decisions: The Importance of Joint Predictive Distributions Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracher...2024-03-071h 08TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastSharath Chandra RaparthySharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!  Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.  Featured Reference  Generalization to New Sequential Decision Making Tasks with In-Context Learning   Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu  Additional References  Sharath Chandra Raparthy Homepage  Human-Timescale Adaptation in an Open-Ended Task Space, Adaptive Agent Team 2023Data Distributional Properties Drive Emergent In-Context Learning in Transformers, Chan et al 2022  Decision Transformer: Reinforcement Learning via Sequence Modeling, Chen et a...2024-02-1240 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastPierluca D'Oro and Martin KlissarovPierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!  Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.  Featured References  Motif: Intrinsic Motivation from Artificial Intelligence Feedback  Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff  Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control  Nate Rahn*, Pierluca D'Oro*, Harley...2023-11-1357 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastMartin RiedmillerMartin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!  Martin Riedmiller is a research scientist and team lead at DeepMind.   Featured References   Magnetic control of tokamak plasmas through deep reinforcement learning  Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, An...2023-08-221h 13TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastMax SchwarzerMax Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science.  Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research.   Featured References Bigger, Better, Faster: Human-level Atari with human-level efficiency  Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro  Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier Pierluca D'Oro, Max Schwarzer, Evgenii Nikishin, Pierre-Luc Bacon, Marc G Bellemare, Aaron Courville  The Primac...2023-08-081h 10TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastJulian TogeliusJulian Togelius is an Associate Professor of Computer Science and Engineering at NYU, and Cofounder and research director at modl.ai  Featured References  Choose Your Weapon: Survival Strategies for Depressed AI AcademicsJulian Togelius, Georgios N. YannakakisLearning Controllable 3D Level GeneratorsZehua Jiang, Sam Earle, Michael Cerny Green, Julian TogeliusPCGRL: Procedural Content Generation via Reinforcement LearningAhmed Khalifa, Philip Bontrager, Sam Earle, Julian TogeliusIlluminating Generalization in Deep Reinforcement Learning thro...2023-07-2540 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastJakob FoersterJakob Foerster on Multi-Agent learning, Cooperation vs Competition, Emergent Communication, Zero-shot coordination, Opponent Shaping, agents for Hanabi and Prisoner's Dilemma, and more.  Jakob Foerster is an Associate Professor at University of Oxford.  Featured References  Learning with Opponent-Learning Awareness Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch  Model-Free Opponent Shaping Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster  Off-Belief Learning Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster  Learning to Communicate with Deep Multi...2023-05-081h 03TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastDanijar Hafner 2Danijar Hafner on the DreamerV3 agent and world models, the Director agent and heirarchical RL,  realtime RL on robots with DayDreamer, and his framework for unsupervised agent design! Danijar Hafner is a PhD candidate at the University of Toronto with Jimmy Ba, a visiting student at UC Berkeley with Pieter Abbeel, and an intern at DeepMind.  He has been our guest before back on episode 11.  Featured References   Mastering Diverse Domains through World Models [ blog ] DreaverV3 Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap  DayDreamer: World Models for Physica...2023-04-1245 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastJeff CluneAI Generating Algos, Learning to play Minecraft with Video PreTraining (VPT), Go-Explore for hard exploration, POET and Open Endedness, AI-GAs and ChatGPT, AGI predictions, and lots more!  Professor Jeff Clune is Associate Professor of Computer Science at University of British Columbia, a Canada CIFAR AI Chair and Faculty Member at Vector Institute, and Senior Research Advisor at DeepMind.  Featured References  Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [ Blog Post ] Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune  2023-03-271h 11TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastNatasha Jaques 2Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!  Dr Natasha Jaques is a Senior Research Scientist at Google Brain. Featured References Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard  Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turn...2023-03-1446 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastJacob Beck and Risto VuorioJacob Beck and Risto Vuorio on their recent Survey of Meta-Reinforcement Learning.  Jacob and Risto are Ph.D. students at Whiteson Research Lab at University of Oxford.    Featured Reference   A Survey of Meta-Reinforcement LearningJacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson   Additional References  VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning, Luisa Zintgraf et al  Mastering Diverse Domains through World Models (Dreamerv3), Hafner et al    Unsupervised Meta-Learning for Reinforcement Learning (MAML), Gupta et al  Decoupling Exploration a...2023-03-071h 07TalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastJohn SchulmanJohn Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.Featured ReferencesWebGPT: Browser-assisted question-answering with human feedbackReiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John SchulmanTraining language models to follow instructions with human feedbackLong Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John...2022-10-1844 minTalkRL: The Reinforcement Learning PodcastTalkRL: The Reinforcement Learning PodcastSven MikaSven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University. Featured ReferencesRLlib Documentation: RLlib: Industry-Grade Reinforcement LearningRay: DocumentationRLlib: Abstractions for Distributed Reinforcement LearningEric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion StoicaEpisode sponsor: AnyscaleRay Summit 2022 is coming to San Francisco on August 23-24.Hear how teams at Dow, V...2022-08-1934 min