Listen

Description

OpenAI's ChatGPT has released a new update to give users more control over how it responds. A.I. companies have agreed to voluntary safeguards to manage the risks associated with their technology. "Secrets of RLHF in Large Language Models Part I: PPO" introduces a new approach to reinforcement learning with human feedback, which is important for large language models. Finally, "SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval" introduces a new paradigm in retrieval called neural sparse retrieval, and a toolkit called SPRINT to evaluate and compare different models.

Contact:  sergi@earkind.com

Timestamps:

00:34 Introduction

01:48 Custom instructions for ChatGPT

03:20 Pressured by Biden, A.I. Companies Agree to Guardrails on New Tools

05:17 llama2.c Repository by Andrej Karpathy

06:31 Fake sponsor

08:27 Secrets of RLHF in Large Language Models Part I: PPO

10:19 Provably Faster Gradient Descent via Long Steps

11:42 SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval

13:49 Outro