podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
Enrico Shippole
Shows
Daily Paper Cast
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
🤗 Upvotes: 22 | cs.CL, cs.LG Authors: Nikhil Kandpal, Brian Lester, Colin Raffel, Sebastian Majstorovic, Stella Biderman, Baber Abbasi, Luca Soldaini, Enrico Shippole, A. Feder Cooper, Aviya Skowron, John Kirchenbauer, Shayne Longpre, Lintang Sutawika, Alon Albalak, Zhenlin Xu, Guilherme Penedo, Loubna Ben Allal, Elie Bakouch, John David Pressman, Honglu Fan, Dashiell Stander, Guangyu Song, Aaron Gokaslan, Tom Goldstein, Brian R. Bartoldson, Bhavya Kailkhura, Tyler Murray Title: The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Arxiv: http://arxiv.org/abs/2506.05209v1 Abstract: Large lan...
2025-06-07
18 min
Daily Paper Cast
Bridging the Data Provenance Gap Across Text, Speech and Video
🤗 Upvotes: 3 | cs.AI, cs.CL, cs.CY, cs.LG, cs.MM Authors: Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Ste...
2024-12-26
25 min
ThursdAI - The top AI news from the past week
ThursdAI - Special Episode, interview with Nous Research and Enrico Shippole, fine-tuning LLaMa 2, extending it's context and more
Hey there, welcome to this special edition of ThursdAI. This episode is featuring an interview with Nous Research, a group of folks who fine-tune open source large language models to make them better. If you are interested to hear how finetuning an open source model works, dataset preparation, context scaling and more, tune in! You will hear from Karan, Teknium, LBJ from Nous Research and Enrico who worked along side them. To clarify, Enrico is going in depth into the method called Rope Scaling, which is a clever hack, that extends the context length of...
2023-07-23
37 min