The End of Data Leakage

Description

In this episode, we dive deep into the challenges facing time series AI model leaderboards, from hidden information leakage to the complexities of benchmarking foundation models. I sit down with Marcel Meyer to unpack why traditional approaches fall short and how our new TS Arena leaderboard is setting a new standard for fair, future-proof evaluation.

We explore the pitfalls that plague current benchmarks, the surprising ways data contamination can skew results, and the innovative pre-registration protocol we've developed to keep evaluations honest. If you've ever wondered what it takes to build a truly trustworthy AI leaderboard—or why it matters for industry and research alike—this conversation is packed with insights you won't want to miss.

Listen

Description

Want to check another podcast?