Yannic Kilcher - Podcast Details

Shows

Super Data Science: ML & AI Podcast with Jon Krohn733: OpenAssistant: The Open-Source ChatGPT Alternative, with Dr. Yannic KilcherYannic Kilcher, a leading ML YouTuber and DeepJudge CTO, teams up with Jon Krohn this week to delve into the open-source ML community, the technology powering Yannic’s Swiss-based startup, and the significant implications of adversarial examples in ML. Tune in as they also unpack Yannic's approach to tracking ML research, future AI prospects and his startup challenges.This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by CloudWolf, the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you wil...2023-11-211h 40

The Geek In ReviewPaulina Grnarova and Yannic Kilcher from DeepJudge.AI: Unlocking Institutional Knowledge: How AI is Transforming Legal Search (TGIR Ep. 224)On this episode of The Geek in Review, hosts Marlene Gebauer and Greg Lambert explore innovations in legal search with Paulina Grnarova and Yannic Kilcher, co-founders of DeepJudge. This semantic search engine for legal documents leverages proprietary AI developed by experts with backgrounds from Google and academic AI research. As PhDs from ETH Zurich, Grnarova and Kilcher recognized lawyers needed better access to institutional knowledge rather than constantly reinventing the wheel. DeepJudge moves beyond traditional keyword searches to a deeper integration of search and generative AI models like GPT-3. Partnerships provide financial support and key insights – advisors in...2023-10-1035 min

THINK REACTOR#028 - Open Source AI - mit Yannic Kilcher (Open Assistant, DeepJudge)Der Podcast rund um Künstliche Intelligenz von und mit Roland Becker und Dr. Sirko Straube. Roland und Sirko sprechen mit Yannic Kilcher über Open Source AI, das Projekt Open Assistant und sein Startup DeepJudge.Yannic Kilcher ist ein Schweizer Informatiker und YouTuber, der für seine Videos zum Thema künstliche Intelligenz und Technologie bekannt ist. Mit seinem YouTube-Kanal "Yannic Kilcher" erreicht er ein breites Publikum. Yannic hat einen Abschluss in Informatik und ist Gründer von DeepJudge, sowie Co-Founder des OpenAssistant Projekts. Sein Fachwissen und seine Leidenschaft für die neuesten Entwicklungen machen ihn zu ein...2023-06-012h 07

Yannic Kilcher Videos (Audio Only)ChatGPT: This AI has a JAILBREAK?! (Unbelievable AI Progress)#chatgpt #ai #openai ChatGPT, OpenAI's newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:40 - Sponsor: Weights & Biases 3:20 - ChatGPT: How does it work? 5:20 - Reinforcement Learning from Human Feedback 7:10 - ChatGPT Origins: The GPT-3.5 Series 8:20 - OpenAI's strategy: Iterative Refinement 9:10 - ChatGPT's a...2023-01-0231 min

Yannic Kilcher Videos (Audio Only)This is a game changer! (AlphaTensor by DeepMind explained)#alphatensor #deepmind #ai Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. DeepMind goes a step further and creates AlphaTensor, a Deep Reinforcement Learning algorithm that plays a single-player game, TensorGame, in order to find even more optimized algorithms for matrix multiplication. And it turns out, there exists a...2022-10-2355 min

Yannic Kilcher Videos (Audio Only)More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)#ai #interview #research Jacob Steinhardt believes that future AI systems will be qualitatively different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think. OUTLINE: 0:00 Introduction 1:10 Start of Interview 2:10 Blog posts series 3:56 More Is Different for AI (Blog Post) 7:40 Do you think this emergence is mainly a property from the interaction o...2022-09-151h 06

Yannic Kilcher Videos (Audio Only)The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)#huggingface #pickle #exploit Did you know that something as simple as loading a model can execute arbitrary code on your machine? Try the model: https://huggingface.co/ykilcher/total... Get the code: https://github.com/yk/patch-torch-save Sponsor: Weights & Biases Go here: https://wandb.me/yannic OUTLINE: 0:00 - Introduction 1:10 - Sponsor: Weights & Biases 3:20 - How Hugging Face models are loaded 5:30 - From PyTorch to pickle 7:10...2022-09-0719 min

Yannic Kilcher Videos (Audio Only)Did Google's LaMDA chatbot just become sentient?#lamda #google #ai Google engineer Blake Lemoine was put on leave after releasing proprietary information: An interview with the chatbot LaMDA that he believes demonstrates that this AI is, in fact, sentient. We analyze the claims and the interview in detail and trace how a statistical machine managed to convince at least one human that it is more than just an algorithm. OUTLINE: 0:00 - Whistleblower put on leave 4:30 - What is a language model? 6:40 - The prompt is the key 10:40 - W...2022-06-2022 min

Yannic Kilcher Videos (Audio Only)[ML News] Meta's OPT 175B language model | DALL-E Mega is training | TorToiSe TTS fakes my voice#mlnews #dalle #gpt3 An inside look of what's happening in the ML world! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 1:40 - Meta AI releases OPT-175B 4:55 - CoCa: New CLIP-Competitor 8:15 - DALL-E Mega is training 10:05 - TorToiSe TTS is amazing! 11:50 - Investigating Vision Transformers 12:50 - Hugging Face Deep RL class launched 13:40 - Helpful Things 17:00...2022-05-1219 min

Yannic Kilcher Videos (Audio Only)This A.I. creates infinite NFTs#nft #gan #ai Today we build our own AI that can create as many bored apes as we want! Fungibility for everyone! Try the model here: https://huggingface.co/spaces/ykilcher/apes or here: https://ykilcher.com/apes Files & Models here: https://huggingface.co/ykilcher/apes/tree/main Code here: https://github.com/yk/apes-public (for the "what's your ape" app, look for the file interface_projector.py) This video is sponsored by BrightData, use this link for...2022-05-1218 min

Yannic Kilcher Videos (Audio Only)Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan - Paper Explained)#saycan #robots #ai Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of these plans are feasible or appropriate. SayCan combines the semantic capabilities of language models with a bank of low-level skills, which are available to the agent as individual policies to execute. SayCan automatically finds the best policy to execute by considering a trade-off between the policy's ability to progress towards the goal, given by the language model, and the policy's probability of executing...2022-05-0228 min

Yannic Kilcher Videos (Audio Only)Transformer Memory as a Differentiable Search Index (Machine Learning Research Paper Explained)#dsi #search #google Search engines work by building an index and then looking up things in it. Usually, that index is a separate data structure. In keyword search, we build and store reverse indices. In neural search, we build nearest-neighbor indices. This paper does something different: It directly trains a Transformer to return the ID of the most relevant document. No similarity search over embeddings or anything like this is performed, and no external data structure is needed, as the entire index is essentially captured by the model's weights. The paper experiments with various...2022-04-2151 min $Ken\'s Nearest Neighbors$ Ken's Nearest NeighborsHow He Breaks Down Complex Machine Learning Research for YouTube (Yannic Kilcher ) - KNN Ep. 95Today I had the pleasure of interviewing Yannic Kilcher. Yannic is a YouTuber covering state of the art Machine Learning research topics. He has a PhD from ETH Zurich and is currently the CTO of DeepJudge, a LegalTech NLP startup. In this episode we learn about how Yannic decided on a PHD in Ai, how he is able to make advanced research so digestable, and the reason why he wears sunglasses on camera. I hope you enjoy the epsisode, I know I enjoyed our conversation.2022-04-2056 min

Yannic Kilcher Videos (Audio Only)[ML News] Google's 540B PaLM Language Model & OpenAI's DALL-E 2 Text-to-Image Revolution#mlnews #palm #dalle2 Google releases PaLM and OpenAI releases DALL-E 2 (and more news). Sponsor: Weights & BIases Start here: https://wandb.me/yannic Thumbnail credit: DALL-E 2 via Sam Altman OUTLINE 0:00 - Street interview w/ random stranger 2:25 - Intro 2:50 - PaLM - Google's 540B Pathways Language Model 7:50 - Sponsor: Weights & Biases 9:10 - OpenAI releases DALL-E 2 12:05 - Open Source Datasets and Models 13:20 - Salesforce releases CodeGen 2022-04-1214 min

Yannic Kilcher Videos (Audio Only)[ML News] GPT-3 learns to edit | Google Pathways | Make-A-Scene | CLIP meets GamePhysics | DouBlind#mlnews #gpt3 #pathways Your updates on the latest and greatest from the depths of Machine Learning! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:15 - Weights & Biases Report about Reports 2:45 - GPT-3 learns to edit 6:30 - Make-A-Scene: Text-to-Image with Human Priors 8:00 - Pathways: Google's new High-Performance ML scheduler 10:45 - DouBlind: Open Peer-Review 12:45 - CLIP meets GamePhysics 14:40 - Residual Quantization pushes Image Generation SOTA 2022-04-0618 min

Yannic Kilcher Videos (Audio Only)BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding&Generation#blip #review #ai Cross-modal pre-training has been all the rage lately in deep learning, especially training vision and language models together. However, there are a number of issues, such as low quality datasets that limit the performance of any model trained on it, and also the fact that pure contrastive pre-training cannot be easily fine-tuned for most downstream tasks. BLIP unifies different tasks and objectives in a single pre-training run and achieves a much more versatile model, which the paper immediately uses to create, filter, clean and thus bootstrap its own dataset to improve...2022-03-2546 min

Yannic Kilcher Videos (Audio Only)[ML News] DeepMind controls fusion | Yann LeCun's JEPA architecture | US: AI can't copyright its artUpdates on what's going on in the ML world! Check out w&b's alerts feature: https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 2:35 - DeepMind uses Reinforcement Learning to control nuclear fusion 4:35 - Google responds to carbon emission estimates 8:40 - Yann LeCun proposes new architecture for world models 11:05 - Fruit fly neurons may perform multiplication 12:00 - Emojisearch App 12:30 - Ar5iv officially in arXiv labs 12:55 - Language Model Consciousness & Media Hype 2022-03-0828 min

The Debugged PodcastYannic Kilcher on AI, Law, Research and Music | Debugged Episode #8In this episode of Debugged, Medha Gupta sits down with Yannic Kilcher, the CTO of DeepJudge, a law and technology based startup in Zurich, Switzerland, and a YouTube personality. They discuss his experience founding DeepJudge and his reason for creating easy-to-digest videos on YouTube to help others understand the latest AI research. He also discusses his interests in artificial intelligence and its potential limits as it increases in importance and use through the STEM fields; he follows that up with a small discussion on the legal and ethical implications of AI. Furthermore, Kilcher gives advice about getting started in...2022-03-0132 min

Yannic Kilcher Videos (Audio Only)[ML News] Uber: Deep Learning for ETA | MuZero Video Compression | Block-NeRF | EfficientNet-X#mlnews #muzero #nerf Your regularly irregular updates on everything new in the ML world! Merch: store.ykilcher.com OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 2:15 - Uber switches from XGBoost to Deep Learning for ETA prediction 5:45 - MuZero advances video compression 10:10 - Learned Soft Prompts can steer large language models 12:45 - Block-NeRF captures entire city blocks 14:15 - Neural Architecture Search considers underlying hardware 16:50 - Mega-Blog on Self-Organizing Agents 18:40...2022-02-2426 min

Yannic Kilcher Videos (Audio Only)Listening to You! - Channel Update (Author Interviews)#mlnews #kilcher #withtheauthors Many of you have given me feedback on what you did and didn't like about the recent "with the authors" videos. Here's the result of that feedback and an outlook into the future. Merch: store.ykilcher.com Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann...2022-02-2204 min

Yannic Kilcher Videos (Audio Only)[ML News] DeepMind AlphaCode | OpenAI math prover | Meta battles harmful content with AI#mlnews #alphacode #openai The latest and greatest from the world of Machine Learning! Merch: store.ykilcher.com Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 3:15 - DeepMind's AlphaCode: AI competitive programmer 11:30 - OpenAI uses language models to prove math theorems 14:30 - StyleGAN XL: Scaling StyleGAN to diverse datasets 16:10 - ar5iv.org displays papers as HTML5 17:40 - Helpful Things 2022-02-1626 min

Yannic Kilcher Videos (Audio Only)OpenAI Embeddings (and Controversy?!)#mlnews #openai #embeddings COMMENTS DIRECTLY FROM THE AUTHOR (thanks a lot for reaching out Arvind :) ): 1. The FIQA results you share also have code to reproduce the results in the paper using the API: https://twitter.com/arvind_io/status/... There's no discrepancy AFAIK. 2. We leave out 6 not 7 BEIR datasets. Results on msmarco, nq and triviaqa are in a separate table (Table 5 in the paper). NQ is part of BEIR too and we didn't want to repeat it. Finally, the 6 datasets we leave out are not readily available and it is...2022-02-1615 min

Yannic Kilcher Videos (Audio Only)[ML News] DeepMind builds Gopher | Google builds GLaM | Suicide capsule uses AI to check access#mlnews #gopher #glam Your updates on everything going on in the Machine Learning world. Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro & Overview 0:20 - Sponsor: Weights & Biases 3:05 - DeepMind releases 3 papers on large language models 11:45 - Hugging Face Blog: Training CodeParrot from scratch 14:25 - Paper: Pre-Training vision systems with noise 15:45 - DeepMind advances Quantum Mechanics 16:45 - GoogleAI trains GLaM: 1 Trillion Parameters Mixture of Experts Model 2022-01-0525 min

Yannic Kilcher Videos (Audio Only)[ML News] DeepMind tackles Math | Microsoft does more with less | Timnit Gebru launches DAIR#mlnews #deepmind #ai The most trusted model in News! Get started with Weights & Biases here: https://wandb.me/yannic (it's free forever for personal use) OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 3:10 - DeepMind tackles fundamental math 6:45 - Microsoft focuses on scaling effectively and efficiently 10:15 - NeurIPS Anthology Visualization 13:30 - Timnit Gebru launches research institute independent from big tech 16:50 - SageMaker Canvas for no-code ML 17:50...2021-12-1425 min

Towards Data Science105. Yannic Kilcher - A 10,000-foot view of AIThere once was a time when AI researchers could expect to read every new paper published in the field on the arXiv, but today, that’s no longer the case. The recent explosion of research activity in AI has turned keeping up to date with new developments into a full-time job. Fortunately, people like YouTuber, ML PhD and sunglasses enthusiast Yannic Kilcher make it their business to distill ML news and papers into a digestible form for mortals like you and me to consume. I highly recommend his channel to any TDS podcast listeners who are interested in...2021-12-011h 03

Yannic Kilcher Videos (Audio Only)Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in)#neurips #peerreview #machinelearning A look at the results of the 2021 NeurIPS peer review experiment. https://arxiv.org/abs/2109.09774 https://www.reddit.com/r/MachineLearn... Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space...2021-11-2611 min

Yannic Kilcher Videos (Audio Only)Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev)#deeplearning #neuralarchitecturesearch #metalearning Deep Neural Networks are usually trained from a given parameter initialization using SGD until convergence at a local optimum. This paper goes a different route: Given a novel network architecture for a known dataset, can we predict the final network parameters without ever training them? The authors build a Graph-Hypernetwork and train on a novel dataset of various DNN-architectures to predict high-performing weights. The results show that not only can the GHN predict weights with non-trivial performance, but it can also generalize beyond the distribution of training architectures to predict weights...2021-11-2548 min

Yannic Kilcher Videos (Audio Only)Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Reivew)#grafting #adam #sgd The last years in deep learning research have given rise to a plethora of different optimization algorithms, such as SGD, AdaGrad, Adam, LARS, LAMB, etc. which all claim to have their special peculiarities and advantages. In general, all algorithms modify two major things: The (implicit) learning rate schedule, and a correction to the gradient direction. This paper introduces grafting, which allows to transfer the induced learning rate schedule of one optimizer to another one. In that, the paper shows that much of the benefits of adaptive methods (e.g. Adam) are...2021-11-2239 min

Yannic Kilcher Videos (Audio Only)[ML News] Cedille French Language Model | YOU Search Engine | AI Finds Profitable MEME TOKENS#mlnews #cedille #wmt Only the greatest of news from the world of Machine Learning. OUTLINE: 0:00 - Sponsor: Weights & Biases 1:50 - Cedille - French Language Model 3:55 - Facebook AI Multilingual model wins WMT 5:50 - YOU private search engine 10:35 - DeepMind's Open-Source Arnheim 12:10 - Company sued for using AI to make website more accessible 18:05 - Alibaba DAMO Academy creates 10 Trillion M6 model 21:15 - AMD MI200 Family 22:30 - State of AI report 2021 24:15...2021-11-2236 min

Yannic Kilcher Videos (Audio Only)Gradients are Not All You Need (Machine Learning Research Paper Explained)#deeplearning #backpropagation #simulation More and more systems are made differentiable, which means that accurate gradients of these systems' dynamics can be computed exactly. While this development has led to a lot of advances, there are also distinct situations where backpropagation can be a very bad idea. This paper characterizes a few such systems in the domain of iterated dynamical systems, often including some source of stochasticity, resulting in chaotic behavior. In these systems, it is often better to use black-box estimators for gradients than computing them exactly. OUTLINE: 2021-11-2248 min

Yannic Kilcher Videos (Audio Only)[ML News] Microsoft combines Images & Text | Meta makes artificial skin | Russians replicate DALL-E#mlnews #turing #reskin The latest and greatest from the Machine Learning world Sponsor: Weights & Biases https://wandb.com References: Microsoft Turing Bletchley: Universal Image Language Representation Model https://www.microsoft.com/en-us/resea... https://turing.microsoft.com/bletchley Meta AI Tactile Sensing https://ai.facebook.com/blog/teaching... https://ai.facebook.com/blog/reskin-a... https://twitter.com/AIatMeta/status/1... AnimeGANv2 2021-11-2237 min

Yannic Kilcher Videos (Audio Only)Autoregressive Diffusion Models (Machine Learning Research Paper Explained)#machinelearning #ardm #generativemodels Diffusion models have made large advances in recent months as a new type of generative models. This paper introduces Autoregressive Diffusion Models (ARDMs), which are a mix between autoregressive generative models and diffusion models. ARDMs are trained to be agnostic to the order of autoregressive decoding and give the user a dynamic tradeoff between speed and performance at decoding time. This paper applies ARDMs to both text and image data, and as an extension, the models can also be used to perform lossless compression. OUTLINE: 2021-11-1134 min

Yannic Kilcher Videos (Audio Only)[ML News] Google introduces Pathways | OpenAI solves Math Problems | Meta goes First Person#pathways #mlnews #ego4d Your irregular dose of Machine Learning News. OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 2:10 - Google Introduces Pathways AI Architecture 6:30 - OpenAI trains Language Models to do High School Math 8:25 - Sam Altman says Neural Networks truly learn 9:35 - Google AI researchers frustrated with lawyers 12:10 - DeepMind RL Lecture Series 2021 12:40 - Fashion Store sells Adversarial Patches 13:15 - A viable method to remove the GIL from CPython 2021-11-1136 min

Yannic Kilcher Videos (Audio Only)EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)#efficientzero #muzero #atari Reinforcement Learning methods are notoriously data-hungry. Notably, MuZero learns a latent world model just from scalar feedback of reward- and policy-predictions, and therefore relies on scale to perform well. However, most RL algorithms fail when presented with very little data. EfficientZero makes several improvements over MuZero that allows it to learn from astonishingly small amounts of data and outperform other methods by a large margin in the low-sample setting. This could be a staple algorithm for future RL research. OUTLINE: 0:00 - Intro & Outline 2021-11-0529 min

Yannic Kilcher Videos (Audio Only)[YTalks] Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (Interview)#ytalks #siraj #plagiarism A conversation with Siraj Raval about his journey on YouTube, and the perils of fame. OUTLINE: 0:00 - Intro 1:30 - Welcome 3:15 - Starting out: From Economics to YouTube 13:00 - More Views: Plagiarizing Video Content 23:30 - One Step Up: Copying A Research Paper 29:15 - Was there another way? 39:00 - Clickbait Course: Make Money with Machine Learning 50:30 - Rock Bottom and the Way Forward 1:01:30 - Advice for Future Generations 2021-11-011h 06

Yannic Kilcher Videos (Audio Only)[ML News GERMAN] NVIDIA GTC'21 | DeepMind kauft MuJoCo | Google Lernt Spreadsheet Formeln#gtc21 #mlnews #mujoco Registriere für GTC'21 und gewinne eine RTX 3090: https://nvda.ws/2Y2B5ni OUTLINE: 0:00 - Intro 0:15 - Sponsor: NVIDIA GTC'21 6:10 - DeepMind kauft & Open-Sourct MuJoCo 9:05 - PyTorch 1.10 Veröffentlicht 11:25 - Google Lernt Spreadsheet Formeln 14:15 - handtracking.io 15:25 - Zellinstanzsegmentierungswettbewerb 16:15 - Hilfreiche Bibliotheken 23:15 - Waymo autos verirren sich alle in der selben Sackgasse 24:50 - BlueRiver balanciert Traktoren References: DeepMind ka...2021-11-0126 min

Yannic Kilcher Videos (Audio Only)[ML News] NVIDIA GTC'21 | DeepMind buys MuJoCo | Google predicts spreadsheet formulas#gtc21 #mlnews #mujoco Register to GTC'21 and Win a RTX 3090: https://nvda.ws/2Y2B5ni OUTLINE: 0:00 - Intro 0:15 - Sponsor: NVIDIA GTC'21 5:35 - DeepMind buys & Open-Sources MuJoCo 7:25 - PyTorch 1.10 Released 9:10 - Google Predicts Spreadsheet Formulas 11:25 - handtracking.io 12:25 - Cell Instance Segmentation Challenge 13:00 - Helpful Libraries 17:50 - Waymo cars keep turning into same dead-end 19:35 - BlueRiver balances tractors References: DeepMind buys...2021-11-0121 min

Yannic Kilcher Videos (Audio Only)I went to an AI Art Festival in Geneva (AiiA Festival Trip Report)#aiia #ai #art A trip report from the AiiA Festival in Geneva organized by the ImpactAI foundation. OUTLINE: 0:00 - Intro 1:50 - Laura Tocmacov: The Festival 4:10 - Timothy O'Hear: The Tech 6:50 - Jonathan O'Hear: The Robot 11:50 - Cléa Chopard: The Artist 17:45 - Final Words Website: https://aiiafestival.org/en/ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c...2021-10-2918 min

Yannic Kilcher Videos (Audio Only)Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained)#gpt3 #knowledge #symbolic Symbolic knowledge models are usually trained on human-generated corpora that are cumbersome and expensive to create. Such corpora consist of structured triples of symbolic knowledge. This paper takes a different approach and attempts to generate such a corpus by prompting GPT-3. Results show that clever prompting, combined with targeted small critic models trained on human ratings can outperform both human-generated data, as well as the teacher model (GPT-3) itself. The results of this paper give a general recipe for automatically building corpora for various NLP tasks by extracting samples from large...2021-10-2545 min

Yannic Kilcher Videos (Audio Only)I took a Swiss train and it was awesome! Train Seat Review - SBB InterCity 1 - Geneva to St. Gallen#sbb #seatreview #travel A friendly parody of Travel Vloggers and Airplane Seat Reviews :) No, SBB did not pay me for this (but they should ;) ) Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn...2021-10-2504 min

Yannic Kilcher Videos (Audio Only)[ML News] Microsoft trains 530B model | ConvMixer model fits into single tweet | DeepMind profitable#mlnews #turingnlg #convmixer Your latest upates on what's happening in the Machine Learning world. OUTLINE: 0:00 - Intro 0:16 - Weights & Biases raises on 1B valuation (sponsored) 2:30 - Microsoft trains 530 billion parameter model 5:15 - StyleGAN v3 released 6:45 - A few more examples may be worth billions of parameters 8:30 - ConvMixer fits into a tweet 9:45 - Improved VQGAN 11:25 - William Shatner AI chats about his life 12:35 - Google AI pushes material science 14:10...2021-10-2127 min

Yannic Kilcher Videos (Audio Only)[ML News] DeepMind does Nowcasting | The Guardian's shady reporting | AI finishes Beethoven's 10th#deepmind #nowcasting #machinelearning Your holy update on what's new in the Machine Learning world. OUTLINE: 0:00 - Intro 0:30 - DeepMind tackles Nowcasting 3:30 - The Guardian's shady reporting on TruthfulQA 6:15 - Stochastic training not necessary for generalization 7:35 - Google AI's efficient partitioning of road networks 9:15 - MiniHack Reinforcement Learning Environment 10:45 - Plato XL 11B dialog model 11:35 - AI finishes Beethoven's 10th Symphony 13:10 - AI casts doubt on painting authenticity 15:55 - ShadowDragon...2021-10-1127 min

Yannic Kilcher Videos (Audio Only)Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)#grokking #openai #deeplearning Grokking is a phenomenon when a neural network suddenly learns a pattern in the dataset and jumps from random chance generalization to perfect generalization very suddenly. This paper demonstrates grokking on small algorithmic datasets where a network has to fill in binary tables. Interestingly, the learned latent spaces show an emergence of the underlying binary operations that the data were created with. OUTLINE: 0:00 - Intro & Overview 1:40 - The Grokking Phenomenon 3:50 - Related: Double Descent 7:50 - Binary Operations Datasets 2021-10-1129 min

Yannic Kilcher Videos (Audio Only)How far can we scale up? Deep Learning's Diminishing Returns (Article Review)#deeplearning #co2 #cost Deep Learning has achieved impressive results in the last years, not least due to the massive increases in computational power and data that has gone into these models. Scaling up currently promises to be a reliable way to create more performant systems, but how far can we go? This article explores the limits of exponential scaling in AI, and what people are doing to get around this problem OUTLINE: 0:00 - Intro & Overview 1:00 - Deep Learning at its limits 3:10 - The...2021-10-0420 min

Yannic Kilcher Videos (Audio Only)[ML News] Plagiarism Case w/ Plot Twist | CLIP for video surveillance | OpenAI summarizes books#plagiarism #surveillance #schmidhuber Your Mondaily updates of what's going in the world of Machine Learning. OUTLINE: 0:00 - Intro 0:20 - New plagiarism case has plot twist 7:25 - CLIP for video surveillance 9:40 - DARPA SubTerranean Challenge 11:00 - Schmidhuber criticizing Turing Lecture 15:00 - OpenAI summarizes books 17:55 - UnBiasIt monitors employees' communications for bias 20:00 - iOS plans to detect depression 21:30 - UK 10 year plan to become AI superpower 23:30 - Helpful Libraries 29:00...2021-10-0130 min

Yannic Kilcher Videos (Audio Only)Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)#neurips #peerreview #nips The peer-review system at Machine Learning conferences has come under much criticism over the last years. One major driver was the infamous 2014 NeurIPS experiment, where a subset of papers were given to two different sets of reviewers. This experiment showed that only about half of all accepted papers were consistently accepted by both committees and demonstrated significant influence of subjectivity. This paper revisits the data from the 2014 experiment and traces the fate of accepted and rejected papers during the 7 years since, and analyzes how well reviewers can assess future impact, among...2021-10-0125 min

Yannic Kilcher Videos (Audio Only)[ML News] New ImageNet SOTA | Uber's H3 hexagonal coordinate system | New text-image-pair dataset#truthfulqa #efficientnet #laion400M Your regularly irregular updates on what's happening in the Machine Learning world. OUTLINE: 0:00 - Intro 0:20 - TruthfulQA benchmark shines new light on GPT-3 2:00 - LAION-400M image-text-pair dataset 4:10 - GoogleAI's EfficientNetV2 and CoAtNet 6:15 - Uber's H3: A hexagonal coordinate system 7:40 - AWS NeurIPS 2021 DeepRacer Challenge 8:15 - Helpful Libraries 9:20 - State of PyTorch in September 2021 10:05 - Physics-Based Deep Learning Book 10:35 - Music-conditioned 3D dance generation 2021-09-2814 min

Yannic Kilcher Videos (Audio Only)Does GPT-3 lie? - Misinformation and fear-mongering around the TruthfulQA dataset#gpt-3 #truth #conspiracy A new benchmark paper has created quite an uproar in the community. TruthfulQA is a dataset of 817 questions probing for imitative falsehoods where language models become less truthful, the larger they get. This surprising counter-intuitive finding validates many people's criticisms of large language models, but is it really the correct conclusion? OUTLINE: 0:00 - Intro 0:30 - Twitter Paper Announcement 4:10 - Large Language Models are to blame! 5:50 - How was the dataset constructed? 9:25 - The questions are adversarial 2021-09-2413 min

Yannic Kilcher Videos (Audio Only)Topographic VAEs learn Equivariant Capsules (Machine Learning Research Paper Explained)#tvae #topographic #equivariant Variational Autoencoders model the latent space as a set of independent Gaussian random variables, which the decoder maps to a data distribution. However, this independence is not always desired, for example when dealing with video sequences, we know that successive frames are heavily correlated. Thus, any latent space dealing with such data should reflect this in its structure. Topographic VAEs are a framework for defining correlation structures among the latent variables and induce equivariance within the resulting model. This paper shows how such correlation structures can be built by correctly arranging...2021-09-2132 min

Yannic Kilcher Videos (Audio Only)[ML News] Roomba Avoids Poop | Textless NLP | TikTok Algorithm Secrets | New Schmidhuber Blog#schmidhuber #tiktok #roomba Your regularly irregular update on what's happening in the world of Machine Learning. OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 1:55 - ML YouTuber reaches 100k subscribers 2:40 - Facebook AI pushes Textless NLP 5:30 - Schmidhuber blog post: I invented everything 7:55 - TikTok algorithm rabbitholes users 10:45 - Roomba learns to avoid poop 11:50 - AI can spot art forgeries 14:55 - Deepmind's plans to separate from Google 16:15 - Cohere raises 40...2021-09-1625 min

Yannic Kilcher Videos (Audio Only)Celebrating 100k Subscribers! (w/ Channel Statistics)#yannickilcher #machinelearning #100k OUTLINE: 0:00 - 100k! 1:00 - Announcements & Thanks 3:55 - Channel Statistics Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-ki... ...2021-09-1609 min

Yannic Kilcher Videos (Audio Only)[ML News] AI predicts race from X-Ray | Google kills HealthStreams | Boosting Search with MuZero#mlnews #schmidhuber #muzero Your regular updates on what's happening in the ML world! OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 1:45 - Google shuts down health streams 4:25 - AI predicts race from blurry X-Rays 7:35 - Facebook labels black men as primates 11:05 - Distill papers on Graph Neural Networks 11:50 - Jürgen Schmidhuber to lead KAUST AI Initiative 12:35 - GitHub brief on DMCA notices for source code 14:55 - Helpful Reddit Threads 19:40...2021-09-1327 min

Yannic Kilcher Videos (Audio Only)∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained)#inftyformer #infinityformer #transformer Vanilla Transformers are excellent sequence models, but suffer from very harsch constraints on the length of the sequences they can process. Several attempts have been made to extend the Transformer's sequence length, but few have successfully gone beyond a constant factor improvement. This paper presents a method, based on continuous attention mechanisms, to attend to an unbounded past sequence by representing the past as a continuous signal, rather than a sequence. This enables the Infty-Former to effectively enrich the current context with global information, which increases performance on long-range dependencies in...2021-09-0636 min

Yannic Kilcher Videos (Audio Only)[ML News] Blind Chess AI Competition | Graph NNs for traffic | AI gift suggestions#mlnews #chess #neurips OUTLINE: 0:00 - Intro 0:30 - Reconnaissance Blind Chess NeurIPS 2021 Competition 3:40 - Colab Pro no longer top priority for GPUs 4:45 - DeepMind uses Graph NNs to do traffic prediction 6:00 - Helpful Libraries: Isaac Gym, Differentiable Human, LVIS, BEHAVIOR 10:25 - Cerebras Wafer Scale Engine Cluster 12:15 - AI Voice Synthesis for Val Kilmer 14:20 - Can AI give thoughtful gifts? References: Reconnaissance Blind Chess NeurIPS 2021 Competition https://rbc.jhuapl.edu/ 2021-09-0517 min

Yannic Kilcher Videos (Audio Only)ALiBi - Train Short, Test Long: Attention with linear biases enables input length extrapolation#alibi #transformers #attention Transformers are essentially set models that need additional inputs to make sense of sequence data. The most widespread additional inputs are position encodings or position embeddings, which add sequence index information in various forms. However, this has put a limit on the resulting model, which cannot run inference on sequences longer than it has been trained on, as it would encounter unfamiliar position encodings. ALiBi solves this by proposing simple linear fixed biases as position information, adding negligible overhead in time and memory, but surprisingly, the resulting model is able to...2021-09-0531 min

Yannic Kilcher Videos (Audio Only)[ML News] Stanford HAI coins Foundation Models & High-profile case of plagiarism uncovered#plagiarism #foundationmodels #tesla The best place to keep up to date with the latest and greatest from the ML world! OUTLINE: 0:00 - Intro & Sponsor 3:15 - A high-profile case of plagiarism shocks the ML world 11:55 - Stanford AI releases paper on "Foundation Models" 19:45 - Updates on Apple's NeuralHash 20:45 - RL control for two-player splorts 21:45 - Tesla's AI Day 23:55 - COMMA THREE announced 24:40 - Intel winding down RealSense cameras 25:20 - IBM unveils Telum...2021-08-3032 min

The Gradient: Perspectives on AIYannic Kilcher on Being an AI Researcher and EducatorIn episode 8 of The Gradient Podcast, we interview Yannic Kilcher, an AI researcher and educator.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSYannic graduated with his PhD from ETH Zurich’s data analytics lab and is now the Chief Technology Officer of DeepJudge, a company building the next-generation AI-powered context-sensitive legal document processing platform. He famously produces videos on his very popular Youtube channel, which cover machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.Check out his Youtube ch...2021-08-2740 min

Yannic Kilcher Videos (Audio Only)Fastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)#attention #transformer #fastformer Transformers have become the dominant model class in the last few years for large data, but their quadratic complexity in terms of sequence length has plagued them until now. Fastformer claims to be the fastest and most performant linear attention variant, able to consume long contexts at once. This is achieved by a combination of additive attention and elementwise products. While initial results look promising, I have my reservations... OUTLINE: 0:00 - Intro & Outline 2:15 - Fastformer description 5:20 - Baseline: Classic Attention 2021-08-2735 min

Yannic Kilcher Videos (Audio Only)PonderNet: Learning to Ponder (Machine Learning Research Paper Explained)#pondernet #deepmind #machinelearning Humans don't spend the same amount of mental effort on all problems equally. Instead, we respond quickly to easy tasks, and we take our time to deliberate hard tasks. DeepMind's PonderNet attempts to achieve the same by dynamically deciding how many computation steps to allocate to any single input sample. This is done via a recurrent architecture and a trainable function that computes a halting probability. The resulting model performs well in dynamic computation tasks and is surprisingly robust to different hyperparameter settings. OUTLINE: 0:00...2021-08-2344 min

Yannic Kilcher Videos (Audio Only)NeuralHash is BROKEN | How to evade Apple's detection and forge hash collisions (w/ Code)#apple #icloud #neuralhash Send your Apple fanboy friends to prison with this one simple trick ;) We break Apple's NeuralHash algorithm used to detect CSAM for iCloud photos. I show how it's possible to craft arbitrary hash collisions from any source / target image pair using an adversarial example attack. This can be used for many purposes, such as evading detection, or forging false positives, triggering manual reviews. OUTLINE: 0:00 - Intro 1:30 - Forced Hash Collisions via Adversarial Attacks 2:30 - My Successful Attack 5:40...2021-08-1908 min

Yannic Kilcher Videos (Audio Only)[ML News] Nvidia renders CEO | Jurassic-1 larger than GPT-3 | Tortured Phrases reveal Plagiarism#mlnews #nvidia #openai An in-depth look over what's going on in the world of Machine Learning and Artificial intelligence. Subscribe now and make Monday the best day of the week! OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 3:00 - Nvidia's CEO was rendered during Keynote 5:00 - AI21 Labs releases Jurassic-1 language model 7:00 - Tortured Phrases reveal plagiarism 10:05 - Cortical neurons are computationally complex 11:55 - OpenAI Codex Update & Challenge 13:30 - Automated drug abuse prevention gone...2021-08-1926 min

Yannic Kilcher Videos (Audio Only)How Apple scans your phone (and how to evade it) - NeuralHash CSAM Detection Algorithm Explained#apple #icloud #privacy Apple recently announced scanning all images uploaded to iCloud for CSAM (child abuse material), and that this scan would happen locally on users' phones. We take a look at the technical report and explore how the system works in detail, how it is designed to preserve user privacy, and what weak points it still has. OUTLINE: 0:00 - Introduction 3:05 - System Requirements 9:15 - System Overview 14:00 - NeuralHash 20:45 - Private Set Intersection 31:15 - Threshold Secret Sharing 2021-08-1650 min

Yannic Kilcher Videos (Audio Only)[ML NEWS] Apple scans your phone | Master Faces beat face recognition | WALL-E is real#mlnews #apple #nolamarck Your update on the latest news in the AI and Machine Learning world. OUTLINE: 0:00 - Intro 0:15 - Sponsor: Weights & Biases 3:30 - Apple to scan iDevices for illegal content 14:10 - EU approves chatcontrol 15:20 - Machine Learning FAQ book 17:40 - TimeDial & Disfl-QA Conversation Datasets 20:30 - VoxPopuli Speech Dataset 21:00 - Google Tensor chip coming to Pixel 6 21:30 - Pentagon uses AI to predict events 23:10 - Sketch your own GAN 2021-08-1630 min

Yannic Kilcher Videos (Audio Only)[ML News] AI-generated patent approved | Germany gets an analog to OpenAI | ML cheats video games#mlnews #dabus #alephalpha OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 3:45 - AI legally recognized as patent inventor 8:35 - Alpeh Alpha raises USD 27Mio to build European OpenAI 10:20 - AMP advances AI aided recycling 11:20 - DeepMind builds XLand RL environment 13:15 - Cognitive Behavioral Therapy as an app 16:15 - Wordcraft interactive AI text editor 17:05 - ML used to cheat in console games 18:10 - Google's OpenBuildings Dataset 20:00 - Most ML COVID tools are flawed 2021-08-0927 min

Yannic Kilcher Videos (Audio Only)[ML News] MMO Game destroys GPUs | OpenAI quits Robotics | Today w/ guest host Sanyam Bhutani#chai #mlnews #nvidia Follow Saynam here: YouTube: https://www.youtube.com/c/ChaiTimeDat... Twitter: https://twitter.com/bhutanisanyam1 Apple Podcasts: https://podcasts.apple.com/us/podcast... LinkedIn: https://www.linkedin.com/in/sanyambhu... Spotify: https://open.spotify.com/show/7IbEWJj... Anchor.fm RSS: https://anchor.fm/s/c19772c/podcast/rss Outline: 0:00 - Intro & Overview 1:30 - Amazon's MMO may destroy gaming GPUs 2:40 - OpenAI pivots away from Robotics 3:35...2021-08-0913 min

Yannic Kilcher Videos (Audio Only)[ML News] Facebook AI adapting robots | Baidu autonomous excavators | Happy Birthday EleutherAIA look into the happenings of the Machine Learning world. OUTLINE: 0:00 - Intro 0:25 - Facebook AI trains rapidly adapting robots 3:05 - Baidu presents autonomous excavator system 4:45 - EleutherAI turns 1 6:05 - Elon Musk says FSD harder than expected 8:10 - AI interview tools still fall short 11:10 - RunwayML AI-powered cloud video editor 11:55 - MineRL BASALT competition to learn from human feedback 13:15 - The Myth of the Expert Reviewer 15:55 - NVIDIA unveils Cambridge-1 supercomputer 17:10...2021-07-1823 min

Yannic Kilcher Videos (Audio Only)[ML News] GitHub Copilot - Copyright, GPL, Patents & more | Brickit LEGO app | Distill goes on break#copilot #copyright #gpl GitHub and OpenAI release Copilot, an AI-powered code autocomplete system that can generate entire functions, classes, and modules from mere definitions and docstrings. Copilot was trained on all public GitHub repositories, and this has a lot of people upset about questions on copyright, code licenses, social obligations, and how much you can profit from other people's work. I give my opinions on the issue in relation to copyright law, the GPL license, and terms of service. Further, we discuss the Brickit app to organize your LEGOs, Distill going on a break...2021-07-1327 min

Yannic Kilcher Videos (Audio Only)Self-driving from VISION ONLY - Tesla's self-driving progress by Andrej Karpathy (Talk Analysis)#tesla #selfdriving #karpathy Tesla is pushing the state-of-the-art in full self-driving, and interestingly, they explicitly switch from having multiple different sensors to a vision-only system. We discuss the highlights of Andrej Karpathy's talk about Tesla's FSD system, how to label petabytes of data, how to sample edge-cases, how to train a neural network that has to work in real-time, and why moving to having only cameras is superior to multi-sensor approaches. OUTLINE: 0:00 - Intro & Overview 1:55 - Current Auto-Breaking system 3:20 - Full Self-Driving from...2021-07-0523 min

Yannic Kilcher Videos (Audio Only)[ML News] CVPR bans social media paper promotion | AI restores Rembrandt | GPU prices down#cvpr #socialmedia #machinelearning In this week's ML news we look at CVPR's controversial action to ban paper promotions on social media during the review phase, among other things! OUTLINE: 0:00 - Intro & Overview 0:25 - CVPR bans social media paper discussions 5:10 - WalMart uses AI to suggest substitutions 6:05 - NVIDIA releases Alias-Free GAN 7:30 - Confession Video in Myanmar possibly a DeepFake 8:50 - AI restores Rembrandt painting 10:40 - AI for healthcare not problem-free yet 11:50 - ML...2021-07-0518 min

Yannic Kilcher Videos (Audio Only)The Dimpled Manifold Model of Adversarial Examples in Machine Learning (Research Paper Explained)#adversarialexamples #dimpledmanifold #security Adversarial Examples have long been a fascinating topic for many Machine Learning researchers. How can a tiny perturbation cause the neural network to change its output by so much? While many explanations have been proposed over the years, they all appear to fall short. This paper attempts to comprehensively explain the existence of adversarial examples by proposing a view of the classification landscape, which they call the Dimpled Manifold Model, which says that any classifier will adjust its decision boundary to align with the low-dimensional data manifold, and only slightly bend...2021-06-281h 14

Yannic Kilcher Videos (Audio Only)[ML News] Hugging Face course | GAN Theft Auto | AI Programming Puzzles | PyTorch 1.9 Released#mlnews #gta #weather In this week's ML News, we look at the latest developments in the Machine Learning and AI world with updates from research, industry, and society at large. OUTLINE: 0:00 - Intro 0:20 - Hugging Face launches free course 1:30 - Sentdex releases GAN Theft Auto 2:25 - Facebook uses AI to help moderators 4:10 - Weather with Antonio 5:10 - Autonomous ship aborts mission 7:25 - PyTorch Release 1.9 8:30 - McDonald's new AI drive thru 10:20...2021-06-2515 min

Yannic Kilcher Videos (Audio Only)XCiT: Cross-Covariance Image Transformers (Facebook AI Machine Learning Research Paper Explained)#xcit #transformer #attentionmechanism After dominating Natural Language Processing, Transformers have taken over Computer Vision recently with the advent of Vision Transformers. However, the attention mechanism's quadratic complexity in the number of tokens means that Transformers do not scale well to high-resolution images. XCiT is a new Transformer architecture, containing XCA, a transposed version of attention, reducing the complexity from quadratic to linear, and at least on image data, it appears to perform on par with other models. What does this mean for the field? Is this even a transformer? What really matters in deep...2021-06-2535 min

Yannic Kilcher Videos (Audio Only)AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control (Paper Explained)#reiforcementlearning #gan #imitationlearning Learning from demonstrations is a fascinating topic, but what if the demonstrations are not exactly the behaviors we want to learn? Can we adhere to a dataset of demonstrations and still achieve a specified goal? This paper uses GANs to combine goal-achieving reinforcement learning with imitation learning and learns to perform well at a given task while doing so in the style of a given presented dataset. The resulting behaviors include many realistic-looking transitions between the demonstrated movements. OUTLINE: 0:00 - Intro & Overview 1:25...2021-06-2234 min

Yannic Kilcher Videos (Audio Only)[ML News] De-Biasing GPT-3 | RL cracks chip design | NetHack challenge | Open-Source GPT-JOUTLINE: 0:00 - Intro 0:30 - Google RL creates next-gen TPUs 2:15 - Facebook launches NetHack challenge 3:50 - OpenAI mitigates bias by fine-tuning 9:05 - Google AI releases browseable reconstruction of human cortex 9:50 - GPT-J 6B Transformer in JAX 12:00 - Tensorflow launches Forum 13:50 - Text style transfer from a single word 15:45 - ALiEn artificial life simulator My Video on Chip Placement: https://youtu.be/PDRtyrVskMU References: RL creates next-gen TPUs https...2021-06-2217 min

Yannic Kilcher Videos (Audio Only)Efficient and Modular Implicit Differentiation (Machine Learning Research Paper Explained)#implicitfunction #jax #autodiff Many problems in Machine Learning involve loops of inner and outer optimization. Finding update steps for the outer loop is usually difficult, because of the.need to differentiate through the inner loop's procedure over multiple steps. Such loop unrolling is very limited and constrained to very few steps. Other papers have found solutions around unrolling in very specific, individual problems. This paper proposes a unified framework for implicit differentiation of inner optimization procedures without unrolling and provides implementations that integrate seamlessly into JAX. OUTLINE: 0:00...2021-06-1532 min

Yannic Kilcher Videos (Audio Only)[ML News] EU regulates AI, China trains 1.75T model, Google's oopsie, Everybody cheers for fraud.#mlnews #wudao #academicfraud OUTLINE: 0:00 - Intro 0:25 - EU seeks to regulate AI 2:45 - AI COVID detection systems are all flawed 5:05 - Chinese lab trains model 10x GPT-3 size 6:55 - Google error identifies "ugliest" language 9:45 - McDonald's learns about AI buzzwords 11:25 - AI predicts cryptocurrency prices 12:00 - Unreal Engine hack for CLIP 12:35 - Please commit more academic fraud References: https://www.lawfareblog.com/artificia... https://blogs.sciencemag.org...2021-06-1016 min

Yannic Kilcher Videos (Audio Only)Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)#decisiontransformer #reinforcementlearning #transformer Proper credit assignment over long timespans is a fundamental problem in reinforcement learning. Even methods designed to combat this problem, such as TD-learning, quickly reach their limits when rewards are sparse or noisy. This paper reframes offline reinforcement learning as a pure sequence modeling problem, with the actions being sampled conditioned on the given history and desired future rewards. This allows the authors to use recent advances in sequence modeling using Transformers and achieve competitive results in Offline RL benchmarks. OUTLINE: 0:00 - Intro & Overview 2021-06-0756 min

Yannic Kilcher Videos (Audio Only)[ML News] Anthropic raises $124M, ML execs clueless, collusion rings, ELIZA source discovered & more#mlnews #anthropic #eliza Anthropic raises $124M for steerable AI, peer review is threatened by collusion rings, and the original ELIZA source code was discovered. OUTLINE: 0:00 - Intro 0:40 - Anthropic raises $124M 3:25 - 65% of execs can't explain AI predictions 4:25 - DeepMind releases AndroidEnv 6:10 - Collusion rings in ML Conferences 7:30 - ELIZA's original source code discovered 10:45 - OpenAI raises $100M fund 11:25 - Outro References: https://techcrunch.com/2021/05/28/ant...2021-06-0711 min

Yannic Kilcher Videos (Audio Only)Reward Is Enough (Machine Learning Research Paper Explained)#reinforcementlearning #deepmind #agi What's the most promising path to creating Artificial General Intelligence (AGI)? This paper makes the bold claim that a learning agent maximizing its reward in a sufficiently complex environment will necessarily develop intelligence as a by-product, and that Reward Maximization is the best way to move the creation of AGI forward. The paper is a mix of philosophy, engineering, and futurism, and raises many points of discussion. OUTLINE: 0:00 - Intro & Outline 4:10 - Reward Maximization 10:10 - The Reward-is-Enough Hypothesis 13:15...2021-06-0235 min

Yannic Kilcher Videos (Audio Only)Expire-Span: Not All Memories are Created Equal: Learning to Forget by Expiring (Paper Explained)#expirespan #nlp #facebookai Facebook AI (FAIR) researchers present Expire-Span, a variant of Transformer XL that dynamically assigns expiration dates to previously encountered signals. Because of this, Expire-Span can handle sequences of many thousand tokens, while keeping the memory and compute requirements at a manageable level. It severely matches or outperforms baseline systems, while consuming much less resources. We discuss its architecture, advantages, and shortcomings. OUTLINE: 0:00 - Intro & Overview 2:30 - Remembering the past in sequence models 5:45 - Learning to expire past memories 8:30...2021-05-2641 min

Yannic Kilcher Videos (Audio Only)FNet: Mixing Tokens with Fourier Transforms (Machine Learning Research Paper Explained)#fnet #attention #fourier Do we even need Attention? FNets completely drop the Attention mechanism in favor of a simple Fourier transform. They perform almost as well as Transformers, while drastically reducing parameter count, as well as compute and memory requirements. This highlights that a good token mixing heuristic could be as valuable as a learned attention matrix. OUTLINE: 0:00 - Intro & Overview 0:45 - Giving up on Attention 5:00 - FNet Architecture 9:00 - Going deeper into the Fourier Transform 11:20 - The Importance...2021-05-2434 min

Yannic Kilcher Videos (Audio Only)AI made this music video | What happens when OpenAI's CLIP meets BigGAN?#artificialintelligence #musicvideo #clip I used OpenAI's CLIP model and BigGAN to create a music video that goes along with the lyrics of a song that I wrote. The song lyrics are made from ImageNet class labels, and the song itself is performed by me on a looper. OUTLINE: 0:00 - Intro 1:00 - AI-generated music video for "be my weasel" 3:50 - How it was made 7:30 - My looping gear 9:35 - AI-generated music video #2 12:45 - Outro & Credits 2021-05-2113 min

Yannic Kilcher Videos (Audio Only)DDPM - Diffusion Models Beat GANs on Image Synthesis (Machine Learning Research Paper Explained)#ddpm #diffusionmodels #openai GANs have dominated the image generation space for the majority of the last decade. This paper shows for the first time, how a non-GAN model, a DDPM, can be improved to overtake GANs at standard evaluation metrics for image generation. The produced samples look amazing and other than GANs, the new model has a formal probabilistic foundation. Is there a future for GANs or are Diffusion Models going to overtake them for good? OUTLINE: 0:00 - Intro & Overview 4:10 - Denoising Diffusion Probabilistic Models 2021-05-1554 min

Yannic Kilcher Videos (Audio Only)Involution: Inverting the Inherence of Convolution for Visual Recognition (Research Paper Explained)#involution #computervision #attention Convolutional Neural Networks (CNNs) have dominated computer vision for almost a decade by applying two fundamental principles: Spatial agnosticism and channel-specific computations. Involution aims to invert these principles and presents a spatial-specific computation, which is also channel-agnostic. The resulting Involution Operator and RedNet architecture are a compromise between classic Convolutions and the newer Local Self-Attention architectures and perform favorably in terms of computation accuracy tradeoff when compared to either. OUTLINE: 0:00 - Intro & Overview 3:00 - Principles of Convolution 10:50 - Towards spatial-specific computations...2021-05-1030 min

Yannic Kilcher Videos (Audio Only)MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)#mixer #google #imagenet Convolutional Neural Networks have dominated computer vision for nearly 10 years, and that might finally come to an end. First, Vision Transformers (ViT) have shown remarkable performance, and now even simple MLP-based models reach competitive accuracy, as long as sufficient data is used for pre-training. This paper presents MLP-Mixer, using MLPs in a particular weight-sharing arrangement to achieve a competitive, high-throughput model and it raises some interesting questions about the nature of learning and inductive biases and their interaction with scale for future research. OUTLINE: 0:00 - Intro...2021-05-1028 min

The Engineered-Mind Podcast | Engineering, AI & TechnologyAdversarial Examples, AI Bias & Memes - Yannic Kilcher | Podcast #49Yannic Kilcher has a Master's in CS from ETH and now he is a PhD student and researcher at ETH in the Data Analytics Lab by day and an AI YouTuber by night. ————————————————————————————— 🧠 Free Science Community: community.sci-circle.com 👉 Science Academy: academy.jousefmurad.com 📥 Weekly free science insights newsletter: jousef.substack.com 🐤 Follow me on Twitter: @jousefm2 📷 Follow me on Instagram: @jousefmrd Feel free to support the podcast on Patreon: patreon.com/theengiineer 2021-05-0842 min

Yannic Kilcher Videos (Audio Only)Is Google Translate Sexist? Gender Stereotypes in Statistical Machine Translation#genderbias #algorithmicfairness #debiasing A brief look into gender stereotypes in Google Translate. The origin is a Tweet containing a Hungarian text. Hungarian is a gender-neutral language, so translating gender pronouns is ambiguous. Turns out that Google Translate assigns very stereotypical pronouns. In this video, we'll have a look at the origins and possible solutions to this problem. OUTLINE: 0:00 - Intro 1:10 - Digging Deeper 2:30 - How does Machine Translation work? 3:50 - Training Data Problems 4:40 - Learning Algorithm Problems 5:45 - Argmax Output Pr...2021-05-0312 min

Yannic Kilcher Videos (Audio Only)Perceiver: General Perception with Iterative Attention (Google DeepMind Research Paper Explained)#perceiver #deepmind #transformer Inspired by the fact that biological creatures attend to multiple modalities at the same time, DeepMind releases its new Perceiver model. Based on the Transformer architecture, the Perceiver makes no assumptions on the modality of the input data and also solves the long-standing quadratic bottleneck problem. This is achieved by having a latent low-dimensional Transformer, where the input data is fed multiple times via cross-attention. The Perceiver's weights can also be shared across layers, making it very similar to an RNN. Perceivers achieve competitive performance on ImageNet and state-of-the-art on other modali...2021-05-0329 min

Yannic Kilcher Videos (Audio Only)Pretrained Transformers as Universal Computation Engines (Machine Learning Research Paper Explained)#universalcomputation #pretrainedtransformers #finetuning Large-scale pre-training and subsequent fine-tuning is a common recipe for success with transformer models in machine learning. However, most such transfer learning is done when a model is pre-trained on the same or a very similar modality to the final task to be solved. This paper demonstrates that transformers can be fine-tuned to completely different modalities, such as from language to vision. Moreover, they demonstrate that this can be done by freezing all attention layers, tuning less than .1% of all parameters. The paper further claims that language modeling is a superior pre-tr...2021-05-0334 min

Yannic Kilcher Videos (Audio Only)Yann LeCun - Self-Supervised Learning: The Dark Matter of Intelligence (FAIR Blog Post Explained)#selfsupervisedlearning #yannlecun #facebookai Deep Learning systems can achieve remarkable, even super-human performance through supervised learning on large, labeled datasets. However, there are two problems: First, collecting ever more labeled data is expensive in both time and money. Second, these deep neural networks will be high performers on their task, but cannot easily generalize to other, related tasks, or they need large amounts of data to do so. In this blog post, Yann LeCun and Ishan Misra of Facebook AI Research (FAIR) describe the current state of Self-Supervised Learning (SSL) and argue that it is the ne...2021-05-0358 min

Yannic Kilcher Videos (Audio Only)Multimodal Neurons in Artificial Neural Networks (w/ OpenAI Microscope, Research Paper Explained)#openai #clip #microscope OpenAI does a huge investigation into the inner workings of their recent CLIP model via faceted feature visualization and finds amazing things: Some neurons in the last layer respond to distinct concepts across multiple modalities, meaning they fire for photographs, drawings, and signs depicting the same concept, even when the images are vastly distinct. Through manual examination, they identify and investigate neurons corresponding to persons, geographical regions, religions, emotions, and much more. In this video, I go through the publication and then I present my own findings from digging around in the Op...2021-05-0351 min

Yannic Kilcher Videos (Audio Only)Machine Learning PhD Survival Guide 2021 | Advice on Topic Selection, Papers, Conferences & more!#machinelearning #phd #howto This video is advice for new PhD students in the field of Machine Learning in 2021 and after. The field has shifted dramatically in the last few years and navigating grad school can be very hard, especially when you're as clueless as I was when I started. The video is a personal recount of my mistakes and what I've learned from them. If you already have several published papers and know what to do, this video is not for you. However, if you are not even sure where to start, how to select...2021-05-0316 min

Yannic Kilcher Videos (Audio Only)PAIR AI Explorables | Is the problem in the data? Examples on Fairness, Diversity, and Bias.In the recurring debate about bias in Machine Learning models, there is a growing argument saying that "the problem is not in the data", often citing the influence of various choices like loss functions or network architecture. In this video, we take a look at PAIR's AI Explorables through the lens of whether or not the bias problem is a data problem. OUTLINE: 0:00 - Intro & Overview 1:45 - Recap: Bias in ML 4:25 - AI Explorables 5:40 - Measuring Fairness Explorable 11:00 - Hidden Bias Explorable 16:10 - Measuring...2021-05-0323 min

Yannic Kilcher Videos (Audio Only)DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning#dreamcoder #programsynthesis #symbolicreasoning Classic Machine Learning struggles with few-shot generalization for tasks where humans can easily generalize from just a handful of examples, for example sorting a list of numbers. Humans do this by coming up with a short program, or algorithm, that explains the few data points in a compact way. DreamCoder emulates this by using neural guided search over a language of primitives, a library, that it builds up over time. By doing this, it can iteratively construct more and more complex programs by building on its own abstractions and therefore solve more a...2021-05-0348 min

Yannic Kilcher Videos (Audio Only)NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)#nerf #neuralrendering #deeplearning View Synthesis is a tricky problem, especially when only given a sparse set of images as an input. NeRF embeds an entire scene into the weights of a feedforward neural network, trained by backpropagation through a differential volume rendering procedure, and achieves state-of-the-art view synthesis. It includes directional dependence and is able to capture fine structural details, as well as reflection effects and transparency. OUTLINE: 0:00 - Intro & Overview 4:50 - View Synthesis Task Description 5:50 - The fundamental difference to classic Deep Learning...2021-05-0233 min

Yannic Kilcher Videos (Audio Only)DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)#dino #facebook #selfsupervised Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with the new Vision Transformer (ViT) architecture and achieves impressive results without any labels. Attention maps can be directly interpreted as segmentation maps, and the obtained representations can be used for image retrieval and zero-shot k-nearest neighbor classifiers (KNNs). OUTLINE: 0:00 - Intro & Overview 6:20 - Vision Transformers 9:20 - Self-Supervised Learning for Images 2021-05-0239 min

Yannic Kilcher Videos (Audio Only)Why AI is Harder Than We Think (Machine Learning Research Paper Explained)#aiwinter #agi #embodiedcognition The AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This paper examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI. OUTLINE: 0:00 - Intro & Overview 2:10 - AI Springs & AI Winters 5:40 - Is the current AI boom...2021-05-0236 min