Look for any podcast host, guest or anyone
Showing episodes and shows of

Jay Alammar

Shows

Otostopçunun Yapay Zeka RehberiOtostopçunun Yapay Zeka RehberiGPT Nasıl Çalışır? Büyük Dil Modelleri, Token, Embedding, Dikkat ve Daha FazlasıBu bölümde, GPT’nin nasıl devasa metin veri setleri üzerinde önceden eğitildiğini, nasıl Transformer mimarisi üzerine kurulduğunu ve metinleri nasıl bir kelimenin sonrasında hangisinin geleceğini tahmin ederek oluşturduğunu keşfediyoruz. Bunu yaparken, eski dostumuz Turing'e selam veriyor, Doğal Dil İşleme'nin (NLP) tarihine göz atıyor, ELIZA'ya hal hatır soruyoruz.Beatles'ın "All you need is love" haykırışını kulaklarımızda çınlatırken, biz de dikkat mekanizmalarına, tokenizasyona, embedding yöntemlerine göz atıyoruz. RNN’lerden LSTM’lere algoritmaların nasıl evrildiğine kısaca bakıyoruz. GPT’ni...2025-04-2731 minAI Innovations UnleashedAI Innovations UnleashedThe Human Guide to Artificial Intelligence - WTF is AI - Ep 1Artificial Intelligence isn’t just for data scientists. In Episode 1 of WTF is AI?, we make AI human—with dogs, toddlers, Netflix, and pizza analogies that actually make sense. 🎧 A must-listen for professionals navigating the AI revolution.===========Reference List & Further ReadingGeneral AI Concepts:Russell & Norvig. Artificial Intelligence: A Modern ApproachStanford’s “AI Index Report” (https://aiindex.stanford.edu)Google’s “AI for Anyone” Guide: https://ai.google/education/On Transformer Models:“Attention Is All You Need” (Vaswani et al., 2017)OpenAI’s GPT-3 and GPT-4 papersIllustrated Transformer Guide by Jay Alammar: https://j...2025-04-0831 minThe Hard Part with Evan McCannThe Hard Part with Evan McCannThe Friday Show - Connor Turland, Alexander Stratmoen, Owen Brakes, Rahul Gudise, Jianxiang (Jack) Xu, Adam Cohen, and Jay AlammarThe first episode of The Friday Show!00:00 - Intro to The Friday Show Format01:57 - X Content Rundown08:38 - Connor Turland from Ceedar18:57 - Alexander Stratmoen on Socratica Symposium27:55 - Owen Brakes from Steinmetz37:03 - Rahul Gudise from Gale48:00 - Jack Xu from Proception55:14 - Adam Cohen from Weave57:00 - Jay Alammar from Cohere on Command A2025-03-211h 06Data Neighbor PodcastData Neighbor PodcastEp18: Open-Source LLMs vs. ChatGPT: Which One Should You Use?AI is evolving faster than ever—and open-source AI models are catching up to proprietary models at an incredible pace. In this episode of the Data Neighbor Podcast, we sit down with Maarten Grootendorst, co-author of Hands-On Large Language Models with Jay Alammar, DeepLearning.AI instructor, and creator of BERTopic and KeyBERT, to break down the real differences between open-source and closed-source AI models.We’ll discuss how LLMs (Large Language Models) evolved from bag-of-words and Word2Vec to modern transformer-based models like BERT, GPT-4, DeepSeek, LLaMA 2, and Mixtral. More importantly, we explore when open-source AI models might actually be b...2025-03-161h 00IA Sob Controle - Inteligência ArtificialIA Sob Controle - Inteligência Artificial117: Impacto do DeepSeek, OpenAI o3-mini grátis, IA no SiSUSexta-feira é dia de repercutir as principais notícias da semana, no mundo da IA. Vem ver quem participou desse papo: Marcus Mendes, host sob controle Fabrício Carraro, co-host sob controle, Program Manager da Alura, autor de IA e host do podcast Dev Sem Fronteiras 🔗 Links: Fabrício Carraro: Como o Deepseek-R1 foi treinado Jay Alammar: The Illustrated DeepSeek-R1 Nvidia perdeu 600B em valor de mercado na segunda-feira EUA investigam se a DeepSeek usou chips indevidos Meta monta quatro operações de guerra para analisar o DeepSeek Itália bane o DeepSeek Perplexity adiciona o R1 às buscas do pl...2025-01-311h 03Exploring the unknown, together.Exploring the unknown, together.Roads to Research: Scientific CommunicationUncover the secrets of successful scientific communication at our upcoming panel discussion. Join Marzieh Fadaee with Jay Alammar and Shayne Longpre as they share their insights on the art of translating scientific concepts.2024-12-101h 01GenAI Level UPGenAI Level UPAttention Is All You Need - Level 6The Transformer: Revolutionizing Sequence Transduction with Self-Attention This episode explores the groundbreaking Transformer, a novel neural network architecture that has transformed the field of sequence transduction. The Transformer dispenses with recurrence and convolutions entirely, relying solely on attention mechanisms to capture global dependencies between input and output sequences. This results in superior performance on tasks like machine translation and significantly faster training times. We'll break down the key components of the Transformer, including multi-head self-attention, positional encoding, and encoder-decoder stacks, explaining how they work...2024-11-2715 minMachine Learning Street Talk (MLST)Machine Learning Street Talk (MLST)Jay Alammar on LLMs, RAG, and AI EngineeringJay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api. 2024-08-1157 minLearning from Machine LearningLearning from Machine LearningLewis Tunstall: Hugging Face, SetFit and Reinforcement Learning | Learning from Machine Learning #6This episode features Lewis Tunstall, machine learning engineer at Hugging Face and author of the best selling book Natural Language Processing with Transformers. He currently focuses on one of the hottest topic in NLP right now reinforcement learning from human feedback (RLHF). Lewis holds a PhD in quantum physics and his research has taken him around the world and into some of the most impactful projects including the Large Hadron Collider, the world's largest and most powerful particle accelerator. Lewis shares his unique story from Quantum Physicist to Data Scientist to Machine Learning Engineer. Resources to learn...2023-10-031h 18What\'s AI Podcast by Louis-François BouchardWhat's AI Podcast by Louis-François BouchardBuilding LLM Apps & the Challenges that come with it. The What's AI Podcast Episode 16: Jay AlammarMy interview with Jay Alammar, widely known inthe AI and NLP field mainly through his great blog on transformers and attention. ►Watch on YouTube: https://youtu.be/TO0IV9e2MMQ ►LLM University: https://docs.cohere.com/docs/llmu ►Jay's blog: http://jalammar.github.io/illustrated-transformer/ ►Twitter: https://twitter.com/JayAlammar, https://twitter.com/Whats_AI ►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/ ►Support me on Patreon: https://www.patreon.com/whatsai ►Join Our AI Discord: https://discor...2023-06-271h 12Getting SimpleGetting Simple#71: Alex O'Connor — Transformers, Generative AI, and the Deep Learning RevolutionAlex O’Connor—researcher and ML manager—on the latest trends of generative AI. Language and image models, prompt engineering, the latent space, fine-tuning, tokenization, textual inversion, adversarial attacks, and more. Alex O’Connor got his PhD in Computer Science from Trinity College, Dublin. He was a postdoctoral researcher and funded investigator for the ADAPT Centre for digital content, at both TCD and later DCU. In 2017, he joined Pivotus, a Fintech startup, as Director of Research. Alex has been Sr Manager for Data Science & Machine Learning at Autodesk for the past few years, leading a team t...2023-04-261h 45{ между скобок }{ между скобок }Юля Яковлева, Константин Шибков: ChatGPT для разработчиков#chatgpt #openai #softwareengineer #developers Юля сделала невероятно интересную презентацию, которая показывает как работает ChatGPT под капотом, как проходит обучение этой нейронной сети и какое будущее у проекта с техничекой стороны. После рассмотрели примеры использования ChatGPT - как генерировать ASCII схемы, разбор и изучение sql скриптов. Так же Костя подробно поделился как он использует ChatGPT для решения задач LeetCode, создание образовательного контента. В конце пришли к выводу, что ChatGPT не сможет заменить разработчиков, тк по уровню технических навыков находится на уровне стажера. Ламповый чат https://t.me/backend_megdu_skobkahКанал с анонсами https://t.me/megdu_skobokYouTube https://youtu.be/g2u21UsAS84Boosty https://boosty.to/megdu_skobokПолезные ссылки 📖 Github Юли https://github.com/robolamp📖 Мега крутая презентация от Юли https://docs.google.com/presentation/d/1BlXR51CmNxxUnDDX87jF2D4o74siGI-ZbEVXj77HY1Q/📖 Блог Jay Alammar от ML https://jalammar.github.io📖 Алгоритмический клуб https://t.me/JavaKeyFrames📖 Конспект от Кости с детальным разбором кейсом использования ChatGPT https://sendel.notion.site/Chat-GPT-b4d4722ace864875a0884cb30f4e6736📖 ChatGPT for Developers https://docs.gpt4devs.com/📖 Приколы с ChatGPT: обмануть или быть обманутым https://habr.com/ru/post/709636/📖 650+ Best Prompts For ChatGPT https://www.writingbeginner.com/best-prompts-for-chatgpt/📖 ChatGPT Prompts for Developer Use Cases https://www.tooltester.com/en/blog/best-chatgpt-prompts/#ChatGPT_Prompts_for_Developer_Use_Cases0:00 Приветствие 3:33 Что такое chatGPT и как оно работает 31:50 Конвертация protobuf в markdown 33:46 ChatGPT как...2023-04-111h 31Misreading ChatMisreading Chat#111: Formal Algorithms for Transformers 勤務先への脅威に怯える森田が Transformer を復習しました。ご意見ご感想などはおたより投書箱や Reddit にお寄せください。iTunes のレビューや星も歓迎です。 今回は録音に際し Adobe Podcast (beta) のバグを引き当ててしまい、向井と森田の音声トラックがずれてしまいました。ごめんなさい。次回からは non-beta の手堅いツールで録音しようと思います・・・。 [2207.09238] Formal Algorithms for Transformers #15 – Neural Machine Translation by Jointly Learning to Align and Translate #38 – Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates #51 – Attention Is All You Need #53 – BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jay Alammar – YouTube GitHub – openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI’s models. GitHub – karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs. Let’s build GPT: from scratch, in code, spelled out. – YouTube 2023-04-0438 minChangelog Master FeedChangelog Master FeedApplied NLP solutions & AI education (Practical AI #212)We’re super excited to welcome Jay Alammar to the show. Jay is a well-known AI educator, applied NLP practitioner at co:here, and author of the popular blog, “The Illustrated Transformer.” In this episode, he shares his ideas on creating applied NLP solutions, working with large language models, and creating educational resources for state-of-the-art AI. Discuss on Changelog News Changelog++ members support our work, get closer to the metal, and make the ads disappear. Join today! Sponsors: Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond yo...2023-02-2238 minPractical AIPractical AIApplied NLP solutions & AI educationWe’re super excited to welcome Jay Alammar to the show. Jay is a well-known AI educator, applied NLP practitioner at co:here, and author of the popular blog, “The Illustrated Transformer.” In this episode, he shares his ideas on creating applied NLP solutions, working with large language models, and creating educational resources for state-of-the-art AI.Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Sponsors:Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content...2023-02-2238 minMachine Learning Street Talk (MLST)Machine Learning Street Talk (MLST)#80 AIDAN GOMEZ [CEO Cohere] - Language as SoftwareWe had a conversation with Aidan Gomez, the CEO of language-based AI platform Cohere. Cohere is a startup which uses artificial intelligence to help users build the next generation of language-based applications. It's headquartered in Toronto. The company has raised $175 million in funding so far. Language may well become a key new substrate for software building, both in its representation and how we build the software. It may democratise software building so that more people can build software, and we can build new types of software. Aidan and I discuss this in detail in this episode of...2022-11-1551 minTechmeme Ride HomeTechmeme Ride HomeFri. 10/07 – What Is It With These Dang Bridges?There’s another huge crypto hack and I’ll give you two guesses as to what folks think the culprit is. The Twitter/Elon trial is officially paused. Meta can’t get its own developers to use their metaverse products. Maybe my dream of Death Star style anklebots is over. And, of course, the weekend longreads suggestions.Sponsors:Wealthfront.com/techmemeLinks:Binance-linked blockchain hit by $570 million crypto hack (Reuters)The Elon Musk vs. Twitter trial is on hold until October 28th (The Verge)Meta’s flagship metaverse app is too buggy and...2022-10-0717 minThe Real Python PodcastThe Real Python PodcastMoving NLP Forward With Transformer Models and AttentionWhat’s the big breakthrough for Natural Language Processing (NLP) that has dramatically advanced machine learning into deep learning? What makes these transformer models unique, and what defines “attention?” This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, continues our talk about how machine learning (ML) models understand and generate text. This episode is a continuation of the conversation in episode #119. Jodie builds on the concepts of bag-of-words, word2vec, and simple embedding models. We talk about the breakthrough mechanism called “attention,” which allows for parallelization in building models. We also discu...2022-08-1250 minHacia Afuera con Omar EspejelHacia Afuera con Omar EspejelEp 14 - Haritz Puerto (UKP Lab) - Haciendo que la IA resuelva preguntasHaritz habla personalmente y no representando a ninguna empresa o institución de ninguna manera. Toda la información aquí descrita es mi interpretación y no necesariamente lo que Haritz quiso decir. Haritz es investigador ciéntifico en el Ubiquitous Knowledge Processing (UKP) Lab en la TU Darmstadt. Fundado en 2009, el UKP es uno de los centros más importantes de investigación en procesamiento del lenguaje natural del mundo. El trabajo de Haritz se ha enfocado en Question Answering, es decir, algoritmos que sirven para contestar preguntas, generar preguntas y graph neural networks. Hizo su...2022-02-1956 minPost MortemPost Mortem#8 When the facts change, I change my model"When the Facts Change, I Change My Mind. What Do You Do, Sir?" disait JM Keynes.  L’économiste soulignait alors l’importance de réajuster ses a priori et sa représentation du monde lorsqu'on on est confronté à de nouveaux éléments. C’est la même chose lorsqu’on entraîne un modèle de machine learning et qu’on le déploie. Les données que l’on va rencontrer en production suivent-elles une distribution similaire aux données sur lesquelles on a entraîné le modèle? Si non, comment peut-on ajuster le tir? 2021-02-0523 minMisreading ChatMisreading Chat#53 – BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingNN の自然言語処理で transfer learning を実現した BERT について向井が話します。感想などはハッシュタグ #misreading か hello@misreading.chat にお寄せください。 https://misreading.chat/wp-content/uploads/2019/03/ep53.mp3 [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Improving Language Understanding by Generative Pre-Training GitHub – openai/gpt-2: Code for the paper “Language Models are Unsupervised Multitask Learners” [1901.11504] Multi-Task Deep Neural Networks for Natural Language Understanding Microsoft’s New MT-DNN Outperforms Google BERT – SyncedReview – Medium BERT with SentencePiece を日本語 Wikipedia で学習してモデルを公開しました – 原理的には可能 – データ分析界隈の人のブログ、もとい雑記帳 Follow up Jay Alammar جهاد العمار | LinkedIn STV 2019-03-2133 minMisreading ChatMisreading Chat#51 – Attention Is All You Need昨年から成果が注目されている自然言語処理向けの新しいニューラルネットワーク Transformer を向井が紹介します。感想などはハッシュタグ #misreading か hello@misreading.chat にお寄せください。 https://misreading.chat/wp-content/uploads/2019/02/ep51.mp3 [1706.03762] Attention Is All You Need The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time The Annotated Transformer Episode 15 – Neural Machine Translation by Jointly Learning to Align and Translate – Misreading Chat 2019-03-0122 minPractical AIPractical AIOpenAI's new "dangerous" GPT-2 language modelThis week we discuss GPT-2, a new transformer-based language model from OpenAI that has everyone talking. It’s capable of generating incredibly realistic text, and the AI community has lots of concerns about potential malicious applications. We help you understand GPT-2 and we discuss ethical concerns, responsible release of AI research, and resources that we have found useful in learning about language models. Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Sponsors:Linode – Our cloud server of choice. Depl...2019-02-2540 min