podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
Jay Alammar
Shows
Otostopçunun Yapay Zeka Rehberi
GPT Nasıl Çalışır? Büyük Dil Modelleri, Token, Embedding, Dikkat ve Daha Fazlası
Bu bölümde, GPT’nin nasıl devasa metin veri setleri üzerinde önceden eğitildiğini, nasıl Transformer mimarisi üzerine kurulduğunu ve metinleri nasıl bir kelimenin sonrasında hangisinin geleceğini tahmin ederek oluşturduğunu keşfediyoruz. Bunu yaparken, eski dostumuz Turing'e selam veriyor, Doğal Dil İşleme'nin (NLP) tarihine göz atıyor, ELIZA'ya hal hatır soruyoruz.Beatles'ın "All you need is love" haykırışını kulaklarımızda çınlatırken, biz de dikkat mekanizmalarına, tokenizasyona, embedding yöntemlerine göz atıyoruz. RNN’lerden LSTM’lere algoritmaların nasıl evrildiğine kısaca bakıyoruz. GPT’ni...
2025-04-27
31 min
AI Innovations Unleashed
The Human Guide to Artificial Intelligence - WTF is AI - Ep 1
Artificial Intelligence isn’t just for data scientists. In Episode 1 of WTF is AI?, we make AI human—with dogs, toddlers, Netflix, and pizza analogies that actually make sense. 🎧 A must-listen for professionals navigating the AI revolution.===========Reference List & Further ReadingGeneral AI Concepts:Russell & Norvig. Artificial Intelligence: A Modern ApproachStanford’s “AI Index Report” (https://aiindex.stanford.edu)Google’s “AI for Anyone” Guide: https://ai.google/education/On Transformer Models:“Attention Is All You Need” (Vaswani et al., 2017)OpenAI’s GPT-3 and GPT-4 papersIllustrated Transformer Guide by Jay Alammar: https://j...
2025-04-08
31 min
The Hard Part with Evan McCann
The Friday Show - Connor Turland, Alexander Stratmoen, Owen Brakes, Rahul Gudise, Jianxiang (Jack) Xu, Adam Cohen, and Jay Alammar
The first episode of The Friday Show!00:00 - Intro to The Friday Show Format01:57 - X Content Rundown08:38 - Connor Turland from Ceedar18:57 - Alexander Stratmoen on Socratica Symposium27:55 - Owen Brakes from Steinmetz37:03 - Rahul Gudise from Gale48:00 - Jack Xu from Proception55:14 - Adam Cohen from Weave57:00 - Jay Alammar from Cohere on Command A
2025-03-21
1h 06
Data Neighbor Podcast
Ep18: Open-Source LLMs vs. ChatGPT: Which One Should You Use?
AI is evolving faster than ever—and open-source AI models are catching up to proprietary models at an incredible pace. In this episode of the Data Neighbor Podcast, we sit down with Maarten Grootendorst, co-author of Hands-On Large Language Models with Jay Alammar, DeepLearning.AI instructor, and creator of BERTopic and KeyBERT, to break down the real differences between open-source and closed-source AI models.We’ll discuss how LLMs (Large Language Models) evolved from bag-of-words and Word2Vec to modern transformer-based models like BERT, GPT-4, DeepSeek, LLaMA 2, and Mixtral. More importantly, we explore when open-source AI models might actually be b...
2025-03-16
1h 00
IA Sob Controle - Inteligência Artificial
117: Impacto do DeepSeek, OpenAI o3-mini grátis, IA no SiSU
Sexta-feira é dia de repercutir as principais notícias da semana, no mundo da IA. Vem ver quem participou desse papo: Marcus Mendes, host sob controle Fabrício Carraro, co-host sob controle, Program Manager da Alura, autor de IA e host do podcast Dev Sem Fronteiras 🔗 Links: Fabrício Carraro: Como o Deepseek-R1 foi treinado Jay Alammar: The Illustrated DeepSeek-R1 Nvidia perdeu 600B em valor de mercado na segunda-feira EUA investigam se a DeepSeek usou chips indevidos Meta monta quatro operações de guerra para analisar o DeepSeek Itália bane o DeepSeek Perplexity adiciona o R1 às buscas do pl...
2025-01-31
1h 03
Exploring the unknown, together.
Roads to Research: Scientific Communication
Uncover the secrets of successful scientific communication at our upcoming panel discussion. Join Marzieh Fadaee with Jay Alammar and Shayne Longpre as they share their insights on the art of translating scientific concepts.
2024-12-10
1h 01
GenAI Level UP
Attention Is All You Need - Level 6
The Transformer: Revolutionizing Sequence Transduction with Self-Attention This episode explores the groundbreaking Transformer, a novel neural network architecture that has transformed the field of sequence transduction. The Transformer dispenses with recurrence and convolutions entirely, relying solely on attention mechanisms to capture global dependencies between input and output sequences. This results in superior performance on tasks like machine translation and significantly faster training times. We'll break down the key components of the Transformer, including multi-head self-attention, positional encoding, and encoder-decoder stacks, explaining how they work...
2024-11-27
15 min
Machine Learning Street Talk (MLST)
Jay Alammar on LLMs, RAG, and AI Engineering
Jay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.
2024-08-11
57 min
Learning from Machine Learning
Lewis Tunstall: Hugging Face, SetFit and Reinforcement Learning | Learning from Machine Learning #6
This episode features Lewis Tunstall, machine learning engineer at Hugging Face and author of the best selling book Natural Language Processing with Transformers. He currently focuses on one of the hottest topic in NLP right now reinforcement learning from human feedback (RLHF). Lewis holds a PhD in quantum physics and his research has taken him around the world and into some of the most impactful projects including the Large Hadron Collider, the world's largest and most powerful particle accelerator. Lewis shares his unique story from Quantum Physicist to Data Scientist to Machine Learning Engineer. Resources to learn...
2023-10-03
1h 18
What's AI Podcast by Louis-François Bouchard
Building LLM Apps & the Challenges that come with it. The What's AI Podcast Episode 16: Jay Alammar
My interview with Jay Alammar, widely known inthe AI and NLP field mainly through his great blog on transformers and attention. ►Watch on YouTube: https://youtu.be/TO0IV9e2MMQ ►LLM University: https://docs.cohere.com/docs/llmu ►Jay's blog: http://jalammar.github.io/illustrated-transformer/ ►Twitter: https://twitter.com/JayAlammar, https://twitter.com/Whats_AI ►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/ ►Support me on Patreon: https://www.patreon.com/whatsai ►Join Our AI Discord: https://discor...
2023-06-27
1h 12
Getting Simple
#71: Alex O'Connor — Transformers, Generative AI, and the Deep Learning Revolution
Alex O’Connor—researcher and ML manager—on the latest trends of generative AI. Language and image models, prompt engineering, the latent space, fine-tuning, tokenization, textual inversion, adversarial attacks, and more. Alex O’Connor got his PhD in Computer Science from Trinity College, Dublin. He was a postdoctoral researcher and funded investigator for the ADAPT Centre for digital content, at both TCD and later DCU. In 2017, he joined Pivotus, a Fintech startup, as Director of Research. Alex has been Sr Manager for Data Science & Machine Learning at Autodesk for the past few years, leading a team t...
2023-04-26
1h 45
{ между скобок }
Юля Яковлева, Константин Шибков: ChatGPT для разработчиков
#chatgpt #openai #softwareengineer #developers Юля сделала невероятно интересную презентацию, которая показывает как работает ChatGPT под капотом, как проходит обучение этой нейронной сети и какое будущее у проекта с техничекой стороны. После рассмотрели примеры использования ChatGPT - как генерировать ASCII схемы, разбор и изучение sql скриптов. Так же Костя подробно поделился как он использует ChatGPT для решения задач LeetCode, создание образовательного контента. В конце пришли к выводу, что ChatGPT не сможет заменить разработчиков, тк по уровню технических навыков находится на уровне стажера. Ламповый чат https://t.me/backend_megdu_skobkahКанал с анонсами https://t.me/megdu_skobokYouTube https://youtu.be/g2u21UsAS84Boosty https://boosty.to/megdu_skobokПолезные ссылки 📖 Github Юли https://github.com/robolamp📖 Мега крутая презентация от Юли https://docs.google.com/presentation/d/1BlXR51CmNxxUnDDX87jF2D4o74siGI-ZbEVXj77HY1Q/📖 Блог Jay Alammar от ML https://jalammar.github.io📖 Алгоритмический клуб https://t.me/JavaKeyFrames📖 Конспект от Кости с детальным разбором кейсом использования ChatGPT https://sendel.notion.site/Chat-GPT-b4d4722ace864875a0884cb30f4e6736📖 ChatGPT for Developers https://docs.gpt4devs.com/📖 Приколы с ChatGPT: обмануть или быть обманутым https://habr.com/ru/post/709636/📖 650+ Best Prompts For ChatGPT https://www.writingbeginner.com/best-prompts-for-chatgpt/📖 ChatGPT Prompts for Developer Use Cases https://www.tooltester.com/en/blog/best-chatgpt-prompts/#ChatGPT_Prompts_for_Developer_Use_Cases0:00 Приветствие 3:33 Что такое chatGPT и как оно работает 31:50 Конвертация protobuf в markdown 33:46 ChatGPT как...
2023-04-11
1h 31
Misreading Chat
#111: Formal Algorithms for Transformers
勤務先への脅威に怯える森田が Transformer を復習しました。ご意見ご感想などはおたより投書箱や Reddit にお寄せください。iTunes のレビューや星も歓迎です。 今回は録音に際し Adobe Podcast (beta) のバグを引き当ててしまい、向井と森田の音声トラックがずれてしまいました。ごめんなさい。次回からは non-beta の手堅いツールで録音しようと思います・・・。 [2207.09238] Formal Algorithms for Transformers #15 – Neural Machine Translation by Jointly Learning to Align and Translate #38 – Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates #51 – Attention Is All You Need #53 – BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jay Alammar – YouTube GitHub – openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI’s models. GitHub – karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs. Let’s build GPT: from scratch, in code, spelled out. – YouTube
2023-04-04
38 min
Changelog Master Feed
Applied NLP solutions & AI education (Practical AI #212)
We’re super excited to welcome Jay Alammar to the show. Jay is a well-known AI educator, applied NLP practitioner at co:here, and author of the popular blog, “The Illustrated Transformer.” In this episode, he shares his ideas on creating applied NLP solutions, working with large language models, and creating educational resources for state-of-the-art AI. Discuss on Changelog News Changelog++ members support our work, get closer to the metal, and make the ads disappear. Join today! Sponsors: Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond yo...
2023-02-22
38 min
Practical AI
Applied NLP solutions & AI education
We’re super excited to welcome Jay Alammar to the show. Jay is a well-known AI educator, applied NLP practitioner at co:here, and author of the popular blog, “The Illustrated Transformer.” In this episode, he shares his ideas on creating applied NLP solutions, working with large language models, and creating educational resources for state-of-the-art AI.Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Sponsors:Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content...
2023-02-22
38 min
Machine Learning Street Talk (MLST)
#80 AIDAN GOMEZ [CEO Cohere] - Language as Software
We had a conversation with Aidan Gomez, the CEO of language-based AI platform Cohere. Cohere is a startup which uses artificial intelligence to help users build the next generation of language-based applications. It's headquartered in Toronto. The company has raised $175 million in funding so far. Language may well become a key new substrate for software building, both in its representation and how we build the software. It may democratise software building so that more people can build software, and we can build new types of software. Aidan and I discuss this in detail in this episode of...
2022-11-15
51 min
Techmeme Ride Home
Fri. 10/07 – What Is It With These Dang Bridges?
There’s another huge crypto hack and I’ll give you two guesses as to what folks think the culprit is. The Twitter/Elon trial is officially paused. Meta can’t get its own developers to use their metaverse products. Maybe my dream of Death Star style anklebots is over. And, of course, the weekend longreads suggestions.Sponsors:Wealthfront.com/techmemeLinks:Binance-linked blockchain hit by $570 million crypto hack (Reuters)The Elon Musk vs. Twitter trial is on hold until October 28th (The Verge)Meta’s flagship metaverse app is too buggy and...
2022-10-07
17 min
The Real Python Podcast
Moving NLP Forward With Transformer Models and Attention
What’s the big breakthrough for Natural Language Processing (NLP) that has dramatically advanced machine learning into deep learning? What makes these transformer models unique, and what defines “attention?” This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, continues our talk about how machine learning (ML) models understand and generate text. This episode is a continuation of the conversation in episode #119. Jodie builds on the concepts of bag-of-words, word2vec, and simple embedding models. We talk about the breakthrough mechanism called “attention,” which allows for parallelization in building models. We also discu...
2022-08-12
50 min
Hacia Afuera con Omar Espejel
Ep 14 - Haritz Puerto (UKP Lab) - Haciendo que la IA resuelva preguntas
Haritz habla personalmente y no representando a ninguna empresa o institución de ninguna manera. Toda la información aquí descrita es mi interpretación y no necesariamente lo que Haritz quiso decir. Haritz es investigador ciéntifico en el Ubiquitous Knowledge Processing (UKP) Lab en la TU Darmstadt. Fundado en 2009, el UKP es uno de los centros más importantes de investigación en procesamiento del lenguaje natural del mundo. El trabajo de Haritz se ha enfocado en Question Answering, es decir, algoritmos que sirven para contestar preguntas, generar preguntas y graph neural networks. Hizo su...
2022-02-19
56 min
Post Mortem
#8 When the facts change, I change my model
"When the Facts Change, I Change My Mind. What Do You Do, Sir?" disait JM Keynes. L’économiste soulignait alors l’importance de réajuster ses a priori et sa représentation du monde lorsqu'on on est confronté à de nouveaux éléments. C’est la même chose lorsqu’on entraîne un modèle de machine learning et qu’on le déploie. Les données que l’on va rencontrer en production suivent-elles une distribution similaire aux données sur lesquelles on a entraîné le modèle? Si non, comment peut-on ajuster le tir?
2021-02-05
23 min
Misreading Chat
#53 – BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
NN の自然言語処理で transfer learning を実現した BERT について向井が話します。感想などはハッシュタグ #misreading か hello@misreading.chat にお寄せください。 https://misreading.chat/wp-content/uploads/2019/03/ep53.mp3 [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Improving Language Understanding by Generative Pre-Training GitHub – openai/gpt-2: Code for the paper “Language Models are Unsupervised Multitask Learners” [1901.11504] Multi-Task Deep Neural Networks for Natural Language Understanding Microsoft’s New MT-DNN Outperforms Google BERT – SyncedReview – Medium BERT with SentencePiece を日本語 Wikipedia で学習してモデルを公開しました – 原理的には可能 – データ分析界隈の人のブログ、もとい雑記帳 Follow up Jay Alammar جهاد العمار | LinkedIn STV
2019-03-21
33 min
Misreading Chat
#51 – Attention Is All You Need
昨年から成果が注目されている自然言語処理向けの新しいニューラルネットワーク Transformer を向井が紹介します。感想などはハッシュタグ #misreading か hello@misreading.chat にお寄せください。 https://misreading.chat/wp-content/uploads/2019/02/ep51.mp3 [1706.03762] Attention Is All You Need The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time The Annotated Transformer Episode 15 – Neural Machine Translation by Jointly Learning to Align and Translate – Misreading Chat
2019-03-01
22 min
Practical AI
OpenAI's new "dangerous" GPT-2 language model
This week we discuss GPT-2, a new transformer-based language model from OpenAI that has everyone talking. It’s capable of generating incredibly realistic text, and the AI community has lots of concerns about potential malicious applications. We help you understand GPT-2 and we discuss ethical concerns, responsible release of AI research, and resources that we have found useful in learning about language models. Join the discussionChangelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!Sponsors:Linode – Our cloud server of choice. Depl...
2019-02-25
40 min