podcast
details
.com
Print
Share
Look for any podcast host, guest or anyone
Search
Showing episodes and shows of
Satya Mallick
Shows
Artificial Intelligence : Papers & Concepts
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
In this episode of Artificial Intelligence: Papers and Concepts, we break down RF-DETR, a new direction in object detection that challenges the idea of fixed-capacity models. Instead of choosing between speed and accuracy upfront, RF-DETR introduces an elastic detector that adapts its computation dynamically at inference time. We explore how RF-DETR reuses intermediate representations to scale up or down on demand, why this matters for real-world deployment on edge and cloud systems, and how this design enables more predictable performance across diverse hardware constraints. If you're building adaptive vision systems for edge devices, robotics, or...
2026-02-04
17 min
Artificial Intelligence : Papers & Concepts
YOLO26: Rethinking Real-Time Vision for the Edge
In this episode of Artificial Intelligence: Papers and Concepts, we break down YOLO26, a major shift in real-time object detection. Instead of chasing raw accuracy, YOLO26 is designed for speed, consistency, and edge deployment. We explore how removing non-maximum suppression (NMS) delivers predictable low-latency inference, why simplifying the loss functions makes the model easier to deploy on real hardware, and how new training ideas borrowed from large language models improve small-object detection. If you're building vision systems for robots, drones, factories, or mobile devices, this episode explains why YOLO26 may be the most practical YOLO y...
2026-02-03
16 min
Artificial Intelligence : Papers & Concepts
DeepSeek mHC
Why do some large AI models suddenly collapse during training—and how can geometry prevent it? In this episode of Artificial Intelligence: Papers and Concepts, we break down DeepSeek AI's Manifold-Constrained Hyperconnections (mHC), a new architectural approach that fixes training instability in large language models. We explore why traditional hyperconnections caused catastrophic signal explosions, and how constraining them to a geometric structure—doubly stochastic matrices on the Birkhoff polytope—restores stability at scale. You'll learn how mHC reduces signal amplification from 3,000× to ~1.6×, enables reliable training of 27B-parameter models, and even improves reasoning performance—all with minimal ov...
2026-01-05
12 min
Artificial Intelligence : Papers & Concepts
Chinchilla Scaling Law
In this episode of Artificial Intelligence: Papers and Concepts, curated by Dr. Satya Mallick, we break down DeepMind's 2022 paper "Training Compute-Optimal Large Language Models"—the work that challenged the "bigger is always better" era of LLM scaling. You'll learn why many famous models were under-trained, what it means to be compute-optimal, and why the best performance comes from scaling model size and training data together. We also unpack the Chinchilla vs. Gopher showdown, why Chinchilla won with the same compute budget, and what this shift means for the future: data quality and curation may matter more...
2025-12-18
12 min
Artificial Intelligence : Papers & Concepts
Gradient-Based Planning
How should an AI or robot decide what to do next? In this episode, we explore a new approach to planning that rethinks how world models are trained. The episode is based on the paper "Closing the Train-Test Gap in World Models for Gradient-Based Planning" Many AI systems can predict the future accurately, yet struggle when asked to plan actions efficiently. We explain why this train–test mismatch hurts performance and how gradient-based planning offers a faster alternative to traditional trial-and-error or heavy optimization. The key idea is simple but powerful: if...
2025-12-13
12 min
Artificial Intelligence : Papers & Concepts
SAM3D: The Next Leap in 3D Understanding
Forget flat photos—SAM3D is rewriting how machines understand the world. In this episode, we break down the groundbreaking new model that takes the core ideas of Meta's Segment Anything Model and expands them into the third dimension, enabling instant 3D segmentation from just a single image. We start with the limitations of traditional 2D vision systems and explain why 3D understanding has always been one of the hardest problems in computer vision. Then we unpack the SAM3D architecture in simple terms: its depth-aware encoder, its multi-plane representation, and how it learns to infer 3D st...
2025-12-10
13 min
Artificial Intelligence : Papers & Concepts
DINOv3 : A new Self-Supervised Learning (SSL) Vision Language Model (VLM)
In this episode, we explore DINOv3, a new self-supervised learning (SSL) vision foundation model from Meta AI Research, emphasizing its ability to scale effortlessly to massive datasets and large architectures without relying on manual data annotation. The core innovations are scaling model and dataset size, introducing Gram anchoring to prevent the degradation of dense feature maps during long training, and employing post-hoc strategies for enhanced flexibility in resolution and text alignment. The authors present DINOv3 as a versatile visual encoder that achieves state-of-the-art performance across a broad range of tasks, including dense prediction (segmentation, depth estima...
2025-10-29
13 min
Artificial Intelligence : Papers & Concepts
dots.ocr SOTA Document Parsing in a Compact VLM
dots.ocr is a powerful, multilingual document parsing model from rednote-hilab that achieves state-of-the-art performance by unifying layout detection and content recognition within a single, efficient vision-language model (VLM). Built upon a compact 1.7B parameter Large Language Model (LLM), it offers a streamlined alternative to complex, multi-model pipelines, enabling faster inference speeds. The model demonstrates superior capabilities across multiple industry benchmarks, including OmniDocBench, where it leads in text, table, and reading order tasks, and olmOCR-bench, where it achieves the highest overall score. Its key strengths include robust parsing of low-resource languages, task flexibility thr...
2025-10-28
12 min
Artificial Intelligence : Papers & Concepts
DeepSeek-OCR : A Revolutionary Idea
In this episode, we dive deep into DeepSeek-OCR, a cutting-edge open-source Optical Character Recognition (OCR) / Text Recognition model that's redefining accuracy and efficiency in document understanding. DeepSeek-OCR flips long-context processing on its head by rendering text as images and then decoding it back—shrinking context length by 7–20× while preserving high fidelity. We break down how the two-stage stack works—DeepEncoder (optical/vision encoding of pages) + MoE decoder (text reconstruction and reasoning)—and why this "context optical compression" matters for million-token workflows, from legal PDFs to scientific tables. We also dive into accuracy trade-offs (≈96–97% at ~10× compr...
2025-10-23
14 min
Artificial Intelligence : Papers & Concepts
nanochat by Karpathy - How to build your own ChatGPT for $100
"The best ChatGPT that $100 can buy." That's Andrej Karpathy's positioning for nanochat—a compact, end‑to‑end stack that goes from tokenizer training to a ChatGPT‑style web UI in a few thousand lines of Python (plus a tiny Rust tokenizer). It's meant to be read, hacked, and run so students, researchers, and tech enthusiats can understand the entire pipeline needed to train a baby version of ChatGPT. In this episode, we walk you through the nanochat repository. Resources nanochat github repo: https://github.com/karpathy/nanochat/ AI Consulting & Product Development Services: https://bigvi...
2025-10-21
12 min
Artificial Intelligence : Papers & Concepts
SmolVLM: Small Yet Mighty Vision Language Model
In this episode of Artificial Intelligence: Papers and Concepts, we explore SmolVLM, a family of compact yet powerful vision language models (VLMs) designed for efficiency. Unlike large VLMs that require significant computational resources, SmolVLM is engineered to run on everyday devices like smartphones and laptops. We dive into the research paper SmolVLM: Redefining Small and Efficient Multimodal Models and a related HuggingFace blog post, discussing key design choices such as optimized vision-language balance, pixel shuffle for token reduction, and learned positional tokens to improve stability and performance. We highlight how SmolVLM avoids common p...
2025-10-01
14 min
Artificial Intelligence : Papers & Concepts
Common Pitfalls in Computer Vision & AI Projects (and How to Avoid Them)
In this episode, we dig deep into the unglamorous side of AI and computer vision projects — the mistakes, misfires, and blind spots that too often derail even the most promising teams. Based on BigVision.ai's playbook "Common Pitfalls in Computer Vision & AI Projects", we walk through a field-tested catalog of pitfalls drawn from real failures and successes. We cover: Why ambiguous problem statements and fuzzy success criteria lead to early project drift The dangers of unrepresentative training data and how missing edge cases sabotage models Labeling mistakes, data leakage, and splits th...
2025-10-01
17 min
Free the Data Podcast
Giving Computers Vision with Satya Mallick (OpenCV)
Computer Vision is a popular field of Artificial Intelligence that enables applications such as facial recognition and self-driving cars. In this episode we hear from Dr. Satya Mallick who has been working in the field since it's birth on all the various aspect of Computer Vision and how you can learn it to break into the field of AI. CHAPTERS: 0:00:00 Intro 0:01:22 Satya's Background 0:02:53 What is Computer Vision? 0:08:02 What is Open CV? 0:12:18 Evolution of Computer Vision Field 0:14:35 What is Artificial Intelligence? 0:18:22 The birth of Machine Learning 0:20:26 The birth of Deep Learning 0:24:48 What is Machine Learning? 0:30:07 What is Unsupervised Learning? 0:34:10 How...
2024-05-16
2h 14
Leading With Data
#14: Solving Computer Vision Problems | Satya Mallick, CEO @ OpenCV and @ LearnOpenCV | Leading With Data Ep 14
In this episode of Leading With Data, we interact with Satya Mallick, CEO of OpenCV.org, and Founder of LearnOpenCV.com and Big Vision. Satya Mallick is the CEO of OpenCV.org, the world's largest open-source computer vision library. He is also the founder of the very famous Computer Vision blog - LearnOpenCV.com. Not only that, he is also the founder of Big Vision, an AI consulting organization. Armed with a PhD in Computer Vision back in 2006, Satya has been continuously involved in working with challenging Computer Vision and Deep Learning tasks. Watch as Satya delves into: 👉 How he...
2023-12-13
54 min
AI to Uplift Humanity
What isn't AI and can it be "safe?" Dr. Satya Mallick, CEO of OpenCV answers beginner AI questions.
AI is a fascinating technological pursit, but also a confusing one. Mostly used as a marketing term the AI landscape has become confusing especially for beginners. In this interview we ask Dr. Satya Mallick, CEO of OpenCV all your AI questions. Did you know, for example, taking panorama photos or using HDR is computer vision, but not AI? Check out the show notes at: https://podcast.soar.com/uplift-humanity-podcast/satya-mallick/ Get 10 free hours of AI Video Search tech on your website at soar.com/deepsearch
2022-04-27
34 min
The Justin Brady Show
AI for idiots! I ask Dr. Satya Mallick, CEO of OpenCV, lots of stupid questions you're too scared to ask.
AI has turned into a marketing term, and the result is confusion for everyone on what AI is. So, what is AI, and what isn't AI? For example, did you know taking panorama photos or using HDR is computer vision, but not AI? The difference is if the system, or machine, is learning through data, or if the process is fixed. Face recognition requires AI for example. What is OpenCV for? Why do we need an open-source library for AI? In a way they are collecting the lego blocks, so builders can focus on building ne...
2022-04-07
35 min
Free the Data Podcast
Giving Computers Vision with Satya Mallick (OpenCV)
Computer Vision is a popular field of Artificial Intelligence that enables applications such as facial recognition and self-driving cars. In this episode we hear from Dr. Satya Mallick who has been working in the field since it's birth on all the various aspect of Computer Vision and how you can learn it to break into the field of AI.How to Connect with Satya:- COMPANY WEBSITE: https://learnopencv.com/- TWITTER: https://twitter.com/learnopencv- LINKEDIN: https://www.linkedin.com/in/satyamallick- EMAIL...
2021-09-09
2h 13
The Bitter Bongs
Decade of Decay - Santan
In this new segment called Decade of Decay, we talk about Bengali movies from the 90's that were so bad that they are good. In this week's episode we talk about the 1999 movie called Santan starring Satya Bandopadhay, Geeta Dey, Ranjit Mallick, Chumki Choudhury, Biplab Chatterjee, Rita Koiral, Tapas Pal and others.
2021-07-21
24 min
The Turing Test Podcast
Interview of Satya Mallick CEO of OpenCV
So I got on an interview with Satya Mallick and we discussed the 20th Anniversary of OpenCV, OAK Kits. Watch this podcast to learn about awesome upcoming developments coming to OpenCV. ►YOLOv4 AI Course - https://augmentedstartups.info/yolov4release ►Satya's OpenCV Courses - http://bit.ly/SatyaOpenCVCourses
2021-07-09
46 min
Entrepreneur Mastermind Podcast
Episode 284 "New Blood"
We´re Live. Ladies and Gentleman, We Have new blood on Entreprogrammers. Welcome To Satya Mallick! Satya is an entrepreneur and gives us a brief review of his work, a small presentation of self if you will. Been on various business and had some success. Satya starts an interesting conversation about skills that an engineer needed 5 or 10 a years ago, but he´s looking into the future now. Guys totally agree on this, AI is everywhere today from self-driven cars to computers that make their own decisions. John Talks a little ab...
2019-09-17
1h 22