Fiodar Kazhamakia - Podcast Details

Shows

Stanford MLSys Seminar#62 Dan Fu - Improving Transfer and Robustness of Supervised Contrastive LearningDan Fu - An ideal learned representation should display transferability and robustness. Supervised contrastive learning is a promising method for training accurate models, but produces representations that do not capture these properties due to class collapse -- when all points in a class map to the same representation. In this talk, we discuss how to alleviate these problems to improve the geometry of supervised contrastive learning. We identify two key principles: balancing the right amount of geometric "spread" in the embedding space, and inducing an inductive bias towards subclass clustering. We introduce two mechanisms for achieving these aims in...2022-04-2756 min

Stanford MLSys Seminar#61 Kexin Rong - Big Data AnalyticsKexin Rong - Learned Indexing and Sampling for Improving Query Performance in Big-Data AnalyticsTraditional data analytics systems improve query efficiency via fine-grained, row-level indexing and sampling techniques. However, to keep up with the data volumes, increasingly many systems store and process datasets in large partitions containing hundreds of thousands of rows. Therefore, these analytics systems must adapt traditional techniques to work with coarse-grained data partitions as a basic unit to process queries efficiently. In this talk, I will discuss two related ideas that combine learning techniques with partitioning designs to improve the query efficiency in the...2022-04-2259 min

Stanford MLSys Seminar#60 Igor Markov - Looper: An End-to-End ML Platform for Product DecisionsIgor Markov - Looper: an end-to-end ML platform for product decisionsEpisode 60 of the Stanford MLSys Seminar Series! Looper: an end-to-end ML platform for product decisions Speaker: Igor Markov Abstract: Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support fine-grain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior platforms, we introduce general principles for and the architecture of an ML platform, Looper, wi...2022-04-111h 00

Stanford MLSys Seminar#59 Zhuohan Li - Alpa: Automated Model-Parallel Deep LearningZhuohan Li - Alpa: Automated Model-Parallel Deep LearningAlpa (https://github.com/alpa-projects/alpa) automates model-parallel training of large deep learning models by generating execution plans that unify data, operator, and pipeline parallelism. Alpa distributes the training of large deep learning models by viewing parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms. Based on it, Alpa constructs a new hierarchical space for massive model-parallel execution plans. Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel...2022-04-0555 min

Stanford MLSys Seminar3/10/22 #58 Shruti Bhosale - Multilingual Machine TranslationShruti Bhosale - Scaling Multilingual Machine Translation to Thousands of Language DirectionsExisting work in translation has demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In this talk, I will describe how we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100...2022-03-1857 min

Stanford MLSys Seminar3/3/22 #57 Vijay Janapa Reddi - TinyML, Harvard StyleVijay Janapa Reddi - Tiny Machine LearningTiny machine learning (TinyML) is a fast-growing field at the intersection of ML algorithms and low-cost embedded systems. TinyML enables on-device analysis of sensor data (vision, audio, IMU, etc.) at ultra-low-power consumption (less than 1mW). Processing data close to the sensor allows for an expansive new variety of always-on ML use-cases that preserve bandwidth, latency, and energy while improving responsiveness and maintaining privacy. This talk introduces the vision behind TinyML and showcases some of the interesting applications that TinyML is enabling in the field, from wildlife conservation to supporting public...2022-03-0457 min

Stanford MLSys Seminar2/24/22 #56 Fait Poms - Interactive Model DevelopmentFait Poms - A vision for interactive model development: efficient machine learning by bringing domain experts in the loopBuilding computer vision models today is an exercise in patience--days to weeks for human annotators to label data, hours to days to train and evaluate models, weeks to months of iteration to reach a production model. Without tolerance for this timeline or access to the massive compute and human resources required, building an accurate model can be challenging if not impossible. In this talk, we discuss a vision for interactive model development with iteration cycles of minutes, not...2022-02-2855 min

Stanford MLSys Seminar1/28/21 #10 Travis Addair - Deep Learning at Scale with HorovodTravis Addair - Horovod and the Evolution of Deep Learning at ScaleDeep neural networks are pushing the state of the art in numerous machine learning research domains; from computer vision, to natural language processing, and even tabular business data. However, scaling such models to train efficiently on large datasets imposes a unique set of challenges that traditional batch data processing systems were not designed to solve. Horovod is an open source framework that scales models written in TensorFlow, PyTorch, and MXNet to train seamlessly on hundreds of GPUs in parallel. In this talk, we'll explain the con...2022-02-2359 min

Stanford MLSys Seminar2/17/22 #55 Doris Lee - Visualization for Data ScienceDoris Lee - Always-on Dataframe Visualizations with LuxVisualizations help data scientists discover trends, patterns, identify outliers, and derive insights from their data. However, existing visualization libraries in Python require users to write a substantial amount of code for plotting even a single visualization, often hindering the flow of data exploration. In this talk, you will learn about Lux, a lightweight visualization tool on top of pandas dataframes. Lux recommends visualizations for free to users as they explore their data within a Jupyter notebook without the need to write additional code. Lux is used by data scientists...2022-02-1958 min

Stanford MLSys Seminar1/21/21 #9 Song Han - Reducing AI's Carbon FootprintSong Han - TinyML: Reducing the Carbon Footprint of Artificial Intelligence in the Internet of Things (IoT)Deep learning is computation-hungry and data-hungry. We aim to improve the computation efficiency and data efficiency of deep learning. I will first talk about MCUNet[1] that brings deep learning to IoT devices. The technique is tiny neural architecture search (TinyNAS) co-designed with a tiny inference engine (TinyEngine), enabling ImageNet-scale inference on an IoT device with only 1MB of FLASH. Next I will talk about TinyTL[2] that enables on-device training, reducing the memory footprint by 7-13x. Finally, I will describe D...2022-02-1556 min

Stanford MLSys Seminar2/10/22 #54 Ellie Pavlick - Do Deep Models Learn Symbolic Reasoning?Ellie Pavlick - Implementing Symbols and Rules with Neural NetworksMany aspects of human language and reasoning are well explained in terms of symbols and rules. However, state-of-the-art computational models are based on large neural networks which lack explicit symbolic representations of the type frequently used in cognitive theories. One response has been the development of neuro-symbolic models which introduce explicit representations of symbols into neural network architectures or loss functions. In terms of Marr's levels of analysis, such approaches achieve symbolic reasoning at the computational level ("what the system does and why") by introducing symbols and...2022-02-121h 01

Stanford MLSys Seminar12/10/20 #8 Kayvon Fatahalian - Video Analysis in Hours, Not WeeksKayvon Fatahalian - From Ideas to Video Analysis Models in Hours, Not WeeksMy students and I often find ourselves as "subject matter experts" needing to create video understanding models that serve computer graphics and video analysis applications. Unfortunately, like many, we are frustrated by how a smart grad student, armed with a large *unlabeled* video collection, a palette of pre-trained models, and an idea of what novel object or activity they want to detect/segment/classify, requires days-to-weeks to create and validate a model for their task. In this talk I will discuss challenges we've faced in...2022-02-071h 03

Stanford MLSys Seminar2/3/22 #53 Cody Coleman - Data Selection for Data-Centric AICody Coleman - Data selection for Data-Centric AI: Data Quality Over QuantityData selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. This talk will describe two techniques to make data selection methods more tractable. First, "selection via proxy" (SVP) avoids expensive training and...2022-02-0555 min

Stanford MLSys Seminar1/27/22 #52 Bilge Acun - Sustainability for AIBilge Acun - Designing Sustainable Datacenters with and for AIMachine learning has witnessed exponential growth over the recent years. In this talk, we will first explore the environmental implications of the super-linear growth trend of AI from a holistic perspective, spanning data, algorithms, and system hardware. System efficiency optimizations can significantly help reducing the carbon footprint of AI systems. However, predictions show that the efficiency improvements will not be enough to reduce the overall resource needs of AI as Jevon's Paradox suggests "efficiency increases consumption". Therefore, we need to design our datacenters with sustainability in mind...2022-01-3158 min

Stanford MLSys Seminar12/3/20 #7 Matthias Poloczek - Bayesian OptimizationMatthias Poloczek - Scalable Bayesian Optimization for Industrial ApplicationsBayesian optimization has become a powerful method for the sample-efficient optimization of expensive black-box functions. These functions do not have a closed-form and are evaluated for example by running a complex economic simulation, by an experiment in the lab or in a market, or by a CFD simulation. Use cases arise in machine learning, e.g., when tuning the configuration of an ML model or when optimizing a reinforcement learning policy. Examples in engineering include the design of aerodynamic structures or materials discovery.In this talk I...2022-01-2459 min

Stanford MLSys Seminar01/20/22 #51 Fred Sala - Weak Supervision for Diverse DatatypesFred Sala - Efficiently Constructing Datasets for Diverse DatatypesBuilding large datasets for data-hungry models is a key challenge in modern machine learning. Weak supervision frameworks have become a popular way to bypass this bottleneck. These approaches synthesize multiple noisy but cheaply-acquired estimates of labels into a set of high-quality pseudolabels for downstream training. In this talk, I introduce a technique that fuses weak supervision with structured prediction, enabling WS techniques to be applied to extremely diverse types of data. This approach allows for labels that can be continuous, manifold-valued (including, for example, points in hyperbolic space...2022-01-2153 min

Stanford MLSys Seminar11/19/20 #6 Roy Frostig - The Story Behind JAXRoy Frostig - JAX: accelerating machine learning research by composing function transformations in PythonJAX is a system for high-performance machine learning research and numerical computing. It offers the familiarity of Python+NumPy together with hardware acceleration, plus a set of composable function transformations: automatic differentiation, automatic batching, end-to-end compilation (via XLA), parallelizing over multiple accelerators, and more. JAX's core strength is its guarantee that these user-wielded transformations can be composed arbitrarily, so that programmers can write math (e.g. a loss function) and transform it into pieces of an ML program (e.g. a vectorized, compiled...2022-01-171h 06

Stanford MLSys Seminar01/13/22 #50 Deepak Narayanan - Resource-Efficient Deep Learning ExecutionDeepak Narayanan - Resource-Efficient Deep Learning ExecutionDeep Learning models have enabled state-of-the-art results across a broad range of applications; however, training these models is extremely time- and resource-intensive, taking weeks on clusters with thousands of expensive accelerators in the extreme case. In this talk, I will describe two ideas that help improve the resource efficiency of model training.In the first half of the talk, I will discuss how pipelining can be used to accelerate distributed training. Pipeline parallelism facilitates model training with lower communication overhead than previous methods while still ensuring high compute...2022-01-1457 min

Stanford MLSys Seminar11/12/20 #5 Chip Huyen - Principles of Good Machine Learning Systems DesignChip Huyen - Principles of Good Machine Learning Systems DesignThis talk covers what it means to operationalize ML models. It starts by analyzing the difference between ML in research vs. in production, ML systems vs. traditional software, as well as myths about ML production.It then goes over the principles of good ML systems design and introduces an iterative framework for ML systems design, from scoping the project, data management, model development, deployment, maintenance, to business analysis. It covers the differences between DataOps, ML Engineering, MLOps, and data science, and where each fits into...2022-01-101h 06

Stanford MLSys Seminar11/5/20 #4 Alex Ratner - Programmatically Building & Managing Training Data with SnorkelAlex Ratner - Programmatically Building & Managing Training Data with SnorkelOne of the key bottlenecks in building machine learning systems is creating and managing the massive training datasets that today's models require. In this talk, I will describe our work on Snorkel (snorkel.org), an open-source framework for building and managing training datasets, and describe three key operators for letting users build and manipulate training datasets: labeling functions, for labeling unlabeled data; transformation functions, for expressing data augmentation strategies; and slicing functions, for partitioning and structuring training datasets. These operators allow domain expert users to specify machine l...2022-01-081h 13

Stanford MLSys Seminar11/5/20 #3 Virginia Smith - On Heterogeneity in Federated SettingsVirginia Smith - On Heterogeneity in Federated SettingsA defining characteristic of federated learning is the presence of heterogeneity, i.e., that data and compute may differ significantly across the network. In this talk I show that the challenge of heterogeneity pervades the machine learning process in federated settings, affecting issues such as optimization, modeling, and fairness. In terms of optimization, I discuss FedProx, a distributed optimization method that offers robustness to systems and statistical heterogeneity. I then explore the role that heterogeneity plays in delivering models that are accurate and fair to all users/devices in...2022-01-081h 00

Stanford MLSys Seminar10/22/20 #2 Matei Zaharia - Machine Learning at Industrial Scale: Lessons from the MLflow ProjectMatei Zaharia - Machine Learning at Industrial Scale: Lessons from the MLflow ProjectAlthough enterprise adoption of machine learning is still early on, many enterprises in all industries already have hundreds of internal ML applications. ML powers business processes with an impact of hundreds of millions of dollars in industrial IoT, finance, healthcare and retail. Building and operating these applications reliably requires infrastructure that is different from traditional software development, which has led to significant investment in the construction of “ML platforms” specifically designed to run ML applications. In this talk, I’ll discuss some of the common...2022-01-0859 min

Stanford MLSys Seminar10/15/20 #1 Marco Tulio Ribeiro - Beyond Accuracy: Behavioral Testing of NLP Models with CheckListMarco Tulio Ribeiro on "Beyond Accuracy: Behavioral Testing of NLP Models with CheckList"We will present CheckList, a task-agnostic methodology and tool for testing NLP models inspired by principles of behavioral testing in software engineering. We will show a lot of fun bugs we discovered with CheckList, both in commercial models (Microsoft, Amazon, Google) and research models (BERT, RoBERTA for sentiment analysis, QQP, SQuAD). We'll also present comparisons between CheckList and the status quo, in a case study at Microsoft and a user study with researchers and engineers. We show that CheckList is a really helpful process...2022-01-081h 00

Stanford MLSys Seminar01/06/22 #49 Beidi Chen - Pixelated Butterfly: Fast Machine Learning with SparsityBeidi Chen talks about "Pixelated Butterfly: Simple and Efficient Sparse Training for Neural Network Models." Overparameterized neural networks generalize well but are expensive to train. Ideally, one would like to reduce their computational cost while retaining their generalization benefits. Sparse model training is a simple and promising approach to achieve this, but there remain challenges as existing methods struggle with accuracy loss, slow training runtime, or difficulty in sparsifying all model components. The core problem is that searching for a sparsity mask over a discrete set of sparse matrices is difficult and expensive. To address this, our main insight...2022-01-0853 min