Sanket Makhija - Podcast Details

Shows

Code ImpactSpanner's Globally-Distributed Database: Query ExecutionThis paper details the evolution of Google's Spanner, a globally-distributed database system, from a key-value store to a fully-fledged SQL system. Key improvements discussed include distributed query execution, handling of transient failures via query restarts, efficient range extraction for data retrieval, and the adoption of a common SQL dialect. The authors also explain the transition from a Bigtable-like storage format to a more efficient blockwise-columnar store (Ressi). Finally, the paper highlights lessons learned during Spanner's large-scale deployment and outlines remaining challenges.2025-02-0336 min

Code ImpactChange Data Capture (CDC): Three Implementation MethodsThe article explores Change Data Capture (CDC), a method for tracking database changes, highlighting its advantages over traditional daily snapshots. It details three CDC implementation approaches: using database triggers (e.g., in PostgreSQL), capturing API requests and using a message broker (e.g., Kafka), and leveraging change streams within a data warehouse (e.g., Snowflake). The article compares these methods, weighing their pros and cons in terms of performance, scalability, and ease of implementation. A subsequent discussion critiques the presented methods, suggesting alternative, more robust solutions based on logical replication tools like Debezium.2025-02-0316 min

Code ImpactDeepSeek-R1: Reasoning via Reinforcement LearningThis research paper introduces DeepSeek-R1, a large language model enhanced for reasoning capabilities using reinforcement learning (RL). Two versions are presented: DeepSeek-R1-Zero, trained purely via RL without supervised fine-tuning, and DeepSeek-R1, which incorporates additional multi-stage training and cold-start data for improved readability and performance. DeepSeek-R1 achieves results comparable to OpenAI's o1-1217 on various reasoning benchmarks. The study also explores distilling DeepSeek-R1's reasoning capabilities into smaller, more efficient models, achieving state-of-the-art results. Finally, the paper discusses unsuccessful attempts using process reward models and Monte Carlo Tree Search, providing valuable insights for future research. 2025-01-2619 min

Code ImpactJira Cloud Performance Enhancement with ProtobufThis Atlassian blog post details the migration of Jira Cloud's Issue Service from JSON to Protocol Buffers (Protobuf) to enhance performance. The switch involved a phased approach to minimise downtime, creating new endpoints and logic to handle both formats concurrently before a complete transition. The results showcased significant improvements: 75% less Memcached CPU usage, 80% smaller data size, and a substantially faster response time. Challenges encountered included Protobuf's handling of null values and incompatibility with Spring's default error controller, which required workarounds. Ultimately, the migration yielded substantial performance gains and reduced infrastructure needs. https://www.atlassian...2025-01-2620 min

Code ImpactHyaline: Fast and Transparent Lock-Free Memory ReclamationThis research paper introduces Hyaline, a novel family of memory reclamation schemes for lock-free data structures in unmanaged C/C++ code. Hyaline leverages reference counting, but only during reclamation, minimising overhead during object access and balancing workload across threads. The paper details Hyaline's design, including a scalable multi-list version and robust extensions to handle stalled threads. Extensive testing across multiple architectures demonstrates Hyaline's superior performance and memory efficiency compared to existing schemes like epoch-based reclamation and hazard pointers, particularly in read-dominated and oversubscribed scenarios. The paper concludes by proving Hyaline's correctness and lock-freedom properties.2025-01-2532 min

Code ImpactTrello's Kafka MigrationThis Atlassian blog post details Trello's migration from RabbitMQ to Kafka for its websocket architecture. RabbitMQ's unreliability during network partitions and high costs associated with queue creation and deletion prompted the switch. The article compares various queuing systems, highlighting Kafka's superior failover capabilities and in-order message delivery. Trello implemented a master-client architecture with Kafka, resulting in improved performance, reduced costs, and fewer outages. Key performance improvements included a 33% decrease in memory usage and a substantial cost reduction.2025-01-1912 min

Code ImpactReliability Engineering: History, Practice, and FutureThis podcast explores the field of reliability engineering, tracing its origins at Google with the development of Site Reliability Engineering (SRE). It differentiates reliability engineering from SRE, highlighting its broader applicability across various organisational structures. The podcast outlines four key promises of a successful reliability team: defining service levels (SLA/SLO/SLI), managing the service infrastructure, participating in technical design, and providing tactical support during incidents. Finally, it discusses the evolving landscape of reliability engineering, emphasising pragmatic approaches to balancing cost and reliability needs, and advocating for a more nuanced understanding of when to build versus buy solutions.2025-01-1913 min

Code ImpactDebugging Large Distributed Systems: The Antithesis ApproachThis podcast profiles Antithesis, a company developing a "multiverse debugger" for large, distributed systems. It traces the history of debugging tools, highlighting Antithesis's innovative approach using deterministic simulation testing (DST) to allow time travel debugging. The podcast includes a Q&A with Antithesis's co-founder, detailing the challenges of debugging large systems and how Antithesis addresses them. Furthermore, it discusses Antithesis's tech stack, engineering culture, and the trade-offs of using their complex, but potentially game-changing, technology. Finally, it considers the implications of widespread adoption of Antithesis's technology for the future of software development.2025-01-1923 min

Code ImpactShopify's Live Globe: Building a Black Friday ExperienceThis podcast details the creation of Shopify's interactive Black Friday/Cyber Monday live dashboard, nicknamed "Live Globe". The 2024 version, built by a six-person team in two months, features a spaceship-themed interface showcasing real-time sales data and boasts impressive technical specifications, including peak loads of nearly 30 million database reads per second. The design process involved extensive prototyping and the use of AI-generated imagery for inspiration. The podcast also highlights the technology stack (React Three Fiber, Go, Rails, Kafka, and Flink), the inclusion of numerous Easter eggs, and the challenges of performance optimisation and real-time data streaming. Finally, it explores the...2025-01-1920 min

Code ImpactWartime vs. Peacetime in Tech CompaniesThis podcast examines the contrasting "wartime" and "peacetime" operating modes in tech companies, drawing on the author's experiences at Uber and observations across the industry. It defines these modes in terms of leadership styles, employee behaviours, and organisational priorities, highlighting the differences in approaches to project management, performance reviews, and tech debt. The text explores the transitions between these modes, identifying common triggers and observable signs, and offers advice for employees and managers on thriving in each environment. Finally, it discusses the counterintuitive relationship between extended "wartime" periods and tech debt accumulation.2025-01-1918 min

Code ImpactThe First-Time Manager: A Practical GuideJim McCormick's "The First Time Manager" offers a practical guide for new managers, covering essential aspects like communication, delegation, and conflict resolution. The book employs a clear and relatable style, using real-world examples and actionable advice to help readers build foundational leadership skills. While some advice may be general, its comprehensive approach to fundamental management principles makes it a valuable resource for aspiring and new managers seeking a strong start in their careers. The book also touches on crucial aspects of personal development and emotional intelligence in leadership. Even experienced managers might find its refresher on core concepts beneficial.2025-01-0628 min

Code ImpactOn-the-Fly Sharing for Streamed AggregationThis research paper details the development and implementation of efficient techniques for processing multiple, similar aggregate queries in data streaming systems. The authors address the challenges of scaling to handle hundreds of concurrent queries, each with potentially different time windows and selection predicates. Their proposed "on-the-fly" methods avoid computationally expensive static query analysis, offering significant performance improvements (up to an order of magnitude) over existing approaches. The techniques are validated through a performance study using real-world stock market data, demonstrating their practical effectiveness. The core contributions are novel algorithms for shared time slices, shared data fragments, and a combined...2025-01-0414 min

Code ImpactVercel Request Lifecycle From User Input to Global DeliveryThe article details how Vercel's platform handles web requests, from initial user input to final response. Vercel's Edge Network directs requests to optimal data centres, minimising latency. A multi-layered firewall system protects against threats. Advanced routing features, including middleware, manage request flow. Finally, Edge caching and Vercel Functions optimise speed and scalability for dynamic content.2024-12-2311 min

Code ImpactBytedance Real-Time Recommendation SystemThis research paper introduces Monolith, a real-time recommendation system designed by Bytedance. Addressing limitations of existing deep learning frameworks, Monolith uses a novel collisionless embedding table to efficiently handle sparse, dynamic features, significantly improving model quality and memory usage. A key innovation is its online training architecture, enabling real-time model updates based on user feedback. The authors demonstrate Monolith’s superior performance through experiments and A/B tests, highlighting the trade-offs between real-time learning and system reliability. Finally, the paper compares Monolith to existing solutions, showcasing its advantages in scalability and efficiency for large-scale recommendation tasks.2024-12-2315 min

Code ImpactPostgres Retrospective by Joseph M. HellersteinThis article reminisces on the history of the Postgres project, spearheaded by Michael Stonebraker at UC Berkeley from the mid-1980s to the mid-1990s. It details Stonebraker's design philosophy and the project's technical innovations, including support for abstract data types, active databases, and novel storage and recovery mechanisms. The article highlights Postgres's evolution into the open-source PostgreSQL system, its significant commercial impact through various spin-off companies, and the lessons learned from its success. It also discusses the unexpected benefits of open-sourcing the research and the project's lasting influence on database technology. The author reflects on his own involvement...2024-12-2321 min

Code ImpactMigrating Yelp Reservations from PostgreSQL to MySQLThis blog post details Yelp's in-place migration of their Yelp Reservations service database from PostgreSQL to MySQL. The migration, necessitated by maintenance and expertise limitations with PostgreSQL, involved significant code refactoring to address unsupported features and ensure data consistency. A gradual rollout strategy, employing multi-DB support and careful synchronisation, was implemented to minimise disruption. The process revealed several unexpected challenges, including issues with auto-incrementing keys and ProxySQL memory usage, highlighting the complexities of such large-scale database migrations. Ultimately, the switch to the company standard MySQL improved performance and maintainability.2024-12-2317 min

Code ImpactFBDetect - Catching Tiny Performance Regressions at Hyperscale through In-Production MonitoringMeta's FBDetect system, detailed in this research paper, is a robust, in-production performance regression detection system. It identifies minuscule performance regressions (as small as 0.005%) across millions of servers and hundreds of services by monitoring hundreds of thousands of time series metrics. Key to FBDetect's success are advanced techniques for subroutine-level performance analysis, filtering false positives, deduplicating correlated regressions, and root cause analysis. The paper validates FBDetect's effectiveness through simulations and real-world production data, showcasing its superiority over existing methods and highlighting the significance of its seven years of successful operation.2024-12-2320 min

Code ImpactAmazon DynamoDB - A Decade of Scalable NoSQLThis paper details the architecture and evolution of Amazon DynamoDB, a fully managed NoSQL database service. Key features highlighted include its scalability, predictable performance, high availability (achieved through multi-region replication and sophisticated failure handling), and strong durability (guaranteed by techniques like write-ahead logging and continuous data verification). The authors discuss challenges faced during DynamoDB's development, such as handling uneven traffic distribution and optimising resource allocation, and explain the solutions implemented, including the shift from provisioned to on-demand capacity. Performance benchmarks are provided to demonstrate the system's consistent low latency even under extreme load.2024-12-2321 min

Code ImpactDefining a Senior Software EngineerThis blog post discusses the multifaceted definition of a senior software engineer. Technical expertise is crucial, encompassing a T-shaped skill profile and a deep understanding of software development principles. However, soft skills, such as communication, leadership, and a growth mindset, are equally vital for moving projects and teams forward. The author suggests several strategies for professional growth, including pair programming, content creation, and seeking challenging tasks. Ultimately, the article posits that becoming a senior engineer is an ongoing journey of learning and improvement, rather than a fixed destination.2024-12-2323 min

Code ImpactAmazon S3 Tables - Analytics Optimised StorageAmazon Web Services (AWS) has launched Amazon S3 Tables, a new storage service optimised for analytical workloads. These tables, stored in a new type of S3 bucket, utilise the Apache Iceberg format for efficient querying with tools like Amazon Athena and Apache Spark. Offering significant performance improvements (up to 3x faster queries and 10x more transactions per second) over self-managed solutions, S3 Tables provide fully managed features including automatic compaction, snapshot management, and unreferenced file removal. The service integrates with other AWS analytics services and supports standard S3 APIs, offering enhanced security and scalability. Currently available in select US...2024-12-2314 min

Code ImpactAmazon S3 Deep Dive: Scale, Decorrelation, and VelocityThis transcript from an AWS re:Invent 2024 session details Amazon S3's architecture and engineering principles. Two senior engineers explain how S3's massive scale enables efficient data management, utilising techniques like shuffle sharding to distribute workloads across millions of drives. They discuss the physics of data storage, showcasing how S3's design improves performance and reliability by mitigating the impact of hardware limitations and individual workload bursts. Erasure coding is highlighted as a key technology that ensures data durability and facilitates faster software deployment. Finally, the presentation emphasises how S3's fault-tolerant design, built on principles of decorrelation, ultimate...2024-12-2325 min

Code ImpactAurora DSQL Transactions and DurabilityThis blog post by Marc Brooker, an AWS engineer, explains the write operations within Aurora DSQL, a scalable SQL database. It details how Aurora DSQL uses optimistic concurrency control (OCC) combined with multiversion concurrency control (MVCC) to achieve strong consistency and snapshot isolation. The system uses an "adjudicator" service to manage write conflicts and a "Journal" for durable, ordered data replication. The author highlights the benefits of this approach for scalability, availability, and performance, emphasising the avoidance of locking. Finally, the post touches upon the system's consistency guarantees and the importance of schema design to minimise write conflicts.2024-12-2311 min

Code ImpactParsing Millions of URLs Per SecondThis research article details the development and benchmarking of a high-performance URL parser compliant with the WHATWG standard. The authors created a C++ implementation leveraging vectorisation techniques, resulting in a parser significantly faster than existing solutions like curl and rust-url. Their parser was integrated into Node.js, leading to substantial performance improvements in URL processing within that environment. Extensive benchmarks across various datasets and platforms demonstrated the superior speed and efficiency of their new parser. The authors also provide open-source access to their code and datasets.2024-12-2326 min

Code ImpactTech Predictions for 2025 and BeyondThis article presents technology predictions for 2025 and beyond, focusing on several key themes. Firstly, it highlights a growing mission-driven workforce prioritising positive societal impact over solely financial gain. Secondly, it discusses the crucial role of technological innovation in addressing the global energy crisis, advocating for a blend of renewable and nuclear solutions alongside improved energy consumption practices. Thirdly, the text explores the use of AI-powered tools to combat the spread of misinformation and disinformation. Finally, it examines the emerging trend of intention-driven consumer technology, emphasising mindful usage and a reduction in constant digital distraction.2024-12-2313 min

Code ImpactAmazon Aurora - Design Considerations for High Throughput Cloud-Native Relational DatabasesThis paper details the architecture and design of Amazon Aurora, a cloud-native relational database service. Key design choices focus on mitigating network bottlenecks inherent in high-throughput cloud systems by offloading redo processing to a separate, distributed storage service. This approach enhances durability and availability through a novel quorum model and segmented storage, significantly improving performance and reducing recovery times. The authors present performance benchmarks demonstrating Aurora's superior scalability and efficiency compared to traditional MySQL configurations, along with lessons learned from real-world customer deployments highlighting the system's suitability for modern cloud applications. Finally, the paper compares Aurora's design to related work i...2024-12-2312 min

Code ImpactBuilding a Database on S3This research paper investigates the feasibility and limitations of using Amazon S3, a cloud storage service, as a database system for web applications. The authors present protocols for managing reads, writes, and commits to S3, addressing issues of concurrency and consistency. They explore different consistency levels, including eventual consistency and stronger guarantees like atomicity and monotonic reads/writes, analysing their performance and cost implications using the TPC-W benchmark. The study highlights the trade-offs between consistency, availability, and scalability inherent in utilising S3 for database applications and proposes solutions to enhance transactional properties while retaining S3’s inherent advantages. The pa...2024-12-2319 min

Code ImpactWhy Events Are A Bad Idea (for High-Concurrency Servers)This 2003 USENIX paper challenges the prevailing belief that event-based programming is superior to thread-based programming for high-concurrency servers. The authors argue that the perceived weaknesses of threads stem from flawed implementations rather than inherent limitations. They present a high-performance user-level thread package as evidence, supporting their claim that threads offer a simpler, more natural programming model. Furthermore, they propose compiler enhancements to further improve thread performance and safety. The paper compares and contrasts the two approaches, revisiting earlier work on their duality, and concludes that with proper compiler support, threads provide a superior solution for building scalable servers.2024-12-2317 min

Code ImpactSystem Design For Beginners - Everything You NeedThis Medium article by Shivam Bhadani provides a comprehensive guide to system design for beginners. It covers fundamental concepts like servers, latency, and throughput, progressing to advanced topics such as scaling strategies (vertical and horizontal), database scaling, microservices, caching, and message brokers. The author emphasises practical implementation alongside theory, using real-world examples and providing exercises to reinforce learning. The article concludes with advice on approaching system design problems and includes numerous illustrations. A strong focus is placed on distributed systems and their associated challenges, including consistency and leader election. Article: https://medium.com/@shivambhadani_/system-design-for-beginners-everything-you-need-in-one-article-c74eb702540b 2024-12-2315 min

Code ImpactWhy Threads Are A Bad Idea (for most purposes)This 1995 paper argues that threads, while powerful for achieving true CPU concurrency, are overly complex for most programming tasks. The author, John Ousterhout, contends that event-driven programming offers a simpler, more reliable alternative for applications such as GUIs and distributed systems. He highlights the difficulties of thread synchronisation, debugging, and performance optimisation, contrasting them with the relative ease of event handling. Ousterhout advocates using threads only when genuine parallel processing across multiple CPUs is essential, suggesting that even then, they should be confined to a core kernel within a predominantly single-threaded application. Ultimately, the paper promotes a pragmatic approach...2024-12-2313 min

Code ImpactSQLite - Past, Present and the FutureThis paper examines SQLite, the world's most widely deployed database engine, exploring its history, architecture, and performance characteristics. The authors benchmark SQLite against DuckDB, an analytics-focused database, across various workloads (OLTP, OLAP, and blob processing). Key performance bottlenecks in SQLite's OLAP capabilities are identified and addressed through optimisation, resulting in significant speed improvements. The study also considers the resource footprint of both databases, comparing their compilation times, library sizes, and data storage efficiency. Finally, the authors discuss future development directions for SQLite, balancing performance enhancements with its established strengths of portability, compactness, and reliability.2024-12-2312 min

Code ImpactPrequal - Load Balancing for Distributed SystemsThis research paper introduces Prequal, a novel load balancer designed to minimise latency in large-scale distributed systems like YouTube. Unlike traditional load balancers that focus on balancing CPU usage, Prequal prioritises estimated latency and requests in flight, actively probing servers for real-time load information. Extensive testing on YouTube and a controlled testbed demonstrated that Prequal significantly reduces tail latency, error rates, and resource consumption, compared to weighted round-robin and other load balancing strategies. The paper details Prequal's design, including its asynchronous probing mechanism and hot-cold lexicographic rule for replica selection, and its superior performance is attributed to its ability...2024-12-2315 min

Code ImpactThe RedMonk Programming Language Rankings: June 2024This document presents the June 2024 RedMonk Programming Language Rankings, which are based on data extracted from GitHub and Stack Overflow. The report, authored by Stephen O'Grady, provides an analysis of the most popular programming languages and their relative popularity, highlighting trends in language adoption. The rankings offer a glimpse into the dynamic world of software development and serve as a valuable resource for developers and organisations alike. Reference: https://redmonk.com/sogrady/2024/09/12/language-rankings-6-24/2024-10-2510 min

Code ImpactEvent Sourcing Pattern In MicroservicesBuilding a reliable microservices architecture requires careful consideration of data consistency. Join us as we explore event sourcing, a powerful pattern that ensures atomic updates across your database and message broker, eliminating data inconsistencies and paving the way for a robust and scalable system. Reference: https://microservices.io/patterns/data/event-sourcing.html 2024-10-2507 min

Code ImpactDomain Event Pattern In MicroservicesThis podcast explores domain events and their crucial role in modern software architecture. Join us as we discuss how domain events facilitate communication between services, support patterns like CQRS and Saga, and enable the development of robust and scalable applications. We'll also examine related concepts like DDD aggregates, transactional outboxes, and event sourcing to provide a comprehensive understanding of this vital architectural pattern. Reference: https://microservices.io/patterns/data/domain-event.html 2024-10-2514 min

Code ImpactCommand Query Responsibility Segregation CQRS Pattern In MicroservicesIn this episode, we explore the Command Query Responsibility Segregation (CQRS) pattern, a powerful technique for implementing efficient queries in microservice architectures. We break down the challenges of querying data across multiple services and explain how CQRS offers a solution through the creation of dedicated read-only databases. Tune in to discover the benefits, drawbacks, and real-world examples of CQRS in action, helping you streamline data retrieval and enhance your microservices' performance. Reference: https://microservices.io/patterns/data/cqrs.html 2024-10-2507 min

Code ImpactAPI Composition Design Pattern In MicroservicesDiscover the API Composition pattern, its benefits, drawbacks, and practical examples using API Gateways. Learn about alternative solutions like the CQRS pattern and understand how it addresses the challenges posed by the Database per Service pattern. This podcast is your guide to efficient data management in a microservice world. Reference: https://microservices.io/patterns/data/api-composition.html 2024-10-2208 min

Code ImpactCommand-Side Replica Design Pattern In MicroservicesThis podcast explores the world of microservices architecture, discussing patterns, benefits, and challenges. We'll cover topics like the Command-Side Replica pattern, Database per Service, and the Saga pattern. Learn how to design, implement, and manage complex systems using microservices. Join us as we break down the complexities of modern software development and empower you to build robust and scalable applications. Reference: https://microservices.io/patterns/data/command-side-replica.html 2024-10-2209 min

Code ImpactSaga Pattern In MicroservicesIn the realm of microservices, where applications are split into independent services, managing transactions that span multiple services becomes a challenge. This episode explores the Saga pattern, a powerful solution for coordinating distributed transactions. Discover how Sagas break down complex operations into smaller, manageable local transactions, ensuring data consistency without relying on traditional distributed transaction mechanisms. Learn about the two main Saga coordination styles – choreography and orchestration – and their trade-offs. We'll also discuss the benefits and drawbacks of using Sagas, including strategies for handling failures and maintaining isolation. Reference: https://microservices.io/patterns/data/saga...2024-10-2010 min

Code ImpactShared Database Pattern In MicroservicesJoin us as we discuss the challenges of managing data in a microservices architecture. We'll focus on the "Shared Database" pattern, analysing its potential benefits and drawbacks, and providing insights to help you make informed architectural decisions for your applications. Reference: https://microservices.io/patterns/data/shared-database.html 2024-10-2010 min

Code ImpactDatabase Per Service Design Pattern in Microservice Architecture In the world of microservices, data management is key. This episode explores the "Database-per-Service" pattern – a powerful approach to keeping your services loosely coupled and adaptable. We'll look at the benefits, drawbacks, and how to navigate challenges like distributed transactions and complex queries. Get ready to optimise your microservice architecture! Reference: https://microservices.io/patterns/data/database-per-service.html 2024-10-2012 min

Code ImpactThe Polling Publisher Pattern Deep DiveThis episode focuses on the Polling Publisher pattern, a crucial technique for publishing messages in a microservices architecture. Explore its workings, benefits, drawbacks, and alternatives like Transaction Log Tailing. Reference: https://microservices.io/patterns/data/polling-publisher.html 2024-10-2007 min

Code ImpactThe Power of Transaction Log TailingIn this episode, we explore the Transaction Log Tailing pattern, a powerful technique for ensuring reliable message delivery in a microservices architecture. We'll discuss how this pattern works, its benefits and drawbacks, and how it compares to alternative solutions. Join us as we demystify transaction log tailing and unlock its potential for your microservices applications. References: https://microservices.io/patterns/data/transaction-log-tailing.html 2024-10-2009 min

Code ImpactMastering Reliable Messaging in the Transactional Outbox Beyond Event SourcingWhile event sourcing is a popular approach for managing data consistency in microservices, there are alternative patterns worth considering. This episode explores the Transactional Outbox pattern, a versatile solution for reliably sending messages and events even when distributed transactions are not an option. Uncover the inner workings of this pattern and how it guarantees message ordering and delivery while maintaining the flexibility of your microservices architecture. Reference: https://microservices.io/patterns/data/transactional-outbox.html 2024-10-2014 min

Code ImpactMicroservices for Agile OrganisationsIn today's fast-paced business environment, agility is key. This podcast explores how microservices empower organisations to achieve rapid development cycles, adapt quickly to changing requirements, and maintain a competitive edge. Join us as we discuss the principles, practices, and real-world experiences of adopting microservices for increased agility. Reference: https://microservices.io/patterns/decomposition/service-per-team.html 2024-10-2005 min

Code ImpactMicroservices Patterns, Practices, and PitfallsAre you considering adopting microservices or looking to improve your existing architecture? This podcast is your one-stop shop for all things microservices. We discuss proven patterns, best practices, and common pitfalls to help you navigate the challenges of this architectural style. We'll cover topics such as service communication, data consistency, testing, deployment, and more, drawing on real-world experiences and industry insights. Reference: https://microservices.io/patterns/index.html 2024-10-1908 min

Code ImpactSeamless Scaling Without Slowing DownIn the world of distributed systems, performance is paramount. Discover how the 'Consistent Core' pattern allows large data clusters to coordinate efficiently by leveraging a smaller, dedicated cluster for critical functions. We'll unpack the problem of quorum-based algorithms in large clusters and explain how 'Consistent Core' offers a practical alternative. Reference: https://martinfowler.com/articles/patterns-of-distributed-systems/consistent-core.html 2024-10-1903 min

Code ImpactUnpacking the "Emergent Leader" PatternIn distributed systems, leadership can emerge organically without explicit elections. This podcast unpacks the "Emergent Leader" pattern, where the oldest node in a cluster naturally assumes the role of coordinator. We'll discuss how this approach simplifies coordination, enhances fault tolerance, and enables efficient decision-making in complex, decentralized environments. Tune in to understand the intricacies of this powerful pattern and its implications for modern system design. Reference: https://martinfowler.com/articles/patterns-of-distributed-systems/emergent-leader.html 2024-10-1903 min

Code ImpactTaming Time in Distributed SystemsImagine a world where your online shopping experience becomes a confusing mess because different parts of the system can't agree on what time it is! This is the challenge of external consistency in distributed systems, and it's the topic of today's episode. We'll explore the Clock-Bound Wait pattern – a clever solution that ensures all nodes in a system are on the same page, time-wise, before reading or writing data. This episode will be particularly interesting for software developers, architects and anyone curious about the hidden complexities of building reliable distributed systems.2024-10-1808 min

Code ImpactHow Netflix Handles Real-Time UpdatesEver wonder how Netflix manages to send real-time updates to a billion devices worldwide? This podcast explores the fascinating story of Pushy, Netflix's custom-built WebSocket server. We'll hear from the engineers who built and scaled Pushy, and learn about the technical challenges they faced along the way. We'll also explore the future of Pushy, and how it will continue to evolve to meet the needs of Netflix's growing user base. Reference: https://netflixtechblog.com/pushy-to-the-limit-evolving-netflixs-websocket-proxy-for-the-future-b468bc0ff658 2024-10-1811 min

Code ImpactHow Netflix Uses Traffic Replays to Build ConfidenceIn this episode, we explore the fascinating world of traffic replay at Netflix. Discover how this powerful technique allows engineers to simulate real-world scenarios, test new features at scale, and identify potential issues before they impact users. We'll examine the 'Basic with Ads' launch as a case study and discuss the broader applications of traffic replay in software development. Reference: https://netflixtechblog.com/ensuring-the-successful-launch-of-ads-on-netflix-f99490fdf1ba 2024-10-1808 min

Code ImpactThe Practical Guide to GraphQL Adoption in PayPalGraphQL has generated a lot of buzz, but what does it take to successfully adopt it within your organization? This podcast cuts through the hype and provides a practical roadmap for GraphQL implementation. Drawing on PayPal's experiences, we'll explore the key factors to consider, from assessing your company's needs and setting realistic expectations to building a solid foundation, scaling knowledge, and establishing design standards. We'll also address common challenges, such as performance optimization, error handling, authentication, and the importance of investing in the GraphQL community.2024-10-1808 min

Code ImpactUnderstanding the Gitflow WorkflowThe Gitflow workflow is a branching model for Git that can be used to manage the development of software projects. In this episode, we will discuss the different branches in the Gitflow workflow and how they are used. We will also provide an example of how to use the Gitflow workflow to develop a new feature. Reference: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow 2024-10-1807 min

Code ImpactGeospatial Analytics at Scale with PrestoGeospatial analytics is becoming increasingly important as companies seek to gain insights from location data. This podcast will discuss how Presto can be used to perform geospatial queries efficiently. We will also discuss the QuadTree data structure, which is used to index geospatial data in Presto. Reference: https://scontent.fblr1-4.fna.fbcdn.net/v/t39.8562-6/280858273_398229132019114_3101674654302651_n.pdf?_nc_cat=108&ccb=1-7&_nc_sid=e280be&_nc_ohc=_DdBGwQNZnIQ7kNvgGlIBGq&_nc_zt=14&_nc_ht=scontent.fblr1-4.fna&_nc_gid=AOTWRBja0lnMACuRisrKSQr&oh=00_AYA6MfDxtmeCQTL_-bQrVSEPZbBBA8XqiDGs0TvV1m9nMw&oe=6715C2B6 2024-10-1618 min

Code ImpactFrom Likes to Less Servers: Facebook's Journey to MyRocksThis podcast explores Facebook's transition from using the InnoDB storage engine to MyRocks, an LSM-tree-based storage engine built on RocksDB, for its massive User Database (UDB). The podcast will examine the challenges Facebook faced with InnoDB, such as index fragmentation and write amplification, and how MyRocks helped overcome these obstacles. It will discuss the key features and optimisations of MyRocks, including prefix bloom filters, tombstone management and bulk loading, and how these innovations led to a 62.3% reduction in instance size and fewer database servers. Reference: Research paper "MyRocks: LSM-Tree Database Storage Engine Serving Facebook's...2024-10-1615 min

Code ImpactArchitecture of a Database System: Tech Deep DiveThis episode explores the intricate architecture of Database Management Systems (DBMSs), those mission-critical software systems that power our digital world. From the earliest online server systems to modern-day giants, we trace the evolution of DBMS design, examining key components such as process models, parallel architectures, storage systems, transaction systems, query processors, and optimizers. Get ready for an insightful journey into the heart of database systems! Reference: https://dsf.berkeley.edu/papers/fntdb07-architecture.pdf 2024-10-1618 min

Code ImpactWhat Every Programmer Should Know About Memory Join us as we explore the intricate world of CPU and memory architecture, uncovering the hidden performance bottlenecks that plague software developers. We'll break down complex topics like cache behaviour, virtual memory, NUMA systems, and more, empowering you to write faster and more efficient code. From understanding the impact of cache lines and TLB misses to leveraging advanced techniques like prefetching and helper threads, we'll equip you with the knowledge to optimise your code for maximum performance. Whether you're a seasoned developer or just starting out, this podcast will provide valuable insights and practical tips to elevate your coding skills...2024-10-1610 min

Code ImpactBeyond Zstandard: Unexpected Wins in Discord's Performance JourneyDiscord's pursuit of a more efficient platform led them down a path filled with unexpected discoveries. This episode goes beyond the implementation of zstandard compression to explore the surprising optimisation of passive sessions. We'll analyse how a keen eye for detail and a data-driven approach uncovered hidden opportunities for improvement, resulting in substantial bandwidth savings Reference: https://discord.com/blog/how-discord-reduced-websocket-traffic-by-40-percent 2024-10-1509 min

Code ImpactFrom Monolith to Microservices: A Razorpay Tech Deep DiveGet ready for a technical deep dive into Razorpay's evolution from a monolith to a microservices architecture. Our guest, Arjun, a payments platform engineer at Razorpay, walks us through the pivotal decisions and engineering feats that enabled Razorpay to scale its systems and handle the massive transaction volumes of India's booming digital payments landscape. We'll explore the challenges of splitting databases, the importance of the outboxer pattern and CDC pipelines for maintaining data consistency, and the intricate testing strategies used to guarantee system reliability. This episode offers invaluable insights for engineers grappling with...2024-10-1512 min

Code Impact● Apache Kafka, Flink and Pinot: Open Source Powering Uber's Real-Time Data StackBuilding and scaling a real-time data infrastructure is a complex undertaking, fraught with challenges and valuable lessons. This episode takes a deep dive into Uber's journey, exploring the hurdles they encountered while managing petabytes of real-time data. We'll discuss the need for data consistency, availability, and freshness, the complexities of handling diverse use cases and user groups, and the constant need for system evolution. Tune in to learn from Uber's experiences and gain insights into building robust and scalable real-time data infrastructures. References: https://arxiv.org/pdf/2104.00087 2024-10-1511 min

Code ImpactInstant Purge: How Cloudflare Makes Content Disappear in the Blink of an EyeImagine a world where online content updates instantly, everywhere. This podcast explores Cloudflare's groundbreaking "Instant Purge" system, which removes outdated content from its cache in under 150 milliseconds. Discover the innovative technologies and strategies behind this feat, and how they impact the internet experience for users worldwide. Reference: https://blog.cloudflare.com/instant-purge/2024-10-1510 min

Code ImpactUber's MySQL Upgrade: A Smooth Ride to Version 8.0Join us as we go behind the scenes with Uber engineers to learn how they upgraded their massive MySQL fleet to version 8.0 - without any downtime! We'll explore the challenges they faced, the solutions they developed, and the lessons they learned along the way. From performance boosts to enhanced security, discover how this major upgrade paved the way for a smoother, more efficient Uber experience. Reference: https://www.uber.com/en-IN/blog/upgrading-ubers-mysql-fleet/2024-10-1511 min

Code ImpactData Wranglers: Mastering the Art of Data Integration with AWS Glue In this podcast, we explore the intricate world of data integration and how AWS Glue empowers data professionals to conquer its complexities. Through conversations with experts and real-world case studies, we uncover the secrets to efficiently extracting, cleaning, enriching, and loading data using Glue's powerful suite of tools. Whether you're a seasoned data engineer or a curious data scientist, join us as we break down the barriers to seamless data integration and unlock the true potential of your data. Referece: https://drive.google.com/file/d/1CxK5bTV8ZQgNFe3TI582W9N8PyLa5...2024-10-1511 min

Code ImpactRevolutionising Cloud Data Warehouses: The Power of Predicate Caching Cloud data warehouses have become an essential part of data analytics, offering speed, scalability, and elasticity for processing massive datasets12. However, traditional caching methods like result caching and materialized views often struggle with data updates and can have high overhead134. This episode explores predicate caching, a novel indexing technique that enhances query performance by caching ranges of qualifying data, leading to significant speed improvements without the drawbacks of traditional methods56. Tune in to discover how predicate caching leverages the repetitive nature of real-world workloads, minimises resource usage, and adapts to data updates seamlessly. Reference: https://drive.google.com/file...2024-10-1505 min

Code ImpactBeyond the Cloud: Databases for the Serverless EdgeExplore the cutting-edge world of serverless computing at the network edge and the challenges of integrating databases for lightning-fast data access. Join us as we unpack Limbo, a groundbreaking approach to re-architecting SQLite, the world's most popular database, for a serverless future. We'll discuss the limitations of traditional database architectures in serverless environments and how asynchronous I/O could unlock a new era of performance and scalability. Reference: https://drive.google.com/file/d/1nuURga5TdctAorXCRPOraIgiLwT0n6S-/view 2024-10-1505 min

Code ImpactScaling Up with Amazon RedshiftThis podcast explores how businesses can use Amazon Redshift to scale their data warehousing needs. Each episode will feature experts who will share their insights on topics such as intelligent scaling, performance tuning, and best practices for using Redshift. Reference: https://drive.google.com/file/d/1E7cb5Ttj21JvI3svJC0QnkhncycX2PS-/view 2024-10-1504 min

Code ImpactInside GFS: A Deep Dive into Google's Distributed File SystemThis podcast explores the Google File System (GFS), a groundbreaking distributed file system designed to handle Google's massive data processing needs. We'll examine the key design principles behind GFS, such as its focus on fault tolerance, scalability, and high aggregate performance. Join us as we discuss the challenges of building and maintaining such a system, and the innovative solutions that Google engineers have implemented. Reference: https://drive.google.com/file/d/1S_hYRcjdo7aR0ShXuIuK5ePQm2U0FEs2/view 2024-10-1510 min

Code ImpactBuilding Highly Available and Durable Applications with DynamoDBIn this podcast, we'll explore how Amazon DynamoDB ensures high availability and data durability for mission-critical applications. We'll discuss how the service replicates data across multiple availability zones, employs techniques like log replicas to maintain write quorums, and continuously verifies data at rest to prevent data loss. We'll also cover strategies for failure detection, deployment best practices, and managing dependencies on external services. Learn how to leverage DynamoDB's robust features to build resilient and reliable applications. Reference document: https://drive.google.com/file/d/1nA7iL9b_WLlQKhuzAV9...2024-10-1519 min

Code ImpactScaling for Dhoni: How JioCinema Streams IPL to MillionsThis podcast goes behind the scenes with Prachi Sharma, a senior engineering director at JioCinema, to explore the technical challenges of live streaming the Indian Premier League (IPL) to millions of concurrent viewers. Discover how JioCinema’s engineering team prepares for massive traffic spikes, especially when MS Dhoni is playing, by using strategies like pre-scaling, multi-CDN optimisation, and graceful degradation. Learn about the crucial role of feature flags in mitigating issues during live matches, as well as the importance of rigorous auditing and testing of both front-end and back-end systems. Gain insights into the complexities of ad insertion at...2024-10-1419 min

Code ImpactA Flexible Large-Scale Similar Product Identification System in E-commerceThis research paper explores the development of a "Product Similarity Service" (PSS) for identifying similar products within e-commerce platforms. The authors highlight the challenges of defining "similarity" across diverse applications and the need for scalable solutions to handle massive product datasets. PSS leverages deep neural networks, multi-task learning, and distributed computing techniques to address these challenges. The system employs a hybrid approach, integrating product content information (e.g., images, titles) with customer behaviour data (e.g., co-purchases, co-views), and provides flexible configuration options for different applications. Experimental results demonstrate the effectiveness of PSS...2024-10-1409 min

Code ImpactSQL Has Problems. We Can Fix Them: Pipe Syntax In SQLThis document discusses the limitations of SQL and proposes a solution in the form of pipe syntax. The document is titled "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" and focuses on the idea that SQL, a language used for managing data in databases, has deficiencies. The authors advocate for using pipe syntax to improve SQL's capabilities, suggesting that it would address some of the perceived problems within the existing SQL framework. https://drive.google.com/file/d/1hmtVJJ5Lud3Z1GebHqZGW1V-8Oe0g3lN/view 2024-10-1403 min

Code ImpactHow AWS Lambda Tackles Fast Loading of Large Container ImagesThis paper describes how Amazon Web Services (AWS) Lambda addresses the challenge of loading large container images quickly to support serverless applications. AWS Lambda uses a system that combines several techniques, including caching, deduplication, convergent encryption, erasure coding, and block-level demand loading. The paper describes how these techniques work together to achieve high scalability and low cold-start times. It also covers some challenges faced in implementing the system, such as ensuring data security and handling metadata. Finally, it presents the system’s performance characteristics and the authors' experiences with deploying it in production. ...2024-10-1418 min

Code ImpactWeb Wins: Real-World Success Stories in DevelopmentEver wonder what makes a website truly great? In this episode, we’re sharing real stories from the trenches of web development. From cutting down loading times to making sites more accessible, we explore how companies solved tough challenges to create fast, user-friendly web experiences. Whether you’re a developer or just curious about what goes on behind the screens, these case studies will give you practical tips and inspiration to level up your web game. Join us for a peek into what it really takes to build something people love online.2024-10-1418 min

Sourcing Challenge ShowSourcing Challenge Weekly - A New Year Ahead - 5th January 2021Dov Zavadskis and Mark Lundgren talk about this week in the World of Talent Sourcing. Time Stamp: 02:51 Inclusive Sourcing on Social Media by Maisha Cannon: https://youtu.be/O6ToPxETKUk Watch or listen to Maisha's Sourcing Challenge Show Episode from 2018: https://sourcingchallenge.com/episode12 Time Stamp: 33:30 Google Dorks List and Updated Database in 2021 posted by Jan Tegze written by Sanket Makhija: https://twitter.com/jantegze/status/1345434554010652674 Watch or listen to Jan's Sourcing Challenge Show Episode from 2018: https://sourcingchallenge.com/episode7 Follow Dov on his new Instagram: https://www.instagram.com...2021-01-0646 min