Look for any podcast host, guest or anyone
Showing episodes and shows of

ThousandEyes

Shows

The Internet ReportThe Internet ReportBeyond "Break-Fix": Assuring Performance in the AI EraAs AI transforms IT infrastructure, it’s also reshaping what it takes for IT operations teams to assure performance and maintain quality digital experiences. In this episode, we’ll explore the new challenges facing ITOps teams as AI becomes more integrated into IT environments, covering key digital resilience strategies and important considerations.CHAPTERS00:00 Intro00:54 AI & IT Infrastructure03:15 Assuring Performance on Your AI Journey04:55 Distributed Architecture07:49 Digital Resilience10:21 Catching Issues in the AI Era11:47 Performance Problems13:57 AI Readiness: A Journey, Not a Destination15:07 Get in Touch———...2025-07-1915 minThe Internet ReportThe Internet ReportITOps Lessons From Outages at Google Cloud and OpenAIService delivery chains are often made up of a longer string of dependencies than you might expect. When an outage happens, the root cause might not be in your systems or even with a third-party provider you depend on. It could actually trace back to yet another third-party provider they rely on.We saw this phenomenon in action recently when some Cloudflare services were affected by an outage ultimately caused by Google Cloud issues.Tune in to hear more about what happened at Google Cloud and Cloudflare, and also explore takeaways from a recent OpenAI...2025-06-2617 minThe Internet ReportThe Internet ReportLevel Up Your Cloud Monitoring With These TipsCloud monitoring requires holistic, end-to-end visibility across complex, interconnected environments rather than isolated metrics. Here are best practices CloudOps teams should keep in mind.———CHAPTERS00:00 Intro00:47 What CloudOps Should Focus On13:14 Decision-making16:25 AI17:53 Get in Touch———For additional insights, check out the links below:- The Ultimate Cloud Migration Survival Kit: https://www.thousandeyes.com/resources/the-ultimate-cloud-migration-survival-kit?utm_source=transistor&utm_medium=referral&utm_campaign=fy25q4_internetreport_q4fy25ep3_podcast- Cloud ROI: How To Measure Your Migration’s Impact: https://www.th...2025-06-0618 minThe Internet ReportThe Internet ReportDecoding Stealth Outages: Strategies for Digital ResilienceJust because an outage is subtle, doesn’t mean it’s harmless. Learn how to catch those pesky “stealth outages” that can so easily slip under the radar, and also unpack recent service disruptions at Slack, Microsoft 365, and X.CHAPTERS00:00 Intro00:56 Slack08:16 Microsoft 36511:22 X13:26 Outage Trends: By the Numbers16:26 Get in Touch———For additional insights, check out the links below:- The Five Phases of Internet Outage Recovery: https://www.thousandeyes.com/resources/five-phases-internet-outage-recovery-infographic?utm_source=transistor&utm_medium=referral&utm_campaign=fy25q4_internetreport...2025-05-2417 minThe Internet ReportThe Internet ReportA Story of Scale: Geoff Huston on the Evolution of Network ArchitectureJourney through the evolution of network architecture and explore what the future might hold in this conversation with APNIC’s Chief Scientist Geoff Huston.Geoff and The Internet Report team will cover how the Internet has transformed significantly over the past four decades, scaling to meet rapidly growing demand. And they’ll also discuss how the challenge to “scale still more” continues today as the networking community evolves infrastructure to support emerging technologies like artificial intelligence (AI).CHAPTERS00:00 Intro00:11 Meet Geoff Huston02:56 The Shift to Asymmetry10:58 The Challenge of Scale22:07 Moore's...2025-05-1045 minThe Internet ReportThe Internet ReportTroubleshooting Tips & Outages at Zoom, Spotify & MoreDive into recent service disruptions at Zoom, Spotify, SAP Concur, and Vanguard UK, and explore what they reveal about troubleshooting best practices for ITOps teams.Tune in now for insights from The Internet Report team or use the chapters below to jump to the sections that most interest you.CHAPTERS:00:00 Intro00:52 Zoom Outage04:40 SAP Concur Disruption 07:28 Spotify Outage10:58 Vanguard Outage13:59 By the Numbers16:01 Get in Touch———For additional insights, check out the links below:- The Internet Report’s latest blog: https://www.thous...2025-04-2616 minThe Internet ReportThe Internet ReportWhy Even 1% Packet Loss Can Impact User ExperiencesPacket loss can be bad news for network flows and customer experience. However, in our experience, NetOps teams tend to focus on major spikes in packet loss, while overlooking smaller amounts like 1 or 2%.This might be a mistake. Tune in for a deep dive into research findings suggesting that even 1% packet loss can significantly impact user experience—and recommendations for steps NetOps teams should take as a result.———CHAPTERS00:00 Intro01:07 The Surprising Impact of 1% Packet Loss02:50 Research Methodology08:17 Key Findings13:55 Recommendations for NetOps Teams16:48 Additional Research2025-04-1123 minThe Internet ReportThe Internet ReportUnderstanding Service Disruptions at X, Workday & MastercardGo under the hood of recent service disruptions at X, Workday, and Mastercard—and explore why it’s so important to quickly (and accurately) identify the root cause of an outage.———CHAPTERS00:00 Intro00:59 X Outage07:08 Workday Outage11:00 Mastercard Service Disruption14:48 By the Numbers16:05 Get in Touch———For additional insights, check out The Internet Report’s latest blog: https://www.thousandeyes.com/blog/internet-report-service-disruptions-x-workday-mastercard?utm_source=transistor&utm_medium=referral&utm_campaign=fy25q3_internetreport_q3fy25ep4_podcastAnd to learn more about how to deliver sea...2025-03-2216 minThe Internet ReportThe Internet ReportUnpacking the Slack Outage & Other Backend IssuesDive into the recent Slack outage and disruptions at Microsoft 365, Grafana Cloud, and Otter.ai—plus, explore key takeaways for ITOps teams.———CHAPTERS:00:00 Intro00:48 Slack Outage06:55 Microsoft 365 Outage11:44 A Pair of Otter.ai Outages14:21 Grafana Cloud Disruption15:55 By the Numbers17:58 Get in Touch———To learn more about how to deliver seamless digital experiences in a distributed IT landscape, read this eBook:  https://www.thousandeyes.com/resources/guide-to-next-generation-assurance-ebook?utm_source=transistor&utm_medium=referral&utm_campaign=fy25q3_internetreport_q3fy25ep3_podcast ———Want to get in to...2025-03-0818 minThe Internet ReportThe Internet ReportConfiguration Mishaps Strike Again: Asana Outages & More NewsOutages connected to configuration mishaps were a common theme last year, and we’ve continued to see incidents like these in 2025. Configuration changes triggered two consecutive Asana outages in early February, and configuration or update-related issues may also have contributed to recent disruptions at Barclays, ChatGPT, Jira, and Discord.Tune in to hear The Internet Report’s Mike Hicks unpack these incidents and discuss ways ITOps teams can guard against similar issues.———CHAPTERS:00:00 Intro01:06 Asana Outages11:40 ChatGPT Disruption19:34 Barclays Outage21:57 Jira Outage22:59 Discord Outage24:31 By the Numbers2025-03-0131 minThe Internet ReportThe Internet ReportThe Show Must Go On: ITOps Lessons From the Events IndustryWhat does it take to deliver successful digital experiences at major events like concerts and conferences? With special guest Dominic Hampton—Managing Director at attend2IT—we’ll explore the dynamic world of event IT and key takeaways ITOps teams at enterprise companies can apply to their own events as well as in their day-to-day operations.We’ll also discuss insights from recent incidents that impacted Azure, Microsoft 365, and more.CHAPTERS00:00 Intro01:34 Behind the Scenes of Event IT: Lessons for Enterprise ITOps22:42 Microsoft Azure Incident24:15 Microsoft 365 Disruption25:31 Atlassian Bitbucket Cloud Outage2025-01-3131 minThe Internet ReportThe Internet ReportConfiguration Change Trouble & Other 2024 Outage TrendsConfiguration changes played an outsized role 2024 outages. Tune in to hear more about this and other outage trends—and learn how ITOps teams should plan accordingly in the year ahead.We’ll also share insights from recent incidents at OpenAI and Google Cloud’s Pub/Sub, and dive deeper into a degradation incident that Netflix experienced at the end of 2024.Read on to learn more, or use the chapters below to jump to the sections that most interest you.CHAPTERS00:00 Intro00:58 Cloud Service Provider (CSP) Outages Continue To Rise 01:52 Accidental Misconf...2025-01-1821 minThe Internet ReportThe Internet Report2024 Outage Trends Solidify; Plus OpenAI & Meta OutagesWith nearly a year of data available, the topline outage trends for 2024 are coming into focus. Tune in to see what the numbers are showing.The Internet Report team will discuss how Internet service provider (ISP) outage numbers are continuing to increase, while cloud service provider (CSP) outages are also becoming more frequent, indicating a changing landscape in service reliability. They’ll also unpack the recent OpenAI and Meta outages.———CHAPTERS:00:00 Intro00:49 Outage Trends Across 202407:37 OpenAI Outage13:10 Meta Outage18:48 Get in Touch ———For additional insights, check...2024-12-2119 minThe Internet ReportThe Internet ReportDigitalOcean, Reddit Outages & Worldline’s IT PerturbationsThe past few weeks are somewhat of a representative sample of 2024 from an outage perspective, with connectivity issues and updates at the root of the four recent incidents.Both DigitalOcean and real-time payments provider Worldline experienced connectivity issues to data centers that made services unreachable. Meanwhile, Microsoft and Reddit encountered problems following changes to their systems that appeared to have unexpected user impacts and had to be rolled back. Tune in to hear The Internet Report team unpack these incidents and discuss the latest outage trends.———CHAPTERS:00:00 Intro00:50 Reddit Server...2024-12-1415 minThe Internet ReportThe Internet ReportTalking Proactive Optimization, ChatGPT Issues & MorePowerful things happen when ITOps teams move beyond a break-fix approach and lean into proactive optimization. Instead of just responding to issues as they occur, when teams have independent visibility into their end-to-end service delivery chain, they can proactively identify possible areas for optimization and improvement. For example, streamlining one small part of a complex process could shave seconds off the total transaction time; do this for every part of the process, and the efficiency savings can quickly add up.In recent weeks, it appeared OpenAI’s ChatGPT may have been undergoing this type of opt...2024-11-2719 minThe Internet ReportThe Internet ReportDORA & ITOps Best Practices; Plus BMO, Google OutagesThe Digital Operational Resilience Act (DORA) goes into effect on January 17, 2025, and financial institutions serving the EU will need to meet an enhanced set of requirements related to risk management, network resilience, and incident reporting.While DORA is directly applicable to EU financial institutions, it prompts important discussions about resilience and ensuring digital experiences that are relevant for all IT operations teams, regardless of industry or region.Tune in to the podcast to hear The Internet Report team and special guest Bernie Clairmont, Product Solution Architect at ThousandEyes, dive deeper into DORA.As...2024-11-0830 minThe Internet ReportThe Internet ReportLet’s Talk Status Pages & Salesforce, Microsoft OutagesA recent Salesforce outage highlighted the limitations of status pages and the importance of considering a variety of data points when identifying the source of an outage.Tune in to hear The Internet Report team discuss what happened and why. They’ll also share insights from a recent Microsoft Outlook outage and cover the latest Internet outage trends.Listen now or use the chapters below to jump to the sections that most interest you.CHAPTERS00:00 Intro00:48 Salesforce Outage10:00 Microsoft Outlook Outage14:43 Outage Trends: By the Numbers17:22 Get in To...2024-10-2518 minThe Internet ReportThe Internet ReportServiceNow, Microsoft & Workday Outages, ExplainedA recent certificate problem impacted ServiceNow, and other issues prevented users from accessing key cloud services including Microsoft 365, Azure Virtual Desktop, and Workday.Tune in to hear what happened during these incidents and a separate data center fire that caused a Reliance Jio outage for customers across multiple areas of India.Listen now or use the chapters below to jump to the sections that most interest you.CHAPTERS00:00 Intro00:59 ServiceNow Outage03:20 Microsoft 365 Outage04:35 Azure Virtual Desktop Outage05:50 Workday Outage09:39 Reliance Jio Outage13:06 Outage Trends: By the...2024-10-0515 minCisco Champion RadioCisco Champion RadioS11|E20 Digital Experience Assurance with Cisco ThousandEyes: Cloud Insights & IntegrationBy combining the power of Cisco Networking Cloud, ThousandEyes Digital Experience Assurance, and Cisco's unmatched dataset, new innovations unlock proactive insights and automated operations across customers' entire digital ecosystem. Marko Tisler, Group Product Manager with Cisco ThousandEyes, and the Cisco Champions will discuss new digital experience assurance capabilities announced on June 4 at Cisco Live, as well as dig deeper into the now generally available Cisco Secure Access Experience Insights, powered by ThousandEyes. By integrating ThousandEyes with Secure Access, IT and security teams now have full visibility into key components of the digital experience: device, network, and application. This means that...2024-10-0200 minCisco Champion RadioCisco Champion RadioS11|E20 Digital Experience Assurance with Cisco ThousandEyes: Cloud Insights & IntegrationBy combining the power of Cisco Networking Cloud, ThousandEyes Digital Experience Assurance, and Cisco's unmatched dataset, new innovations unlock proactive insights and automated operations across customers' entire digital ecosystem. Marko Tisler, Group Product Manager with Cisco ThousandEyes, and the Cisco Champions will discuss new digital experience assurance capabilities announced on June 4 at Cisco Live, as well as dig deeper into the now generally available Cisco Secure Access Experience Insights, powered by ThousandEyes. By integrating ThousandEyes with Secure Access, IT and security teams now have full visibility into key components of the digital experience: device, network, and application. This means that...2024-10-0200 minCisco Champion RadioCisco Champion RadioS11|E20 Digital Experience Assurance with Cisco ThousandEyes: Cloud Insights & IntegrationBy combining the power of Cisco Networking Cloud, ThousandEyes Digital Experience Assurance, and Cisco's unmatched dataset, new innovations unlock proactive insights and automated operations across customers' entire digital ecosystem. Marko Tisler, Group Product Manager with Cisco ThousandEyes, and the Cisco Champions will discuss new digital experience assurance capabilities announced on June 4 at Cisco Live, as well as dig deeper into the now generally available Cisco Secure Access Experience Insights, powered by ThousandEyes. By integrating ThousandEyes with Secure Access, IT and security teams now have full visibility into key components of the digital experience: device, network, and application. This means that...2024-09-2433 minThe Internet ReportThe Internet ReportManaging Traffic During Peak Demand; Plus, Microsoft, Akamai OutagesDuring high-traffic seasons like Black Friday or a much-anticipated product launch, maintaining good digital experiences for customers is vital. We’ve all heard tales of floods of eager shoppers crashing a website during a major sale—leaving them unable to make their coveted purchases. To guard against a breakdown like this during high-traffic periods, companies sometimes use various traffic management strategies such as digital waiting rooms.In this episode, The Internet Report team discusses the pros and cons of traffic management and looks at the different techniques used by ticketing platforms for the upcoming Oasis reunion tour conc...2024-09-2119 minThe Internet ReportThe Internet ReportThe Current Subsea Cable Ecosystem: Resiliency & What’s NextLet’s dive into the fascinating world of subsea cables. With special guest Murray Burling—Executive Director of Oceans and Environment at RPS—we’ll explore the current subsea cable ecosystem and chat about what the future might hold.Tune in for insights on how important subsea cables are for today’s digital experiences, how decisions are made on where to place them, the consequences of cable cuts, and route diversity and Internet resilience.CHAPTERS00:00 Intro02:29 Current Subsea Cable Ecosystem07:16 Subsea Cable Cuts15:15 Route Diversity & Internet Resilience18:51 What’s Next22:05 Get...2024-09-0722 minThe Internet ReportThe Internet ReportAnalyzing X’s Livestream & GitHub, Google OutagesExplore the recent Google Cloud and GitHub outages, plus get insights from a network perspective into the August 12 X livestream event featuring Elon Musk and Donald Trump.In the case of Google Cloud, a power issue in one of its European regions impacted connectivity and affected several services and networking equipment. The problems disrupted connectivity into the region as well as some Partner Interconnect connections and associated routes between other Google regions.Traffic to and from GitHub.com encountered an issue when a database configuration change resulted in critical services unexpectedly losing connectivity....2024-08-2416 minThe Internet ReportThe Internet ReportWhy NetOps Is the Real MVP of the Sports WorldThis week, The Internet Report team and special guest Dave Anderson—a tech industry veteran and co-host of "A Very Melbourne Podcast," which covers the Australian Football League and more—are chatting about how to assure great digital experiences at major sporting events.Large sporting events are always logistically complex, and today that’s even more the case with digital technology permeating every part of operational and experience delivery. And due to the real-time nature of live sports, any glitch can have a big impact on fan experiences—whether they’re at the stadium or joining in from their...2024-08-1033 minThe Internet ReportThe Internet ReportUnpacking the CrowdStrike Update, Azure Outage, & MoreOn July 19, many organizations around the globe—including airlines, banks, and hospitals—experienced outages as Windows machines reportedly got stuck in a boot loop that ultimately resulted in the Blue Screen of Death (BSOD). These disruptions had a common source: an update from CrowdStrike, a managed detection and response (MDR) service used to protect Windows endpoints from attack. Tune in to hear The Internet Report team’s insights on this CrowdStrike update and the ensuing IT outages. We’ll also dive into the separate Azure outage that occurred just hours before, as well as some other rece...2024-07-2717 minThe Internet ReportThe Internet ReportTwitter to X: Charting Performance and OutagesOn May 17, X reached a major milestone when the social media platform completed its full migration from twitter.com to x.com. While the number and frequency of outages did increase after the company’s acquisition by Elon Musk, following the domain migration, there don’t appear to have been any significant disruptions to the X.com platform. In this week’s podcast, The Internet Report team discusses what they observed during (and after) the domain migration, and analyzes X’s performance pre- and post-acquisition. ——— CHAPTERS:00:00 Intro00:55 Performance in the Twitter Era05:13 The Sale...2024-07-1718 minThe Internet ReportThe Internet ReportInsights From Outages at Starlink, Schwab & Internet ArchiveThree recent outages at Starlink, Charles Schwab, and the Internet Archive highlight key reminders for NetOps teams around backup options, the role of intelligence, and understanding your end-to-end service delivery chain.A subset of Starlink users were unable to establish a connection; some users of Schwab.com and its apps may have found themselves unable to transact or trade due to an authentication issue; and the Internet Archive and the Wayback Machine were intermittently overwhelmed by unexpected traffic floods.Tune in to learn more about what happened and why, or use the chapters below to...2024-06-2216 minThe Internet ReportThe Internet ReportCloud Outages Rise & Other H1 2024 Internet Outage TrendsBelieve it or not, we’re already about halfway through 2024. Looking at the outage data from this year so far, we see continued evolution, following patterns observed over the past few years. Notably, the percentage of cloud service provider (CSP) outages is still increasing—though at a more accelerated rate than seen in recent years.Tune on to learn more about this trend and other themes we’re noticing in the Internet ecosystem, as well as tips for how IT teams can respond to these evolving challenges.———CHAPTERS:00:00 Intro01:14 Cloud Inciden...2024-06-1521 minThe Internet ReportThe Internet ReportMeta and Salesforce Tackle Intermittent IssuesWhen it comes to assuring great digital experiences for your users, intermittent issues can be incredibly difficult to discover and diagnose because the service is both working and not working simultaneously—or, it may simply be running slow. Some users may experience issues, while for others, everything will work just fine.In this week’s episode, The Internet Report team will explore the complexities that intermittent issues can bring by examining two recent incidents at Meta and Salesforce. They’ll also cover an automation bug at Google Cloud that caused problems for a range of custome...2024-05-2517 minThe Internet ReportThe Internet ReportOutages at X, google.com, and jsDelivr + Why Details MatterExplore what happened during recent outages at google.com, X (formerly Twitter), and CDN service jsDelivr. The Internet Report team will also discuss why a detailed understanding of every component in your service delivery chain is vital to maintain the availability and resiliency of your service. If even one component encounters challenges, the entire service can be impacted.In jsDelivr’s case, for example, the detail at issue was an expired cert, which created problems serving content and impacted many websites that rely on the CDN service.Listen now to learn more or use...2024-05-1118 minThe Internet ReportThe Internet ReportInside the ChatGPT Outage & More News | Pulse UpdateGo under the hood of a ChatGPT outage, H&R Block’s Tax Day disruption, and more incidents from the past few weeks. The Internet Report team will also discuss Microsoft’s update on recent subsea cable cuts and the latest global outage trends.———CHAPTERS:00:00 Intro00:57 ChatGPT Outage03:35 Revisiting West Coast of Africa Cable Cuts09:07 H&R Block Outage11:32 Sky Mobile Outage12:25 Outage on unpkg CDN14:06 PlayHQ Outage16:40 Outage Trends: By the Numbers19:33 Get in Touch ———For more insights, check out the links below:2024-04-2720 minThe Internet ReportThe Internet ReportWhatsApp & Apple Outages; Plus ITOps Tax Day Survival TipsWith tax season coming to a close in the United States, IT teams at tax preparation companies and other organizations in the industry will be taking extra care to make sure that their systems can handle a spike in traffic due to a potential last-minute rush of filings. Tune in to hear The Internet Report hosts discuss how IT teams can navigate major spikes in demand and give customers the best possible digital experience, whether it’s Tax Day, Black Friday, or another high-traffic period.They’ll also unpack recent outages at companies including Apple and W...2024-04-1327 minThe Internet ReportThe Internet ReportHow Third-party Issues Led to McDonald’s, DMV Outages | Pulse UpdateThe end-to-end delivery of modern digital services can introduce a complex web of dependencies and failure points, which can stem from direct relationships as well as third-party providers, introducing layers of abstraction for operations teams to keep track of. Managing this complex ecosystem can be challenging. Unexpected issues may arise from seemingly insignificant components, surprising even the largest, most technologically sophisticated organizations.For example, in recent weeks, problems at third-party providers led to outages at McDonald’s and the Department of Motor Vehicles. Tune in to the episode to hear what happened and explore other inc...2024-03-3017 minThe Internet ReportThe Internet ReportMeta, LinkedIn, and Comcast Outages, Explained | Pulse UpdateOver a two-day period this past week, major social media platforms—Meta’s Facebook and Instagram, LinkedIn, and Discord—all experienced disruptions. In the same timeframe, Comcast was also impacted by an outage that affected access to specific services and applications.Meta experienced issues with its log-in process, Discord navigated unexpectedly high load volumes, Comcast dealt with 100% packet loss in part of its backbone, and—the following day—LinkedIn worked its way through a backend issue.These incidents each leave valuable reminders for NetOps teams as they seek to minimize downtime and assure exceptional digital experience...2024-03-1617 minThe Internet ReportThe Internet ReportAT&T Outage and Disruptions at Google Cloud, Front, and More | Pulse UpdateLoad is a fundamental but, at times, challenging variable for networks and operations teams to handle. In the past few weeks, ThousandEyes saw various load-related problems impact organizations including Google Cloud, Front, several Australian banks, and Minnesota State University Moorhead.Tune in to learn more about what happened during these incidents, as well as hear our commentary on the recent outage impacting AT&T. Use the timestamps below to jump to the sections that most interest you: CHAPTERS:00:00 Intro00:59 AT&T outage impacts cellular services nationwide04:40 Australian banks appear to lose online a...2024-03-0416 minThe Internet ReportThe Internet ReportSquare Outage, Data Center Issues & Planning for Resiliency | Pulse UpdateWhen outages happen, it’s what you do next that matters. It’s important to have a backup plan in place that you can quickly activate to minimize the impact of an incident.Over the past two weeks, companies initiated a range of resiliency actions, including asking customers to use alternate authentication methods (or to avoid logging out of a service), setting up a new contact center to re-establish lines of communication, and reverting to manual processes.Tune in to learn more about what happened during these and other recent incidents.CHAPTERS:00:00 Intr...2024-02-1717 minThe Internet ReportThe Internet ReportSecurity, Great Digital Experiences & Why Visibility MattersThe ThousandEyes Internet Intelligence team joins us from Cisco Live in Amsterdam, talking about a major theme from the event—security.Tune in to hear their thoughts on how visibility can help companies in their security efforts, the sovereignty of data in flight, and why you don’t have to choose between security and performance.———CHAPTERS00:00 Intro01:09 Evolving Security Landscape04:53 Security Excellence & Optimal Digital Experience10:13 Sovereignty of Data in Flight14:57 Key Takeaways15:55 Get in Touch———Want to get in touch?If you have questio...2024-02-1016 minThe Internet ReportThe Internet ReportUnderstanding the Microsoft Teams & Azure Disruptions | Pulse UpdateWhat happened during the recent Microsoft Teams and Azure disruptions? Go under the hood of these incidents and also explore other recent disruptions in this week’s Pulse Update.CHAPTERS- 01:03 Network issue leads to Microsoft Teams service disruption- 04:09 Azure Resource Manager exhausts capacity, causing service issues- 06:20 Oracle Cloud experiences network outage- 09:56 Jira users encounter 503s and other errors- 10:30 Sage outage impacts South Africa- 11:08 Red Hat experiences four search-related incidents- 11:45 Recent outage trends and numbersFor more insights on outage trends and an...2024-02-0316 minCisco Podcast NetworkCisco Podcast NetworkIkusi Intelligence with ThousandEyesIn this episode, host Brandon Copeland is joined by ThousandEyes Partner Ikusi. Ikusi Product Manager Diana Mendez Hernandez and ThousandEyes Channel Account Manager Alfonso Valencia discuss the creation of Ikusi Intelligence, how ThousandEyes has been integrated into their solution, and the value it has brought to their customers in the financial sector. Find out more about Ikusi here: https://www.ikusi.com/mx/servicios/full-visibility-with-thousandeyes/ Subscribe to our Partner Newsletter: https://www.thousandeyes.com/partners/newsletter/ Join our ThousandEyes Partners LinkedIn Group: https://www.linkedin.com/groups/12831038/2024-01-2410 minThe Internet ReportThe Internet ReportUnpacking Recent ChatGPT Issues & Other Outage News | Pulse UpdateWhat caused recent dips in performance for OpenAI’s ChatGPT? Tune in to hear The Internet Report team unpack this and other recent disruptions, including a hack that led to an outage at the Spanish branch of the Orange mobile network, and a blip for customers of the cloud services provider DigitalOcean.They’ll also cover the outage trends they’re seeing in 2024 so far and how extreme cold weather can cause problems for data centers.For more insights on outage trends and analysis of some of the most notable outages of 2023, register for the upcomi...2024-01-2024 minThe Internet ReportThe Internet Report2023 Internet Outage Trends & the New Outage Landscape | Pulse UpdateAs they launch into 2024, organizations are facing a different outage landscape than they had at the start of 2023. The past year saw increases in cloud service provider (CSP) outages, application outages, and the percentage of U.S.-centric outages—all of which point to an evolution in the way outages happen and the need for different strategies to minimize the impact of disruptions.In this episode, Mike Hicks (Principal Solutions Analyst at ThousandEyes) unpacks these trends and shares practical tips for mitigating disruptions and optimizing performance. Listen on YouTube or tune in on your favorite podcast platform....2024-01-1311 minThe Internet ReportThe Internet ReportInsights From the Ghosts of NetOps Past, Present, and FutureAs 2023 comes to a close, in the spirit of Dickens’ holiday classic “A Christmas Carol,” let’s reflect on the valuable insights left by the ghosts of network operations teams past, present, and yet to come. Tune in to hear host Mike Hicks (Principal Solutions Analyst at ThousandEyes) discuss lessons from the NetOps teams of the past, the current state of NetOps, and what the future might hold—all with the goal of helping teams take steps to optimize performance and deliver delightful digital experiences in 2024.And also check out Mike’s related article in TechRadar: ht...2023-12-2221 minThe Internet ReportThe Internet ReportPeering Issues, Internet Resilience, and Cloud Outage News | Pulse UpdateRecent changes appeared to trigger a series of events for two peering points internationally—with very different impacts. Tune in to learn more about these incidents, why they differed, and the lessons they leave.Mike Hicks, Principal Solutions Analyst at ThousandEyes, will also cover the latest outage numbers and explore other recent incidents, including an Oracle Cloud outage and a duo of disruptions at Alibaba Cloud.Interested in more outage analysis? Check out our Internet Outages Timeline, which covers several notable Internet outages and application issues from the past year, along with the lessons they le...2023-12-1313 minThe Internet ReportThe Internet ReportScaling To Meet the Black Friday Demand: Tips for IT TeamsAs companies gear up for Black Friday, The Internet Report team shares some best practices for delivering great customer experiences and minimizing downtime during one of the retail industry’s biggest days of the year. Mike Hicks, Principal Solutions Analyst at ThousandEyes, will cover some helpful case studies of Black Fridays that experienced some hiccups and what you can do to guard against similar disruptions.To learn more, check out the link below: - https://www.thousandeyes.com/blog/internet-report-episode-54-black-friday-2023———Want to get in touch?If you have q...2023-11-2215 minThe Internet ReportThe Internet ReportUnderstanding the Recent Workday and Cloudflare Outages | Pulse UpdateBackend-related incidents have been a recurring theme in outages across 2023, caused by everything from data center issues and hardware mishaps to failures at common (shared) services.Recently, we saw two examples of these backend issues when data center power problems led to outages at both Cloudflare and Workday.Tune in to hear more about what happened at Cloudflare and Workday, as well as our analysis of disruptions at OneLogin and GitLab.———CHAPTERS00:00 Intro01:00 OneLogin Disruption05:22 GitLab.com Availability Issues09:14 Workday and Cloudflare Outages31:16 Get in Touch—...2023-11-1432 minThe Internet ReportThe Internet ReportHalloween Special: Ghosts of Outages PastThis Halloween, The Internet Report team is sharing some of their most thrilling (and chilling) networking tales.Pull up a chair (and a big bowl of your favorite Halloween candy) to hear what happened—and important lessons learned.———CHAPTERS00:00 Intro01:40 Haunting obstacles with a dynamic routing protocol that thwarted crew changes on an oil platform10:00 A spooky code base rollout that unleashed memory leak mischief18:58 A chilling application rollout that failed to deliver on user expectations around the globe29:45 Mysterious application issues that sent shivers...2023-11-0143 minThe Internet ReportThe Internet ReportInsights From Outages at Citibank, DBS, and Other News | Pulse UpdateIn recent weeks, back-end infrastructure work and other backend-related issues impacted various online and consumer banking services, including DBS and Citibank in Singapore.Simple front-facing customer experiences that we’ve become accustomed to today can often mask considerable complexity on the backend. The service delivery chain of technologies powering the front end often comprises a mix of on-premises assets, cloud services, containers, and APIs.A degradation or outage to just one of those components can have massive impact. Depending on the architecture of the app and resilience of the backend, an incident in one part ca...2023-10-3024 minThe Internet ReportThe Internet ReportTalking Data Freshness + Slack, Cloudflare, and Google Outages | Pulse UpdateOutages and degradations can happen when underlying data isn’t fresh enough. In recent weeks, stale data may have contributed to incidents at both Slack and Cloudflare. Slack began experiencing issues when, by our best guess, its app stopped trusting the freshness of the data in the cache; and, separately, Cloudflare’s 1.1.1.1 DNS resolver ran into some issues related to stale root zone data.Watch this Pulse Update episode to hear more about the Cloudflare and Slack outages, and also explore recent disruptions at Google.For more insights, check out these links:- E...2023-10-1830 minThe Internet ReportThe Internet ReportInternet Outages: Why One Small Link Can Break the Whole Chain | Pulse UpdateProviding great digital experiences relies on a complex service delivery chain. The past few weeks brought multiple reminders that the root cause of cloud and app disruptions often comes down to one single link in this chain. While the component at issue may appear small, if it’s not functioning normally, the consequences can be significant. Additionally, the impact of a malfunctioning “link” is often intensified by a lack of understanding or visibility into the entire end-to-end service delivery chain, especially in situations where a change is made outside standard operating procedures or pipelines.In this ep...2023-10-0222 minThe Internet ReportThe Internet ReportData Center Disruptions, Square Down, and More News | Pulse UpdateIn a world that operates at “hyperscale,” the potential for hyperscale-sized problems is also very real. The measure of a good provider—and a well-engineered system—is how well they handle these anomalous conditions and minimize disruption.During recent weeks, some of these hyperscale-sized outages hit, including data center-focused disruptions that impacted companies like Square, Oracle OCI, NetSuite, and Microsoft Azure. Tune into this Pulse Update episode to go under the hood of these outages and discover how the companies responded—and important lessons learned.For more insights, check out these links:- The Int...2023-09-1633 minThe Internet ReportThe Internet ReportDisruptions at Slack and X + Thoughts on “Take Twos” | Pulse UpdateAn outage occurs, a change is rolled back, and everything stabilizes. But what happens when the change is attempted a second time?These second tries often go much more smoothly. While another outage might still occur during this “take two,” the impact is usually far less severe. The engineering team has learned from what went wrong the first time and is ready to stop at the first hint of trouble. Slack recently experienced a pair of disruptions that appear to illustrate this “take two” scenario: a longer disruption resulting from a routine database cluster migration, followed...2023-09-0221 minThe Internet ReportThe Internet ReportAn August Slack Outage and Why Context Matters | Pulse UpdateContext matters when working on a distributed web-based application or service where everything is linked and dependent on each part functioning correctly. It’s all too easy for one team to make a change that unexpectedly affects something another team is working on. Or the combined impact of both changes may also accidentally break something.To avoid such mishaps, teams should cut back on silos as much as possible.However, it’s hard to completely eliminate siloed operations or decision-making. But the potential negative effects of silos can be reduced if each team has a...2023-08-2134 minThe Internet ReportThe Internet ReportSharePoint Outage and Security Certificate Considerations | Pulse UpdateIn an end-to-end service delivery chain, isolated changes can have broad consequences. This played out recently when an erroneous SSL certificate change at Microsoft appeared to cause a SharePoint Online and OneDrive for Business outage.While this incident definitely underscores the importance of valid security certificates, it’s also a reminder of what can happen when even one component in an end-to-end service delivery chain experiences issues. Every component needs to work in sync to maintain the service’s availability. As a result, all changes, especially manual ones, should be made with care and teams shoul...2023-08-0526 minThe Internet ReportThe Internet ReportAzure Disruption, Meta App Issues, and Navigating Edge Cases | Pulse UpdateLet’s face it. Not every contingency can be planned for. Sometimes an outlier scenario pops up and causes an unexpected outage or disruption.Over the past few weeks, multiple companies appeared to be impacted by such edge cases: Azure; GitLab; and Meta’s WhatsApp, Facebook, Instagram, and Threads—its newest addition.Tune into the latest Pulse Update episode to learn more about what happened during these disruptions and why robust visibility is so important for navigating unexpected outlier scenarios.And for more insights, check out these links:- Internet Report: Pulse...2023-07-2119 minThe Internet ReportThe Internet ReportA Front Door, But No House: Explaining Application Outages | Pulse UpdateThe application opens, but users encounter errors when they try to do anything—what gives? It’s the curious case of the disappearing backend. Discover why application issues often show up like this, with the service reachable but unresponsive beyond rendering a basic landing page, and sometimes an accompanying error message.In this episode, hosts Mike Hicks and Brian Tobia discuss this common problem and explore related incidents at CBA, GitHub, and Microsoft Teams. They also unpack other recent outage trends and disruptions, including the UK emergency services outage.To learn more, check out t...2023-07-1017 minThe Internet ReportThe Internet ReportApplication Outages Up in 2023—What to Know | Pulse UpdateThough network outages are still far more common, application outages seem to be increasing in 2023—and having bigger impacts. Tune in to learn more about this trend and dive into incidents at Okta and Instagram. Host Mike Hicks will also explore other outage trends from the first half of the year in this special episode reflecting on the state of the Internet in 2023 thus far.To learn more, check out these links:- Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-application-outages-increasing?utm_source=transistor&utm_medium=referral&utm_campaign=InternetReportPulseEp132023-06-2921 minThe Internet ReportThe Internet ReportIs Spring Cleaning Causing an Outage Spike? | Pulse UpdateFor three consecutive years, there appears to have been a spike in outages and degradations in May. A potential “spring cleaning effect” may explain why. Tune in to learn more about this possible trend and explore what happened during recent incidents at Twitter; Microsoft 365; Slack; Instagram; Apple’s iMessage; and subscription-based streaming service, Max (formerly known as HBO Max).After watching, check out these links to dive deeper: Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-spring-cleaning-outage-spike?utm_source=transistor&utm_medium=referral&utm_campaign=InternetReportPulseEp12Explore the Instagram outage i...2023-06-1027 minThe Internet ReportThe Internet ReportHow Outages Can Impact Distributed Dev Teams | Pulse UpdateTune in to explore ways that outages can impact distributed software development teams and what companies can learn from recent incidents at GitHub, Google Cloud, and Apple.To learn more, check out these links: Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-outages-and-distributed-dev-teams?utm_source=transistor&utm_medium=referral&utm_campaign=InternetReportPulseEp11Explore the GitHub service degradation in the ThousandEyes platform (NO LOGIN REQUIRED): https://agiebiuwxkwqowctctfvdaazvvfpxzew.share.thousandeyes.com/———CHAPTERS00:00 Intro00:39 The Download04:40 Outage Trends: By the Numbers09:22 GitHub Service Degradation19:13 Update: Google...2023-05-2726 minThe Internet ReportThe Internet ReportRedundancy in the Cloud Era: Two Case Studies | Pulse UpdateWhen it comes to your technology strategy, it's a good idea to have more than one way to access every resource—just in case. As IT environments have changed, so has the thinking around the right approaches to achieve this desired redundancy.Two recent incidents at Google Cloud and Microsoft 365 reinforce the importance of redundancy—and the need for evolving strategies to meet this goal.To learn more, check out these links: Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-redundancy-in-cloud-era?utm_source=transistor&utm_medium=referral&utm_campaign=InternetReportPulseEp10Clo...2023-05-1626 minThe Internet ReportThe Internet ReportThe Anatomy of an Outage | Pulse UpdateUnderstanding the unique characteristics of different kinds of Internet outages can help you quickly recognize the type of incident you’re dealing with and take the right steps to mitigate its impact. This week’s episode discusses the anatomy of common outage categories and explores recent case studies:- Security-related incidents: Western Digital and SD Worx outages- A single-point-of-aggregation issue: SpaceX’s Starlink outage- Last-mile challenges: Vodafone UK outageTo learn more, check out the links below:- Internet Report: Pulse Update blog: https://www.thousandeyes.com/blog/in...2023-04-2917 minThe Internet ReportThe Internet ReportChatting About the ChatGPT Outage and Other Outage News | Pulse UpdateThis week’s Pulse Update unpacks OpenAI’s ChatGPT outage and discusses why the outage actually represented a pragmatic move on the part of OpenAI. We’ll also discuss global outage trends; explore other recent incidents at Dish Network, Microsoft, and Virgin Media UK; and look at why responses to performance problems vary, based on application characteristics and usage patterns.To learn more, check out the links below: - Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-chatgpt-outage?utm_source=youtube.com&utm_medium=referral&utm_campaign=InternetReportPulseEp8- Explore...2023-04-1728 minThe Internet ReportThe Internet ReportUnderstanding the UK Virgin Media Outages on April 4 | Outage Deep DiveOn April 4, 2023, Virgin Media UK (AS 5089) experienced two outages that impacted the reachability of its network and services to the global Internet. The two outages shared similar characteristics, including the withdrawal of routes to its network, traffic loss, and intermittent periods of service recovery. In this episode, we discuss how the outages unfolded and what IT teams can learn from this to help navigate similar incidents in the future. To learn more, check out the links below: - Blog: Virgin Media UK Outage Analysis: https://www.thousandeyes.com/blog/virgin-media-uk-outage-analysis-april-4-2023- Explore th...2023-04-0826 minThe Internet ReportThe Internet ReportExploring Application Errors at Okta, Twitch, Reddit & GitHub | Pulse UpdateHTTP 403, 503, and 504 status codes dominated the last few weeks as multiple companies experienced application degradations and outages. These incidents at companies like Okta, Twitch, Reddit, and GitHub leave important lessons on navigating similar issues and minimizing downtime for your own users.To learn more, check out the links below: - Internet Report: Pulse Update Blog: https://www.thousandeyes.com/blog/internet-report-pulse-update-application-errors- Explore the Okta and Reddit outages in the ThousandEyes platform (NO LOGIN REQUIRED): Okta: https://awoleuudwuvnwklukifbrpghghynjjwy.share.thousandeyes.comReddit: https://arblcshhhdpvslhwkuxtvukvmvnlobur.share.thousandeyes.com/view/tests/?ro...2023-04-0124 minThe Internet ReportThe Internet ReportTwitter Performance in the Elon Era, a Ransomware Attack & More Outage News | Pulse UpdateIt was an eventful fortnight on the Internet as Twitter, Dish Network, Akamai, and Ticketek Australia all experienced outages. Tune into our latest episode for insights from our analysis of these events and practical tips for IT teams.To learn more, check out the links below: - Internet Report: Pulse Update Blog: ttps://www.thousandeyes.com/blog/internet-report-pulse-update-twitter-outages-and-more- Explore the Twitter and Dish Network outages in the ThousandEyes platform (NO LOGIN REQUIRED): Twitter: https://aonfgcjryeodugjpksvxpdhuxodyjaxf.share.thousandeyes.comDish Network: https://amwajhhgwjnienexcktrsjhsmoisvktt.share.thousandeyes.com- Also ex...2023-03-2025 minThe Internet ReportThe Internet ReportA Tale of Two Data Center Outages | Pulse UpdateIn the space of a week, we saw two data center-related incidents lead to long Microsoft and Oracle outages. Join us as we analyze these outages and ways IT teams can minimize downtime in similar situations. We’ll also discuss a series of application issues that impacted companies including Twitter and Tesla.To learn more, check out the links below: Internet Report: Pulse Update BlogExplore the Atlassian outage in the ThousandEyes platform (NO LOGIN REQUIRED)Chapters00:00 Intro00:34 The Download2:42 Outage Trends: By the Numbers4:54 Data Center Inc...2023-03-0422 minThe Internet ReportThe Internet ReportA Trio of Similar Incidents: Microsoft, Cloudflare, & Slack Outages | Pulse UpdateWe discuss insights from a recent trio of similar incidents at Microsoft, Cloudflare, and Slack, along with other outage news, including a Comcast outage that impacted some Philadelphia neighborhoods on Super Bowl Sunday. 00:00 Intro00:58 Outage Trends: By the Numbers4:33 Microsoft Outage (Jan. 25)4:58 Cloudflare Outage (Jan. 24)9:27 Slack Outage (Jan. 25)13:16 Microsoft Outlook Outage (Feb. 7)18:06 Square Outage (Feb. 7)20:39 Comcast Outage (Feb. 12)23:23 Get in TouchTo learn more, check out the links below:Internet Report: Pulse Update BlogExplore the January 25 and February 7 Microsoft outages in the ThousandEyes platform (NO LOGIN R...2023-02-1824 minThe Internet ReportThe Internet ReportThe Microsoft Outlook Outage, Explained | Outage Deep DiveLive from #CiscoLiveEMEA, we discuss the Feb. 7 Microsoft Outlook outage to understand how the event unfolded, why it may have played out the way it did, and what you can learn from this outage event.To dive deeper, check out the links below:Explore the outage in the ThousandEyes platform (NO LOGIN REQUIRED)Microsoft Outlook Outage Analysis Blog (Feb. 7)Microsoft Outage Analysis Blog (Jan. 25)Want to get in touch?If you have questions, feedback, or guests you'd like to see featured on the show, send us a...2023-02-0816 minThe Internet ReportThe Internet ReportLessons From the FAA, Fastly, & Microsoft Outages | Pulse UpdateIn this episode, we cover the latest internet trends and unpack important takeaways from the recent FAA, Fastly, and Microsoft outages. We also discuss how several early 2023 outages and disruptions reinforced the need for application monitoring and testing to counter, or at least anticipate the effect of, anomalous conditions on certain routes.00:00 Intro1:32 Outage Trends: Week of Jan. 307:07 FAA Outage (Jan. 11)11:04 Fastly Outage (Jan. 19)15:31 Microsoft 365 Outage (Jan. 17)19:52 Microsoft Outage (Jan. 25)28:40 Get in TouchTo learn more, check out the links below:Follow the Fastly and Microsoft outages...2023-02-0429 minThe Internet ReportThe Internet ReportUnderstanding the Microsoft Outage: Why Were Azure, Microsoft Teams, & Outlook Down? | Outage Deep DiveAt around 7:05 a.m. UTC on January 25, 2023, Microsoft started experiencing service related issues. At the same time, ThousandEyes observed BGP withdrawals and a significant number of route changes that resulted in a high amount of packet loss, ultimately affecting various services like Outlook, Teams, SharePoint, and others. 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Join our co-hosts Angelique Medina, Head of Internet Intelligence, and Kemal Santja, Principal Internet Architect, both from ThousandEyes, as they discuss the January 25, 2023 Microsoft outage. 2023-01-3127 minThe Internet ReportThe Internet ReportNotes on the Spotify Outage | Pulse UpdateThis episode covers the latest global network outage numbers and interesting end-of-year trends; how resilient application architectures, clouds, and networks are challenging old ways of thinking; and a deep dive into an outage that disrupted Spotify’s music streaming on December 14, 2022.To learn more, check out the links below: Internet Report Pulse Update BlogExplore the Spotify outage in the ThousandEyes platform (NO LOGIN REQUIRED) Part 1Part 2Part 3Chapters00:00 Intro1:12 Outage Trends: By the Numbers10:26 Spotify Outage19:11 Get in TouchWant to get in touch?If you have ques...2023-01-2019 minThe Internet ReportThe Internet ReportTwitter in the Elon Era + Microsoft & AWS Outages | Pulse UpdateThis is the Internet Report: Pulse Update, where we review and provide analysis of significant outages and trends across the Internet, from the previous two weeks. Every other week, we'll publish a new episode covering the latest tally of outage events, and highlighting a few interesting outages. This week, in addition to our usual look at global and U.S. outage trends, we’ll take a brief look at how Twitter is holding up since it's sale to Elon Musk, plus, a couple of interesting outages at Microsoft and AWS.To learn more, read the blog....2022-12-1722 minThe Internet ReportThe Internet ReportUnpacking the Dec. 12 Quad9 BGP Route Leak | Outage Deep DiveStarting at ~12:12 UTC on Dec 12, 2022, an ISP in the Democratic Republic of Congo leaked a route belonging to the Quad9 DNS service, causing some traffic, including Verizon US customer traffic, to get routed to Africa for ~90 minutes. High traffic loss was observed throughout the incident which was resolved at ~13:40 UTC. 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Join our co-hosts Mike Hicks, Principal Solutions Analyst, and Kemal Sanjta, Principal Internet Architect, both from ThousandEyes, as they discuss the December 12th Quad9 BGP route leak. 2022-12-1426 minThe Internet ReportThe Internet ReportAn Eventful End to October for WhatsApp, Zscaler, Salesforce, and Facebook | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. In this episode, we unpack four notable outages that impacted WhatsApp, Zscaler, Salesforce, and Facebook, which all appear to have a common theme. Join our co-hosts Mike Hicks, Principal Solutions Analyst at ThousandEyes, and Chris Villemez, Technical Marketing Engineer at ThousandEyes, as they walk through each incident to understand what happened and discuss how network professionals can attempt to mitigate these types of scenarios in the future. FURTHER READING Facebook Outage Analysis → https://www.thousandeyes.com/blog/facebook-outage-analysis2022-11-0428 minThe Internet ReportThe Internet ReportUnpacking the March 28th Twitter Outage | Outage Deep DiveWe're back! 00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this episode, our newest host, Chris Villemez, is joined by Kemal Sanjta to discuss a BGP-related incident that took down Twitter for many users around the globe on March 28th. 00:36 Under the Hood: Chris Villemez and Kemal Sanjta leverage their extensive operations experience managing the networks of large-scale SaaS, IoT, and cloud providers to analyze this incident using the ThousandEyes platform. They examine the scope of the outage, review the specific BGP changes that resulted in the ou...2022-04-1331 minThe Internet ReportThe Internet ReportUnpacking the December AWS Outages (December 7, 10, & 15, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, our newest host and Technical Marketing Engineer, Chris Villemez, is joined by Kemal Sanjta, Principal Engineer, to dive into the details of the recent AWS outages from December 7th, 10th and 15th. They’ll walk through what ThousandEyes saw from its fleet of vantage points, as well as share some insight into what enterprises can learn from these incidents to build resilient cloud architectures.2021-12-1743 minThe Internet ReportThe Internet ReportThe Facebook Outage, Explained (10/4/21) | Outage Deep Dive00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:15 Headlines: Today we’re going to do a thorough analysis of the major Facebook outage that took place yesterday, Monday, October 4. I’m joined by ​​Gustavo Ramos, ThousandEyes’ in-house expert on Network Engineering. ThousandEyes Blog: https://www.thousandeyes.com/blog/facebook-outage-analysis Analysis from Facebook: https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/ 1:17 Under the Hood: We'll walk through the sequence of events that led to this outage, understand what went wrong (and what actions may have made the situation worse), and what lessons we...2021-10-0627 minThe Internet ReportThe Internet ReportWhen BGP Routes Accidentally Get Hijacked: A Lesson In Internet Vulnerability | Outage Deep Dive00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:08 Headlines: Today, Mike Hicks (Principal Solutions Analyst, ThousandEyes) and I discuss a recent BGP routing incident that had intermittent impacts on Amazon’s services, including Amazon.com and AWS compute resources, during a five-hour period on July 12. 01:04 Under the Hood: When we look into BGP routing at the time, we can see multiple BGP path changes due to a service provider erroneously inserting themselves into the path for a large number of Amazon routes. Watch this episode to see how the...2021-08-0318 minThe Internet ReportThe Internet ReportThe Akamai DNS Outage and the Case for CDN Redundancy (July 1-23, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, principal solutions analyst here at ThousandEyes, to cover the outage of Akamai’s DNS service. The outage, which occurred on July 22nd around 3:38 PM UTC (8:38AM PT), struck during the course of business hours in Europe and North America, resulting in widespread impacts to applications and services hosted within Akamai servers. The outage itself was short-lived and was resolved roughly one hour after the outage began. In this episode, we examine the customer impact, th...2021-07-2418 minThe Internet ReportThe Internet ReportBGP Routing Incident Shows Why the Shortest Path Isn’t Always the Chosen Path | Outage Deep Dive00:00 Welcome:This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:13 Headlines: Today, Kemal and I unpack an interesting BGP incident, in which a large-scale route leak briefly altered traffic patterns across the Internet. 00:58 Under the Hood: The incident began on Thursday, June 3rd at around 10:24 UTC, and resulted in a significant spike in packet loss that was noticeable in ThousandEyes tests. While this packet loss resolved within the hour (at around 10:48 UTC), we observed some interesting routing changes during this window—as traffic was diverted to a Russian telecom provider...2021-07-0321 minThe Internet ReportThe Internet ReportAkamai Prolexic Outage Analysis + Takeaways (Week of June 9-17, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined by ThousandEyes’ BGP expert, Kemal Sanjta, to review the June 16th outage of Prolexic Routed, a DDoS Mitigation Service operated by Akamai. According to a statement from Akamai, the outage was not due to a DDoS attack or system update, but instead a routing table limitation that was inadvertently exceeded. In this episode, Kemal and I analyzed what happened and how customers of Akamai Prolexic who had automated failover mechanisms in place were able to recover more quickly th...2021-06-2525 minCisco Champion RadioCisco Champion RadioS8|E25 Delivering Visibility and Actionable Insights with ThousandEyesThousandEyes is joining forces with the Cisco Catalyst 9000 family of switches to bring Internet and Cloud Intelligence to tens of thousands of organizations around the world, enabling them to see and manage networks and applications that they rely on but do not directly own or control. By bundling ThousandEyes with its most widely deployed switches for campus and branch environments, Cisco is solving a critical visibility gap for enterprises, who are increasingly leveraging cloud, Internet and SaaS to deliver digital services to its employees and internal systems. Cisco Catalyst 9300/9400 switches with a DNA Advantage or Premier license can now serve...2021-06-2146 minThe Internet ReportThe Internet ReportFastly’s Outage and Why CDN Redundancy Matters (Week of June 3-8) | Outage Deep Dive00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:12 Headlines: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. 3:46 Under the Hood: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. The widespread outage occurred around 9:50 UTC, about 5:50 am ET, and mostly impacted users across Europe and Asia due to the timing. he outage lasted approximately one hour until...2021-06-0939 minThe Internet ReportThe Internet ReportBitcoin Dive Sparks Outage at a Popular Crypto Exchange (Weeks of May 17-June 2) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, Principle Solution Analyst at ThousandEyes, to cover two recent application-related outages. The first occurred on May 19th around 12:50 UTC at Coinbase—a well-known cryptocurrency exchange. Around the time that news broke saying that the Chinese government would be imposing strict regulation on cryptocurrencies, users attempting to execute transactions were unable to access the application. From the ThousandEyes platform we were able to see a drop in availability around this time as well as increased...2021-06-0521 minThe Internet ReportThe Internet ReportDNS and BGP and DDoS Attacks—Oh, My! (May 11-17, 2021) | Outage Deep Dive00:00 Welcome 00:14 Headlines: DNS and BGP and DDoS Attacks—Oh, My! This week we cover a couple of recent service degradation incidents involving DNS providers 2:19 Under the Hood: Kemal Sanjta, ThousandEyes’ resident BGP expert, joins us to discuss the May 6th disruption to Neustar’s UltraDNS service, which lasted nearly four hours. We discuss the BGP routing changes we observed during the incident and what they can tell us about the cause of the disruption. We also cover a separate incident involving Quad 9, a public recursive resolver service, which the company says was caused by a DDoS attack on May 3rd. 16:19 Expert...2021-05-2132 minThe Internet ReportThe Internet ReportEven Magic Can't Stop Internet Outages (April 28-May 3, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Today, we focused on an interesting outage that impacted Cloudflare Magic Transit, a relatively new offering from the CDN provider which aims to efficiently route and protect the network traffic of its customers. On May 3rd at approximately 3:00 PM PDT (10:00 PM UTC), ThousandEyes vantage points connecting to sites using Magic Transit began to detect significant packet loss at Cloudflare’s network edge—with the loss continuing at varying levels, for approximately 2 hours. While the outage impacted some Magic Transit customers more signi...2021-05-0810 minThe Internet ReportThe Internet ReportMicrosoft Teams Outage Highlights: Need to See Beyond App Front Door (Week of April 20-27, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re joined this week by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to discuss Tuesday’s Microsoft Teams outage. On Tuesday, April 27th, ThousandEyes tests began to detect an outage affecting the Teams service starting around 3 AM (PT) and lasting approximately 1.5 hours. While the outage occurred in the overnight hours for much of the Americas, the global nature of the outage resulted in service disruption for users connecting from Asia and Europe. Transaction views within the ThousandEyes platform show that...2021-04-2918 minThe Internet ReportThe Internet ReportMajor BGP Route Leak Disrupts Internet Traffic Globally (April 13-19, 2021) | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we’re thrilled to be joined by Kemal Sanjta, ThousandEyes’ resident expert on BGP. This week, we’re going under the hood on the April 16th BGP leak at Vodafone India, which leaked more than 30,000 prefixes, causing a major disruption of Internet traffic to some services. While some news outlets reported that the incident lasted approximately 10 minutes (starting around 1:50AM UTC or 9:50AM ET), we found that it lasted quite a bit longer—more than an hour in the case...2021-04-2230 minThe Internet ReportThe Internet ReportFacebook Outage Analysis; Plus, Why Cross-Layer Visibility Is a Must for App Experience | Outage Deep DiveThis is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re back from a short sabbatical to cover an interesting outage at Facebook in what appears to be an application outage compounded by a series of routing issues. On April 8th, for roughly 40 minutes, the Facebook application became unavailable for users around the globe who were attempting to connect to the service. Despite the short-lived nature of the outage, we observed prolonged performance degradation even after the application came back online for users. Suboptimal page load and response times, both of...2021-04-1524 minThe Internet ReportThe Internet ReportWhat Happened With Verizon’s Recent Outage (Week of Jan. 25-Feb. 1, 2021) | Outage Deep DiveOn today’s episode, we discuss the recent outage on Verizon’s network that had widespread impacts on users in the US. ThousandEyes Broadband Agents detected an outage starting around 11:30am EST that manifested as packet loss across multiple locations concentrated along Verizon backbone in the US east coast and midwest. While the outage was resolved approximately an hour later, users connecting from the Verizon network across the US experienced varying degrees of impact, depending on the services they were connecting to. This serves as yet another reminder that the context around an outage directly affects the scope of the disr...2021-02-0309 minThe Internet ReportThe Internet ReportAn IXP and a Streaming Music Provider Walk Into an Outage Bar (Week of Aug. 17-23) | Outage Deep Divehis is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, Archana and I cover some recent outages that made headlines. This includes the Spotify outage, caused by an expired TLS certificate, that prevented users from accessing its platform. We also cover off on a widespread outage at Cogent during (what seems to be) a maintenance window. Then, we go “under the hood” on the prolonged outage at an IXP on August 18th to understand exactly what infrastructure was impacted and which downstream providers were subsequently impacted. We...2020-08-2521 minThe Fat Pipe Of The Packet Pushers PodcastsThe Fat Pipe Of The Packet Pushers PodcastsDay Two Cloud 060: Charting Global Internet Performance With ThousandEyes (Sponsored)If your idea of the Internet is to draw a generic cloud icon on the whiteboard, this is the episode for you. We all know that the Internet is important, and bandwidth is a big deal. We sort of have a notion that regions matter depending on where our customers are. But what’s really going on inside of that generic cloud you just drew? Shining some light on the mysterious tubes filled with cat memes is Angelique Medina, Director, Product Marketing at ThousandEyes. ThousandEyes is our sponsor for today’s episode. ThousandEyes has just released its in...2020-08-0545 minThe Internet ReportThe Internet ReportRansomware Attack Leaves Garmin Users Stuck Without a Paddle (Week of July 20-26) | Outage Deep DiveOn this week’s episode, I am joined by Deepak Ravi from our Dublin technical sales engineering team to discuss a recent outage at Garmin. Garmin confirmed that it was a victim of a ransomware attack, which took down several of its services including its website functions, customer support, customer facing applications, and company communications. In this episode, we walk through what we observed in the ThousandEyes platform during the time of the attack, and what the impacts were on users attempting to access Garmin services.We’re also joined by ThousandEyes’ CISO, Alexander Anoufriev, to talk about what ransomware attack...2020-07-2825 minThe Internet ReportThe Internet ReportDo Outages Come in 3’s? Diving Into Last Week’s Outages at GitHub, WhatsApp, and Cloudflare (Week of July 13-19) | Outage Deep DiveOn this week’s episode, we cover a couple of significant application-layer outages at Github and WhatsApp that occurred over the past week. Then, Archana and I do a deep-dive into a network-related outage at Cloudflare that affected the availability of its popular DNS service for approximately 30 minutes. We’ll share what we saw through our vantage points in the ThousandEyes platform, and you can read Cloudflare’s full explanation of the incident on their blog/2020-07-2125 minThe Fat Pipe Of The Packet Pushers PodcastsThe Fat Pipe Of The Packet Pushers PodcastsNetwork Break 286: Cisco To Acquire ThousandEyes; The Return Of Follow UpTake a Network Break! We start with some FU (follow up) from previous Network Break and Heavy Networking episodes, then dive into Cisco’s big ThousandEyes acquisition. We also tackle financial results from VMware, Dell Technologies, and HPE. Last but not least, scammers have targeted 5G fears with a bogus and extravagantly priced USB stick. Sponsor: Itential We’re sponsored in part today by Itential is intelligent automation for multi-domain and multi-vendor networks. Find out more on the Packet Pushers’ Heavy Networking episode 503 and at itential.com/packetpushers. Tech Bytes: Viavi Stay tuned after the news...2020-06-0248 minThe Internet ReportThe Internet ReportFacebook SDK Snafu Sidelines Spotify & Others, Plus, AWS Global Accelerator… Accelerates (Week of May 4-10) | Outage Deep DiveOn this week’s episode of The Internet Report, Archana and I cover some newsworthy updates that we’ve seen over the past week. We discuss a notable Facebook SDK outage that had ripple effects on other popular services that leverage its log-in functionality, including Spotify and Tik Tok. We also discuss a blog from AWS sharing their thoughts on the JEDI contract. We’re also joined by Arash Molavi, the lead Internet researcher here at ThousandEyes. Arash shares his insight into outages we’re seeing, discusses what constitutes an outage, and why loss, latency and jitter can impact end-user experien...2020-05-1226 minThe Fat Pipe Of The Packet Pushers PodcastsThe Fat Pipe Of The Packet Pushers PodcastsDay Two Cloud 046: A Cloud Checkup During Covid-19 With ThousandEyes (Sponsored)Day Two Cloud dives into the health and performance of the global cloud during the pandemic. Sponsor ThousandEyes measures, collects, and reports on Internet performance, giving them a unique perspective into how cloud providers are faring region by region, provider by provider, and service by service. ThousandEyes software agents instrument Internet and cloud providers, perform active probes, and measure network and application performance from customer sites around the world. Our guests sharing ThousandEyes’ findings are Archana Kesavan, Director of Product Marketing; and Angelique Medina, Director of Product Marketing. We discuss: * How public cloud providers are holding...2020-04-2949 minThe Internet ReportThe Internet ReportISP Outages On The Rise, Router Failure Takes Down Cloud Provider Services During COVID-19 (Week of March 23-29, 2020) | Outage Deep DiveOver the past month, ThousandEyes has been flooded with questions about how the Internet is holding up given the extra strain it’s been under with the sudden influx of remote workers, remote schoolers, and overall increased use due to COVID-19 related self-isolating and shelter in place orders. We’ve put out blogs and have conducted executive, media and analyst briefings. Network World and the IDG family of publications have even started publishing our data on a weekly basis to keep its readers up to date, as things are changing so frequently.Because of the continued interest in h...2020-03-3019 minThe Fat Pipe Of The Packet Pushers PodcastsThe Fat Pipe Of The Packet Pushers PodcastsTech Bytes: Reviewing 2019’s Most Impactful Internet Outages With ThousandEyes (Sponsored)Today’s show looks back at some of the most impactful Internet outages of 2019 with sponsor ThousandEyes. We’ll discuss what happened in these outages, who was affected, and lessons learned. Our guest is Angelique Medina, Director of Product Marketing at ThousandEyes. We examine a June 2019 incident that impacted large swathes of Google Cloud, and a June 2019 route leak that snared CloudFlare, though it wasn’t CloudFlare’s fault. We discuss: * How these outages occurred * Why Internet visibility is critical as more applications and services move to the cloud * How ThousandEyes can help netw...2020-01-2015 minThe Fat Pipe Of The Packet Pushers PodcastsThe Fat Pipe Of The Packet Pushers PodcastsHeavy Networking 473: Synthetic Transactions, SD-WAN Readiness, And Internet Outage Autopsies With ThousandEyes (Sponsored)Welcome to Heavy Networking, a uniquely nerdy podcast that puts the network at the center of the universe where it belongs. Today is a sponsored show with ThousandEyes and we’re going to feast on a smorgasbord of topics: first, a new synthetic transaction monitoring tool from ThousandEyes. Second, we’ll discuss why performance monitoring is critical to your SD-WAN readiness and ongoing operations. Third, we’ll explore postmortems on a couple of 2019’s Internet outages, including a major route leak that affected CloudFlare, and what that means when you’re relying on the Internet for critical busine...2019-09-2445 min