This research paper introduces Prequal, a novel load balancer designed to minimise latency in large-scale distributed systems like YouTube. Unlike traditional load balancers that focus on balancing CPU usage, Prequal prioritises estimated latency and requests in flight, actively probing servers for real-time load information. Extensive testing on YouTube and a controlled testbed demonstrated that Prequal significantly reduces tail latency, error rates, and resource consumption, compared to weighted round-robin and other load balancing strategies. The paper details Prequal's design, including its asynchronous probing mechanism and hot-cold lexicographic rule for replica selection, and its superior performance is attributed to its ability to dynamically adapt to heterogeneous server capacities and varying workloads.