GUIDE

What is low latency? And why is latency important?

Blog_Header_Image_800x348_Latency..jpg

What is latency in networking technology?

Latency is the delay, or time it takes for data to travel from one point to another in a network.

What is low latency (ping)?

Low latency is one of the most important features of any website, program or application on a network.

Low latency, measured by "ping," refers to the round-trip time for data to travel between two points, typically in milliseconds. In real-time applications, it’s crucial for responsiveness, affecting sectors from gaming to financial trading and autonomous systems.

In development terms, low latency aims to improve the speed and interactivity of applications. Real-time software, such as video conferencing or online gaming, typically seeks latency under 60 milliseconds (ms) for smooth performance, with 30 ms or lower often considered "very low latency."

From a scientific perspective, achieving low latency involves understanding and mitigating physical and network-related limitations. Data, limited by the speed of light, can travel roughly 200 km per millisecond through fiber optics, setting an ultimate physical constraint on latency. For example, a 1MB round-trip from New York to London takes around 65 ms under ideal conditions.

How do you measure latency?

Latency is measured in milliseconds (ms), and the closer to zero it can reach, the better. It's measured in various ways depending on the context, but common methods (like) include:

  1. Ping (ICMP network command) Sending a small packet of data from one device to another and measuring the round-trip time it takes for the packet to travel.

  2. Traceroute: Mapping the path and measuring latency at each hop between the source and destination.

  3. Application-level monitoring: Using specialized tools or built-in features to measure latency within specific applications or services.

  4. Network monitoring tools: Utilizing tools like SNMP (Simple Network Management Protocol) or packet analyzers to monitor latency across a network.

  5. Real-user monitoring (RUM): Collecting data from actual users' interactions with a website or application to measure the latency experienced in real-world scenarios.

Each method has its advantages and limitations, so the choice depends on the specific requirements and goals of the measurement.

What is good latency?

Online Gaming

In online gaming, latency needs vary by genre due to differences in response sensitivity. Fast-paced games like first-person shooters (FPS), action, and battle royale titles require near-instant feedback, good if under 30 ms. Latency up to 100 ms is acceptable, but above 120 ms, players may notice delays that disrupt gameplay, making competitive play frustrating due to the lag between actions and on-screen response.

Action Games (e.g., FPS): <30 ms ideal, <100 ms acceptable; low propagation and queuing latency, minimal jitter required.

In strategy-based games, such as real-time strategy (RTS) and multiplayer online battle arena (MOBA) games, latency can be slightly higher without significantly affecting the gameplay experience. These games do not require frame-perfect precision, but they still demand reasonably low latency to keep controls responsive and prevent delays in unit movement and combat actions. For these games, latency below 50 ms is ideal, with up to 120 ms being tolerable. Higher latency, such as 150 ms, is manageable but can lead to delays in issuing commands, which may affect competitive gameplay.

Strategy Games (RTS, MOBA): <50 ms is good for competitive play; <120 ms acceptable, moderate tolerance.

Turn-based games and casual mobile games have a much higher tolerance for latency because they do not rely on real-time input. A delay of several hundred milliseconds does not disrupt gameplay, as these games are not affected by immediate response times. Here, latency around 100–250 ms is generally acceptable, and even higher latency can be tolerated without negatively impacting the gaming experience.

Video Conferencing and VoIP

In live conferencing and VoIP, good latency—under 150 ms—ensures a smooth, natural conversation flow. Latency between 150-250 ms is manageable but may cause slight delays, leading to occasional talk-over. Beyond 300 ms, conversation flow suffers, with noticeable pauses and interruptions. Consistent latency, or low jitter, is also crucial to prevent stuttering or desynchronization.

Streaming Video:

Latency needs differ for on-demand and live streaming. For on-demand streaming (e.g., Netflix, YouTube), moderate latency of 200–500 ms is acceptable, as buffering smooths initial delays. Higher latency does not heavily impact experience due to lack of real-time needs.

For live streaming, low latency is crucial to maintain the "live" experience. Under five seconds is ideal, especially for interactive events like live Q&A. Latency up to 3 seconds is manageable for general live broadcasts, but anything longer can reduce viewer engagement.

  • On-Demand good is under 200ms, acceptable up to 500 ms; buffering handles minor delays.

  • Live Streaming <3 seconds for low-latency streaming.

AR/VR Applications

Augmented and virtual reality need ultra-low latency—under 20 ms—to feel real-time and prevent discomfort or motion sickness. In VR, low jitter is also critical, as fluctuations disrupt immersion and can cause nausea. AR similarly needs minimal latency to keep real-time overlays accurately aligned with the physical world.

<120 ms (good) to avoid motion sickness; requires minimal propagation, queuing latency, and jitter for smooth interaction.

High-Frequency Trading

<10 ms (good) latency for optimal performance; near-zero queuing and processing delays are critical.

General Web Browsing and E-Commerce

<100 ms (good) for optimal page loads, 200-500 ms generally acceptable. CDNs and caching help reduce latency.

Types of Latency and Benchmarks

  1. Propagation Latency: This is the time it takes for data to physically travel from the source to the destination across a network. It depends primarily on the physical distance and the speed of signal transmission, which varies by medium (fiber, copper, wireless, satellite).

    • Fiber ~5 ms per 1000 km.

    • Silver Cable: For a typical silver cable, the propagation speed of electrical signals is about 95% of the speed of light in a vacuum (300,000 km/s). This translates to approximately 5 nanoseconds of delay per meter, or 3.51 ms per 1,000 km, making it one of the fastest transmission mediums.

    • Copper Cable: Slower than fiber, approximately 5.5 ms per 1000 km due to signal degradation and slower electrical transmission.

    • Satellite (Radio and Microwave Frequencies) latency for Geostationary Orbit ~500 ms for round-trip due to long travel distance. Low Earth Orbit Approximately 30–60 ms.

  2. Data Transmission Latency: The time required to push all bits of the data packet onto the network link. This is influenced by the size of the packet and the bandwidth of the transmission link.

    • 1 Gbps Link ~8 ms for a 1 MB file.

    • 10 Mbps Link: For a 1 MB file, transmission latency is approximately 500-800 ms.

  3. Processing Latency: The time taken by network devices (routers, switches, firewalls) to inspect, analyze, and forward the data packets. Processing latency depends on device performance and complexity of the processing tasks (e.g., firewall checks, encryption).

    • High-performance routers < 1 ms per hop.

    • Standard Routers: Can vary widely but typically 1-10 ms per hop.

  4. Queuing Latency: The delay caused by data packets waiting in a queue for transmission due to network congestion or bandwidth constraints. Queuing latency varies significantly based on traffic load and can fluctuate.

    • Low Load < 1 ms.

    • Moderate Load: Average of 5-20 ms.

    • High Load Up to 100 ms.

  5. Total Round-Trip Latency: The total time it takes for a data packet to travel from the source to the destination and back again. This includes all the latency types mentioned above and serves as a holistic measure of network responsiveness.

    • Local Network (LAN): < 5 ms.

    • Within a City: Around 10-20 ms.

    • Intercontinental: 30-150 ms. (The PubNub network operates with a 01 to 30 ms delay ;) check our global system status

What Causes of network latency?

Main factors are: distance, data packet size, medium through which the data travels, transmission processing route & queuing.

How to improve (reduce or lower) latency?

As a business, the simplest solution to reducing latency for your users is to invest in networking solutions like edge computing, clouds or data stream networks. They work by connecting the user to the closest source of information, decreasing the distance that data packets have to travel. 

Content delivery networks (CDNs) are an industry of geographically distributed servers and data centers that increase speed by providing users with more (and closer) locations of stored data at any given time. CDNs work alongside internet service providers and are commonly used for services like video streaming, loading webpages, and downloading software. The amount of data centers and their spatially distributed locations help to decrease load times and high bandwidth across networks, greatly improving quality of service on the Internet.

Real-time networks build off of CDNs and establish persistent connections between clients and servers, which allow messages to flow freely and almost instantaneously. For example, PubNub’s real-time data stream network uses multiple forms of TCP connections, like WebSockets, MQTT, and HTTP long polling APIs, to provide persistent connections and quick data transmission. This optimizes the network for extremely high speeds and low latency.

Improving latency involves optimizing various aspects of network communication and system performance. Here are some strategies to consider:

1. Use a Wired Connection: Wired connections typically offer lower latency compared to wireless connections, which can be affected by interference and signal strength.

2. Optimize Network Configuration: Ensure that your network hardware, such as routers and switches, are configured correctly and are capable of handling the desired traffic without bottlenecks.

3. Reduce Network Congestion: Minimize the number of devices sharing the network and prioritize critical traffic to reduce congestion and latency.

4. Utilize Content Delivery Networks (CDNs): CDNs distribute content across multiple servers globally, reducing the distance data needs to travel and improving latency for users accessing the content.

5. Implement Caching: Caching frequently accessed data locally can reduce the need to fetch data from distant servers, thereby improving latency.

6. Optimize Protocol Efficiency: Choose communication protocols that minimize overhead and reduce latency, such as using UDP instead of TCP for real-time applications.

7. Use Quality of Service (QoS): Implement QoS mechanisms to prioritize critical traffic over less time-sensitive data, ensuring that latency-sensitive applications receive sufficient bandwidth and resources.

8. Optimize Application Design: Design applications to minimize round trips and unnecessary data transfers, optimizing data transmission.

9. Deploy Edge Computing: Utilize edge computing resources to process data closer to the end-user, reducing the distance data needs to travel.

10. Monitor and Analyze Performance: Continuously monitor network performance and latency metrics to identify bottlenecks and areas for improvement, allowing for proactive optimization.

The relationship of low latency, bandwidth, and throughput

Bandwidth, throughput, and latency have a cause-and-effect relationship, where the capacity of one affects the other. For example, lower bandwidth will increase the latency by not letting as much data travel as quickly across a network.  

What is bandwidth? 

Bandwidth is the maximum capacity of a network connection. Higher bandwidth, or more capacity, equates to lower latency.

What is throughput?

Throughput measures how much data a system can process at any given time. While latency measures the amount of time, throughput measures the amount of data that can be sent.

Other common names for latency:

  1. Lag (slang)

  2. Ping

  3. Delay

  4. Response time

  5. Latency jitter

  6. Rubber-banding (often used in gaming contexts)

  7. Hitching (common in streaming and video applications)

  8. Buffering (specifically in media streaming)

Why low latency is essential in real-time applications

Latency can be affected by many physical barriers like internet service, internet speed, or specific IP networks. Companies and IT professionals have continuously innovated and introduced ways to improve network speeds and the general user experience.

PubNub provides a real-time data stream network that allows developers to build real-time applications with almost no latency and unlimited concurrent users. With five global data centers, latency of less than 100 milliseconds worldwide, and 99.999% uptime service level agreements, you can be confident that your application will reliably run in real time and scale to fit your needs. If you are interested in building your real-time application with PubNub, contact sales today.