50686

Measuring broadcast message latency using system clock, good idea?

I want to measure broadcast message latency over our message broker on a 1GB LAN.

Messages are transmitted in a pub sub fashion, one publisher, many consumers. The producer timestamps each message using the system clock (DateTime.Now in C#) and consumers measure latency by subtracting the timestamp on the message from DateTime.Now.

double latency = (DateTime.Now - msg.NMSTimestamp).TotalMilliseconds;

All of the boxes on our LAN sync their time via NTP once an hour yet I'm seeing significant latency and even negative times in the range of +/- 1 second. I read that NTP should provide ~5ms accuracy in a LAN environment.

Is my measurement strategy fundamentally flawed? Is there another explanation for the negative latency? If I was only seeing large latencies I'd suspect our message queue was slow but the negative ones really have me confused.

Answer1:

What are your negative values looking like in millis? If it's within 5ms, that's normal for NTP, as you know. There could even be up to 10 millis difference between computers if one computer was 5 millis ahead of true time and another was 5 behind. More than that, I would guess that there's some rounding error, lookahead/lookbehind error, or sync errors somewhere in your system. There are many hardware and implementation details you have little control over that can produce inaccuracies. GENERALLY, system clocks are accurate enough at the millisecond level when polled by DateTime.Now, but many hardware details like CPU throttling under load, pipelines, cache thrashing etc. can introduce enough error to be significant at the millisecond level.

If possible, set up your computers to synchronize with the NTP server at least a second apart form each other. If all computers try to sync on the hour every hour, the NTP server will be flooded, increasing inaccuracies in reporting the correct time due to crowding and packet scheduling. I think this is the most likely cause for what's going on. Also, make sure your network is as efficient as possible, by reducing cable runs (300ft is the theoretical maximum, and in an EMI-noisy environment runs as short as 40 feet can cause serious problems), replacing hubs with switches, and minimizing wireless network use.

Answer2:

I have a handful of incidents logged of negative network latency measured by the same clock.

Windows fails to implement clock skew, so you see these whenever a sync happens.

Windows does not guarantee 5ms accuracy, but only 18.2 ticks per second. My machine provides an epsilon of 15ms.

Recommend

  • Why doesn't this sql script execute?
  • Convert queue into long array?
  • How much performance overhead is there in using events?
  • A good usage of HTML5's “progress” or “meter”?
  • Firebase Analytics not working with Instant App or Normal App
  • View, how gcc plan (schedule) instructions on tick-by-tick level
  • Unit testing XNA: Do I need to Mock my GraphicsDevice
  • reading data from a USB port [closed]
  • Android gms.google.services version (15.0.1) conflict with Firebase Core 16.0.0
  • Check if values of datetime column in df2 is within datateime values of df1 in R
  • Distance between two lat/long points in Classic ASP
  • How to send and get data between Service and BroadcastReceiver?
  • com.google.android.gms.internal.measurement is missing in Google Play Services 15.0.0 and greater
  • Classes containing other classes as properties
  • How do I classify this value using a decision tree
  • Measuring broadcast message latency using system clock, good idea?
  • how can i get two consecutive values from Iterator
  • How to convert time String into NSDate?
  • How to pass string and dictionary in NUnit test cases?
  • What are the best practices for migrating an Oracle 10g database to Microsoft SQL 2008 R2? Applicati
  • JQuery load doesn't seem to process ?
  • Why does checkout sometimes stage a file?
  • Design of Service Layer and Repositories in Microsoft MVC
  • SEO friendly 301 redirect .htm to .aspx
  • Spring Integration debounce/deduplicate
  • Mapping ManyToMany with composite Primary key and Annotation:
  • Why not Factory pattern for sorting? [closed]
  • android duplicate provider authority on apps that don't have provider
  • VB.Net Double comparison after some additions
  • Zeromq with python hangs if connecting to invalid socket
  • Selecting a subset of data in ServiceStack.OrmLite
  • How to extract text from a PDF and decode characters?
  • Telegram bot API - Inline bot getting Error 400 while trying to answer inline query
  • why calling cd shell command through system() or execvp() from a child process won't work?
  • Which open source license has no forking [closed]
  • How can I replace the server in Web Component Tester
  • Do query loads all the data in memory
  • C# program and C++ DLL compiled for 32-bit system crash on 64-bit system
  • How integrated is Collada to OpenGL ES
  • What is Eclipse's Declaration View used for?