Blog

How much a Leap Second can cost? (The Pragmatic Testing)

On October 12, 2012, in Syndicated, by Association for Software Testing
0
Few months back an incident happned due to the change of Leap Second to the world time. The incident affected many organizations including Redit, Mozilla, Linkedin and many airlines around the world. It all happened when a leap second was added to the world’s atomic clocks. It sounds obvious that the computer systems (shall we say ‘software’) of these affected companies could not handle or recognize this leap second. 

What is a Leap Second?
I am not sure how authentic this definition is, but since Wikipedia has started sounding more & more authentic, here is the definition of Leap second from Wikipedia: 

leap second is a one-second adjustment that is occasionally applied to Coordinated Universal Time (UTC) in order to keep its time of day close to the mean solar time. What this means is, a single second is added to UTC to keep it synchronized with Earth’s rotation. 

More info can be found at http://en.m.wikipedia.org/wiki/Leap_second.

When did it happen last?
The most recent leap second was inserted on June 30, 2012 at 23:59:60 UTC. So how did just one second create havoc? Isn’t it insane? It’s just one second, what difference would it have made? Yeah, it does appear insane, but this one second made many systems choke on it. Many websites did not respond for this one second including LinkedIn and Mozilla; few major Linux based servers were brought down by this change, like one of the world’s biggest airline reservation & booking systems delaying & canceling hundreds of flights worldwide. This airline reservation system is used by many large airlines across the world. Are you thinking that it was a loss of millions of dollars? Possibly yes! Maybe more!!

Why didn’t someone test this before?
Lack of Awareness, simple! There are many people calling it a ‘leap second defect’. In fact the defect is not the leap second, but it was triggered by the leap second insertion. most of these affected systems are based on Unix/ Linux and apparently there was a defect in the  kernel of the open-source Linux operating system which got triggered the moment this extra second was added to UTC.

So, it was not a testing issue. Linux is an open source system and possibly no one had this idea that something of this sort will happen ever. It is a diligence and preparedness issue. Google was aware of the leap second thing and handled it. 

However, some credit should be given to the debugging & fixing teams of these companies who fixed the issue within few hours. Surely someone must have tested it before implementing the patch. 










.
 

Comments are closed.


Looking for something?

Use the form below to search the site:


Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!