Follow Us

Microsoft server crash nearly causes 800-plane pile-up

Failure to restart system caused data overload.

A major breakdown in Southern California's air traffic control system last week was partly due to a "design anomaly" in the way Microsoft Windows servers were integrated into the system, according to a report in the Los Angeles Times.

The radio system shutdown, which lasted more than three hours, left 800 planes in the air without contact to air traffic control, and led to at least five cases where planes came too close to one another, according to comments by the Federal Aviation Administration reported in the LA Times and The New York Times. Air traffic controllers were reduced to using personal mobile phones to pass on warnings to controllers at other facilities, and watched close calls without being able to alert pilots, according to the LA Times report.

The failure was ultimately down to a combination of human error and a design glitch in the Windows servers brought in over the past three years to replace the radio system's original Unix servers, according to the FAA.

The servers are timed to shut down after 49.7 days of use in order to prevent a data overload, a union official told the LA Times. To avoid this automatic shutdown, technicians are required to restart the system manually every 30 days. An improperly trained employee failed to reset the system, leading it to shut down without warning, the official said. Backup systems failed because of a software failure, according to a report in The New York Times.

The contract for designing the system, called Voice Switching and Control System (VSCS), was awarded to Harris Corporation in 1992 and the system was installed in the late 1990s, initially using Unix servers, according to Harris. In 2001, the company completed testing of the VSCS Control Subsystem Upgrade (VCSU), which replaced the original servers with off-the-shelf Dell hardware running Microsoft Windows 2000 Advanced Server. The upgrade was installed in California last year, according to the FAA.

Soon after installation, however, the FAA discovered that the system design could lead to a radio system shutdown, and put the maintenance procedure into place as a workaround, the LA Times said. The FAA reportedly said it has been working on a permanent fix but has only eliminated the problem in Seattle. The FAA is now planning to institute a second workaround - an alert that will warn controllers well before the software shuts down.

The shutdown is intended to keep the system from becoming overloaded with data and potentially giving controllers wrong information about flights, according to a software analyst cited by the LA Times.

Microsoft told Techworld it was aware of the reports but was not immediately able to comment.







Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Desktop modernisation

On the one hand, there is the need to keep the existing desktop environment efficient, secure...

Download Whitepaper

Top 10 myths about virtualising business-critical applications

Even though virtualization has brought positive change to enterprise IT over the last decade,...

Download Whitepaper

Aligning CFO and CIO priorities

Forward-thinking organisations are viewing cloud computing as an investment in business...

Download Whitepaper

The new corporate network

Businesses can’t afford to have employee productivity suffer because they cannot use their...

Download Whitepaper

Techworld UK - Technology - Business

Techworld Awards

Techworld Awards 2012
Coming Soon

Opening for submissions 30th April 2012

 

Find out more

Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...
LogMeIn Rescue

Accelerate Your IT Efficiency

View the latest capacity management resources including whitepapers, videos and news.

Find out more...

Site Map

* *