Waving goodbye to the blue screen
System crashes are not only annoying, they can bring a company to its knees. So IT specialists will welcome a way to measure the dependability of standard operating systems (OS). After conducting hundreds of computer-stress experiments, a European consortium has developed a new benchmarking method for commercial and open source systems.
Gone are the days when performance was all that mattered. What information technology people want - but rarely get - is a reliable system. Especially now that system developers are increasingly using off-the-shelf operating systems for all their needs, including critical applications.
Two papers have emerged from recent dependability benchmarking tests on the Microsoft Windows family. The first is on Windows 2000, the goal being to define the benchmarking principles and to check them. Paper two, published by the IEEE Computer Society Press in June 2004, also includes Windows NT4 and Windows XP.
"We compared these systems erroneous behaviour at the application layer," says Karama Kanoun, coordinator of the IST project DBench, which oversaw the tests. She says the dependability benchmark addresses three complementary measures: robustness, OS reaction time and OS restart time. Together these measurements can characterise a systems behaviour in the presence of application software faults. The project focused on the user, such as system developers, who need dependability benchmarks to pick the best systems for their requirements.
The experiments involved a remote machine, the Benchmark Controller, linked to the computer running the operating system under test and a third machine that ran the workload (a database management application). The Benchmark Controller diagnosed and collected data whenever the operating system experienced an anomaly.
So what did the tests on the three Windows systems reveal? "In terms of robustness, they behave similarly," says Kanoun. "But there was a noticeable difference in OS reaction and restart times: XP has the shortest reaction and restart times, followed by Windows NT, then Windows 2000." The project partners also noted that the application state at the end of the experiment (mainly the hang and abort states) significantly impacts the restart time for the three operating systems.
The coordinator believes that this dependability benchmarking, done in Europe and further afield, was a world first: "People previously focused on validating operating systems. Now the IT world can compare these systems in a standard way." Also original was the way the tests looked at response times in the presence of faults.
She hopes the projects results will accelerate the introduction of international standardisation bodies for dependability benchmarking, possibly within five years. The process is already gaining momentum, thanks to the Dependability Benchmarking Special Interest Group. Created in 1999, when DBENCH was starting, this group is made up of major IT players from the United States and Europe.
The project looked at four kinds of system, in the space, automotive and enterprise fields: general-purpose operating systems, Real-Time Kernels, applications and Online Transaction Processing. According to Kanoun, companies such as IBM, Sun Microsystems and Intel are now doing dependability benchmarking on e-commerce and Web applications, encouraged by some of the results emerging from DBENCH. Some of the project partners are also looking at Linux systems.
Ms Karama Kanoun
7 Avenue Colonel Roche
F-31077 Toulouse Cedex 4
Tara Morris | IST Results