Ames Laboratory puts the "squeeze" on communications technology
New parallel library allows maximum performance for communication networks
A new message-passing library that makes it possible to extract optimum performance from both workstation and personal computer clusters, as well as from large massively parallel supercomputers has been developed by researchers at the U.S. Department of Energy’s Ames Laboratory. The new library, called MP_Lite, supports and enhances the basic capabilities that most software programs require to communicate between computers.
Although MP_Lite could be scaled up easily, its objective is not to provide all the capabilities of the full message-passing interface, or MPI, standard. MPI is a widely used model that standardizes the syntax and functionality for message-passing programs, allowing a uniform interface from the application to the underlying communication network. Parallel libraries that offer the full MPI standard ease programming problems by reducing the need to repeat work, such as defining consistent data structures, data layouts and methods that implement key algorithms.
“Our goal with MP_Lite is to illustrate how to get better performance in a portable and user-friendly manner and to understand exactly where any inefficiencies in the MPI standard may be coming from,” said David Turner, an Ames Laboratory assistant scientist and the principle investigator working on the MP_Lite project. He explained that the MP_Lite library is smaller and much easier to work with than full MPI libraries. “It’s ideal for performing message-passing research that may eventually be used to improve full MPI implementations and possibly influence the MPI standard,” he said.
Turner noted that it was “mainly frustration” that led him to develop the MP_Lite library. “Most message-passing packages are large and clunky to work with, and are often difficult to install. If you run into any errors at all, they give you very cryptic messages that mean nothing unless you actually wrote the library,” he said. “So a lot of the reason I got into the project was not just to improve the efficiency, but also to make the message-passing more user-friendly.”
Offering an example, Turner said, “If two processors are communicating, and one waits a minute for a response from the other one – well, a minute is a very long time in this context – the library should put out a warning into a log file. But that’s something that’s not done. Most message-passing systems dont tell you what’s wrong if a communication buffer overflows or a node is waiting for a message that never gets sent. What if there’s a five-minute wait for a message?” he continued. “Something is probably frozen up, so at that point the library should implement an abort and give the user as much information about the current state of the system as possible.” Turner noted that MP_Lite operates with minimal buffering, and warns if there are any potential problems. When possible, MP_Lite will dump warnings to a log file and eventually time-out when a lock-up occurs. “There’s a lot of these user-friendly aspects that I’d like to see put into other message-passing systems,” he said.
In addition to enhancing performance, another goal Turner has for MP_Lite is to tie it directly to a full MPI library. To do so, he’s been working with the DOE’s Argonne National Laboratory and running their MPICH library on top of MP_Lite. “By doing this, we can pass the good performance of MP_Lite on to the full MPI implementation,” he said. “So we combine the best of both, keeping the efficiency of my library and the greater functionality of Argonne’s.”
Turner said he named the library MP_Lite for several reasons. The small size of the library’s code makes it easy to install anywhere – it compiles in under a minute. And there’s much less code, so it’s more streamlined than MPI. It also has its own syntax, which is simpler and can be used in place of the MPI syntax. The other reason Turner likes calling the library MP_Lite is the answer he’s able to give when responding to people who ask him, “I use this MPI function; why isn’t it in your library?” He simply replies, “Well, it’s ‘lite’ ”
Turner admits that the work on MP_Lite suits him well. “I like the puzzle aspect of it. I like tuning codes and getting them to run on a scaling computer, and trying to squeeze more performance out of what’s there,” he said.
The research is funded by DOE’s office of Mathematical Information and Computer Sciences. Ames Laboratory is operated for the DOE by Iowa State University. The Lab conducts research into various areas of national concern, including energy resources, high-speed computer design, environmental cleanup and restoration, and the synthesis and study of new materials.
Note: MP_Lite may be downloaded free of charge from: http://www.scl.ameslab.gov/Projects/MP_Lite/
David Turner | EurekAlert!