Sultan Qaboos University eschewed conventional logic by building a high performance computing facility out of off-the-shelf hardware, instead of buying an expensive supercomputer.
AL-BARWANI: Nowadays there's no university without an HPC facility available.
By Imthishan Giado
Sun 07 Feb 2010 04:00 AM

Sultan Qaboos University eschewed conventional logic by building a high performance computing facility out of off-the-shelf hardware, instead of buying an expensive supercomputer.

In the past, universities were not renowned for being at the forefront of technology. IT needs were often simplistic, with lab space playing second fiddle to the need to accommodate hordes of bright-eyed students in commodious classrooms.

Today, however it's quite reasonable to say that universities are a virtual nation unto themselves, a microcosm of humanity where tech requirements span from basic connectivity to some serious number-crunching. But the latter need is hard to fill without building an expensive supercomputer, the price of which can run into several millions of dollars.

Fortunately, the inquiring minds at Oman's Sultan Qaboos University were able to find an alternative solution, through the deployment of a High Performance Computing (HPC) cluster from Sun Microsystems at the institution's Centre for information systems.

Muataz Al-Barwani, computational physicist, assistant professor in the physics department in the college of science and manager of the high performance computing facility outlines the history of the implementation: "I've initiated this project back in 2007 and actually had to convince the university of the need of such a facility here at Sultan Qaboos University. Since then, it's all about talking to people and getting to see how many potential users there would be. Finally the university agreed so we started it around early 2007, with funds provided in the beginning of 2008.

The initial plan called for the project to be implemented in three phases. The first - which would eventually cost $204,000 - came to the tender stage in 2008 and drew six proposals, as Al-Barwani details.

"Basically, we posted an RFP on our procurement website and got six proposals from five different vendors. Sun had two partners proposing. Then we decided in July 2008 after different evaluations of the tenders and presentations, on the Sun solution," he confirms.

Equipment for the facility arrived in October and the facility went live in mid-november 2008. Unusually for an IT project, the CIO was not in charge of the implementation; that responsibility fell to Al-Barwani, who passed documentation to the dean of the faculty.

"He [the CIO] wasn't directly involved. I was the lead on the HPC system. This is high performance computing and you need very specialised people. My experience with HPC goes back all the way to the mid-1990s. I'm probably one of the most experienced here with HPC here on campus," he says.

During the evaluation stage, Al-Barwani rated support as a key factor and how quickly the winning vendor would be able to respond. Despite the fact that the HPC would be using off-the-shelf hardware, he found that many vendors lacked the experience to provide a complete hardware and software system, as well as the necessary training.

"In some of the cases, the tenders might have been the cheapest, but they were not offering any training, hardly any support, the company had no experience in HPC. One of the things that helped is that Sun had sent an HPC specialist that was based in Dubai, so we had someone close by if we needed support. The other vendors unfortunately sent people who were not even familiar with what they were presenting," explains Al-Barwani.

He goes on to detail the system he has built at SQU: "The three phase system called for us to get a small system to start with. Phase two would be an expansion of the number of compute nodes, with additional storage. That's exactly what we've done. The HPC facility is basically an off-the-shelf system, a chassis with 10 blades. There's nothing unique about this particular hardware. They provided us with a full 42U rack so we have ten blades, each the equivalent of a normal PC.

"With dual-core processors, each node has four cores and out of the ten nodes, one is dedicated as a management node, the other is dedicated as a login-node, the remaining eight are compute nodes. So we have eight times four cores which equals 32 cores for computation. Our second phase will include another chassis with ten more blades. The new ones will be quad-core processor-based, so each blade will have eight computational cores," he continues.

Understandably, SQU management initially had concerns about whether Al-Barwani's project could attract enough users. However these proved to be unfounded as the HPC facility now has 20 full time users.

"They are mostly academics from different departments such as mechanical, civil and electrical engineering. Just the other day, a guy from Geography is interested in weather forecasting," he says.

Now that the system has proven itself, Al-Barwani has moved onto the expansion stage. Late last summer, he issued a tender for the second phase, which will cost in the vicinity of $116,000. Sun partners submitted two vendors, one of which was accepted and is now pending final approvals from management.

While slightly slower, he says this has its advantages: "Of course, everyone hopes for a lot of money in one go to implement a larger system from the beginning. But when you start small, you can have changes in the second phase so you're not stuck with what you bought in one go. In phase one, we went with dual-core where each node had four cores. New  Nehalem processors have come on the market which meant we can get good performance out of quad-core chips, so we went that for the second phase."

Beyond the expansion, Al-Barwani has ambitious plans for the facility, one of few in the Middle East: "We're waiting for the upper management to approve a proposal for a steering committee that will oversee the general project of the HPC facility. My vision for the facility is to expand it into a full-fledged centre which can cater as a service to the users and also as a hub for interdisciplinary research. Once that is approved and in place, then hopefully things will be moving much smoother."

"If you look at it from the perspective of having no access to an HPC facility a year ago and what you have now, that in itself is an achievement. It is worth whatever investment we've put in because nowadays there's no university without an HPC facility of some kind," concludes Al-Barwani.

Clouds over the horizonWith the news in April last year that Oracle has bought Sun Microsystems for a considerable $7.4 billion, many users of the latter's hardware products have been worried that support may dry up or vanish altogether. HPC facility manager Muataz Al-Barwani is not worried, however.

"I have not put a contingency plan into place. But let's say that today, Sun stops building such systems. The hardware itself is not unique - it's been put together by Sun but it's Intel-based. You can still build on top of it with other systems.

It won't be homogeneous any more, it'll be heterogeneous but it's not the end of the world. I don't see that happening in the next couple of years at least and it won't be something I will be worrying about at this stage," he assures.

