I came across an interesting article on Amazon HPC Cloud. As
was reported recently, Cycle Computing built a 50,000-core Amazon cluster for
Schrödinger, which makes simulation software for use in pharmaceutical and
biotechnology research. Amazon and Cycle Computing made lot of noise around the
HPC capability of EC2, how great it is for HPC applications and that Schrödinger
is a great partner.
When Schrödinger actually tried to run their simulations on
the cloud, the results were not that great. Amazon EC2 architecture slows down HPC
applications that require decent amount of communication between the servers,
even at small scale. Schrödinger President Ramy Farid mentioned that they have
successfully run parallel jobs on Amazon eight-core boxes, but when we tried
anything more than that, they got terrible performance. Farid was using
Amazon’s eight-core server instances, so running a job on 16 cores
simultaneously required two eight-core machines. “Slow interconnect speeds
between separate machines does become a serious issue”.
It is known that Amazon EC2 is not a good place for HPC
application, and it will not change unless they actually build the right
solution. But instead of the taking the right steps, Cycle Computing CEO Jason
Stowe decided that the application is the fault…. and the application tested is
an example for only 1% of the HPC applications and Amazon care for the other
99%. Jason, wake up! The applications tested are a good indication for many other
HPC applications. Don’t blame the application or the user, blame yourself for
building a lame solution. Deepak Singh, Amazon’s principal product manager for
EC2 had also smart things to say - “We’re interested in figuring out from our
customers what they want to run, and then deliver those capabilities to them.
There are certain specialized applications that require very specialized
hardware. It’s like one person running it in some secret national laboratory.” Deepak,
you need to wake up too and to stop with these marketing responses. If you want
to host HPC applications, don’t call every example a “secret national
laboratory” and stop calling standard solutions that everyone can buy from any
server manufacture, such as InfiniBand, “a very specialized hardware”.
Amazon is clearly not connected to its users, or potential
users, and until they do try to understand what we need, the best thing is to avoid
them. There are much better solutions for HPC clouds out there.