I came across an interesting article on Amazon HPC Cloud. As was reported recently, Cycle Computing built a 50,000-core Amazon cluster for Schrödinger, which makes simulation software for use in pharmaceutical and biotechnology research. Amazon and Cycle Computing made lot of noise around the HPC capability of EC2, how great it is for HPC applications and that Schrödinger is a great partner.
When Schrödinger actually tried to run their simulations on the cloud, the results were not that great. Amazon EC2 architecture slows down HPC applications that require decent amount of communication between the servers, even at small scale. Schrödinger President Ramy Farid mentioned that they have successfully run parallel jobs on Amazon eight-core boxes, but when we tried anything more than that, they got terrible performance. Farid was using Amazon’s eight-core server instances, so running a job on 16 cores simultaneously required two eight-core machines. “Slow interconnect speeds between separate machines does become a serious issue”.
It is known that Amazon EC2 is not a good place for HPC application, and it will not change unless they actually build the right solution. But instead of the taking the right steps, Cycle Computing CEO Jason Stowe decided that the application is the fault…. and the application tested is an example for only 1% of the HPC applications and Amazon care for the other 99%. Jason, wake up! The applications tested are a good indication for many other HPC applications. Don’t blame the application or the user, blame yourself for building a lame solution. Deepak Singh, Amazon’s principal product manager for EC2 had also smart things to say - “We’re interested in figuring out from our customers what they want to run, and then deliver those capabilities to them. There are certain specialized applications that require very specialized hardware. It’s like one person running it in some secret national laboratory.” Deepak, you need to wake up too and to stop with these marketing responses. If you want to host HPC applications, don’t call every example a “secret national laboratory” and stop calling standard solutions that everyone can buy from any server manufacture, such as InfiniBand, “a very specialized hardware”.
Amazon is clearly not connected to its users, or potential users, and until they do try to understand what we need, the best thing is to avoid them. There are much better solutions for HPC clouds out there.