Sunday, September 25, 2011

Tier Network is Dead, Long Live the Flat Network

An interesting article was published few days ago on The Register by Timothy Prickett Morgan, talking on the traditional three-tier networks versus flat networks. While the topic is not new, it seems that more datacenter networks now adopt the high-performance architectures and go flat.

The three-tier network is the traditional hierarchical datacenter network and it was never fast enough, nor cost effective enough, but was championed by Cisco and therefore used in many places. The HPC side of the world went for the flat networks, which delivers lower latency, higher utilization and of course, is much more cost effective. Many of the new “web 2.0” applications today adopt the HPC concepts of parallelism and suffer from low performance in the three-tier network. Going to flat network is the ideal situation for them.

It was funny the read that Blade Network (now IBM) agrees that the datacenter application are going the HPC way, but their company vision is to continue with the three-tier concept… well… in my eyes this is the right decision if you want to stay behind… Flat network, such as CLOS network, is the better way for any large scale system, both for performance and for cost.

Thursday, September 22, 2011

The Texas Advanced Computing Center (TACC) Revealed Plans to Deploy a 10 Petaflops System

TACC announced today plans to deploy one of the world fastest supercomputer – a 10 petaflop machine in the beginning of 2013. The money is coming from the National Science Foundation's (NSF) "eXtreme Digital" (XD) program.

When completed, the system will comprise several thousand Dell servers with Intel Sandy Bridge-EP CPUs (PCI-Express Generation 3).  The CPUs will provide 2 petaflops of peak performance. The system will also include the Intel Knights Corner accelerators (finally another option instead of NVIDIA) that will add additional 8 petaflops for a total of 10 petaflops. The cluster will be connected with InfiniBand FDR 56Gb/s network , which according to HPC Wire is from Mellanox (well… this is an easy guess…).

This is a great win for commodity HPC (we do not like proprietary), a great system for research, and a great win for the companies that really drive HPC innovations – Intel and Mellanox, and for the company that knows how to package it together – Dell. I really hope that now Dell will increase its investment in HPC activities, such as workshops etc, as it seems that they went the other way.

I hope to see many more!

Wednesday, September 14, 2011

$126 Million Approved by US Senate Appropriations Committee for Fiscal Year 2012 For Exascale Development

As mentioned at, the Senate Appropriations Committee voted September 7th to approve the amount as part of the fiscal 2012 energy and water appropriations bill, which starts October 1st, 2011. Also mentioned that in a report accompanying the bill, the Senate appropriators say Energy plans to deploy the first Exascale system in 2018, despite challenges.

Good news indeed. You can argue if the funding is enough to accelerate the development toward the Exascale, but the indication on 2018 as the target year for the first Exascale machine is an encouraging news. My only hope is that part of the funding will go toward commodity-based development. I would hate to see the entire funding go toward proprietary solutions (we saw what happened with Blue Waters…)

Since my systems are not as big as the ones in the large, known, famous HPC centers (well… I do not have the funding that they get…), any development around standard-based solutions will help my systems to be better, faster and stronger as I can use the same building blocks. I would like to have my own small Exascale-like system in 2018 too … :-)

Thursday, September 8, 2011

All the Clouds Around Block my View…

Cloud computing is (well… for quite some time now) the new buzz word. All the different vendors try to teach us what cloud computing is and why it is so good for us and every day you can find 10 different conferences on cloud computing…  I never saw so many different explanations to a single term. Some define cloud computing as utility computing, others as service computing or grid computing or computing by the hour. I prefer to call it “stop wasting my time”… ;-)

The concept of a shared platform is not new, definitely not in the HPC world. My systems are being shared by many users where each of them runs different jobs, different applications and consume the agreed allocated compute power. It is true that my preference is to use a single operating system, but I can easily provision my system to support different flavors if needed. I don’t need to provision by the core, and doing it by the server is good enough. I don’t need to face the issues of memory access starvation, security, management of my data, and no less important the reduction in performance due to the extra unneeded software layers. We can probably define such systems as physical “cloud” HPC systems.

There are some interesting experiments around concepts of virtualized clouds for HPC, but I have not heard much from these experiments. Two of them are at Berkeley and Argonne. The main concept there is to ease the cluster management. Virtualization might be a solution for some of the future reliability issues, but it will be needed to done differently than today. Well… talking on another buzz word … :-)

Monday, September 5, 2011

Cluster Debug Tools – Overview

Debugging is not something I enjoy doing but sometimes I cannot avoid doing… Therefore I thought that a short overview on some of the available tools will be helpful.

-          TotalView: is probably the most widely used debugger for parallel programs. It can be used with C, C++ and Fortran programs and it supports all common forms of parallelism (pthreads, penMP and MPI). The vendor behind, TotalView, was acquired by Rogue Wave in December 2009.
-          DDT:  Distributed Debugging Tool, a product of Allinea Software. DDT is a graphical debugger for debugging complex parallel codes. It is supported on a variety of platforms for C, C++ and Fortran. DDT is a strong competitor to TotalView.
-          PGDBG:  The Portland Group Compiler Technology symbolic debugger for F77, F90, C, C++ and assembly language programs. It is capable of debugging parallel programs on clusters, including threaded, MPI and hybrid MPI/threads.
-          GDB: the GNU GDB debugger is a relatively powerful, command-line, single-process debugger. It is text-based, easy to learn, and suitable for a quick debug. C, C++ and Fortran programs are supported. GDB is not a parallel debugger, and not particularly well-suited to debugging threaded codes. GDB can be used with the graphical DDD debugger. GDB can also be used to debug a single process in a parallel job.
-          DDD: the GNU DDD debugger is a graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger and the Python debugger. DDD is not really a parallel debugger, but like GDB, it can be used to debug a single process of a parallel program.
-          STAT:  the Stack Trace Analysis Tool gathers and merges stack traces from a parallel processes. It produces call graphs 2D spatial and 3D spatial-temporal. The 2D spatial call prefix tree represents a single snapshot of the entire application, and the 3D spatial-temporal call prefix tree represents a series of snapshots from the application taken over time.

If you are using others, please feel free to add.