For some of the compute demanding applications, data throughput is a critical element in getting the needed performance. Throughput is of course important for storage performance and scalability, for checkpointing and other cases, which are do relevant to most applications. Today the fastest solution we can find is InfiniBand FDR. It enables bandwidth of 56 gigabit per second. Taking the overhead off, we are left with around 54 gigabit per second for actual data movement. Other options are QDR or 40 gigabit Ethernet. Ethernet is not what we use for our HPC systems. Too much performance overhead.
One can claim that there are 100 gigabit ports on some Ethernet switches, but these are for network aggregation, not to the server. These ports actually use 10 lanes of 10 gigabit each. Less the desired 4 lanes approach.
We did see some announcements for real 100 gigabit HPC networks. The first InfiniBand 100 gigabit switch was announced back in June – not just higher throughput but also lower latency – so win on both sides. While no indications yet on when the 100 gigabit InfiniBand adapter will be out, the switch announcement hints that we are getting close to the 100 gigabit times.
The higher the bandwidth (typically) the higher the message rate. With InfiniBand FDR we already saw much higher message rate versus all the QDR options in the market – either from Mellanox or from Intel. The increase in message rate was greater than the bandwidth difference – therefore also due to the new architecture of the latest InfiniBand FDR adapters. We do base all of our system nowadays on FDR. Waiting for EDR….