parallel performance on BX900
OpenFOAM v1.6 has been successfully installed on a supercomputer at Japan Atomic Energy Agency. The supercomputer system is a hybrid system consisting of three computational server systems, i.e., (I) Large-scale Parallel Computation Unit, (II) Application Development Unit for the Next Generation Supercomputer, and (III) SMP Server. The Large-scale Parallel Computation Unit uses PRIMERGY BX900, which is the Fujitsu's latest blade server with 2134 nodes (4268 CPUs, 17072 cores) connected using the latest InfiniBand QDR high-speed interconnect technology. The details of the Large-scale Parallel Computation Unit are as follows.
CPU: Intel Xeon processor X5570 (2.93GHz)×2CPU
level one cache(L1)：256K
number of cores: 4 cores/CPU
node communication performance:8GB/s
OS: Red Hat Enterprise Linux 5
Based on the LINPACK performance benchmark, the supercomputer achieved performance of 186.1 teraflops, which made it the fastest one in Japan based on the latest TOP500 list of supercomputers at the date of this October.
I would like to report parallel performance up to 256 cores on the Large-scale Parallel Computation Unit. I thought it will be a good idea to share it for supercomputer users in any form. I hope this information helps you if only a little.
Here, a simplified three-dimensional dam break problem is chosen as a test example and the two-phase flow is solved an interFoam solver. Numerical conditions are same in experimental settings as used in Martin and Koshizuka.
 J.C. Martin and W.J. Moyce, ”PartIV. An experimental study of the collapse of liquid columns on a rigid horizontal plane ”, Phil. Trans. R. Soc. Lond. A, 244, 312-324 (1952).
 S. Koshizuka, H. Tamako, Y. Oka, "A particle method for incompressible viscous flow with fluid fragmentation", Computational Fluid Mechanics Journal, 113, 134-147 (1995).
It is found that it scales well for up to 128 cores, yet maintains excellent performance levels even on 256 cores. (Please see the attached file for details.)
Parallel performance up to full cores (17072 cores) will be reported later.
If you instead plot the numbers of cell per core, what would the numbers be?
I usually tries to go for approximately 50k cells / core, lower than that is not worth it
Dear Niklas Nordin
Thank you very much for your interest in my work. I would be happy to try to answer your question.
|All times are GMT -4. The time now is 23:30.|