Parallel Process

jxs832 · September 20, 2018, 12:55

Hi,

I am currently running a 3D transient simulation using rhoCentralFoam in parallel. My current computer has 4 cores, at 4.8GHz (overclocked), and 16 GB of ram (it only uses about 8GB during the run). I am running this in parallel and it takes about 100 hours or more to converge.

I recently got the money to build a little cluster, and I have seen some success in creating a cluster out of raspberry pis or similar boards. My question is what would be the better solution for my case. With the funds I have available I can make the following system:

- 34 nodes each with 2.1 GHz quadcore and 2 GB of RAM
- 76 nodes each with 1.6 GHZ quadcore and 1 GB of RAM

I am not sure how the computational time would scale with number of cores and clock times.

I should mention that I am using the scotch option in my decompose file, so theoretically each core is handling the same amount of nodes. Eventually, I want to increase my simulation to have a few million nodes.

pbrady2013 · September 20, 2018, 17:14

Hi,

In my experience there is really no right answer in this case as scale out results depend not only on your hardware - don't forget the network interconnects are just as important as the CPUs - but also the compiler optimisations and mesh/solver.

My personal preference is clock speed over count but I've no solid evidence to back that up, just intuition.

Ideally, you'd define the type of problems that you want to run and the test on some sample hardware before really scaling out. Then, to really squeeze performance, you'd want to custom compile your main compiler, followed by your MPI implementation and possibly look at your network stack and kernel.

Cheers,
-pete

RobertHB · September 21, 2018, 03:59

Quote:

Originally Posted by jxs832

I am not sure how the computational time would scale with number of cores and clock times.

Here is a scalability study of OpenFoam.

September 20, 2018, 12:55	Parallel Process	#1
jxs832 New Member Join Date: Mar 2014 Posts: 7 Rep Power: 12	Hi, I am currently running a 3D transient simulation using rhoCentralFoam in parallel. My current computer has 4 cores, at 4.8GHz (overclocked), and 16 GB of ram (it only uses about 8GB during the run). I am running this in parallel and it takes about 100 hours or more to converge. I recently got the money to build a little cluster, and I have seen some success in creating a cluster out of raspberry pis or similar boards. My question is what would be the better solution for my case. With the funds I have available I can make the following system: - 34 nodes each with 2.1 GHz quadcore and 2 GB of RAM - 76 nodes each with 1.6 GHZ quadcore and 1 GB of RAM I am not sure how the computational time would scale with number of cores and clock times. I should mention that I am using the scotch option in my decompose file, so theoretically each core is handling the same amount of nodes. Eventually, I want to increase my simulation to have a few million nodes.

September 20, 2018, 17:14	Depends	#2
pbrady2013 Member Peter Brady Join Date: Apr 2014 Location: Sydney, NSW, Australia Posts: 54 Rep Power: 11	Hi, In my experience there is really no right answer in this case as scale out results depend not only on your hardware - don't forget the network interconnects are just as important as the CPUs - but also the compiler optimisations and mesh/solver. My personal preference is clock speed over count but I've no solid evidence to back that up, just intuition. Ideally, you'd define the type of problems that you want to run and the test on some sample hardware before really scaling out. Then, to really squeeze performance, you'd want to custom compile your main compiler, followed by your MPI implementation and possibly look at your network stack and kernel. Cheers, -pete

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
MPI error in parallel application	usv001	OpenFOAM Programming & Development	2	September 14, 2017 11:30
Explicitly filtered LES	saeedi	Main CFD Forum	16	October 14, 2015 11:58
simpleFoam in parallel issue	plucas	OpenFOAM Running, Solving & CFD	3	July 17, 2013 11:30
problem in using parallel process in fluent 14	aydinkabir88	FLUENT	1	July 10, 2013 02:00
parallel process	Jane	FLUENT	1	May 11, 2004 13:23