Large case parallel efficiency

arjun · September 12, 2011, 20:15

Quote:

Originally Posted by flavio_galeazzo

I would like to share my newest experience with the scalability of the linear solvers in OpenFoam. I used to run simulations using compressible solvers, and used the PCG linear solver for pressure and PBiCG for the other variables. The scalability was very good up to 256 processors, and I could get one second computational time per time step for a 14 million node grid.
Recently I moved to a incompressible solver, and in that case the GAMG linear solver was far superior than the PCG for the pressure, the other variables stayed with the PBiCG. However, to my surprise, the scalability was very poor this time. I got good results up to 32 processors, with about 10 seconds computational time per time step, and increasing the number of processors do not improved the computational time.
Reading now the ppt from Dr. Jasak, from the post of lakeat, it become clear that the problem is actually the GAMG linear solver. It is very unfortunate, since the GAMG linear solver is indeed very helpfull for the incompressible solver.

Here is the reason why compressible case converges faster than incompressible case.
In case of incompressible navier stokes, the pressure correction is elliptic type.

http://mathworld.wolfram.com/Ellipti...lEquation.html

In numerical methods terms diagonal element = sum of off diagonal elements. (absolute values). This makes this linear system very difficult to converge. This is why a very efficient method for this is critical to efficiency of CFD solver.

In case of compressible solver the linear system has diagonal dominance. It could be easily converged now.

About your scaling issue as to why compressible case scales better. It is very hard to pin point the reason. But my guess is that, (remember I said my guess), since there are less iterations involved the result of poor scaling is not that pronounced as in case of incompressible solver. Less iterations required == less time lost in inefficiency of linear solver.

In case of compressible solver, I might completely give up AMG because BiCGStab etc have better scaling than AMG and system is now easy to converge due to diagonal dominance.

flavio_galeazzo · September 30, 2011, 02:32

Thanks for the explanation, arjun. It makes more sense to me now.

I think I was too vague in my last post. Sorry lakeat. When I said that it worked well, I wanted to say that when solving an incompressible problem using a compressible solver, the results were very close, however the scalability of the solver was much better.

kumar · February 3, 2012, 11:46

Hello Foamers,
Our university upgraded our cluster and I have access now to a cluster with infiniband. My home directory was synchronized with the new cluster. SO I can run my jobs. But I have no idea if I am really making use of the infiniband.

I would prefer not to compile OF again, just for making use of infiniband.

1)I need help on "linking Pstream against the system compiled OpenMPI"

How do I do that?
And @Daniel:
1 PFLAGS = -DOMPI_SKIP_MPICXX
2 PINC = -I$(MPI_ARCH_PATH)/include
3 PLIBS = -L$(MPI_ARCH_PATH)/lib -lmpi

2)Could you please let me know where can I find these settings.

3)Also how to make sure If I am using system OpenMpi or openfoam's openMPI while I am submitting my job?

I use the following script to submit the job:
mpirun -hostfile $OAR_NODEFILE -mca btl ^openib -mca plm_rsh_agent "oarsh" \
-np `wc -l $OAR_NODEFILE | cut -d " " -f 1` interFoam -parallel > log

Any help or explanations is really useful for me.
Thanks in Advance
regards
K.Suresh kumar

lakeat · February 3, 2012, 12:00

Hi Kumar,

Here are my experience,
1. You dont need to take special care about how to link openmpi with infiniband, it is automatically handled by openmpi during openmpi's compilation.
2. By using `which mpirun`, you will see which mpi package you are using.
3. Other settings are in wmake/ folder, reading the etc/bashrc is also helpful. Just use the find command if you dont know where they are.

Anyway, just test on a simple case, and plot the speedup. mpi with infiniband is much faster and sppedup curve is much better.

lakeat · February 3, 2012, 12:02

lakeat · February 3, 2012, 12:04

Another vital tip:

When using afs, dont output too frequently if you dont need it. (Including the logs file). Too many io operations will slow down the entire process.

kumar · February 3, 2012, 12:24

Hello Daniel,
Thanks for the prompt reply. I will setup a simple case and plot the efficiency and then let you know.

regards
K.Suresh kumar

1/153 · October 25, 2012, 14:09

Quote:

Originally Posted by arjun

Here is the reason why compressible case converges faster than incompressible case.
In case of incompressible navier stokes, the pressure correction is elliptic type.

http://mathworld.wolfram.com/Ellipti...lEquation.html

In numerical methods terms diagonal element = sum of off diagonal elements. (absolute values). This makes this linear system very difficult to converge. This is why a very efficient method for this is critical to efficiency of CFD solver.

In case of compressible solver the linear system has diagonal dominance. It could be easily converged now.

About your scaling issue as to why compressible case scales better. It is very hard to pin point the reason. But my guess is that, (remember I said my guess), since there are less iterations involved the result of poor scaling is not that pronounced as in case of incompressible solver. Less iterations required == less time lost in inefficiency of linear solver.

In case of compressible solver, I might completely give up AMG because BiCGStab etc have better scaling than AMG and system is now easy to converge due to diagonal dominance.

When you are talking about convergence. I am wondering are you referring to steady RANS simulations only? Is your observation applicable to unsteady simulations? Thanks

1/153 · October 25, 2012, 14:11

I am also struggling with mpi implementations.

With a inifniband cluster, will mvapich be substantially better than openmpi?

Does anyone has any real experience?

Thanks a million

arjun · October 27, 2012, 03:11

Quote:

Originally Posted by 1/153

When you are talking about convergence. I am wondering are you referring to steady RANS simulations only? Is your observation applicable to unsteady simulations? Thanks

yes, the situation do not change even if calculation is unsteady. I was taking about solution of pressure correction equation at any iteration of SIMPLE.

February 3, 2012, 12:00		#64
lakeat Senior Member Daniel WEI (老魏) Join Date: Mar 2009 Location: Beijing, China Posts: 689 Blog Entries: 9 Rep Power: 21	Hi Kumar, Here are my experience, 1. You dont need to take special care about how to link openmpi with infiniband, it is automatically handled by openmpi during openmpi's compilation. 2. By using `which mpirun`, you will see which mpi package you are using. 3. Other settings are in wmake/ folder, reading the etc/bashrc is also helpful. Just use the find command if you dont know where they are. Anyway, just test on a simple case, and plot the speedup. mpi with infiniband is much faster and sppedup curve is much better. __________________ ~ Daniel WEI ------------- Boeing Research & Technology - China Beijing, China Email

February 3, 2012, 12:02		#65
lakeat Senior Member Daniel WEI (老魏) Join Date: Mar 2009 Location: Beijing, China Posts: 689 Blog Entries: 9 Rep Power: 21	<Deleted><Deleted> __________________ ~ Daniel WEI ------------- Boeing Research & Technology - China Beijing, China Email

February 3, 2012, 12:04		#66
lakeat Senior Member Daniel WEI (老魏) Join Date: Mar 2009 Location: Beijing, China Posts: 689 Blog Entries: 9 Rep Power: 21	Another vital tip: When using afs, dont output too frequently if you dont need it. (Including the logs file). Too many io operations will slow down the entire process. __________________ ~ Daniel WEI ------------- Boeing Research & Technology - China Beijing, China Email

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Postprocessing large data sets in parallel	evrikon	OpenFOAM Post-Processing	28	June 28, 2016 03:43
Superlinear speedup in OpenFOAM 13	msrinath80	OpenFOAM Running, Solving & CFD	18	March 3, 2015 05:36
Parelleling Efficiency	kassiotis	OpenFOAM	0	June 19, 2009 14:12
Parallel efficiency channel flow	maka	OpenFOAM Running, Solving & CFD	1	December 8, 2005 12:58
Post-processing of a large transient case	Flav	Siemens	2	September 28, 2004 06:19

September 30, 2011, 02:32		#62
flavio_galeazzo Member Flavio Galeazzo Join Date: Mar 2009 Location: Karlsruhe, Germany Posts: 34 Rep Power: 18	Thanks for the explanation, arjun. It makes more sense to me now. I think I was too vague in my last post. Sorry lakeat. When I said that it worked well, I wanted to say that when solving an incompressible problem using a compressible solver, the results were very close, however the scalability of the solver was much better.

February 3, 2012, 11:46		#63
kumar Senior Member Suresh kumar Kannan Join Date: Mar 2009 Location: Luxembourg, Luxembourg, Luxembourg Posts: 129 Rep Power: 17	Hello Foamers, Our university upgraded our cluster and I have access now to a cluster with infiniband. My home directory was synchronized with the new cluster. SO I can run my jobs. But I have no idea if I am really making use of the infiniband. I would prefer not to compile OF again, just for making use of infiniband. 1)I need help on "linking Pstream against the system compiled OpenMPI" How do I do that? And @Daniel: 1 PFLAGS = -DOMPI_SKIP_MPICXX 2 PINC = -I$(MPI_ARCH_PATH)/include 3 PLIBS = -L$(MPI_ARCH_PATH)/lib -lmpi 2)Could you please let me know where can I find these settings. 3)Also how to make sure If I am using system OpenMpi or openfoam's openMPI while I am submitting my job? I use the following script to submit the job: mpirun -hostfile $OAR_NODEFILE -mca btl ^openib -mca plm_rsh_agent "oarsh" \ -np `wc -l $OAR_NODEFILE \| cut -d " " -f 1` interFoam -parallel > log Any help or explanations is really useful for me. Thanks in Advance regards K.Suresh kumar

February 3, 2012, 12:24		#67
kumar Senior Member Suresh kumar Kannan Join Date: Mar 2009 Location: Luxembourg, Luxembourg, Luxembourg Posts: 129 Rep Power: 17	Hello Daniel, Thanks for the prompt reply. I will setup a simple case and plot the efficiency and then let you know. regards K.Suresh kumar

October 25, 2012, 14:11		#69
1/153 Member dw Join Date: Jul 2012 Posts: 32 Rep Power: 13	I am also struggling with mpi implementations. With a inifniband cluster, will mvapich be substantially better than openmpi? Does anyone has any real experience? Thanks a million