CFD Online Discussion Forums

CFD Online Discussion Forums (
-   CFX (
-   -   speedup questions (

tony January 31, 2008 12:31

speedup questions
Hi, all,

It seems that when I change from serial to parallel using 4 partitions of MPICH2 Local Parallel for Windows, the computational time increased(almost doubled!), which is oppsite to what I expected.

I am using CFX11. My machine:

Fujitsu Siemens Celsius v830

AMD64 2.61G 2xdual core (total 4 processors)

8G memory

Windows XP x64 edition

The mesh: total nodes: 581665

total elements: 2.7M (mostly tet)

I tried several cases with different meshes and different partition numbers, I didn't see speedup. The serial mode seems always faster.

Did I missed something in solver setup or other software setup?

Thank you for your comments.

CycLone January 31, 2008 13:46

Re: speedup questions
Hi Tony,

That sounds unusual. You certainly have a large enough problem to get reasonable scaling and with 8GB of RAM, you have enough memory. Could you add the following:

1. How many iterations are you running? 2. How long is the partitioning taking vs. the iterations? (look at the time reported just before the iterations start vs. the time after) 3. How does the CPU time compare to the wall clock time? 4. How does the performance compare for 2 and 3 partitions? 5. Do you have a lot of domain interfaces (GGIs)? 6. Do other cases display similar behavior?


tony January 31, 2008 15:49

Re: speedup questions
Hi, CycLone,

Thank you for your reply.

1. I haven't done any real comparason runs on a clean machine like most previous posts did. I noticed that the CPU time for one iteration remains almost the same (maybe I am wrong here). So I just compared the CPU Seconds for one iteration on the same def file using different partitions.

2. The summed CPU-time for mesh partitioning: 19s; The CPU time for each iteration: 500s

3. For this run I stopped the solver at 17th iterations and wall clock time is 4 minutes longer than the cpu time (it wrote 4 *_full.bak files).

4. I will check this later. Before I tried once and I didn't see any advantage from parallel computing.

5. Yes, I do have a lot of GGI interfaces. And one of the interfaces has a lot of 2d regions. Is this the reason?

6. I haven't try other cases without interfaces.


Glenn Horrocks January 31, 2008 17:24

Re: speedup questions

Lots of GGI interfaces will reduce parallel efficiency. You may have lots of small domains, that will also parallelise poorly. Also writing results files parallelises poorly, if you stop writing results files it should improve.

Try the benchmark.def file which is located in the examples directory of CFX. Run that serial and parallel, you should get a reasonable speedup using that model. That would be a good test to see if the issue is the model or your computer setup.

Glenn Horrocks

tony February 1, 2008 11:58

Re: speedup questions
Hi, Glenn,

Yes, that benchmark gives me reasonable speedup.

partition: speedup

4 2.58

3 2.32

2 1.78

I think I have to minimize or stop writing backup files.


Glenn Horrocks February 3, 2008 18:26

Re: speedup questions

It looks like the backup files are the cause then. Sounds like the simulation is spending more time writing backup files then solving - you will get a big speedup by writing less backup files.

Glenn Horrocks

All times are GMT -4. The time now is 15:05.