CFD Online Discussion Forums - Strange behaviour 1.6 decomposePar vs 1.7 decomposePar

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- OpenFOAM (https://www.cfd-online.com/Forums/openfoam/)

- - Strange behaviour 1.6 decomposePar vs 1.7 decomposePar (https://www.cfd-online.com/Forums/openfoam/82742-strange-behaviour-1-6-decomposepar-vs-1-7-decomposepar.html)

Strange behaviour 1.6 decomposePar vs 1.7 decomposePar

Hi all,
I'm looking into the performance of our mods to OF and have come across a strange artifact. Running a job under 1.6 with, say, 32 processors, and 5000 particles results in basically results for 5000 particles after decomposePar -> solver -> reconstructPar.

When I run the same job under our code ported to 1.7, once again with 32 processors and 5000 particles, I get results for 156982 particles, basically 160K particles. Looking at the results for 1, 2, 4, 8, 16, 32 processors gives me 5K, 10K, 20K, 40K, 80K & 160K particles respectively. It seems that every time I double the number of processors, I double the number of particles I get results for.

I'm still a bit green with OF so before I attempt to delve deeper into the internal workings, I just wanted to know if there were any major changes in going from 1.6 -> 1.7 with respect to decomposePar/reconstructPar? I've diff'd the decomposeParDict files under both environments and they are the same.

Any thoughts, greatly appreciated
Andrew

Greetings Andrew,

But what I've read in this forum, there have been some changes in simulations with particles, although I don't know the extent of the changes. Also I think I read something in the bug reports about particles. So I have a few questions and a suggestion...

Questions:

How exactly do you execute "decomposePar -> solver -> reconstructPar"? Do you always run the solver with the additional argument "-parallel"?
Do you use the exact same case to run in both OpenFOAM versions? Or do you need to adapt any boundary conditions or simulation characteristics?
Have you tested running the tutorials/applications on which you have based your code and simulations?

Suggestion: try with OpenFOAM 1.7.1 and/or 1.7.x and see if the error still occurs. If with 1.7.x the error still occurs, you might want to report this as a bug to http://www.openfoam.com/bugs/

Best regards and good luck!
Bruno

Bruno,
Thanks for the reply. In answer to your questions:

Background:
I generate model files and job files on my linux box then upload onto a supercomputer, xe, here in Perth where jobs are queued then run.

1) After uploading my model onto xe, I ssh into it and then manually run decomposePar from an ssh shell. I then set up an appropriate job file for the queuing system, place the job in the queue then come back later. Then once again via an ssh shell I run reconstructPar which rebuilds my job.

2) The model is the same in both cases. The only changes I have to make is in the job file which is for the queuing system, therefore not relevant to OF.

3) Haven't tried the tutorials yet. I'm currently in the middle of writing some stuff up so time is tight right at the moment, but I will be able to check things in a day or two.

So that I can narrow the search down a bit, decomposePar or solver or reconstructPar, what files should I look at to see if things are the correct size. For example decomposePar in both versions should break the model down so that certain files are the same size etc. If under 1.7 a file is many times bigger than under 1.6, then decomposePar might be the culprit, or similarly what files would point to reconstructPar being the culprit?

Andrew

Hi Andrew,

Quote:

Originally Posted by BlueyTheDog (Post 286040)

1) After uploading my model onto xe, I ssh into it and then manually run decomposePar from an ssh shell. I then set up an appropriate job file for the queuing system, place the job in the queue then come back later. Then once again via an ssh shell I run reconstructPar which rebuilds my job.

Mmmm, looks OK, but you didn't say how you launch the solver ;) I.e., are you sure that you use the "-parallel" argument when you call the solver!?

Quote:

Originally Posted by BlueyTheDog (Post 286040)

2) The model is the same in both cases. The only changes I have to make is in the job file which is for the queuing system, therefore not relevant to OF.

OK, then boundary conditions could have changed a bit between versions. The best way to verify this would be to try to decompose a smaller version of that case with a "normal PC" (or virtual machine), instead of using the supercomputer. But then again, I'm not familiar with the OpenFOAM's particle mechanism.

Quote:

Originally Posted by BlueyTheDog (Post 286040)

3) Haven't tried the tutorials yet. I'm currently in the middle of writing some stuff up so time is tight right at the moment, but I will be able to check things in a day or two.

When in doubt, always check/test the reference cases from the tutorials, because they are usually properly updated to keep up with OpenFOAM's internal changes. And also compare the same tutorial with the different OpenFOAM versions, since there might have been a bug fixed or another introduced between versions.

Quote:

Originally Posted by BlueyTheDog (Post 286040)

So that I can narrow the search down a bit, decomposePar or solver or reconstructPar, what files should I look at to see if things are the correct size. For example decomposePar in both versions should break the model down so that certain files are the same size etc. If under 1.7 a file is many times bigger than under 1.6, then decomposePar might be the culprit, or similarly what files would point to reconstructPar being the culprit?

Sadly, I have no clue of which files have the particle count :( And since I don't know the reference tutorial cases, I can't check it myself.
But I think that the size of the files might not increase, depending on how the information of the particle count is stored; if it's stored in a particle count per cell, you will have the same sizes of files. Using diff should prove to be more efficient, once you know which file it is; assuming of course that the mesh is kept identical.
The other hypothesis would be to create an utility application that is a stripped version of your solver, that reads the particle lists and outputs the counts. I haven't seen such utility in OpenFOAM, but it should be pretty simple to make.

Best regards and good luck!
Bruno

Further to Bruno's questions:

Quote:

Originally Posted by wyldckat (Post 286103)

Hi Andrew,

Mmmm, looks OK, but you didn't say how you launch the solver ;) I.e., are you sure that you use the "-parallel" argument when you call the solver!?

Parallel is in place. The job files are of the format:

Quote:

#PBS -l walltime=20:00:00
#PBS -l nodes=8:ppn=8

module load openmpi/1.4.2-gcc4.4.4
export FOAM_INST_DIR=/scratch/af01/OpenFOAM-src
foamDotFile=$FOAM_INST_DIR/OpenFOAM-1.7.0/etc/bashrc
[ -f $foamDotFile ] && . $foamDotFile

cd /scratch/af01/alowe/1.7_Parallel/800nm_64
mpirun -np 64 dustParticleFoam -parallel </dev/null >run.log 2>error.log

Quote:

Originally Posted by wyldckat (Post 286103)

I've run the same file, in fact a range of files, under both 1.6 and 1.7, on my PC and XE, on only a single core, hence no decomposePar/reconstructPar, and got results as expected so I think things are pointing to a problem with these two.

Quote:

Originally Posted by wyldckat (Post 286103)

When in doubt, always check/test the reference cases from the tutorials, because they are usually properly updated to keep up with OpenFOAM's internal changes. And also
...
...
[snip]
...
...
utility in OpenFOAM, but it should be pretty simple to make.

Thanks for the suggestions Bruno, when I get the report finished, hopefully today/tomorrow, I'll revisit these.

Regards,
Andrew

Well, it looks like the cause of the problem was me. Some work I had done with respect to injection of particles was the fault. I didn't have a good handle on when OF broke things up into parallel operation hence I had particles being injected on all cores rather than just the one.

Hi Andrew,

I'm glad you managed to solve the problem! And too bad it wasn't the machine's fault ;)
I do hate the sentence "the machine is always right"... although sometimes it can be comforting knowing that it should be unlikely that the machine is simply toying with our minds/feelings :D

By the way, did you end up creating the little utility for counting the number of particles? Or did you implement the feature directly into your solver?

Best regards,
Bruno

Bruno,
I didn't need to write the utility, I used a much more powerful debugging tool, the good old printf(). Just by placing a few output commands, I managed to work out what happens with respect to the parallelisation and then the penny dropped.

Andrew