CFD Online Discussion Forums - [PyFoam] Running multiple instances of solver using MPI and PyFoam

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- OpenFOAM Community Contributions (https://www.cfd-online.com/Forums/openfoam-community-contributions/)

- - [PyFoam] Running multiple instances of solver using MPI and PyFoam (https://www.cfd-online.com/Forums/openfoam-community-contributions/84244-running-multiple-instances-solver-using-mpi-pyfoam.html)

Running multiple instances of solver using MPI and PyFoam

Hello community,
I have a question for the UNIX, Python and MPI people among you:
I have written a Python script that does variation studies on a FOAM case. To save a lot of time (have to do 500+ variations) I want to run the solver simultaneously on multiple cores, computing the flow with one set of parameters per core. I run the script using 'mpirun -np 6 python myScript.py'. That spawns six processes called 'python' on my machine and everything is fine.
Let me outline the work that is done inside the script:
1) If I am the master process, set up the environment and create parameter sets.
2) send a set to each child-process
3) If I am a child process, receive a parameter set, do the calculation and send the result back to the master
4) Master process receives the result and does some output and post-processing.
5) send one of the remaining sets to the now idle process

That works beautifully parallel as long as "do the calculation" is just a local function in my script. But as soon as I replace it with "call simpleFoam and wait for it to finish" things get weird. I noticed that the system call automatically creates a new process that is usually on my master core (procID 0) and all the other processes wait for it to finish. So I get no parallelisation at all. I tried doing the call directly with Python's subprocess.Popen and tried using PyFoam.Execution.BasicRunner, but both ended with the same result.
I wonder if anyone can tell me how to run the call to OpenFOAM as a new thread inside my child process instead of as a new process on my master core.

I hope this is not too far off your road, although it might sound a bit complicated. Source chunks can be provided on request.

Thanks,
Björn

Greetings Björn,

:eek: This is quite interesting!!
Wait, then where does PyFoam enter in all of this? I guess probably in Pre/Post-processing?!

OK, lets then put OpenFOAM on the side for a bit and check some logistics details:

How are the Python scripts communicating with each other? Are they using MPI functions for the communication? Or are they simply using file transfer?
Each child has its own independent case folder to work with, correct?
What does each child in the foamless version of the scripts? Do they run some massive redundant calculations in Python, depending on the instructions given by the master?
Instead of running directly simpleFoam, try running a shell script that does nothing more than loop around doing some crazy thing, like generating junk data and gzip-ping that junk.
Or perhaps you can copy/move the contents of the dummy function to another python script and make each child call another python to run the new python script, but on command; you know, something like: python child -> sh crazy_script.sh -> python crazy_function.py
And don't forget to give the function some level of randomness, since each real simulation may be quicker to attain than others.

Mmm... OK, for now that's about it. My initial guess from your description is that the master process is confused about who is the master and who is the slave... so that's pretty much why I listed the items/questions above.

Best regards,
Bruno

Hi Bruno,
thanks for your reply. I tried a few things and can answer your questions.
1. There is only one pyhton script, but multiple instances of it. I use mpi4py to do the communication, so I check COMM_WORLD.MyRank() to decide if it's master or slave and communicate using MPI.send and MPI.recv. That works fine in the foamless version
2. This is correct. I clone the case in question and then let each child do the processing independently.
3. The children receive an instance of a class that contains all the information they need for setting up the case. They clone the case, alter the boundary conditions (works perfectly) and then call a member function of the class that should start simpleFoam (but doesn't).
4. When I replace the call to simpleFoam by a heap of calculation, it works fine. I realized, though, probably because I am calling a member of a class that was created by the master process and then sent to the child, that when I check the rank inside this member function it's always 0. That is odd.
5. I used pyFoam.Execution.Runner beacuse it looked as if that would create a new thread for a call to simpleFoam instead of creating a new process like subprocess.Popen. But the problem seems to be elsewhere, since it doesn't work even without using foam functions at all.

I recompiled OpenMPI with --enable-MPI-threads, which now gives me level 3 thread support (checked with MPI.Query_thread()). My current idea is that is has something to do with the member function I am calling lying in the address space of the master process only, because it was instantiated there. I will keep on trying different things, but if you have any ideas, feel free to post them here.

Thanks,
Björn

EDIT: Obviously enabling MPI_threads in OpenMPI did the trick. Plus I found an odd call to the calculation function hidden deep inside my master process that messed up my observations slightly. Now I got it running and I can run N-1 instances of simpleFoam at the same time using MPI and pyFoam. Communication compared to the actual simulation is negligible, so I get a speed-up of nearly N-1. Delightful!

Hi Björn,

I have a feeling that there is something very twisted about the whole thing... namely it feels like it's the Master script who is doing the whole work and the slaves are just laying around doing nothing.
I say this because I can't figure out why MPI Threads fixed the problem, unless it's the master who needs the threads itself. Otherwise, each process slave should be more than fully able to do the business on their own.

What it does come to mind is that you have a locking communication barrier that makes everyone wait for someone to do their own job!

Either way, if you have 6 cores and want to get things done quickly and don't mind having a sluggish computer, try launching with "mpirun -np 7" instead of six, since the master only distributes tasks; additionally, when you call simpleFoam, give it a lesser runtime priority. I don't know how this is done exactly in Python, but it should be easily possible to do so!

Best regards,
Bruno