CFD Online Discussion Forums

CFD Online Discussion Forums (
-   OpenFOAM Bugs (
-   -   HPMPI Infiniband problem (

carsten January 23, 2009 09:11

Hi there, I'm not sure if t
Hi there,

I'm not sure if this is a bug, but maybe...

When running Openfoam on our cluster (HP-Mpi with Infiniband interconnects) I have the problem that only _small_ cases work correctly. For larger cases the job fails immediately:

[snipped ...]

Pstream initialized with:
floatTransfer : 0
nProcsSimpleSum : 0
commsType : nonBlocking

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

[29] IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>>(Istream&, List<t>&) : reading first tok
[29] file: IOstream at line 0.
[29] From function IOstream::fatalCheck(const char* operation) const
[29] in file db/IOstreams/IOstreams/IOcheck.C at line 73.
FOAM parallel run exiting

This behaviour was previously reported when porting to HPMPI was not yet finished ( Strangely, for me the problem suddenly appeared for a version of Openfoam that I compiled some time ago and that worked flawlessly. Thus I assume it has something to do with changes of the environment on the cluster on which Openfoam reacts, as the code itself was not changed. On the other hand, all other software on the cluster behaves normally, so there is probably no problem with the machine itself.

To complicate matters further, this problem only occurs if the Infiniband-stack is selected for mpi-communication. If I switch to TCPIP it works nicely, albeit slow.

Any help is appreciated


mattijs January 23, 2009 12:30

Your could try increasing MPI_
Your could try increasing MPI_BUFFER_SIZE. HPMPI might use the buffer space differently.

carsten January 25, 2009 16:36

Thanks Mattijs. It works ag
Thanks Mattijs.

It works again now. But not due to MPI_BUFFER_SIZE (HPMPI reports explicitly if it is too small), but due to some other event I don't know about. It must be the phase of the moon or the like, because suddenly all versions run again, both for me and for a colleague. To be honest, I could puke. I spent three days hunting a ghost and still don't know what happened. Hope this won't happen again

Many thanks for your time,


All times are GMT -4. The time now is 01:25.