CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Installation (https://www.cfd-online.com/Forums/openfoam-installation/)
-   -   [OpenFOAM.com] Openfoam solvers and snappyHexMesh stopped working in parallel after update (https://www.cfd-online.com/Forums/openfoam-installation/232534-openfoam-solvers-snappyhexmesh-stopped-working-parallel-after-update.html)

KTG December 17, 2020 21:24

Openfoam solvers and snappyHexMesh stopped working in parallel after update
 
Hi everyone,


I recently updated to Fedora 33, and no suddenly I get this error whenever I run snappy or an openFoam solver in parallel:


Quote:

--> FOAM FATAL IO ERROR:
[4] error in IOstream "IOstream" for operation Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::List<T>&) [with T = Foam::Vector<double>]
[4]
[4] file: IOstream at line 0.
[4]
[4] From bool Foam::IOstream::fatalCheck(const char*) const
[4] in file db/IOstreams/IOstreams/IOstream.C at line 63.

:(:confused::mad::mad::mad::mad::mad::mad::mad:

I have no idea what this means, or why it is happening. simpleFoam and snappyHexMesh produce the same error! Things still work in serial for some reason. I have tried updating, restarting, running various tutorial files, and keep getting the same behavior. I am not even 100% sure this is an installation issue. I removed the OF binary with dnf, and reinstalled it. Could this be a bug? If it is, it probably can't be reproduced off my computer.



Any help would be much appreciated.


Thanks

KTG December 17, 2020 23:04

I should clarify what I mean by "suddenly". The programs start ok, and then crash after a short time. For example, snappyHexMesh gets to Feature refinement iteration 2 before it throws the error, which is very strange.

KTG December 18, 2020 19:43

This is so insanely frustrating! I tried on a smaller casefile to see what would happen, and get a slightly different IO error!




Code:

[4] --> FOAM FATAL IO ERROR: 
[4] incorrect first token, expected '(', found on line 0: punctuation '['
[4] 
[4] file: IOstream at line 0.
[4] 
[4]    From Foam::Istream& Foam::operator>>(Foam::Istream&, Foam::List<T>&) [with T = Foam::Pair<int>]
[4]    in file lnInclude/ListIO.C at line 153.
[4] 
FOAM parallel run exiting


Here, snappy got further along than before, I think because the mesh is smaller. Its as if something is getting overloaded, causing the crash, and then it sends out and unrelated error from whatever part of the code was currently running.


It gives different behavior every time I run it! I just ran it again and snappy made it all the way through, again, and it crashes during castellated, then again and it crashes during the snap! What could possibly cause this?!

olesen December 19, 2020 11:26

Could be mpi-related.

KTG December 19, 2020 22:51

I had that thought, and did the brainless thing of just reinstalling it - no change. I am not an expert on MPI, so if anyone has an ideas of how to diagnose, let me know!



Thanks

EternalSeekerX January 24, 2021 00:18

I am having the same issue but on Fedora 34
 
Quote:

Originally Posted by KTG (Post 791140)
I had that thought, and did the brainless thing of just reinstalling it - no change. I am not an expert on MPI, so if anyone has an ideas of how to diagnose, let me know!
Thanks

It definitely seems like an mpi issue with the mpi version in fedora 33/34. Best bet is to try to purge and reinstall openmpi and then rerun Allwmake to see if it works. Or go into prefs file to tell Allwmake to use the openmpi given with the third party folder.

KTG January 25, 2021 15:53

Thanks for the info. I ended up tearing everything down and switching to CentOS to appease IT. Hopefully it is helpful to other Fedora people out there...

EternalSeekerX January 26, 2021 00:52

Seems like openmpi in F33 has bugs
 
Quote:

Originally Posted by KTG (Post 794419)
Thanks for the info. I ended up tearing everything down and switching to CentOS to appease IT. Hopefully it is helpful to other Fedora people out there...

Okay I just build myself the latest OpenMPI (v4.1.0) and it works fine with all openfoam I have (namely 7,8, 1912 and 2012). I think the OpenMPI v 4.04 is bugged

olesen January 26, 2021 15:02

Quote:

Originally Posted by EternalSeekerX (Post 794439)
I think the OpenMPI v 4.04 is bugged

Thanks for the heads up.

EternalSeekerX February 14, 2021 03:14

An update
 
Quote:

Originally Posted by olesen (Post 794525)
Thanks for the heads up.

It seems the update to OpenMPI v4.0.5 fixed the issue, either that or me compiling OF8 inside a Fedora 32 container and then transferring it to f34 did the trick?

Edit: Seems like a F33 issue, works flawlessly inside F32 docker I created.


All times are GMT -4. The time now is 22:21.