CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Programming & Development

Problems running on multiple nodes in a cluster

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 14, 2019, 07:38
Default Problems running on multiple nodes in a cluster
  #1
New Member
 
naveen k s
Join Date: Sep 2014
Posts: 6
Rep Power: 11
naveen k s is on a distinguished road
dear Foamers,


I'm trying to mesh Ahmed body geometry using the snappyHexMesh utility. I am using the following command to run on two different nodes:


mpirun --hostfile /home/p20170004/mesh3forlowyplus/mpi_machines -np 24 /opt/apps/openfoam/5.0/OpenFOAM-5.x/bin/foamExec snappyHexMesh -parallel.


I end up with the following error:



/opt/apps/openfoam/5.0/OpenFOAM-5.x/etc/config.sh/mpi: line 46: mpicc: command not found
/opt/apps/openfoam/5.0/OpenFOAM-5.x/etc/config.sh/mpi: line 46: mpicc: command not found
snappyHexMesh: symbol lookup error: /opt/apps/openmpi/intel-built/lib/libopen-pal.so.20: undefined symbol: _intel_fast_memmove
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[62587,1],7]
Exit code: 127



All the necessary .txt files and decomposparDict have been created properly.


However if i run the command:


mpirun --hostfile /home/p20170004/mesh3forlowyplus/mpi_machines -np 24 /opt/apps/openfoam/5.0/OpenFOAM-5.x/bin/foamExec snappyHexMesh


it works fine.


I want to use the parallel computing to speed up my work. I'm trying to solve it from the past 15 days without any success.


I'm in need of help desperately.
naveen k s is offline   Reply With Quote

Old   February 14, 2019, 11:05
Default
  #2
Senior Member
 
Andrew Somorjai
Join Date: May 2013
Posts: 175
Rep Power: 12
massive_turbulence is on a distinguished road
Quote:
Originally Posted by naveen k s View Post
dear Foamers,


I'm trying to mesh Ahmed body geometry using the snappyHexMesh utility. I am using the following command to run on two different nodes:


mpirun --hostfile /home/p20170004/mesh3forlowyplus/mpi_machines -np 24 /opt/apps/openfoam/5.0/OpenFOAM-5.x/bin/foamExec snappyHexMesh -parallel.


I end up with the following error:



/opt/apps/openfoam/5.0/OpenFOAM-5.x/etc/config.sh/mpi: line 46: mpicc: command not found
/opt/apps/openfoam/5.0/OpenFOAM-5.x/etc/config.sh/mpi: line 46: mpicc: command not found
snappyHexMesh: symbol lookup error: /opt/apps/openmpi/intel-built/lib/libopen-pal.so.20: undefined symbol: _intel_fast_memmove
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[62587,1],7]
Exit code: 127



All the necessary .txt files and decomposparDict have been created properly.


However if i run the command:


mpirun --hostfile /home/p20170004/mesh3forlowyplus/mpi_machines -np 24 /opt/apps/openfoam/5.0/OpenFOAM-5.x/bin/foamExec snappyHexMesh


it works fine.


I want to use the parallel computing to speed up my work. I'm trying to solve it from the past 15 days without any success.


I'm in need of help desperately.
Well... when I had a problem like this it wasn't related to snappyhexmesh and instead it was either because my directories for openfoam and mpi weren't the same on two or more machines or there was a problem with a build.

Basically I rebuilt openfoam on two separate machines in the same directory and that for some reason worked. What does your bashrc look like on both machines, even though I think it only matters on the master node. You need to include your mpi libs, but given that mpirun works with -np 24 then that's probably not the problem either.

I was also using only blockmesh or unv files to foam when I finally got it to work so it might be snappyhexmesh.
massive_turbulence is offline   Reply With Quote

Old   February 14, 2019, 12:23
Default
  #3
New Member
 
naveen k s
Join Date: Sep 2014
Posts: 6
Rep Power: 11
naveen k s is on a distinguished road
Thank-you for you reply. So, you are suggesting me to recompile openfoam on the two nodes separately. Also the directory should be same for openmpi and openfoam.???. Also I would like to mention that problem arises only when I want to run in parallel.
naveen k s is offline   Reply With Quote

Old   February 14, 2019, 17:36
Default
  #4
Senior Member
 
Andrew Somorjai
Join Date: May 2013
Posts: 175
Rep Power: 12
massive_turbulence is on a distinguished road
Quote:
Originally Posted by naveen k s View Post
Thank-you for you reply. So, you are suggesting me to recompile openfoam on the two nodes separately. Also the directory should be same for openmpi and openfoam.???. Also I would like to mention that problem arises only when I want to run in parallel.
You can compile openfoam on one machine as well but make sure you copy it to the other computer to the same exact directory and also make sure it's compiled in the same directory you use it on, along with 3rd party tools.

The way I have openfoam setup is I use NFS to mirror the directory from one machine and I cluster around that. So I just simply mount the openfoam directory off a flash drive and then I export that using exportfs.

The clustered nodes then mount that nfs directory to mnt/external/openfoam

and then the bashrc sees this as the way it was built originally in the external folder. I didn't use the home/user/ folder for mine because I didn't have enough room.

The original openfoam build was built on a flash drive but that was mounted on the /mnt/external directory as well, but the build for it was made with my slave node compiling openfoam/3rd party on the server/master node.

I'm not sure all these precautions for directories and building are necessary but I had lots of trouble with being careless and decided to try it this way. It takes less space on a LAN server to run openfoam is all.

EDIT
I have a suggestion for you though before you trash your installation, try running a foamJob using a simply blockmesh based example with interfoam or something using two machines. See if that works. If not it's probably a bad install, openfoam has a shell file that contains all the directories but I've never been able to make sense of all of it so for the sake of time ( with 8 cores) I just rebuild the thing in 20 minutes.
massive_turbulence is offline   Reply With Quote

Reply

Tags
multiple nodes, parallel computing


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem with interfoam on multiple nodes gkara OpenFOAM Running, Solving & CFD 1 July 6, 2016 09:17
MPI code on multiple nodes, scalability and best practice t.teschner Hardware 0 October 7, 2014 05:07
problem about running parallel on cluster killsecond OpenFOAM Running, Solving & CFD 3 July 23, 2014 21:13
[General] Running ParaView on multiple cores flotus1 ParaView 6 January 20, 2014 03:22
Running in parallel on multiple nodes kalyangoparaju OpenFOAM 5 January 18, 2012 11:36


All times are GMT -4. The time now is 05:02.