CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   interFoam process forking on HPC using PBS (https://www.cfd-online.com/Forums/openfoam-solving/127187-interfoam-process-forking-hpc-using-pbs.html)

JFM December 5, 2013 07:44

interFoam process forking on HPC using PBS
 
Good day All

I have an interFoam model running on a HPC (Barrine cluster) with PBS scripting. When checking on the model I find that the command:
Code:

ptree -ap 15754
generates the following output
Code:

b07b05:~ # pstree -ap 15754
mpirun,15754 -np 16 interFoam -parallel
  ├─interFoam,15756 -parallel
  │  ├─{interFoam},15763
  │  └─{interFoam},15766
  ├─interFoam,15757 -parallel
  │  ├─{interFoam},15765
  │  └─{interFoam},15767
  ├─interFoam,15758 -parallel
  │  ├─{interFoam},15761
  │  └─{interFoam},15764
  └─interFoam,15759 -parallel
      ├─{interFoam},15760
      └─{interFoam},15762

I interpret this as two PID are generated for each node - from what I understand this is forking and it is bad. Currently my models are running extremely slowly. Below is the PBS script I am using
Code:

#!/bin/bash -l
# PBS directives

#PBS -N iF32ndslam1C
#PBS -l select=8:ncpus=4:mpiprocs=4:mem=4gb
#PBS -l walltime=336:00:00
#PBS -j oe
#PBS -A XXXXXXXXXXXXXXXXXX
#PBS -m abe
#PBS -M XXXXXXXXXXXXXXXXX

# load modules
module load python
module load openfoam/2.2.1
source /sw/OpenFOAM/2.2.1/OpenFOAM-2.2.1/etc/bashrc

# PBS-created environment variables & directories
export JOBWORK1=/work1/$USER/$PBS_JOBID-results
mkdir -p $JOBWORK1

echo 'Working directory: '$PBS_O_WORKDIR
echo 'Network storage: '$JOBWORK1
echo 'Temporary / scratch directory: '$TMPDIR

# go to work1 job directory
cd $JOBWORK1
cp -r $PBS_O_WORKDIR/* $JOBWORK1

# carry out computations
echo 'Working on node: '$(hostname)

# Execute scipy tests
python -c "import scipy ; scipy.test()" > scipy_test.out 2>&1

# Mesh the geometry
# blockMesh 2>&1 | tee -a $JOBWORK1/output.log

# Refine mesh to improve speed
# refineMesh 2>&1 | tee -a $JOBWORK1/output.log
# renumberMesh 2>&1 | tee -a $JOBWORK1/output.log

# Set the intial conditions
# setFields 2>&1 | tee -a $JOBWORK1/output.log

# Decompose the mesh for parallel run
decomposePar 2>&1 | tee -a $JOBWORK1/output.log 

# Run the solver
mpirun -np 32 interFoam -parallel 2>&1 | tee -a $JOBWORK1/output.log

# Reconstruct the parallel results
reconstructPar 2>&1 | tee -a $JOBWORK1/output.log

# Extract results (all for turbulent)
sample 2>&1 | tee -a $JOBWORK1/output.log
foamCalc UPrimeToMean 2>&1 | tee -a $JOBWORK1/output.log
Co 2>&1 | tee -a $JOBWORK1/output.log
flowType 2>&1 | tee -a $JOBWORK1/output.log
Lambda2 2>&1 | tee -a $JOBWORK1/output.log
Mach 2>&1 | tee -a $JOBWORK1/output.log
Pe 2>&1 | tee -a $JOBWORK1/output.log
Q 2>&1 | tee -a $JOBWORK1/output.log
streamFunction 2>&1 | tee -a $JOBWORK1/output.log
uprime 2>&1 | tee -a $JOBWORK1/output.log
vorticity 2>&1 | tee -a $JOBWORK1/output.log
createTurbulenceFields 2>&1 | tee -a $JOBWORK1/output.log
R 2>&1 | tee -a $JOBWORK1/output.log
# weight 2>&1 | tee -a $JOBWORK1/output.log
# weights 2>&1 | tee -a $JOBWORK1/output.log

It has been suggested by IT to either:
  • remove the -parallel switch from the function call (does not appear to work), or
  • change the number of CPU's to match the number of nodes being called (???)

Has anyone come across this previously and how did you resolve the issue. Any assistance will be greatly appreciated.

Kind regards
JFM
:)

dkingsley December 5, 2013 11:59

I think you need to add the hosts that PBS is allocated, I usually do something like this:

#!/bin/bash
#PBS -l nodes=16
#PBS -q PEM610
#PBS -V

. /apps/OpenFOAM/OpenFOAM-2.2.x/etc/bashrc

cd $FOAM_RUN

cp $PBS_NODEFILE ./CaseName/system/machines

mpirun -n 16 -hostfile ./CaseName/system/machines interFoam -case Casename -parallel > CaseName.log 2>&1

JFM February 4, 2014 08:49

[SOLVED] HPC PBS Script Working
 
Thank you dkingsley - I tried your recommendation and the HPC is now performing satisfactorily, with no forking messages. This issue was also partially related to PCG solver and now using GAMG.

:D


All times are GMT -4. The time now is 03:17.