CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

big difference between clockTime and executionTime

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes

Reply
 
LinkBack Thread Tools Display Modes
Old   July 31, 2013, 17:21
Default big difference between clockTime and executionTime
  #1
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Dear all,

I'm trying to ran a DES simulation using a model with 30 million cells. I'm using a cluster with 6 processors 12 cores each one. I have noticed that there is a big difference between clockTime and executionTime

ExecutionTime = 164.45 s ClockTime = 1000 s

I've used the scotch method to decompose the domain. Furthermore the system uses infiniband so I don't think it's a connection problem. Anyone can help me? thank you

best regards
Luca
LM4112 is offline   Reply With Quote

Old   August 1, 2013, 07:09
Default
  #2
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
bump....any help??
LM4112 is offline   Reply With Quote

Old   August 1, 2013, 07:39
Default
  #3
Senior Member
 
Laurence R. McGlashan
Join Date: Mar 2009
Posts: 370
Rep Power: 14
l_r_mcglashan will become famous soon enough
Is there a lot of IO? What was the output of decomposePar?
__________________
Laurence R. McGlashan :: Website
l_r_mcglashan is offline   Reply With Quote

Old   August 1, 2013, 09:33
Default
  #4
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by l_r_mcglashan View Post
Is there a lot of IO? What was the output of decomposePar?
I'm sorry but I'm not sure what you mean for IO. By the way I've attached the decompose output.

Furthermore I have an other strange problem, after few steps on one of the nodes the program has tried to use more than 12 cpus.

=>> PBS: job killed: ncpus 12.92 exceeded limit 12 (sum)

do you know now to fix this problem as well? thank you

best regards
Luca
Attached Files
File Type: pdf decompose output copy .pdf (60.6 KB, 22 views)
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 07:36
Default
  #5
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
Greetings to all!

@Luca: What Laurence means by IO is hardware related Input-Output, namely the time spent in reading and writing files, as well as data exchange between machines.

Now, according to the PDF file you attached, the case was decomposed into 72 processors, not just 12!

As for the big time discrepancy, there is also the possibility that you are over-scheduling the machines (more applications running than there are cores available). In other words, there might be more applications running on the cluster, along side your own run. This would explain the big time discrepancy.
In addition, if we do the math: 1000/164 ~= 6.1, this means that there are probably 6 times more processes running than there are cores available... which makes some sense, since 6*12 = 72.

So, my guess is that the job is being incorrectly scheduled on the cluster.

Best regards,
Bruno
wyldckat is online now   Reply With Quote

Old   August 16, 2013, 07:50
Default
  #6
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Greetings to all!

@Luca: What Laurence means by IO is hardware related Input-Output, namely the time spent in reading and writing files, as well as data exchange between machines.

Now, according to the PDF file you attached, the case was decomposed into 72 processors, not just 12!

As for the big time discrepancy, there is also the possibility that you are over-scheduling the machines (more applications running than there are cores available). In other words, there might be more applications running on the cluster, along side your own run. This would explain the big time discrepancy.
In addition, if we do the math: 1000/164 ~= 6.1, this means that there are probably 6 times more processes running than there are cores available... which makes some sense, since 6*12 = 72.

So, my guess is that the job is being incorrectly scheduled on the cluster.

Best regards,
Bruno
Dear Bruno,

thanks for the reply.
The system that I'm using has 12 processors for each node. If I use only one node, everything is fine. I would like to use 6 nodes (72 processors) then I decomposed the domain in 72 subdomains. I'm sure that is actually using 72 processors as in the output I have somthing like this:

nProcs : 16
Slaves :
15
(
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16833"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16834"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16835"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16836"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16837"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16838"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16839"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16840"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16841"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16842"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16843"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16844"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16845"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16846"
"cx1-9-2-2.cx1.hpc.ic.ac.uk.16847"
)

This outuput is relatet to 1 node with 16 processors, when I try to use 6 nodes with 12 processors each one I have the same output a list of 72 slaves, then I think that the system is actually using 72 processors. What it happens is:

1) clockTime much greater that executionTime

2)after few steps on one of the nodes the program has tried to use more than 12 cpus.

=>> PBS: job killed: ncpus 12.92 exceeded limit 12 (sum)

The HPC responsible told be that for him the second problem is an OpenFOAM bug and that I scheduled the job in the correct way.

I hope that I have explained the situation more clearly.

best regards

Luca
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 07:59
Default
  #7
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
Hi Luca,

Well, your description continues to indicate that the processes are all being launched on the same machine.
To confirm this, what's the output for when you use 72 processors, namely regarding the "nProcs" and "Slaves" output? (and please use the [CODE] marker, as explained in the second link on my signature, for posting the more than 72 lines of output )

In addition, how exactly was OpenFOAM installed on the cluster and with which MPI toolbox?
Or in other words, is the installed OpenFOAM using the cluster's MPI? Or using OpenFOAM's Open-MPI version?

Best regards,
Bruno
wyldckat is online now   Reply With Quote

Old   August 16, 2013, 08:20
Default
  #8
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Hi Luca,

Well, your description continues to indicate that the processes are all being launched on the same machine.
To confirm this, what's the output for when you use 72 processors, namely regarding the "nProcs" and "Slaves" output? (and please use the [CODE] marker, as explained in the second link on my signature, for posting the more than 72 lines of output )

In addition, how exactly was OpenFOAM installed on the cluster and with which MPI toolbox?
Or in other words, is the installed OpenFOAM using the cluster's MPI? Or using OpenFOAM's Open-MPI version?

Best regards,
Bruno
Dear Bruno,

I have attached the output and the run script that I used to launch the simulation.
I am using the hpc servers of the Imperial College, they have a list of modules available. As you can see from the run script, I have loaded the modules "open foam/2.1.1" and openmpi libraries.

best regards
Luca

p.s.: thank you also for the other replies!!
Attached Files
File Type: pdf run_script.pdf (17.8 KB, 33 views)
File Type: pdf simulation_output.pdf (24.2 KB, 23 views)
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 08:48
Default
  #9
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
Hi Luca,

Well, the 72 processes are assigned only to the machine "cx1-2-2-1", according to the output.

OK, I did some researching online and since there are several PBS schedulers, I found the following:
From what I could find, the last two links are the most revealing, where it looks like you're missing the entry "mpiprocs", which constrains the number of processes per node:
Code:
#PBS -l select=6:ncpus=12:mpiprocs=12:icib=true:mem=45000mb
My guess is that since CX1 is being upgraded, the admins are not familiar with the new cluster settings, which would explain why "mpiprocs" is omitted in the official instructions: http://www3.imperial.ac.uk/ict/servi.../cx1%20changes

Best regards,
Bruno
wyldckat is online now   Reply With Quote

Old   August 16, 2013, 08:59
Default
  #10
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Hi Luca,

Well, the 72 processes are assigned only to the machine "cx1-2-2-1", according to the output.

OK, I did some researching online and since there are several PBS schedulers, I found the following:
From what I could find, the last two links are the most revealing, where it looks like you're missing the entry "mpiprocs", which constrains the number of processes per node:
Code:
#PBS -l select=6:ncpus=12:mpiprocs=12:icib=true:mem=45000mb
My guess is that since CX1 is being upgraded, the admins are not familiar with the new cluster settings, which would explain why "mpiprocs" is omitted in the official instructions: http://www3.imperial.ac.uk/ict/servi.../cx1%20changes

Best regards,
Bruno
Dear Bruno,

thank you a lot, you are very helpful. I will try to change the run script and I'll let you know.

best regards
Luca
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 10:40
Default
  #11
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Hi Luca,

Well, the 72 processes are assigned only to the machine "cx1-2-2-1", according to the output.

OK, I did some researching online and since there are several PBS schedulers, I found the following:
From what I could find, the last two links are the most revealing, where it looks like you're missing the entry "mpiprocs", which constrains the number of processes per node:
Code:
#PBS -l select=6:ncpus=12:mpiprocs=12:icib=true:mem=45000mb
My guess is that since CX1 is being upgraded, the admins are not familiar with the new cluster settings, which would explain why "mpiprocs" is omitted in the official instructions: http://www3.imperial.ac.uk/ict/servi.../cx1%20changes

Best regards,
Bruno
Dear Bruno,

I have tried to use mpiprocs=12 but it gives the same error:

=>> PBS: job killed: ncpus 13.36 exceeded limit 12 (sum)
-bash: line 1: 27581 Terminated /var/spool/PBS/mom_priv/jobs/5074103.cx1b.SC
mpirun: abort is already in progress...hit ctrl-c again to forcibly terminate

--------------------------------------------------------------------------
mpirun noticed that process rank 54 with PID 27877 on node cx1-2-3-1.cx1.hpc.ic.ac.uk exited on signal 15 (Terminated).



Best regards,
Luca
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 10:50
Default
  #12
Senior Member
 
Laurence R. McGlashan
Join Date: Mar 2009
Posts: 370
Rep Power: 14
l_r_mcglashan will become famous soon enough
Do you not need to provide a machinefile to mpirun? Normally this is an environment variable filled by the cluster.

Is there not a sysadmin at Imperial that you can ask for help?
__________________
Laurence R. McGlashan :: Website
l_r_mcglashan is offline   Reply With Quote

Old   August 16, 2013, 11:07
Default
  #13
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
@Laurence - quoting Luca:
Quote:
Originally Posted by LM4112 View Post
The HPC responsible told be that for him the second problem is an OpenFOAM bug and that I scheduled the job in the correct way.
@Luca: I agree with Laurence, it looks like the PBS isn't working properly with mpirun. Some more searching and I found this link: http://tsubame.gsic.titech.ac.jp/doc...ml/queues.html
It refers to the explicit usage of:
Code:
-hostfile $PBS_NODEFILE
For example, you can try using:
Code:
mpirun -np 72 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel
If this doesn't work, try figuring out which variable has the machine list or machine file, by adding to the job script:
Code:
export > /full/path/to/your/work/folder/snooping_around.txt
The resulting "snooping_around.txt" file should have a long list of environment variables, including any references to PBS or hosts or machines and so on.
wyldckat is online now   Reply With Quote

Old   August 16, 2013, 11:09
Default
  #14
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by l_r_mcglashan View Post
Do you not need to provide a machinefile to mpirun? Normally this is an environment variable filled by the cluster.

Is there not a sysadmin at Imperial that you can ask for help?
The sysadmin says that it is a bug of OpenFOAM, now I'm trying to use a different version(so far I've used 2.1, now the version 1.6 is running). By the way I think that it is quite improbable that it's a bug as I didn't found the same kind of error reported anywhere. And it is strange as well that it gives me that error as a friend of mine, using the same command lines but not OpenFOAM, is managing to run DNS simulations with 6 nodes.

I am sorry but I don't know what is a machinefile to mpirun

best regards
Luca
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 11:22
Default
  #15
Member
 
Luca
Join Date: Mar 2013
Posts: 59
Rep Power: 4
LM4112 is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
@Laurence - quoting Luca:


@Luca: I agree with Laurence, it looks like the PBS isn't working properly with mpirun. Some more searching and I found this link: http://tsubame.gsic.titech.ac.jp/doc...ml/queues.html
It refers to the explicit usage of:
Code:
-hostfile $PBS_NODEFILE
For example, you can try using:
Code:
mpirun -np 72 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel
If this doesn't work, try figuring out which variable has the machine list or machine file, by adding to the job script:
Code:
export > /full/path/to/your/work/folder/snooping_around.txt
The resulting "snooping_around.txt" file should have a long list of environment variables, including any references to PBS or hosts or machines and so on.
I am sorry I'm not sure if I have understood correctly, I have to write -hostfile $PBS_NODEFILE every time I use mpirun?

I mean, I have to write

Code:
mpirun -np 72 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel
and

Code:
 mpirun -np 72 -hostfile $PBS_NODEFILE pisoFoam -parallel >/work/sb3712/Luca/32mm_DES_bo/simulation_bo.log
best regards,

Luca
LM4112 is offline   Reply With Quote

Old   August 16, 2013, 11:30
Default
  #16
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
Quote:
Originally Posted by LM4112 View Post
I mean, I have to write

Code:
mpirun -np 72 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel
and

Code:
 mpirun -np 72 -hostfile $PBS_NODEFILE pisoFoam -parallel >/work/sb3712/Luca/32mm_DES_bo/simulation_bo.log
Yes, that's the idea!

And in case it doesn't work, call that other export command before using the first mpirun:
Code:
export > /work/sb3712/Luca/32mm_DES_bo/snooping_around.txt

mpirun -np 72 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel
wyldckat is online now   Reply With Quote

Old   January 10, 2014, 15:52
Default
  #17
Member
 
Join Date: Dec 2009
Posts: 45
Rep Power: 7
katakgoreng is on a distinguished road
Hi guys,

I had pretty much the same problem as Luca (btw, I'm also from Imperial).
I can't seemed to get more than 1 node running on the cluster (1 node = 16 cores).
I tried running the parallel the dam break case on the cluster using 2 nodes.
After meshing and decomposing the domain,

running,
Code:
mpirun -np 32 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel > log.renumberMesh 2>&1
gives,
Code:
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.2.x                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.2.x-0ee7dc546f1b
Exec   : renumberMesh -overwrite -parallel
Date   : Jan 10 2014
Time   : 19:27:02
Host   : "cx1-11-2-1.cx1.hpc.ic.ac.uk"
PID    : 20276
Case   : /tmp/pbs.6213702.cx1b/damBreak32
nProcs : 32
Slaves : 
31
(
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20277"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20278"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20279"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20280"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20281"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20282"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20283"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20284"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20285"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20286"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20287"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20288"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20289"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20290"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20291"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.24996"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.24997"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.24998"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.24999"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25000"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25001"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25002"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25003"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25004"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25005"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25006"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25007"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25008"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25009"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25010"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25011"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

[25] #0  Foam::error::printStack(Foam::Ostream&)[19] #0  Foam::error::printStack(Foam::Ostream&)[16] #0  [21] #0  Foam::error::printStack(Foam::Ostream&)[22] #0  Foam::error::printStack(Foam::Ostream&)[30] #0  Foam::error::printStack(Foam::Ostream&)[26] #Foam::error::printStack(Foam::Ostream&)[17] #0  0  Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)[20] #0[24] #0  Foam::error::printStack(Foam::Ostream&)[28] [31] [18] #0    Foam::error::printStack(Foam::Ostream&)#0  [23] #0  [27] #0  [29] #0  Foam::error::printStack(Foam::Ostream&)#0  Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[25] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[16] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/lin in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[22] #1  Foam::sigSegv::sigHandler(int)ux64GccDPOpt/lib/libOpenFOAM.so"
[17] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[21] #1   in "/home/ehk112/OpenFOAFoam::sigSegv::sigHandler(int)M/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[29] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64 in "/home/ehk112/OpenFOAM/OpenFOAM-2.2GccDPOpt/lib/libOpenFOAM.so"
[24] #1  .x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[31] #1  Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[26] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[30] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[28] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[23] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[19] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[20] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[18] #1   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linuxFoam::sigSegv::sigHandler(int)64GccDPOpt/lib/libOpenFOAM.so"
[27] #1  Foam::sigSegv::sigHandler(int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[25] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[16] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[22] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[26] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[17] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[21] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[23] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[29] #2   in "/lib64/libc.so.6"
[25] #3  Foam::Time::setTime(Foam::instant const&, int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[24] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[28] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[31] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[19] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[30] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[18] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[27] #2   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[20] #2   in "/lib64/libc.so.6"
[22] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[16] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[26] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[17] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[21] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[23] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[24] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[28] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[29] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[31] #3  Foam::Time::setTime(Foam::instant const&, int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[25] #4   in "/lib64/libc.so.6"
[30] #3   in "/lib64/libc.so.6"
[19] #3  Foam::Time::setTime(Foam::instant const&, int)Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[27] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[18] #3  Foam::Time::setTime(Foam::instant const&, int) in "/lib64/libc.so.6"
[20] #3  Foam::Time::setTime(Foam::instant const&, int) in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[22] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[16] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[26] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[17] #4  
 in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[23] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[21] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[28] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[29] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[24] #4  
 in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[31] #4  

 in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[27] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[30] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[20] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[19] #4   in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[18] #4  
[25]  in "/home/ehk112/OpenFOAM/OpenFOAM-2.2.x/platforms/linux64GccDPOpt/bin/renumberMesh"
[25] #5  __libc_start_main

While running,
Code:
mpirun -np 32 -hostfile $PBS_NODEFILE interFoam -parallel > log.interFoam 2>&1
gives,
Code:
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.2.x                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.2.x-0ee7dc546f1b
Exec   : interFoam -parallel
Date   : Jan 10 2014
Time   : 19:28:09
Host   : "cx1-11-2-1.cx1.hpc.ic.ac.uk"
PID    : 20294
Case   : /tmp/pbs.6213702.cx1b/damBreak32
nProcs : 32
Slaves : 
31
(
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20295"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20296"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20297"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20298"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20299"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20300"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20301"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20302"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20303"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20304"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20305"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20306"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20307"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20308"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20309"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25236"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25237"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25238"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25239"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25240"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25241"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25242"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25243"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25244"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25245"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25246"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25247"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25248"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25249"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25250"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25251"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

[16] 
[16] 
[16] --> FOAM FATAL ERROR: 
[16] Cannot find file "points" in directory "polyMesh" in times 0 down to constant
[25] [26] 
[26] 
[26] --> FOAM FATAL ERROR: [28] 
[28] 
[28] --> FOAM FATAL ERROR: 
[28] Cannot find file "points" in directory "polyMesh" in times 0 down to constant
[28] 
[29] 
[29] 
[29] --> FOAM FATAL ERROR: 
[29] [30] 
[30] 
[30] --> FOAM FATAL ERROR: 
[30] Cannot find file "points" in directory "polyMesh" in times 0 down to constant
[30] 
[30]     From function Time::findInstance(const fileName&, const word&, const IOobject::readOption, const word&)
[30]     in file db/Time/findInstance.C at line [31] 
[31] 
[31] --> FOAM FATAL ERROR: 
[31] Cannot find file "points" in directory "polyMesh" in times 0 down to constant
[31] 
[31]     From function Time::findInstance(const fileName&, const word&, const IOobject::readOption, const word&)
[31] [16] 
[16]     From function Time::findInstance(const fileName&, const word&, const IOobject::readOption, const word&)
[16]     in file db/Time/findInstance.C at line 203.
[16] 
FOAM parallel run exiting
The $PBS_NODEFILE is given as
Code:
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-1
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
cx1-11-2-4
Is the problem due to the nodes having separate storage location & the environmental variables are not carried forward to the second node.

Would really appreciated if someone could help me out.

Kind regards,
katakgoreng

Last edited by katakgoreng; January 10, 2014 at 17:15.
katakgoreng is offline   Reply With Quote

Old   January 10, 2014, 17:18
Default
  #18
Member
 
Join Date: Dec 2009
Posts: 45
Rep Power: 7
katakgoreng is on a distinguished road
I also ran the testparallel application

Code:
mpirun -np 32 -hostfile $PBS_NODEFILE Test-parallel -parallel > log.Testparallel 2>&1
Code:
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.2.x                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.2.x-0ee7dc546f1b
Exec   : Test-parallel -parallel
Date   : Jan 10 2014
Time   : 21:11:30
Host   : "cx1-11-2-1.cx1.hpc.ic.ac.uk"
PID    : 20659
Case   : /tmp/pbs.6213702.cx1b/damBreak32
nProcs : 32
Slaves : 
31
(
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20660"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20661"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20662"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20663"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20664"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20665"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20666"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20667"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20668"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20669"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20670"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20671"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20672"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20673"
"cx1-11-2-1.cx1.hpc.ic.ac.uk.20674"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25480"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25481"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25482"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25483"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25484"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25485"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25486"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25487"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25488"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25489"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25490"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25491"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25492"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25493"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25494"
"cx1-11-2-4.cx1.hpc.ic.ac.uk.25495"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

[2] [3] 
Starting transfers
[3] 
[3] slave sending to master 0
[5] 
Starting transfers
[5] 
[5] slave sending to master 0
[6] 
Starting transfers
[6] 
[6] slave sending to master 0
[6] slave receiving from master 0
[7] 
Starting transfers
[7] 
[7] slave sending to master 0
[7] slave receiving from master 0
[8] 
Starting transfers
[8] 
[8] slave sending to master 0
[8] slave receiving from master 0
[9] 
Starting transfers
[9] 
[9] slave sending to master 0
[9] slave receiving from master 0
[12] 
Starting transfers
[12] 
[12] slave sending to master 0
[12] slave receiving from master 0
[14] 
Starting transfers
[14] 
[14] slave sending to master 0
[14] slave receiving from master 0
[1] 
Starting transfers
[1] 
[1] slave sending to master 0
[1] slave receiving from master 0
[4] 
Starting transfers
[4] 
[4] slave sending to master 0
[4] slave receiving from master 0
[3] slave receiving from master 0
[5] slave receiving from master 0
[10] 
Starting transfers
[10] 
[10] slave sending to master 0
[10] slave receiving from master 0
[11] 
Starting transfers
[11] 
[11] slave sending to master 0
[11] slave receiving from master 0
[13] 
Starting transfers
[13] 
[13] slave sending to master 0
[13] slave receiving from master 0
[15] 
Starting transfers
[15] 
[15] slave sending to master 0
[15] slave receiving from master 0
[0] 
Starting transfers
[0] 
[0] master receiving from slave 1
[0] (0 1 2)
[0] master receiving from slave 2

Starting transfers
[28] 
Starting transfers
[28] 
[28] slave sending to master 0
[28] slave receiving from master 0
[18] 
Starting transfers
[18] 
[18] slave sending to master 0
[18] slave receiving from master 0
[20] 
Starting transfers
[20] 
[20] slave sending to master 0
[20] slave receiving from master 0
[21] 
Starting transfers
[21] 
[21] slave sending to master 0
[21] slave receiving from master 0
[25] 
Starting transfers
[25] 
[25] slave sending to master 0
[25] slave receiving from master 0
[29] 
Starting transfers
[29] 
[29] slave sending to master 0
[29] slave receiving from master 0
[30] 
Starting transfers
[30] 
[30] slave sending to master 0
[30] slave receiving from master 0
[31] 
Starting transfers
[31] 
[31] slave sending to master 0
[31] slave receiving from master 0
[16] 
Starting transfers
[16] 
[16] slave sending to master 0
[16] slave receiving from master 0
[17] 
Starting transfers
[17] 
[17] slave sending to master 0
[17] slave receiving from master 0
[19] 
Starting transfers
[19] 
[19] slave sending to master 0
[19] slave receiving from master 0
[22] 
Starting transfers
[22] 
[22] slave sending to master 0
[22] slave receiving from master 0
[23] 
Starting transfers
[23] 
[23] slave sending to master 0
[23] slave receiving from master 0
[24] 
Starting transfers
[24] 
[24] slave sending to master 0
[24] slave receiving from master 0
[26] 
Starting transfers
[26] 
[26] slave sending to master 0
[26] slave receiving from master 0
[27] 
Starting transfers
[27] 
[27] slave sending to master 0
[27] slave receiving from master 0
[2] 
[2] slave sending to master 0
[2] [0] (0 1 2)
[0] master receiving from slave 3
[0] (0 1 2)
[0] master receiving from slave 4
[0] (0 1 2)
[0] master receiving from slave 5
[0] (0 1 2)
[0] master receiving from slave 6
[0] (0 1 2)
[0] master receiving from slave 7
[0] (0 1 2)
[0] master receiving from slave 8
[0] (0 1 2)
[0] master receiving from slave 9
[0] (0 1 2)
[0] master receiving from slave 10
[0] (0 1 2)
[0] master receiving from slave 11
[0] (0 1 2)
[0] master receiving from slave 12
[0] (0 1 2)
[0] master receiving from slave 13
[0] (0 1 2)
[0] master receiving from slave 14
[0] (0 1 2)
[0] master receiving from slave 15
[0] (0 1 2)
[0] master receiving from slave 16
[0] (0 1 2)
[0] master receiving from slave 17
[0] (0 1 2)
[0] master receiving from slave 18
[0] (0 1 2)
[0] master receiving from slave 19
[0] (0 1 2)
[0] master receiving from slave 20
[0] (0 1 2)
[0] master receiving from slave 21
[0] (0 1 2)
[0] master receiving from slave 22
[0] (0 1 2)
[0] master receiving from slave 23
[0] (0 1 2)
[0] master receiving from slave 24
[0] (0 1 2)
[0] master receiving from slave 25
[0] (0 1 2)
[0] master receiving from slave 26
[0] (0 1 2)
[0] master receiving from slave 27
[0] (0 1 2)
[0] master receiving from slave 28
[0] (0 1 2)
[0] master receiving from slave 29
[0] (0 1 2)
[0] master receiving from slave 30
[0] (0 1 2)
[0] master receiving from slave 31
[0] (0 1 2)
[0] master sending to slave 1
[0] master sending to slave 2
[0] master sending to slave 3
[0] master sending to slave 4
[0] master sending to slave 5
[1] (0 1 2)
[4] (0 [3] (0 1 2)
[0] master sending to slave 6
[0] master sending to slave 7
[0] master sending to slave 8
[0] master sending to slave 9
[0] master sending to slave 10
[0] 1 2)
[6] (0 1 2)
[7] (0 1 2)
[8] (0 1 2)
[5] (0 1 2)
[10] (0 1 2)
[9] (0 1 2)
master sending to slave 11
[0] master sending to slave 12
[0] master sending to slave 13
[0] master sending to slave 14
[0] master sending to slave 15
[0] master sending to slave 16
[0] master sending to slave 17
[0] master sending to slave 18[15] (0 1 2)
[14] (0 1 2)
[11] (0 1 2)
[12] (0 1 2)
[13] (0 1 2)

[0] master sending to slave 19
[0] master sending to slave 20
[0] master sending to slave 21
[0] master sending to slave 22
[0] master sending to slave 23
[0] master sending to slave 24
[0] master sending to slave 25
[0] master sending to slave 26
[0] master sending to slave 27
[0] master sending to slave 28
[0] master sending to slave 29
[0] master sending to slave 30
[0] master sending to slave 31
End

Finalising parallel run
slave receiving from master 0
[2] (0 1 2)
[16] (0 1 2)
[17] (0 1 2)
[18] (0 1 2)
[19] (0 1 2)
[20] (0 1 2)
[22] (0 1 2)
[23] (0 1 2)
[21] (0 1 2)
[24] (0 1 2)
[26] (0 1 2)
[25] (0 1 2)
[29] (0 1 2)
[30] (0 1 2)
[31] (0 1 2)
[27] (0 1 2)
[28] (0 1 2)
katakgoreng is offline   Reply With Quote

Old   January 26, 2014, 10:43
Default
  #19
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 8,488
Blog Entries: 34
Rep Power: 86
wyldckat is just really nicewyldckat is just really nicewyldckat is just really nicewyldckat is just really nice
Greetings katakgoreng,

I've finally managed to get around to this thread on my to-do list... A few questions:
  1. It's been 16 days since you last posted, did you manage to solve the problem?
  2. What were the exact steps you've taken to prepare the case?
  3. And have you tried using the "-case" option in the interFoam and renumberMesh applications? Example:
    Code:
    mpirun -np 32 -hostfile $PBS_NODEFILE renumberMesh -overwrite  -parallel -case /home/ehk112/OpenFOAM/ehk112-2.2.2/run/damBreak32 >  log.renumberMesh 2>&1
The Test-parallel utility doesn't need to access all of the files, which is why there weren't any problems.

Best regards,
Bruno
wyldckat is online now   Reply With Quote

Old   February 19, 2014, 05:39
Default
  #20
Member
 
Join Date: Dec 2009
Posts: 45
Rep Power: 7
katakgoreng is on a distinguished road
Hi Bruno,

Sorry for the late respond. My wife gave birth last month, so I take some time off from work.

The following was the PBS job script that I previously use to submit job to the cluster.

Code:
#!/bin/bash
#
# --- SET THE PBS DIRECTIVES
#PBS -l walltime=2:00:00
#PBS -l select=2:ncpus=16:mpiprocs=16:mem=4000mb 
#PBS -e ehk112_err		
#PBS -o ehk112_out		
#PBS -m ae
#PBS -V

echo "============================================="
echo "FOLDER LOCATION AND NAME"
echo "============================================="
CASEFOLDER="damBreak32"
CASELOCATION="$WORK/CASEFOLDER/"
echo $CASEFOLDER
echo $CASELOCATION

echo "============================================="
echo "SOURCING SYSTEM BASHRC"
echo "============================================="
. $HOME/.bash_profile

echo "============================================="
echo "SOURCING OPENFOAM 2.2.x BASHRC"
echo "============================================="
. /home/ehk112/OpenFOAM/OpenFOAM-2.2.x/etc/bashrc

echo "============================================="
echo "COPYING CASE FILE INTO TEMP FOLDER"
echo "============================================="
cp -rf $CASELOCATION/$CASEFOLDER $TMPDIR
cd $TMPDIR/$CASEFOLDER

echo "============================================="
echo "RUNNING OPENFOAM BATCH SCRIPT"
echo "============================================="
./Allrun

echo "============================================="
echo "COPY RESULT INTO $WORK/OpenFOAM"
echo "============================================="
cd ..
mv -f $CASEFOLDER $WORK/OpenFOAM
The following is the Allrun Openfoam batch script

Code:
#!/bin/bash

# ===============================
# PREPARE CASES
# ===============================
rm log.*
cp 0/backup/* 0/

# ===============================
# MESHING
# ===============================
blockMesh > log.blockMesh 2>&1

# ===============================
# SET FIELD
# ===============================
setFields > log.setField 2>&1

# ===============================
# DECOMPOSE DOMAIN
# ===============================
decomposePar > log.decomposePar 2>&1

# ===============================
# RENUMBER MESH                 
# ===============================
mpirun -np 32 -hostfile $PBS_NODEFILE renumberMesh -overwrite -parallel > log.renumberMesh 2>&1

# ===============================
# RUN APPLICATION
# ===============================
mpirun -np 32 -hostfile $PBS_NODEFILE interFoam -parallel > log.interFoam 2>&1
So what I did was:
1. Sourcing the environmental variables
2. Copying the case folder to the cluster temporary folder
3. Execute Openfoam batch script
4. After finish, copying back the result from the cluster temporary folder

I will try the method that you proposed and report back later.

Kind regards,
katakgoreng
katakgoreng is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Moving mesh Niklas Wikstrom (Wikstrom) OpenFOAM Running, Solving & CFD 122 June 15, 2014 06:20
Upgraded from Karmic Koala 9.10 to Lucid Lynx10.04.3 bookie56 OpenFOAM Installation 8 August 13, 2011 04:03
Extrusion with OpenFoam problem No. Iterations 0 Lord Kelvin OpenFOAM 6 April 12, 2011 11:24
Difference between executionTime and clockTime jml OpenFOAM Running, Solving & CFD 1 December 10, 2008 08:58
IcoFoam parallel woes msrinath80 OpenFOAM Running, Solving & CFD 9 July 22, 2007 02:58


All times are GMT -4. The time now is 17:28.