CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Virtual memory problem with parallel runs

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   January 2, 2005, 17:16
Default Dear friends, I ran a case
  #1
Ali (Ali)
Guest
 
Posts: n/a
Dear friends,

I ran a case with around 300,000 cells on 32 processors without a problem. When I increased the grid points to around 1,200,000 and used 64 processors, it didn't start and gave the following errr. I don't think this number of cells is too much for 64 processors. Has anybody experienced similar error. I would appreciate if you let me know what's wrong.

The error message:
-------------------------------------
new cannot satisfy memory request.
This does not necessarily mean you have run out of virtual memory.
It could be due to a stack violation causedby e.g. bad use of pointers or an out of date shared library
  Reply With Quote

Old   January 2, 2005, 20:06
Default It sounds like the case is ru
  #2
Henry Weller (Henry)
Guest
 
Posts: n/a
It sounds like the case is running on one processor, maybe 64 copies each on their own processor. FOAM uses between 1 and 2k per cell depending on the code so 1.2e6 sounds like it would fill 32bit addressing for some of the codes.
  Reply With Quote

Old   January 2, 2005, 20:40
Default Thanks a lot Henry, You ar
  #3
Ali (Ali)
Guest
 
Posts: n/a
Thanks a lot Henry,

You are right. I had got such error when I wanted to run it on 1 machine, but this is a lot more processors. I'm using PBS to submit jobs randomly on 64 processos out of a larger cluster consisting of IBM Dual 3.0 GHz BladeCenter processors with over 2GB of memory each. Actually, for the smaller job (300,000 cells), the 32 parallel processors were only a little faster than when I ran it on a single machine (with a little higher memory and approximately same CPU speed). Is there any special partitioning method or other ways of improving the parallel effeciency for irregular geometries? (Now, I'm using simple decomposition method)

The thing is that I got this message twice when I submitted this job, and for the 3rd try, it worked and surprisingly it started to work on 64 processors when I had decreased the number of subdomiains from 64 to 32 in decomposeParDict and decompositionDict. It seems whatever the number of subdomains in this two dicts, it works by 64 processors and gives no error. Is it what usually happens or it should give an error or message concerning the number of subdomains is not the same as the number of processors requested?

Regards,
  Reply With Quote

Old   January 3, 2005, 07:40
Default I am surprised the speed-up w
  #4
Henry Weller (Henry)
Guest
 
Posts: n/a
I am surprised the speed-up was so small, we get much more than this. What is the inter-connect speed of your machine?

There are three decomposition techniques supplied with FOAM, thry the other two and look at the decomposition statistics decomposePar prints which will give you an idea of how effective the approach is for your case.

You might also find it useful to play with

scheduledTransfer 1;
floatTransfer 0;
nProcsSimpleSum 16;

in .OpenFOAM-1.0/controlDict, in particular floatTransfer which could be set to 1 to enable the parallel transfer of data to be floats rather than doubles and possibly change scheduledTransfer and/or nProcsSimpleSum.

I don't understand why you have two decompositon dictionaries, you should have only one and of course the information in it should correspond to the decomposition you are using!
  Reply With Quote

Old   June 9, 2005, 16:27
Default I got the same message as Ali,
  #5
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
I got the same message as Ali, when trying to decompose the case (decomposePar). Should I run it with mpirun? (I will try as soon as the parallel machine where I am running comes back to live...)

Processor 2
Number of cells = 1042872
Number of faces shared with processor 1 = 260718
Number of faces shared with processor 3 = 260718
Number of boundary faces = 9928

Processor 3
Number of cells = 1042872
Number of faces shared with processor 2 = 260718
Number of faces shared with processor 0 = 260718
Number of boundary faces = 9928
new cannot satisfy memory request.
This does not necessarily mean you have run out of virtual memory.
It could be due to a stack violation caused by e.g. bad use of pointers or an out of date shared library
Aborted
[luizebs@green01 oodles]$
sampaio is offline   Reply With Quote

Old   June 10, 2005, 05:22
Default decomposePar has to hold the u
  #6
Super Moderator
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,416
Rep Power: 16
mattijs is on a distinguished road
decomposePar has to hold the undecomposed case and all the pieces it decomposes into. So it uses on average twice the storage the single mesh uses.

Maybe you just run out of memory? What does 'top show when you run decomposePar?
mattijs is offline   Reply With Quote

Old   June 10, 2005, 13:48
Default Yeah. I did run out of memory.
  #7
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
Yeah. I did run out of memory.

Then, I tried to run it with lamexec (I am not sure I could, I am very inexperienced in parallel computations...):

lamexec -np 4 decomposePar . GL3 </dev/null>& logd &

Is this the right command? (I tried with mpirun first, but it looks like decomposePar is not an MPI application, is it?)

Note there are 4 "Processor 3" in the output. I just printed the last 2.

Thanks for your help,
luiz


Processor 3
Number of cells = 1042872
Number of faces shared with processor 2 = 260718
Number of faces shared with processor 0 = 260718
Number of boundary faces = 9928

Processor 3
Number of cells = 1042872
Number of faces shared with processor 2 = 260718
Number of faces shared with processor 0 = 260718
Number of boundary faces = 9928
new cannot satisfy memory request.
This does not necessarily mean you have run out of virtual memory.
It could be due to a stack violation caused by e.g. bad use of pointers or an out of date shared library
1765 (n1) exited due to signal 6
[luizebs@green01 oodles]$
sampaio is offline   Reply With Quote

Old   June 10, 2005, 14:37
Default You cannot run decomposePar in
  #8
Super Moderator
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,416
Rep Power: 16
mattijs is on a distinguished road
You cannot run decomposePar in parallel.
mattijs is offline   Reply With Quote

Old   June 10, 2005, 14:54
Default But then, how can I run a big
  #9
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
But then, how can I run a big mesh in parallel that does not fit the memory requirement of an isolated node?

Does that mean that my mesh size to be run in parallel is limited by the memory requirement of a single node?

Thanks a lot,
luiz
sampaio is offline   Reply With Quote

Old   June 10, 2005, 15:09
Default What normally is being done:
  #10
Super Moderator
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,416
Rep Power: 16
mattijs is on a distinguished road
What normally is being done:
- have a computer with a lot of memory to do the decomposition on.
- run on smaller nodes.

Even better:
- do your mesh generation in parallel (and no, blockMesh does not run in parallel)
mattijs is offline   Reply With Quote

Old   June 10, 2005, 16:02
Default thanks, Mattijs Since this is
  #11
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
thanks, Mattijs
Since this is the computer with higher mem I have, I have no other option but #2.

But I have no Gambit or other mesh generator on this parallel machine, which means I would have to generate a gambit mesh in other computer (single node) and convert it using some Foam utility (gambitToFoam). Question: does gambitToFoam run in parallel? Or it has the same limitation as blockMesh?

If I could not find a way to use gambit in a parellal machine, I will probably have to use decomposePar, which will again have memory problems, right?

What about this: I sequentially construct 4 (number of nodes) smaller (4times) meshes using blockMesh and manually copy each of the polyMesh dir generated into processor0-3/constant/polyMesh.
Then i change the boundary condition, trying to mimic a boundary file generated via decomposePar.

Do you think it would work?

Thanks,
Luiz
sampaio is offline   Reply With Quote

Old   June 10, 2005, 16:07
Default Nope. decomposePar orders the
  #12
Senior Member
 
Hrvoje Jasak
Join Date: Mar 2009
Location: London, England
Posts: 1,758
Rep Power: 21
hjasak will become famous soon enough
Nope. decomposePar orders the faces on parallel boundaries in a special way (i.e. the ordering is the same on both sides of the parallel interface). The chance of getting this right without using decomposePar are slim AND you need to knwo exactly what you're doing...

Hmm,

Hrv
__________________
Hrvoje Jasak
Providing commercial FOAM/OpenFOAM and CFD Consulting: http://wikki.co.uk
hjasak is offline   Reply With Quote

Old   June 10, 2005, 18:05
Default Thanks Hrvoje, In my case, I
  #13
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
Thanks Hrvoje,
In my case, I have z-direction homogeneous geometry, and I am planing to partition it in the z-direction as well. Does this make my chances better?
Again, this is my only possibility, since I have no way to generate a paralel mesh ready to be used my foam (in other words, without the need to first run gambitToFoam or decomposePar first).

BTW, which of Foam utilities can be ran in paralel (with either mpirun or lamexec)? gambitToFoam, for instance? renumberMesh?

Thanks a lot again,
luiz
sampaio is offline   Reply With Quote

Old   June 10, 2005, 19:44
Default Well, you might have half a ch
  #14
Senior Member
 
Hrvoje Jasak
Join Date: Mar 2009
Location: London, England
Posts: 1,758
Rep Power: 21
hjasak will become famous soon enough
Well, you might have half a chance but...
- you'll have to do a ton of mesh manipulation by hand because I bet the front and back planes will be numbered differently
- unless you grade the mesh (such that faces have different areas and you get a matching error), you might keep getting a running code with rubbish results
- Mattijs might have written some utilities for re-ordering parallel (cyclic?) faces, which may be re-used. (I'm sure he'll pitch in with some ideas - thanks, Mattijs) :-)

To put it straight, I have personally written the parallel mesh decomposition and reconstruction tools and I wouldn't want to be in your skin... It would be much easier to find a 64-bit machine and do the job there.

Alternatively, make thick slices for each CPU, decompose the mesh and then use mesh refiniment on each piece separately to get the desired number of layers (or something similar).

BTW, have you considered how you are going to look at results - paraFoam does not run in parallel either. Maybe some averaging in the homogenous direction or interpolation to a coarser mesh is in order.

As for utilities, call them with no arguments and (most of them) should tell you. Off the cuff, I would say that mesh maniputation tools won't work in parallel but data post-processing (apart from graphics) will.

Good luck,

Hrv
__________________
Hrvoje Jasak
Providing commercial FOAM/OpenFOAM and CFD Consulting: http://wikki.co.uk
hjasak is offline   Reply With Quote

Old   June 10, 2005, 21:55
Default "BTW, have you considered how
  #15
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
"BTW, have you considered how you are going to look at results - paraFoam does not run in parallel either. Maybe some averaging in the homogenous direction or interpolation to a coarser mesh is in order."

Yes. I ve already built a corser mesh and mapped sucessufly (from a not so refined mesh case).

Thanks a lot for your comments...
luiz
sampaio is offline   Reply With Quote

Old   June 13, 2005, 16:47
Default Ok. I reduce a little bit the
  #16
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
Ok. I reduce a little bit the mesh, and was able to run decomposePar without problems.

But when I run the case (with mpirun) I get:
(Still looks like I have some memory problem (I mean, hopefully not me, but my simulation), doesnt it?)


[0] Case : GL2meules
[0] Nprocs : 4
[0] Slaves :
3
(
green02.5942
green03.4885
green04.4738
)

Create time

Create mesh, no clear-out for time = 150

MPI_Bsend: unclassified: No buffer space available (rank 2, MPI_COMM_WORLD)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Bsend()
Rank (2, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 22401 failed on node n0 (192.168.0.1) with exit status 1.
-----------------------------------------------------------------------------
[1]+ Exit 1 mpirun -np 4 glLES . GL2meules -parallel 1>&logm
[luizebs@green01 oodles]$

Thanks a lot,
luiz
sampaio is offline   Reply With Quote

Old   June 13, 2005, 16:54
Default What happens if you increase M
  #17
Senior Member
 
Join Date: Mar 2009
Posts: 854
Rep Power: 13
henry is on a distinguished road
What happens if you increase MPI_BUFFER_SIZE?
henry is offline   Reply With Quote

Old   June 13, 2005, 17:10
Default The same thing. (only rank 2 c
  #18
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
The same thing. (only rank 2 changed to rank 1)

What should be this value? It was 20000000.

Thanks,
luiz



green02.6630
green03.5573
green04.5426
)

Create time

Create mesh, no clear-out for time = 150

MPI_Bsend: unclassified: No buffer space available (rank 1, MPI_COMM_WORLD)
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 23152 failed on node n0 (192.168.0.1) with exit status 1.
-----------------------------------------------------------------------------
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Bsend()
Rank (1, MPI_COMM_WORLD): - main()
[luizebs@green01 oodles]$ echo %MPI_BUFFER_SIZE
%MPI_BUFFER_SIZE
[1]+ Exit 1 mpirun -np 4 glLES . GL2meules -parallel </dev/null>&logm
[luizebs@green01 oodles]$
sampaio is offline   Reply With Quote

Old   June 14, 2005, 05:52
Default Too hard to calculate. Just
  #19
Super Moderator
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,416
Rep Power: 16
mattijs is on a distinguished road
Too hard to calculate.

Just double it (and make sure to 'lamwipe' and 'lamboot' so the new settings are known by lamd) and try again. Keep on doing until you don't get this message.
mattijs is offline   Reply With Quote

Old   June 15, 2005, 16:01
Default Thanks Mattijs, It is working
  #20
Member
 
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 8
sampaio is on a distinguished road
Thanks Mattijs,
It is working now.

But what would be the consequences of an unecessary higher value of this buffer size?

Where can I learn more about all these things (mostly linux and running linux in parallel)? I feel so week (I only later found out that I should put my export MPI_BUFFER_SIZE=xxxxx in my bashrc, but I am not even sure why... I suspect it has to do with exporting to all nodes instead of just the current one...)

Could you provide some pointers (linux and parallel stuff)? Books, online tutorials, etc...

I really feel the need to know better what is happening, but i dont know how to start...

thanks,
luiz
sampaio is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Insufficient virtual memory sammi Phoenics 4 April 8, 2009 13:32
Memory requirements for serial and parallel runs denner OpenFOAM Running, Solving & CFD 0 August 26, 2008 15:11
High performance virtual memory tip connclark OpenFOAM Running, Solving & CFD 0 December 5, 2007 19:35
Help with virtual memory tom Phoenics 10 July 19, 2007 14:50
virtual memory? tj Phoenics 1 February 3, 2005 12:40


All times are GMT -4. The time now is 07:18.