CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   Main CFD Forum (http://www.cfd-online.com/Forums/main/)
-   -   Code Saturne mass sources MPI problem (http://www.cfd-online.com/Forums/main/75195-code-saturne-mass-sources-mpi-problem.html)

Pat84 April 19, 2010 07:06

Code Saturne mass sources MPI problem
 
Hello,

i have wrote a program to simulate effusion cooling holes with mass sources in the user routine "ustsma.f90", but if i try to run the case with more than one processor, the calculation will be killed by MPI.

The error message in the Terminal looks like follow:
Code:

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 13747 on
node bk-003z-kp exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
Error running the calculation.

Check Kernel log (listing) and error* files for details


  ********************************************
        Error in calculation stage.
  ********************************************

The list shows this error massage:

Code:

-------------------------------------------------------------


ALL PHASES  : 1D-WALL THERMAL MODULE NOT ACTIVATED
                NFPT1D =          0

-------------------------------------------------------------


PHASE      1 : HEAD LOSS TREATMENT NOT ACTIVATED
                NCEPDC =          0

-------------------------------------------------------------

SIGSEGV signal (forbidden memory area access) intercepted!

Call stack:
  1: 0x414f2a    <ustsma_+0x3278>                (cs_solver)
  2: 0x7fdcf8c99632 <caltri_+0x3882>                (libsaturne.so.0)
  3: 0x7fdcf8c76476 <cs_run+0x856>                  (libsaturne.so.0)
  4: 0x7fdcf8c7674d <main+0x1fd>                    (libsaturne.so.0)
  5: 0x7fdcf6d2e586 <__libc_start_main+0xe6>        (libc.so.6)
  6: 0x40d1a9    ?                                (?)
End of stack

Is it possible, that i am not allowed to use the "save" option in variable declaration? That would be fatal, because i need to save variables for the following time steps.

has someone a idea?

Tanks, Patrick

Pat84 April 19, 2010 09:28

2 Attachment(s)
Hello again,

i have made a simple test case to analyze the problem and it appears to be a little difficult.

In that case i don't save any variable and with one processor the calculation runs very well, but with multi processor calculation the process ends with an error in very early stage.

Here is the error message appearing in the listing file:

Code:

ALL PHASES  : 1D-WALL THERMAL MODULE NOT ACTIVATED
                NFPT1D =          0

-------------------------------------------------------------


PHASE      1 : HEAD LOSS TREATMENT NOT ACTIVATED
                NCEPDC =          0

-------------------------------------------------------------


PHASE      1 : MASS SOURCE TERMS TREATMENT ACTIVATED
                ON A TOTAL OF          4 CELLS
-------------------------------------------------------------



===============================================================



                      MAIN CALCULATION                     
                      ================                     


===============================================================




===============================================================


 INSTANT    0.200000000E-04  TIME STEP NUMBER              1
 =============================================================


 --- Phase:          1
 ---------------------------------                         
 Property  Min. value  Max. value                         
 ---------------------------------                         
  Density  0.1161E+02  0.1161E+02
  visc.lam  0.4693E-04  0.4693E-04
  visc.tur  0.7198E-01  0.7198E-01
 ---------------------------------                         

 --- Diffusivity:                                           
 ---------------------------------------                   
 Scalar  Number  Min. value  Max. value                   
 ---------------------------------------                   
 Temp          1  0.6379E-01  0.6379E-01
 ---------------------------------------                   

SIGFPE signal (floating point exception) intercepted!

Call stack:
  1: 0x40f5e7    <ustsma_+0x157>                  (cs_solver)
  2: 0x7f4cdb7ef2c1 <tridim_+0x2ac1>                (libsaturne.so.0)
  3: 0x7f4cdb639d4c <caltri_+0x4f9c>                (libsaturne.so.0)
  4: 0x7f4cdb615476 <cs_run+0x856>                  (libsaturne.so.0)
  5: 0x7f4cdb61574d <main+0x1fd>                    (libsaturne.so.0)
  6: 0x7f4cd96cd586 <__libc_start_main+0xe6>        (libc.so.6)
  7: 0x40cb39    ?                                (?)
End of stack

floating point exception? Why in previous calculation with one processor there was no floating point exception ?!


If you want to try it by yourself, you can find the SRC files and the mesh in the attachment.

Edit:
.................................................. .....

This is a line from the running one processor calculation:
Code:

-------------------------------------------------------------


PHASE      1 : MASS SOURCE TERMS TREATMENT ACTIVATED
                ON A TOTAL OF          2 CELLS
-------------------------------------------------------------

You can see, that we have only 2 cells for mass sources. As you can see in the 2 processor case we got 4 cells ( 2x2 ).


The same with 4 processors:
Code:

-------------------------------------------------------------


PHASE      1 : MASS SOURCE TERMS TREATMENT ACTIVATED
                ON A TOTAL OF          8 CELLS
-------------------------------------------------------------

8 cells (4x2). 2 cells would be right.

Yvan Fournier April 19, 2010 10:31

Hello,

There is an error in your programming of ustsma.f90:

Instead of looping on cells selected through getcel (see reference example), you seem to be interested only in lstelt(1), and later on icetsm(1) or icetsm(2).

This means your programming is mesh-numbering dependent, and in parallel, if the subdomains assigned to some processors do not have any cells in the source term area, arrays such as lstelt and icetsm are empty for those processors, but you are still trying to access those arrays.

Try to follow the examples in the reference examples, at least for the way of looping on cells (with your own values of course).

Also, version 2.0-rc1, which you may find on announces in the forum at http://code-saturne.info, has a version of ustsma.f90 with English comments, which may help (it also has several bug corrections compared to beta-2 and more functionnality in the GUI).

Best regards,

Yvan Fournier

Pat84 April 19, 2010 17:29

Hello Yvan,

thank you for your attention. In the base ustsma.f90 the cell number is called by "CALL getfbr" or "CALL getcel"

Copy & Paste :
Code:

  iutile = 0
  if(iutile.eq.1) then

    if(iphas.eq.1) then

      ieltsm = 0

!    Cellules dont le centre est a 250 < x < 500

      CALL GETCEL('X < 500.0 and X > 250.0',NLELT,LSTELT)
      do ilelt = 1, nlelt
        ii = lstelt(ilelt)
        ieltsm = ieltsm + 1
        if (iappel.eq.2) icetsm(ieltsm) = ii
      enddo


!    Cellules de bord dont une face est de couleur 3

      CALL GETFBR('3',NLELT,LSTELT)
      do ilelt = 1, nlelt
        ifac = lstelt(ilelt)
        ii  = ifabor(ifac)
!      On ne comptabilise cette cellule que si on ne l'a pas deja vue
!        pour cela on prend la negation du test precedent
        if (.not.(xyzcen(1,ii).lt.500.d0.and.                    &
                  xyzcen(1,ii).gt.250.d0)    )then
          ieltsm = ieltsm + 1
          if (iappel.eq.2) icetsm(ieltsm) = ii
        endif
      enddo

    else
      ieltsm = 0
    endif

  endif

In my easy example i make a call for every cell, but in my effusion cooling routine i make one call for all boundary faces and pick the right cell by an algorithm that searches for the smallest distance between a point and a boundary face center (I need the cell next to the wall -> ifabor).

Is it also possible to make a loop from i = 1 , nfabor and find the right cell by an if case ( if(xyzcen(1,i).lt.1234.and.xyzcen(1,i).gt.1234.and .xyzcen(2,i).lt. .. .. ..)? Is it than able to run with multi processor and mpi?

Yvan Fournier April 19, 2010 18:27

Hello Patrick,

I am not sure I understand what you are trying to do, but in any case,
getcel, getfbr, and getfac are intended to be used to obtain a list of cells, boundary faces, or interior faces respectively, matching a given selection criteria (the string argument in the call). These functions return the number of local (i.e. for that processor) elements matching the criteria, and fill a list of these elements). Used thus, if no local elements match the criteria, nlelt = 0, and loops on nlelt are empty (avoiding any problems accessing empty arrays).

The different possible selection criteria are described in the user documentation pdf (that of version 2.0-rc1 should be generated using "make pdf" for the Kernel). You may use either group names, or geometric primitives, such as "x < 2", plane[...], box[...], sphere[...].

So if you have an element matching such a criteria on any processor, using the correct selection criterion will find it on the processor handling it, and return an empty list on others.

If you need something more complex, such as filtering candidate cells using getcel and then requiring the cell closest to a given point, you may use code similar to the one used for probes. In that case, you would want to take a look at code calling "findpt" in the main (non-user) code, and possibly look at the utility function prototypes in include/base/cs_parall.h or examples in users/base/usproj.f90.

Note: if you have a few holes, you may use a criteria such as

getcel('box[xmin1, ymin1, zmin1, xmax1, ymax1, zmax1] or box[xmin2, ymin2, zmin2, xmax2, ymax2, zmax2]', nlelt, lstele)

but if you have many holes, it may be cheaper to simple loop on cells and select those matching the criteria, for example (if holes are regularly spaced along x and may be simply tested with a "modulo" type condition):

nlelt = 0
do iel = 1, ncel
if ((abs(mod(xyzcen(1,iel) - x1)) .lt 1d-2) then
nlelt = nlelt + 1
lstelt(nlelt) = iel
endif
enddo

Look for similar coding in the examples from uskpdf.f90.

Best regards,

Yvan Fournier

Pat84 April 20, 2010 04:41

Hi Yvan,

my code is ready since the beginning of last weekend. I search my cells with this algorithm:

Code:

      if(iappel.eq.1) then
      call getfbr('all[]', nlelt, lstelt)
        do ilelt = 1, nlelt
          ifac = lstelt(ilelt)
          do zz = 1, nbohr
            if(ilelt.eq.1) then
              minabst(zz) = sqrt((mitts(zz,1)-cdgfbo(1,ifac))**2 +(mitts(zz,2)-cdgfbo(2,ifac))**2 &
                                +(mitts(zz,3)-cdgfbo(3,ifac))**2)
              mittfaces(zz) = ifac
            else
              abstand = sqrt((mitts(zz,1)-cdgfbo(1,ifac))**2 +(mitts(zz,2)-cdgfbo(2,ifac))**2 &
                                +(mitts(zz,3)-cdgfbo(3,ifac))**2)
              if(abstand.lt.minabst(zz)) then
                minabst(zz) = abstand
                mittfaces(zz) = ifac
              endif
            endif
          enddo
        enddo

With one processor the cells are right and there appears no problem .
My 2. cell on the other side i get by searching the face that fits the axis. that is performed by LR decomposition.

The best would be if i could perform the first and second appel by one processor, because taking apart a hole (the sources of inlet and outlet of the cooling hole would be on another part of the grid on different cpus) would break my calculation.

Yvan Fournier April 20, 2010 06:42

Hello Patrick,

Just a minor comment on your program: since you loop on all boundary faces,

call getfbr('all[]', nlelt, lstelt)
do ielt = 1, nlelt
ifac = lstelt(ielt)
...
enddo

Can be replaced simply be

do ifac = 1, nfabor
...
enddo

which may be slightly faster, and should give you the same results (possibly with a change of numbering).

For the more important stuff, there is no way to guarantee that the domain partitioning will not split a hole between 2 processors. You may check partitioning graphically (the postprocessing output has a time-independent field showing the partitioning, whether in EnSight, MED, or CGNS format), and even if you calculation crashed when calculating source terms, it should be available as it was output at an earlier stage.

Partitioning tools like METIS or SCOTCH may be assigned weight to help them keep neighboring elements on the same partitions, but I am not sure this can be guaranteed 100%, so we do not currently have the possibility of handling this in Code_Saturne.

If you have no more than 100-200 holes with only a few values per hole, you could handle holes the way we handle probes.

- For each hole and each processor, find the closest cell center to a hole center, using a call to findpt (see findpt.f90), which must returns both a cell number (node) and assciated processor number (ndrang) for all ranks.
Keep both in an array, either a new common block or a section in ituser.

In your case, you may not need just hole centers, but hole inlet centers and hole outlet centers, but the idea is the same.

- For each point/hole, you may broadcast the associated variable values from the processor containing the point to all others, using a call to parbcr (see example in isproj.f90, but arguments to parbcr are: owner_rank, n_values, values, so owner_rank for each point depends on the associated processor number for that point located which was saved in an array previously.

Look at the code in src/base/ecrhis.f90 (probe values output) for illustration if you need.

- At this stage, all processors should have a copy of all the hole values.

Now if the holes spread over several cells and you need to map individual inlet cells to outlet cells, this may not be enough, and you would need to go into Code_Saturne's C code to use fvm_locator type functions (used in parallel coupling with SYRTHES4, coupling of Code_Saturne with itself, turbomachinery applications, and cooling towers), which are designed to work in parallel, but require slighty more advanced coding, at least calling your own C functions from user fortran subroutines, as this API is not available from the Fortran part of the code?

Best regards,

Yvan Fournier

Pat84 April 20, 2010 07:37

hello

I think the best way to solve my problem is executing my subroutine on only one single processor. So does there exist a sbr / method to force the program, which runs on a multiprocessor machine, to execute parts / single sbrs (here ustsma) on only one processor? If so how can I make the result available to all processors afterwards?

best regards

Patrick

Pat84 April 21, 2010 05:29

Hello again,

i have tried the use of findpt and parbcr with the easy case i already attached in my 2. post.

It works, but it is not very comfortable to handle with a bunch of more sources. Finding the right face with the processor number, so that each processor could handle the own hole in- and outlets would need no communication and would be more comfortable. The problem here is, that the partitioner must be able to let the in- and outlet of a hole together. Or is it possible to make the partitioning of the grid by myself? It must then be possible to feed code saturne with the partitioning data from Metis.

Please answer my question in the post above, too. So I maybe have more possibilities to parallelize my subroutine.

Best regards,
Patrick

Yvan Fournier April 21, 2010 08:02

Hello,

For your previous question, you could call a subroutine only a a given processor (including paral.h and using "if (irangp.eq.0) then..." for example, but each processor has only a part of the data, so that would be basically useless for your needs.

Partitioning the grid yourself would thus be the best solution, and you have 2 alternatives:

- either modify src/appli/cs_partition.c in the Code_Saturne preprocessor's tree (requires some programming).

- or modify src/base/cs_preprocessor_data.c from the Code_Saturne Kernel code (just copy the modified version with your user subroutines).

In this case, you need to modify the _cell_rank_by_sfc function, replacing the call to fvm_io_num_create_from_coords and the following loop defining cell_rank (lines 1661-1669 in Code_Saturne 2.0-rc1) to define your own cell rank (using 0 to p-1 numbering). For a given cell i, the cell center's x, y, and z coordinates are obtained respectively by cell_center[i*3], cell_center[i*3 +1], and cell_center[i*3 +2] (that array is computed above the area you should modify).

You may thus define your own partitioning based on cell coordinates quite easily, but be aware that the quality of your partitioning (i.e. good load balancing as well as minimizing number of neighbors and number of faces on process boundaries) has a major impact on code performance.

If you define your own partitioning in this way, you will also need to tell the code not to run the partitioner in the runcase script by setting EXEC_PARTITION=no around line 215 (it is set to yes by default),
and you need to make sure that a previous PARTITION_OUTPUT is not in your case's DATA directory (otherwise, the code will use that partitioning instead, tough you could also modify cs_preprocessor_data.c further to avoid this).

Best regards,


All times are GMT -4. The time now is 19:22.