CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   SU2 Installation (http://www.cfd-online.com/Forums/su2-installation/)
-   -   Parallel processing: Each iteration carried out 4 times (http://www.cfd-online.com/Forums/su2-installation/120596-parallel-processing-each-iteration-carried-out-4-times.html)

Akash C July 10, 2013 09:18

Parallel processing: Each iteration carried out 4 times
 
I had encountered this error earlier while working on my single machine. But reinstalling the software solved the problem. Now I am facing the same problem on cluster that we have. And reinstalling the software on it time and again would be problem. Following is the output in the terminal
Code:

Iter    Time(s)    Res[Rho]    Res[RhoE]  CLift(Total)  CDrag(Total)

 Iter    Time(s)    Res[Rho]    Res[RhoE]  CLift(Total)  CDrag(Total)

 Iter    Time(s)    Res[Rho]    Res[RhoE]  CLift(Total)  CDrag(Total)

 Iter    Time(s)    Res[Rho]    Res[RhoE]  CLift(Total)  CDrag(Total)
    1  12.505000    1.870499      2.474769      0.147823      0.061620
    1  12.575000    1.870499      2.474769      0.147823      0.061620
    1  12.705000    1.870499      2.474769      0.147823      0.061620
    1  12.570000    1.870499      2.474769      0.147823      0.061620
    2  12.476667    1.841443      2.432168      0.188038      0.044570
    2  12.513333    1.841443      2.432168      0.188038      0.044570
    2  12.656667    1.841443      2.432168      0.188038      0.044570
    2  12.530000    1.841443      2.432168      0.188038      0.044570
    3  12.482500    1.769260      2.341379      0.237196      0.044837
    3  12.510000    1.769260      2.341379      0.237196      0.044837
    3  12.640000    1.769260      2.341379      0.237196      0.044837
    3  12.517500    1.769260      2.341379      0.237196      0.044837
    4  12.506000    1.759685      2.319083      0.255028      0.038207
    4  12.510000    1.759685      2.319083      0.255028      0.038207
    4  12.628000    1.759685      2.319083      0.255028      0.038207
    4  12.516000    1.759685      2.319083      0.255028      0.038207
    5  12.505000    1.778270      2.334846      0.262173      0.030858
    5  12.501667    1.778270      2.334846      0.262173      0.030858
    5  12.608333    1.778270      2.334846      0.262173      0.030858
    5  12.510000    1.778270      2.334846      0.262173      0.030858
    6  12.522857    1.766255      2.321529      0.268511      0.026032
    6  12.498571    1.766255      2.321529      0.268511      0.026032
    6  12.592857    1.766255      2.321529      0.268511      0.026032

If anyone else faced the same problem or has a workaround please help.

Thanks,
Akash

Akash C July 11, 2013 02:29

Problem solved.

Combas December 2, 2013 08:03

Hello Akash,

I have the same problem like you. I have updated my open mpi version (from 1.3 to 1.5.4) and tried another version of metis (5.1 instead of 4.3 but the 5.1 does not seem to be compatible with SU2 v2.0.8).
Could you please tell me what did you do to solve your problem?

Thank you in advance
Laurent

Akash C December 4, 2013 10:41

Hi Laurent,

I should've posted how I solved the problem. But it has been quite some time now and I don't exactly remember how I solved it.
The error means that serial execution is happening instead of the intended parallel execution and mostly mpi is at fault. Metis is partitioning the domain so any problem with metis can be ruled out. Reinstalltion did work for me once and also I used mpi v1.4.1.

Hope this helps.
Akash

Combas December 4, 2013 17:06

Hello Akash,

Thank you for your answer. So I think I will try to uninstall open mpi and mpich2 and try again to reinstall everything (and do a prayer in the same time :o) )

Regards,
Laurent

Akash C December 6, 2013 12:59

I had problems when I used open mpi as well and su2 parallel worked when I compiled with mpich2. I assumed you were using mpich2 hence didn't mention this.

Hope using mpich2 v1.4.1 solves the problem.
Akash

Combas December 6, 2013 16:47

Thank you Akash for this clarification. So I am going to try with mpich2 only.

Have a nice weekend!
Laurent

Combas December 27, 2013 11:58

I reinstalled everything (including Linux) on my computer since it did not work better with mpich2. Unfortunately, it does not work better now... (with Ubuntu v12.04 LTS, SU2 obtained with github, python 2.7 (with numpy 1.6.1 and scipy 0.9.0), metis v4.0.3 and mpich2 v1.4.1)

During the compilation everything seems ok, but when I launch the computation (tutorial n2), computations are done 4 times (on 4 cores), and it goes slower than if I do it on 2 cores...

If anyone has the solution, I would be extremely grateful!
Laurent

Combas January 19, 2014 18:07

I finally found the source of my problem!
Thank you Akash for your help!

So, I give the solution for those who would have the same problem.

My OS is Ubuntu 12.04, and I wanted to use mpich2 for the parallel computations (since it seems that open-mpi does not work)
mpich2 was installed on my computer, but it didn't work because the file "mpirun" in "/etc/alternatives/" pointed to "/usr/bin/mpirun.openmpi" instead of "/usr/bin/mpirun.mpich2". So I changed the link doing "ln -s /usr/bin/mpirun.mpich2 mpirun" in the folder /etc/alternatives/ and it worked.

For information, in the ./configuration options of SU2, I used "--with-MPI=/etc/alternatives/mpicxx"

It seems you could also get the same problem with the file "mpiexec" which is located in the same directory as "mpirun". One of these two files (mpirun or mpiexec) is used to launch a parallel computation.

I hope it will help those who will have the same problem...
Laurent

rktchip January 28, 2014 16:26

Hey everyone,

just to echo some comments here, two important steps for getting su2 parallel to work is (1) compiling against an mpi library by defining the compiler (ie mpicxx), and
(2) using the mpirun call that goes with your mpi library (mpiexec or mpirun)


All times are GMT -4. The time now is 17:44.