CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Installation (https://www.cfd-online.com/Forums/openfoam-installation/)
-   -   [OpenFOAM.com] problems running in parallel on Mac OS X and Windows: only 1 cpu (https://www.cfd-online.com/Forums/openfoam-installation/170218-problems-running-parallel-mac-os-x-windows-only-1-cpu.html)

windscion April 26, 2016 16:18

problems running in parallel on Mac OS X and Windows: only 1 cpu
 
Okay, I have OpenFOAM+ under docker running under Mac OS X 10.10.5 (Yosemite), and I can generate output. Alas, when I try to run in parallel, e.g.,
% mpirun -np 6 icoFoam -parallel
it only runs on one core. I imagine this is a configuration issue, but I haven't a clue where to start.


pgh April 29, 2016 18:25

Hi... docker in MAC uses virtual box.. Try to open virtual box in MAC and in its settings change number of processor to 6 or 8 whatever is available as well increase memory.
hope this answer your question

rudolf.hellmuth September 22, 2016 05:20

I have 24 cores on a Windows workstation. I set Virtual Box to run on 22 processors, but it is only running up to 8 processors in parallel. :confused:

Does anyone know why?

Best regards,
Rudolf

pgh September 22, 2016 06:21

How are you checking that it is only using 8 processor in parallel ?

rudolf.hellmuth September 22, 2016 08:42

Quote:

Originally Posted by pgh (Post 618874)
How are you checking that it is only using 8 processor in parallel ?

I check the CPU usage on the Windows Task Manager. I've got 8 cores on 100%, a couple of cores oscillating on about 10% because of my internet browser, and most of them at 0%.

Meanwhile, OpenFOAM+ has decomposed my mesh in 20 pieces, and the runParallel script is saying that is using 20 processes. (?!!!) I suppose mpi-run is sending the 20 mesh pieces to the 8 cores many times during the simulation.

I also have a feeling that it is slower to solve by decomposing into 20 pieces, than it is if I decompose it into 8 pieces.

Best regards,
Rudolf

wyldckat September 24, 2016 16:31

Quote:

Originally Posted by rudolf.hellmuth (Post 618894)
I check the CPU usage on the Windows Task Manager. I've got 8 cores on 100%, [...]

Quick question: Inside the virtual machine, before running the case in parallel, run:
Code:

lscpu
What do the following entries give you?
Code:

CPU(s):             
On-line CPU(s) list: 
Thread(s) per core: 
Core(s) per socket: 
Socket(s):           
NUMA node(s):


rudolf.hellmuth September 26, 2016 05:10

1 Attachment(s)
Bom dia, Bruno.

Thanks for replying. The result of lscpu is below. This post also has a screen print showing the virtual box settings. I am new to virtual machines, which seems to be the bottleneck here. I don't know what else I can change besides of giving it access to 22 processors. I'd appreciate if you could explain me what I could make to run OpenFOAM+ with full power.

$lscpu:
Code:

CPU(s):                  8
On-line CPU(s) list:  0-7
Thread(s) per core:  1
Core(s) per socket:  8
Socket(s):              1
NUMA node(s):        ? (info not displayed. I suppose it's 1.)

Grato,
Rudolf

wyldckat September 26, 2016 18:59

Hi Rudolf,

Quick request: The image from the desktop did help get a bit of a perspective... but any chance you can also show the configuration windows for the CPU settings for this virtual machine in Virtualbox?
I ask this because perhaps Virtualbox is giving any messages regarding the configuration?

Because if lscpu is telling us that only 8 cores exist, then something weird is going on.

One possibility that comes to mind would be that the "of_plus_1606" container was limited somehow to only using 8 cores...

@Pawan: Have you had any similar experience with this?

Best regards,
Bruno

rudolf.hellmuth September 27, 2016 13:03

Hi Bruno,

I will be away from that machine for a couple of weeks. I am going to write back here then.

Thanks for helping, this is very appreciated.

Cheers,
Rudolf

rudolf.hellmuth October 10, 2016 06:25

1 Attachment(s)
Hi Bruno,

I am attaching the screen capture of the Virtualbox configuration windows for the CPU settings. Is this really any helpful?

Thanks for the aid again.

Obrigado,
Rudolf

Quote:

Originally Posted by wyldckat (Post 619340)
Hi Rudolf,

Quick request: The image from the desktop did help get a bit of a perspective... but any chance you can also show the configuration windows for the CPU settings for this virtual machine in Virtualbox?
I ask this because perhaps Virtualbox is giving any messages regarding the configuration?

Because if lscpu is telling us that only 8 cores exist, then something weird is going on.

One possibility that comes to mind would be that the "of_plus_1606" container was limited somehow to only using 8 cores...

@Pawan: Have you had any similar experience with this?

Best regards,
Bruno


pgh October 10, 2016 06:33

Hi..
Thanks for snapshot . It is strange . Since i do not have windows machine with such higher core so could not check it here. However i will look into this .

Thanks
Pawan

rudolf.hellmuth October 10, 2016 08:51

Could this 8 cores limit be set on docker, instead of VirtualBox?

wyldckat October 10, 2016 20:11

Quick answers:
Quote:

Originally Posted by rudolf.hellmuth (Post 620877)
I am attaching the screen capture of the Virtualbox configuration windows for the CPU settings. Is this really any helpful?

This looks awfully familiar... it's showing that 22 cores are selected out of 32, when you machine only has 24 cores or perhaps 48 Hyperthreads in total. I believe I read a post about this back when I first looked for information on this.

Quote:

Originally Posted by rudolf.hellmuth (Post 620895)
Could this 8 cores limit be set on docker, instead of VirtualBox?

I also checked this back then and it shouldn't be able to do that, at least not as far as I could figure out from the Docker manuals.


Can you please try and upgrade your VirtualBox installation to the latest one? It shouldn't affect the Docker VM, although you might need to first shutdown the Docker VM.


By the way, just to play it safe, what is the error message that the dialogue box is showing in the bottom bar? Namely where it states:
Quote:

Invalid settings detected
I'm guessing it's complaining that there is very little RAM chosen for the display card, in which case it can be left as-is.

rudolf.hellmuth October 11, 2016 07:55

I suppose it is working with 22 cores now, but I don't know what step has made it right. I am going to describe everything that I've done.

Quote:

Originally Posted by wyldckat (Post 620952)
Quick answers:
Can you please try and upgrade your VirtualBox installation to the latest one? It shouldn't affect the Docker VM, although you might need to first shutdown the Docker VM.

Upgrading the VM was a pain... I had got the message error:
Quote:

"The cabinet file common.cab required for this installation is corrupt...."
I checked MD5SUM, tried a previous version, but what actually worked was to restart the computer. I don't know why. :confused:
It worked at least. :o

Then, docker was having SSH problems (IP something...:confused:). I had to delete the default settings, and rerun Docker Quickstart Terminal.

Afterwards, I've got the following error message:
Quote:

unable to find the image ”of_v1606plus_centos66:latest” locally
So, I've followed the troubleshoot hint given on openfoam.com installation website:
Quote:

I have completed the OpenFOAM installation but Clicking on the OF_Create_Env throws an error: unable to find the image ”of_v1606plus_centos66:latest” locally
Prerequisites: Right click on the Oracle Virtual Box shortcut and Open it as Administrator. If default is running, right click over it and select Close → \relax \special {t4ht= PowerOff to close it
  • Click on the Docker Quickstart terminal shortcut on your desktop to open the terminal. Check for any error messages. If there are no error messages, go to the next step. In the case of errors, please address the particular error, referring to the other FAQs listed here.
  • Type the command, docker images in the terminal.If the output shows no errors, go to the next step. In the case of errors, please address the particular error, referring to the other FAQs listed here.
  • Go to the installation foldere.g./c/Program Files(x86)/ESI/OpenFOAM/v1606+/Windows/scripts from the Docker Quickstart terminal using the change directory cd command as below:
  • cd /c/
  • cd *x86*
  • cd ESI/OpenFOAM/v1606+/Windows/scripts
  • Type the command docker load -i of_v1606plus_centos66.tar in the terminal window - this might take a while to execute, depending on system memory. Please note that of_v1606plus_centos66.tar refers to version OpenFOAM-v1606+. The name of the tar file will change depending on the version.
  • Now click on OF_Create_Env to create the container.

However, I ran the command docker load -i of_v1606plus_centos66.tar after I set the new default setting of Virtual Box with the configurations I wanted. I suppose that that ended up configuring the docker container with 22 cores, but I am not 100% sure. I'd have to do this all over again without presetting the default setting, in order to verify the replicability of the configuration process.

The new lscpu:
Code:

CPU(s):                22
On-line CPU(s) list:  0-21
Thread(s) per core:  1
Core(s) per socket:  22
Socket(s):            1


Quote:

Originally Posted by wyldckat (Post 620952)
By the way, just to play it safe, what is the error message that the dialogue box is showing in the bottom bar? Namely where it states:

I'm guessing it's complaining that there is very little RAM chosen for the display card, in which case it can be left as-is.

Yes, that was the video memory. I was't minding that because I am just using the terminal. The message disappeared when I increased the video memory. I don't think that would have allowed me to use the 22 cores. Would it??

The problem was solved for me, but I am not sure which of the above steps made the difference.

Thanks a million for your help, Bruno.

Best regards,
Rudolf

wyldckat October 30, 2016 10:58

Quick answer @Rudolf: Sorry for the delay in responding back to you on this. Since the problem was solved, I didn't give it priority.

Quote:

Originally Posted by rudolf.hellmuth (Post 621023)
I suppose it is working with 22 cores now, but I don't know what step has made it right. I am going to describe everything that I've done.

Many thanks for the detailed steps!! I'm guessing that the upgrade of VirtualBox solved the main problem.


Quote:

Originally Posted by rudolf.hellmuth (Post 621023)
Yes, that was the video memory. I was't minding that because I am just using the terminal. The message disappeared when I increased the video memory. I don't think that would have allowed me to use the 22 cores. Would it??

It shouldn't have been the origin of the problem. But if all else had failed, that could still have been a plausible suspect, although it would likely only continue to give us a symptom and not the cause.

Quote:

Originally Posted by rudolf.hellmuth (Post 621023)
The problem was solved for me, but I am not sure which of the above steps made the difference.

Thanks a million for your help, Bruno.

You're welcome! And once again, many thanks for the detailed steps! This is one of those hard-to-isolate issues, because it's not straight forward to reproduce. Therefore, all of these steps can come in handy!

yMorH July 1, 2019 19:27

Hola Hola problems with parallel solution mac
 
1 Attachment(s)
I'm a new foamer and I think I got a problem!

I run a tutorial for parallel solution without the parallel option and it gave the next final information on the log file:

Code:

Time = 627

smoothSolver:  Solving for Ux, Initial residual = 1.22514118e-05, Final residual = 1.15094602e-06, No Iterations 8
smoothSolver:  Solving for Uy, Initial residual = 1.22504357e-05, Final residual = 1.15430392e-06, No Iterations 8
smoothSolver:  Solving for Uz, Initial residual = 9.99532834e-05, Final residual = 7.21196592e-06, No Iterations 5
GAMG:  Solving for p, Initial residual = 2.71922536e-05, Final residual = 1.25330464e-06, No Iterations 2
time step continuity errors : sum local = 9.86043155e-05, global = 6.98390332e-17, cumulative = 1.43705357e-14
ExecutionTime = 698.01 s  ClockTime = 707 s


SIMPLE solution converged in 627 iterations

End

Then I tried to run with two processors in parallel and I get the next information on the log file:

Code:

Time = 627

smoothSolver:  Solving for Ux, Initial residual = 1.22514118e-05, Final residual = 1.15094602e-06, No Iterations 8
smoothSolver:  Solving for Uy, Initial residual = 1.22504357e-05, Final residual = 1.15430392e-06, No Iterations 8
smoothSolver:  Solving for Uz, Initial residual = 9.99532834e-05, Final residual = 7.21196592e-06, No Iterations 5
GAMG:  Solving for p, Initial residual = 2.71922536e-05, Final residual = 1.25330464e-06, No Iterations 2
time step continuity errors : sum local = 9.86043155e-05, global = 6.98390332e-17, cumulative = 1.43705357e-14
ExecutionTime = 885.54 s  ClockTime = 926 s


SIMPLE solution converged in 627 iterations

End

Given that information I thought that something was wrong so I been looking for the reasons, according to the thread posted by windscion there are several reasons so I checked them.

My virtual machine is working with two processors as can be seen in the configuration of the docker, watch the image attached.

Then I checked the CPU with the command lscpu as recommended by wyldckat and the terminal gave this information:

Code:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):          2
Vendor ID:          GenuineIntel
CPU family:          6
Model:              42
Model name:          Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
Stepping:            7
CPU MHz:            2691.962
BogoMIPS:            5383.92
L1d cache:          32K
L1i cache:          32K
L2 cache:            256K
L3 cache:            4096K
Flags:              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq dtes64 ds_cpl ssse3 cx16 xtpr pcid sse4_1 sse4_2 popcnt aes xsave avx hypervisor lahf_lm kaiser xsaveopt arat

I think my VM is working with two processors, however the solution in parallel is running with the same iterations and even more clock time!

When I execute the decomposeParDict file I got the next information
Code:

/*---------------------------------------------------------------------------*\
  =========                |
  \\      /  F ield        | OpenFOAM: The Open Source CFD Toolbox
  \\    /  O peration    | Website:  https://openfoam.org
    \\  /    A nd          | Version:  6
    \\/    M anipulation  |
\*---------------------------------------------------------------------------*/
Build  : 6-fa1285188035
Exec  : decomposePar
Date  : Jul 01 2019
Time  : 22:21:28
Host  : "f5b1076b78d1"
PID    : 8530
I/O    : uncollated
Case  : /home/openfoam/taylor_couetteParallel
nProcs : 1
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time



Decomposing mesh region0

Create mesh

Calculating distribution of cells
Selecting decompositionMethod simple

Finished decomposition in 0.23 s

Calculating original mesh data

Distributing cells to processors

Distributing faces to processors

Distributing points to processors

Constructing processor meshes

Processor 0
    Number of cells = 128932
    Number of faces shared with processor 1 = 952
    Number of processor patches = 1
    Number of processor faces = 952
    Number of boundary faces = 28560

Processor 1
    Number of cells = 128932
    Number of faces shared with processor 0 = 952
    Number of processor patches = 1
    Number of processor faces = 952
    Number of boundary faces = 28560

Number of processor faces = 952
Max number of cells = 128932 (0% above average 128932)
Max number of processor patches = 1 (0% above average 1)
Max number of faces between processors = 952 (0% above average 952)

Time = 0

Processor 0: field transfer
Processor 1: field transfer

End

Can somebody tell me what am I doing the wrong way?

wyldckat July 9, 2019 19:50

Quick question/answer: How exactly did you run the solver in parallel? What was the command you used?

Because if you used mpirun manually, then you probably forget to add at the end the "-parallel" option in order for the solver to truly run in parallel. Without it, you solved the same case twice, taking 2x times the RAM, which would explain it being slower.

yMorH July 24, 2019 18:47

Indeed! I was running the solver without the -parallel option! A beginner mistake! There will be more! oops!.

I7aniel June 5, 2020 13:06

Quote:

Originally Posted by wyldckat (Post 738495)
Quick question/answer: How exactly did you run the solver in parallel? What was the command you used?

Because if you used mpirun manually, then you probably forget to add at the end the "-parallel" option in order for the solver to truly run in parallel. Without it, you solved the same case twice, taking 2x times the RAM, which would explain it being slower.

Hey i got basically the same Question, i run a simulation in 2 cores with mpirun -np 2 ... -parallel however my execution time nearly doubles :( Any idea why that could be?

Code:

/Build  : 6-4ed10cc0693c
Exec  : decomposePar
Date  : Jun 05 2020
Time  : 18:02:06
Host  : "ubuntu-opsi"
PID    : 30334
I/O    : uncollated
Case  : /home/ubuntu-vm/run/parallel_test
nProcs : 1
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time



Decomposing mesh region0

Create mesh

Calculating distribution of cells
Selecting decompositionMethod scotch

Finished decomposition in 0.01 s

Calculating original mesh data

Distributing cells to processors

Distributing faces to processors

Distributing points to processors

Constructing processor meshes

Processor 0
    Number of cells = 2500
    Number of faces shared with processor 1 = 42
    Number of processor patches = 1
    Number of processor faces = 42
    Number of boundary faces = 5166

Processor 1
    Number of cells = 2500
    Number of faces shared with processor 0 = 42
    Number of processor patches = 1
    Number of processor faces = 42
    Number of boundary faces = 5314

Number of processor faces = 42
Max number of cells = 2500 (0% above average 2500)
Max number of processor patches = 1 (0% above average 1)
Max number of faces between processors = 42 (0% above average 42)

Time = 0

Processor 0: field transfer
Processor 1: field transfer

End

Code:

lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):          1
NUMA node(s):        1
Vendor ID:          GenuineIntel
CPU family:          6
Model:              78
Model name:          Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
Stepping:            3
CPU MHz:            2400.000
BogoMIPS:            4800.00
Hypervisor vendor:  KVM
Virtualization type: full
L1d cache:          32K
L1i cache:          32K
L2 cache:            256K
L3 cache:            3072K
NUMA node0 CPU(s):  0,1
Flags:              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase avx2 invpcid rdseed clflushopt flush_l1d


pgh June 5, 2020 15:55

Will be helpful, if you can post log of both serial and parallel run.
are you running the binaries in virtual box ? if yes, please check how many processor
and memory have you allocated to virtual box ?


All times are GMT -4. The time now is 08:02.