|
[Sponsors] |
Optimum way for running simulation in parallel |
![]() |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
![]() |
![]() |
#1 |
Senior Member
krishna kant
Join Date: Feb 2016
Location: Hyderabad, India
Posts: 134
Rep Power: 11 ![]() |
Hello All
I am running a simulation of multiphase flow in parallel and there is a huge difference between execution time and clock time. I want to know what could be the possible reason for it. I am attaching an instance of my log data and my system info here. Code:
PIMPLE: iteration 1 Selected 0 split points out of a possible 0. Number of isoAdvector surface cells = 0 isoAdvection: Before conservative bounding: min(alpha) = 0, max(alpha) = 1 + -1 isoAdvection: After conservative bounding: min(alpha) = 0, max(alpha) = 1 + -1 isoAdvection: time consumption = 1% Phase-1 volume fraction = 0 Min(alpha.water) = 0 1 - Max(alpha.water) = 1 solve the reinitialization equation Interpolation routine for interface normal Curvature Calculation Creating isoSurface Interpolating Curvature from iso-surface to cell centers smoothSolver: Solving for Ux, Initial residual = 0.000593322427, Final residual = 1.70711359e-09, No Iterations 3 smoothSolver: Solving for Uy, Initial residual = 0.00260626766, Final residual = 6.52222249e-09, No Iterations 3 smoothSolver: Solving for Uz, Initial residual = 0.000199399075, Final residual = 1.1910404e-09, No Iterations 3 GAMG: Solving for p_rgh, Initial residual = 0.00639449018, Final residual = 3.29043924e-05, No Iterations 3 time step continuity errors : sum local = 3.53346848e-09, global = 4.19805384e-11, cumulative = 3.24236402e-08 GAMG: Solving for p_rgh, Initial residual = 0.000340716515, Final residual = 3.28157441e-06, No Iterations 3 time step continuity errors : sum local = 3.52358875e-10, global = -6.40922139e-12, cumulative = 3.2417231e-08 GAMG: Solving for p_rgh, Initial residual = 4.49729204e-05, Final residual = 7.00905977e-09, No Iterations 15 time step continuity errors : sum local = 7.51373052e-13, global = -1.42068391e-14, cumulative = 3.24172168e-08 smoothSolver: Solving for k, Initial residual = 0.000333755776, Final residual = 6.7287304e-07, No Iterations 1 ExecutionTime = 24298.66 s ClockTime = 121287 s Code:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 40 On-line CPU(s) list: 0-39 Thread(s) per core: 2 Core(s) per socket: 10 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz Stepping: 2 CPU MHz: 1200.000 BogoMIPS: 4589.05 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 25600K NUMA node0 CPU(s): 0-9,20-29 NUMA node1 CPU(s): 10-19,30-39 I am running 10 simulation each using 4 processor. The grid size is approx 22K for a 2D case. |
|
![]() |
![]() |
![]() |
![]() |
#2 |
New Member
Icaro A. Carvalho
Join Date: Dec 2020
Posts: 24
Rep Power: 6 ![]() |
Hi Krishna,
I sometimes get confused with the output of 'lscpu' as you used, so I hope I am not saying something wrong. I would suggest you try running these 10 simulations with 2 processors each and compare the cpu time with the clocktime. I say this because I suspect you have actually 20 physical cores, and the way you're running, you're using virtual cores, which OpenFOAM does not take advantage of. Hope that helps. |
|
![]() |
![]() |
![]() |
![]() |
#4 |
Senior Member
Klaus
Join Date: Mar 2009
Posts: 289
Rep Power: 23 ![]() |
- I understand, you have two nodes, each with two sockets, each socket with 10 physical cores?
- Make sure you switch off SMT/Hyper-Threading and use only physical cores! - How fast is your link between the two nodes? (InfiniBand or something slower?) - Maybe you use too many cores for your small test case and waste time on "unnecessary" communication (see discussion: MPIRun How many processors) |
|
![]() |
![]() |
![]() |
![]() |
#5 |
Senior Member
Join Date: Apr 2020
Location: UK
Posts: 825
Rep Power: 16 ![]() |
Have you tried running top? Just type this from the command line and it will tell you how busy the processors are ... and to check on Domenico's suggestion.
For example, if all is working smoothly the processes for each run should be steaming away at 100%CPU ... if they are always far below 100% then there is probably some bottleneck in the communication or you are over loading the cores; if they are at 100% for a while then drop to something small before returning to 100% then there may be a disk writing bottleneck etc etc. |
|
![]() |
![]() |
![]() |
![]() |
#6 |
Senior Member
krishna kant
Join Date: Feb 2016
Location: Hyderabad, India
Posts: 134
Rep Power: 11 ![]() |
Hello All
Thank you for all the suggestions and I apologize for my late reply, I was so involved those days for ICLASS so missed the notification of reply on my email. Now I am running another simulation of 1.125M cells and using 4 processor each. The problem is still pertaining and It seems the problem is of multithreading and CPU utilization. The top command gives me this info: Code:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16526 Rajesh 20 0 1822m 1.1g 7932 R 45.6 3.7 959:58.71 interFlowvAMR1 15585 Rajesh 20 0 1830m 993m 7788 R 43.8 3.1 945:34.24 interFlowvAMR1 15892 Rajesh 20 0 1797m 976m 8216 R 43.8 3.1 949:38.86 interFlowvAMR1 15893 Rajesh 20 0 1809m 1.0g 7600 R 43.8 3.2 945:08.07 interFlowvAMR1 16527 Rajesh 20 0 1813m 1.1g 8092 R 43.8 3.7 960:44.33 interFlowvAMR1 16524 Rajesh 20 0 1824m 1.2g 8120 R 42.0 3.7 958:22.57 interFlowvAMR1 15588 Rajesh 20 0 1826m 965m 7760 R 40.1 3.0 944:16.15 interFlowvAMR1 15894 Rajesh 20 0 1813m 1.0g 7844 R 40.1 3.2 947:49.50 interFlowvAMR1 16852 Rajesh 20 0 1815m 1.2g 7792 R 40.1 3.8 956:07.44 interFlowvAMR1 15586 Rajesh 20 0 1821m 1.0g 7824 R 38.3 3.3 946:58.00 interFlowvAMR1 16219 Rajesh 20 0 1808m 1.1g 8196 R 38.3 3.6 953:12.18 interFlowvAMR1 16221 Rajesh 20 0 1826m 1.1g 8180 R 38.3 3.6 954:00.33 interFlowvAMR1 15891 Rajesh 20 0 1817m 1.0g 8192 R 36.5 3.3 948:09.53 interFlowvAMR1 16218 Rajesh 20 0 1830m 1.1g 8220 R 36.5 3.4 947:10.34 interFlowvAMR1 16525 Rajesh 20 0 1795m 1.1g 8144 R 36.5 3.7 958:52.91 interFlowvAMR1 15587 Rajesh 20 0 1824m 1.1g 7600 R 34.7 3.5 945:53.87 interFlowvAMR1 16220 Rajesh 20 0 1830m 1.1g 8032 R 34.7 3.4 948:53.46 interFlowvAMR1 16851 Rajesh 20 0 1862m 1.3g 7760 R 34.7 4.0 955:12.77 interFlowvAMR1 16853 Rajesh 20 0 1830m 1.2g 7568 R 34.7 3.9 958:54.66 interFlowvAMR1 16854 Rajesh 20 0 1831m 1.2g 7772 R 31.0 3.9 956:42.15 interFlowvAMR1 Code:
grep -i 'ht' /proc/cpuinfo Code:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm invpcid_single ssbd pti retpoline ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm cqm_llc cqm_occup_llc flush_l1d flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm invpcid_single ssbd pti retpoline ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm cqm_llc cqm_occup_llc flush_l1d flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm invpcid_single ssbd pti retpoline ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm cqm_llc cqm_occup_llc flush_l1d flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm invpcid_single ssbd pti retpoline ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm cqm_llc cqm_occup_llc flush_l1d flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm invpcid_single ssbd pti retpoline ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm cqm_llc cqm_occup_llc flush_l1d |
|
![]() |
![]() |
![]() |
![]() |
#7 |
Senior Member
krishna kant
Join Date: Feb 2016
Location: Hyderabad, India
Posts: 134
Rep Power: 11 ![]() |
Is there any command to switch off multithreading in OpenFoam? It is using virtual cpus even if I try to use only physical cpus by limiting to 20 cpus.
|
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Error running openfoam in parallel | fede32 | OpenFOAM Programming & Development | 5 | October 4, 2018 16:38 |
Bug extracting field in parallel simulation? | FlyBob91 | OpenFOAM Post-Processing | 0 | September 20, 2017 10:17 |
Unsteady simulation solution files in parallel | gunnersnroses | SU2 | 1 | December 15, 2015 13:28 |
Engine Simulation in parallel | Peter_600 | OpenFOAM | 0 | July 26, 2012 06:02 |
help with command line in parallel simulation | ernest | STAR-CCM+ | 8 | August 17, 2011 05:02 |