CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > SU2 > SU2 Shape Design

NACA0012 optimization fails on parallel run

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 23, 2020, 10:12
Default NACA0012 optimization fails on parallel run
  #1
New Member
 
Join Date: Jun 2019
Posts: 10
Rep Power: 6
Lazlo is on a distinguished road
Hi everybody,


I am using linux SU2 7.0.1 on Fedora 31 on a single server/multicore AMD CPU.

All tutorials run perfectly except shape design ones. I encounter two problems with Inviscid_2D_Unconstrained_NACA0012:

1 - CONTINUOUS_ADJOINT run fails when reading surface sensitivity file (exposed in an other thread Can not run any test cases) when DISCRETE_ADJOINT is ok in single process but...

2 - SU2_CFD returns a segmentation fault when the same case is run in parallel. It occurs in DSN_002/DIRECT, with an error 139 (DSN_001 is ok). log_Direct.out finishes with the call to ParMETIS. The error sent to the terminal is:

Code:
 --------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 0 on node server01 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
It is a pity because I am really interested by this design capability.


Lazlo
Lazlo is offline   Reply With Quote

Old   February 24, 2020, 11:04
Default
  #2
New Member
 
Jason Trinidad
Join Date: Jul 2018
Posts: 8
Rep Power: 7
jtrin is on a distinguished road
Hi Lazlo,

In my experience shape design can be a memory gobbler. If the first design iteration is running fine and you're experiencing a segfault, my guess is that you may be running out of memory.

Have you tried running "top" or something similar on your compute nodes?
jtrin is offline   Reply With Quote

Old   February 24, 2020, 14:11
Default
  #3
New Member
 
Join Date: Jun 2019
Posts: 10
Rep Power: 6
Lazlo is on a distinguished road
Thanks jtrin,
I stay very low on memory with the NACA0012 test case (2GB). It only occurs in parallel mode, serial mode is ok.
Lazlo is offline   Reply With Quote

Old   February 26, 2020, 08:01
Default
  #4
Super Moderator
 
Tim Albring
Join Date: Sep 2015
Posts: 195
Rep Power: 10
talbring is on a distinguished road
Can you post the complete stack trace of python?
__________________
Developer Director @ SU2 Foundation

Get involved:
talbring is offline   Reply With Quote

Old   February 26, 2020, 15:01
Default
  #5
New Member
 
Join Date: Jun 2019
Posts: 10
Rep Power: 6
Lazlo is on a distinguished road
Thank you for your interest,
Here is the output :
Code:
Traceback (most recent call last):
  File "/home/lazlo/bin/shape_optimization.py", line 176, in <module>
    main()
  File "/home/lazlo/bin/shape_optimization.py", line 108, in main
    options.nzones      )
  File "/home/lazlo/bin/shape_optimization.py", line 152, in shape_optimization
    SU2.opt.SLSQP(project,x0,xb,its,accu)
  File "/home/lazlo/bin/SU2/opt/scipy_tools.py", line 133, in scipy_slsqp
    epsilon        = eps            )
  File "/usr/lib64/python3.7/site-packages/scipy/optimize/slsqp.py", line 208, in fmin_slsqp
    constraints=cons, **opts)
  File "/usr/lib64/python3.7/site-packages/scipy/optimize/slsqp.py", line 399, in _minimize_slsqp
    fx = func(x)
  File "/usr/lib64/python3.7/site-packages/scipy/optimize/optimize.py", line 300, in function_wrapper
    return function(*(wrapper_args + args))
  File "/home/lazlo/bin/SU2/opt/scipy_tools.py", line 383, in obj_f
    obj_list = project.obj_f(x)
  File "/home/lazlo/bin/SU2/opt/project.py", line 233, in obj_f
    return self._eval(konfig, func,dvs)
  File "/home/lazlo/bin/SU2/opt/project.py", line 202, in _eval
    vals = design._eval(func,*args)
  File "/home/lazlo/bin/SU2/eval/design.py", line 147, in _eval
    vals = eval_func(*inputs)
  File "/home/lazlo/bin/SU2/eval/design.py", line 244, in obj_f
    func += su2func(this_obj,config,state) * sign * scale * global_factor
  File "/home/lazlo/bin/SU2/eval/functions.py", line 92, in function
    aerodynamics( config, state )
  File "/home/lazlo/bin/SU2/eval/functions.py", line 255, in aerodynamics
    info = su2run.direct(config)
  File "/home/lazlo/bin/SU2/run/direct.py", line 77, in direct
    SU2_CFD(konfig)
  File "/home/lazlo/bin/SU2/run/interface.py", line 112, in CFD
    run_command( the_Command )
  File "/home/lazlo/bin/SU2/run/interface.py", line 292, in run_command
    raise exception(message)
RuntimeError: Path = /media/data/lazlo/Logiciels/git/SU2/Tutorials/Inviscid_2D_Unconstrained_NACA0012 DA/DESIGNS/DSN_002/DIRECT/,
Command = mpirun -n 8 /home/lazlo/bin/SU2_CFD config_CFD.cfg
SU2 process returned error '139'
[server01:6556 :0:6556] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
[server01:6557 :0:6557] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
[server01:6558 :0:6558] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
[server01:6559 :0:6559] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
[server01:6561 :0:6561] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
[server01:6563 :0:6563] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x514)
==== backtrace ====
[server01:6555 :0:6555] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace ====
    0  /lib64/libucs.so.0(+0x1b25f) [0x7fb6ced4025f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7fb6ced4042a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d2e0]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7fb6d4e8e1a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    0  /lib64/libucs.so.0(+0x1b25f) [0x7f79a413225f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7f79a413242a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d96b]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f79a52801a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    0  /lib64/libucs.so.0(+0x1b25f) [0x7f0cdc54a25f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7f0cdc54a42a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d96b]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f0cde6991a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    0  /lib64/libucs.so.0(+0x1b25f) [0x7f2bb8f4125f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7f2bb8f4142a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d96b]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    0  /lib64/libucs.so.0(+0x1b25f) [0x7fe60d20b25f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7fe60d20b42a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d96b]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7fe60f35a1a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    0  /lib64/libucs.so.0(+0x1b25f) [0x7f32782c925f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7f32782c942a]
    2  /home/lazlo/bin/SU2_CFD() [0xb3d96b]
    3  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    4  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    5  /home/lazlo/bin/SU2_CFD() [0x806447]
    6  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    7  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
    8  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f327a4181a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    0  /lib64/libucs.so.0(+0x1b25f) [0x7f6ff004825f]
    1  /lib64/libucs.so.0(+0x1b42a) [0x7f6ff004842a]
    2  /lib64/libc.so.6(cfree+0x20) [0x7f6ff11fb7b0]
    3  /home/lazlo/bin/SU2_CFD() [0x5f5882]
    4  /home/lazlo/bin/SU2_CFD() [0xb3d366]
    5  /home/lazlo/bin/SU2_CFD() [0xb3ecb9]
    6  /home/lazlo/bin/SU2_CFD() [0x8050f4]
    7  /home/lazlo/bin/SU2_CFD() [0x806447]
    8  /home/lazlo/bin/SU2_CFD() [0x806bf8]
    9  /home/lazlo/bin/SU2_CFD() [0x80cb1f]
   10  /home/lazlo/bin/SU2_CFD() [0x45a8e0]
   11  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f6ff11961a3]
   12  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
    9  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f2bbb0901a3]
   10  /home/lazlo/bin/SU2_CFD() [0x4687be]
===================
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 6 with PID 0 on node server01 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Last edited by Lazlo; February 26, 2020 at 15:01. Reason: typo
Lazlo is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Lagrangian particle tracking cannot be run in parallel for the cases with AMI patches Armin.Sh OpenFOAM Running, Solving & CFD 7 March 28, 2021 23:33
unable to run in parallel with OpenFOAM 2.2 on CentOS einatlev OpenFOAM Running, Solving & CFD 9 June 26, 2014 01:24
[mesh manipulation] Cannot get refineMesh to run in parallel smschnob OpenFOAM Meshing & Mesh Conversion 2 June 3, 2014 12:20
First Parallel Run - need some help Gian Maria OpenFOAM 3 June 17, 2011 13:08
Ignition fails in parallel run combustion solvers msha OpenFOAM Bugs 17 January 17, 2009 04:49


All times are GMT -4. The time now is 06:22.