CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > FLUENT

What cause the below of fluent calculation in cluster?It just happens abruptly. How t

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 14, 2022, 04:47
Default What cause the below of fluent calculation in cluster?It just happens abruptly. How t
  #1
Senior Member
 
Join Date: Dec 2017
Posts: 384
Rep Power: 9
hitzhwan is on a distinguished road
What cause the below of fluent calculation in cluster?It just happens abruptly. How to solve it ?

Fatal error has happened to some of the processes!
Exiting ...



===============Message from the Cortex Process================================

Fatal error in one of the compute processes.

================================================== ============================

================================================== ============================
Stack backtrace generated for process id 2717 on signal 11 :
*** Error in `fluent': corrupted double-linked list: 0x0000000001f847c0 ***
======= Backtrace: =========
/usr/lib64/libc.so.6(+0x7bd95)[0x2af635affd95]
/usr/lib64/libc.so.6(+0x7de35)[0x2af635b01e35]
/usr/lib64/libc.so.6(__libc_malloc+0x4c)[0x2af635b0387c]
/usr/lib64/libc.so.6(__backtrace_symbols+0x10e)[0x2af635b8e33e]
fluent(print_back_trace_to_file+0x5a)[0x68d76a]
*** Error in `fluent': corrupted double-linked list: 0x0000000001f84760 ***
fluent[0x67f3b9]
/usr/lib64/libc.so.6(+0x35670)[0x2af635ab9670]
/usr/lib64/libc.so.6(+0x38dcd)[0x2af635abcdcd]
======= Backtrace: =========
/usr/lib64/libc.so.6(+0x38eb5)[0x2af635abceb5]
fluent[0x677829]
/usr/lib64/libc.so.6(+0x35670)[0x2af635ab9670]
/usr/lib64/libc.so.6(__select+0x33)[0x2af635b71943]
fluent[0x65eb86]
fluent(lreadf+0x29)[0x6e1b99]
/usr/lib64/libc.so.6(+0x7bd95)[0x2af635affd95]
/usr/lib64/libc.so.6(+0x7cec6)[0x2af635b00ec6]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(+0x9bfbc)[0x2af634afafbc]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(+0x9ca27)[0x2af634afba27]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(PyDict_SetItem+0x67)[0x2af634afd487]
fluent(eval+0x497)[0x6db8d7]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(_PyModule_Clear+0x14c)[0x2af634b015bc]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(PyImport_Cleanup+0x24f)[0x2af634b8288f]
/opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(Py_Finalize+0xfe)[0x2af634b948de]
/opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libExpr.so(_ZN13PyInitializerD1Ev+0x6)[0x2af630f3a716]
/usr/lib64/libc.so.6(__cxa_finalize+0x9a)[0x2af635abd1da]
/opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libExpr.so(+0x635c3)[0x2af630ecb5c3]
======= Memory map: ========
00400000-0124d000 r-xp 00000000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0
0144d000-01477000 r--p 00e4d000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0
01477000-014fb000 rw-p 00e77000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0
014fb000-0164d000 rw-p 00000000 00:00 0
01eb2000-022cb000 rw-p 00000000 00:00 0 [heap]
2af62e099000-2af62e0ba000 r-xp 00000000 08:03 17043974 /usr/lib64/ld-2.17.so
2af62e0ba000-2af62e219000 rw-p 00000000 00:00 0
2af62e219000-2af62e220000 r--s 00000000 08:03 17305077 /usr/lib64/gconv/gconv-modules.cache
2af62e220000-2af62e299000 rw-p 00000000 00:00 0
2af62e29a000-2af62e29b000 rw-p 00000000 00:00 0
2af62e2ba000-2af62e2bb000 r--p 00021000 08:03 17043974 /usr/lib64/ld-2.17.so
2af62e2bb000-2af62e2bc000 rw-p 00022000 08:03 17043974 /usr/lib64/ld-2.17.so
2af62e2bc000-2af62e2bd000 rw-p 00000000 00:00 0
2af62e2bd000-2af62e550000 r-xp 00000000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so
2af62e550000-2af62e74f000 ---p 00293000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so
2af62e74f000-2af62e755000 r--p 00292000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so
2af62e755000-2af62e7aa000 rw-p 00298000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so
2af62e7aa000-2af62f489000 r-xp 00000000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so
2af62f489000-2af62f688000 ---p 00cdf000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so
2af62f688000-2af62f6c3000 r--p 00cde000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so
2af62f6c3000-2af62f6c8000 rw-p 00d19000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so
2af62f6c8000-2af62f730000 r-xp 00000000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5
2af62f730000-2af62f930000 ---p 00068000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5
2af62f930000-2af62f931000 r--p 00068000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5
2af62f931000-2af62f932000 rw-p 00069000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5
2af62f932000-2af62f933000 rw-p 00000000 00:00 0
2af62f933000-2af62fa92000 r-xp 00000000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so
2af62fa92000-2af62fc92000 ---p 0015f000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so
2af62fc92000-2af62fc93000 r--p 0015f000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so
2af62fc93000-2af62fca6000 rw-p 00160000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so
2af62fca6000-2af62fcbc000 r-xp 00000000 08:03 17044007 /usr/lib64/libpthread-2.17.so
2af62fcbc000-2af62febc000 ---p 00016000 08:03 17044007 /usr/lib64/libpthread-2.17.so
2af62febc000-2af62febd000 r--p 00016000 08:03 17044007 /usr/lib64/libpthread-2.17.so
2af62febd000-2af62febe000 rw-p 00017000 08:03 17044007 /usr/lib64/libpthread-2.17.so
2af62febe000-2af62fec2000 rw-p 00000000 00:00 0
2af62fec2000-2af62fee8000 r-xp 00000000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so
2af62fee8000-2af6300e7000 ---p 00026000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so
2af6300e7000-2af6300e8000 r--p 00025000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so
2af6300e8000-2af6300e9000 rw-p 00026000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so
2af6300e9000-2af630517000 r-xp 00000000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so
2af630517000-2af630717000 ---p 0042e000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so
2af630717000-2af630719000 r--p 0042e000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so
2af630719000-2af630733000 rw-p 00430000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so
2af630733000-2af630734000 rw-p 00000000 00:00 0
2af630734000-2af6307b9000 r-xp 00000000 00:26 425291459
hitzhwan is offline   Reply With Quote

Old   July 14, 2022, 11:42
Default
  #2
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,675
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
This is the libc version of a segmentation fault which means it tried to access memory and then couldn't. Frankly this could be caused by anything. Maybe someone unplugged your RAM or spilt coffee on it. Or maybe you have code that tries to access variables that haven't been declared yet.
LuckyTran is online now   Reply With Quote

Old   July 14, 2022, 21:24
Default Hello, do you use the cluster, do you find the cluster is faster than the single pc?
  #3
Senior Member
 
Join Date: Dec 2017
Posts: 384
Rep Power: 9
hitzhwan is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
This is the libc version of a segmentation fault which means it tried to access memory and then couldn't. Frankly this could be caused by anything. Maybe someone unplugged your RAM or spilt coffee on it. Or maybe you have code that tries to access variables that haven't been declared yet.
Thank you, do you use the cluster, do you find the cluster is faster than the single pc?
hitzhwan is offline   Reply With Quote

Old   July 14, 2022, 21:46
Default
  #4
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,675
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
Yes I use a cluster. And I employ just a tiny bit of common sense when I do. Most problems that I run on a cluster don't fit on one PC. So yes, it's infinitely faster.
LuckyTran is online now   Reply With Quote

Old   July 14, 2022, 21:53
Default what is the reason that it does not fit on one pc? why is it faster than the pc?
  #5
Senior Member
 
Join Date: Dec 2017
Posts: 384
Rep Power: 9
hitzhwan is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
Yes I use a cluster. And I employ just a tiny bit of common sense when I do. Most problems that I run on a cluster don't fit on one PC. So yes, it's infinitely faster.
what is the reason that it does not fit on one pc? why is it faster than the pc?
hitzhwan is offline   Reply With Quote

Old   July 14, 2022, 22:04
Default
  #6
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,675
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
Do you still have a Pentium CPU or do you have a modern multi-core CPU? Applications in general can run with more throughput via multithreading. Clusters are just massive versions of that.

I often need upwards of 200 GB of RAM to open my model. I only have 96 GB of RAM on my workstation. Even my smaller models that I can open on my workstation would take weeks to run. I run it on a cluster to get results on the same day.
LuckyTran is online now   Reply With Quote

Old   July 15, 2022, 03:51
Default Hi,my workstation has a CPU Xeon platinum 8273CL,what about your workstation and clu
  #7
Senior Member
 
Join Date: Dec 2017
Posts: 384
Rep Power: 9
hitzhwan is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
Do you still have a Pentium CPU or do you have a modern multi-core CPU? Applications in general can run with more throughput via multithreading. Clusters are just massive versions of that.

I often need upwards of 200 GB of RAM to open my model. I only have 96 GB of RAM on my workstation. Even my smaller models that I can open on my workstation would take weeks to run. I run it on a cluster to get results on the same day.
Hi,my workstation has a CPU Xeon platinum 8273CL,what about your workstation and cluster?
if the cluster and the workstation have the same hardware, does they have the same speed?
hitzhwan is offline   Reply With Quote

Old   July 15, 2022, 11:15
Default
  #8
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,675
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
If you have two computers with the same hardware not running at the same speed then either one is defective or you really did something wrong.

But instead of asking about how others are not having issues how about provide details relevant to yourself that might elucidate the issues you are having? That would be more helpful to you.
LuckyTran is online now   Reply With Quote

Old   July 16, 2022, 22:29
Default sorry, Maybe I do not have declare the detail. The cluster has a CPU of E5-2605v4, an
  #9
Senior Member
 
Join Date: Dec 2017
Posts: 384
Rep Power: 9
hitzhwan is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
If you have two computers with the same hardware not running at the same speed then either one is defective or you really did something wrong.

But instead of asking about how others are not having issues how about provide details relevant to yourself that might elucidate the issues you are having? That would be more helpful to you.
sorry, Maybe I do not have declare the detail. The cluster has a CPU of E5-2605v4, another pc has a CPU of 6226R, we both use 24 cores to calculate, all the other calculation model are the same, I find the speed of the cluster is 2 times faster than another pc, what is the reason?
hitzhwan is offline   Reply With Quote

Old   July 17, 2022, 01:45
Default
  #10
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,675
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
An E5-2650V4 has 12 cores, 24 hyperthreaded. A 6226R has 16 cores.


If you use 24 cores on an E5-2650V4, you'll likely saturate the cpu and it runs at 100%. If you use 24 cores on a 6226R, it will be very suboptimal. The first block of 16 will do their work, and then the remaining 8 must wait for the first block of 16 to finish. Not only does this guarantee you have at least 25% idle time, it doubles the number of cpu cycles needed to complete 1 iteration.


Since I can't really tell what else might be wrong, I recommend to run at less than capacity. I.e. 11 cores on both machines to do a fairer comparison.
LuckyTran is online now   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to run fluent from matlab without using aas toolbox? artemis96 ANSYS 7 May 23, 2022 12:16
Why my calculated epsilon is different from FLUENT calculation? minzhang Fluent Multiphase 14 May 12, 2020 22:53
How to continue calculation after the computer is abruptly shut off in Ansys Fluent?? rubeng0071 FLUENT 5 February 9, 2020 15:16
Running UDF with Supercomputer roi247 FLUENT 4 October 15, 2015 13:41
Calculation stop with ansys fluent windows 8 jb pouillard FLUENT 3 September 22, 2015 03:35


All times are GMT -4. The time now is 20:22.