|
[Sponsors] |
Cluster error: Fatal error in one of the compute processes |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
January 4, 2023, 06:02 |
Cluster error: Fatal error in one of the compute processes
|
#1 |
New Member
Karla Jacinto
Join Date: Oct 2020
Posts: 5
Rep Power: 6 |
Hi,
I'm running my jobs in a cluster and this is not the first time, that, after some time steps, without any error, this message appears: ===============Message from the Cortex Process================================ Fatal error in one of the computing processes. ================================================== ============================ Usually after the time-step convergence. There are other log files (fluent-error.log), where it is possible to find more information that I don't understand, as: myid (30): Fatal signal raised sig = SIGIOT /soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/lnamd64/3ddp_node/fluent_mpi.20.1.0() [0x225cabc] /lib64/libpthread.so.0(+0xf630) [0x7f29717ff630] /lib64/libc.so.6(gsignal+0x37) [0x7f2967b24387] /lib64/libc.so.6(abort+0x148) [0x7f2967b25a78] /lib64/libc.so.6(+0x78f67) [0x7f2967b66f67] /lib64/libc.so.6(+0x81329) [0x7f2967b6f329] /soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/mpi/lnamd64/ibmmpi/lib/linux_amd64/libmpi.so.1(free+0x2e) [0x7f296641eece] /lib64/libc.so.6(+0x39d10) [0x7f2967b27d10] /lib64/libc.so.6(+0x39d37) [0x7f2967b27d37] /soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x7996a) [0x7f297473796a] /soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x79a85) [0x7f2974737a85] /soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x81b5a) [0x7f297473fb5a] /lib64/libpthread.so.0(+0x7ea5) [0x7f29717f7ea5] /lib64/libc.so.6(clone+0x6d) [0x7f2967becb0d] myid (30): Fatal signal raised sig = SIGSEGV There is anyone that knows how to solve this? Thanks for your help. |
|
March 4, 2024, 11:14 |
|
#2 |
New Member
Christoph D
Join Date: Mar 2024
Posts: 1
Rep Power: 0 |
Did you find a solution for this problem? I have exactly the same problem (also on a cluster after simulating for some time, using ANSYS FLUENT 2022R2).
|
|
Tags |
cluster, cortex, error, fluent, hpc |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Fatal error in one of the compute processes | xiaopang | FLUENT | 0 | December 29, 2022 01:55 |
StarCCMS+ on AWS Parallel Cluster not distributing workload across multiple nodes | dwagoner | STAR-CCM+ | 3 | May 25, 2021 03:39 |
Compute Cluster with diskless compute nodes | Pauli | Hardware | 0 | October 6, 2015 17:48 |
Cluster ID's not contiguous in compute-nodes domain. ??? | Shogan | FLUENT | 1 | May 28, 2014 16:03 |
Parallel PHOENICS using Microsoft Compute Cluster | Asish Sinha | Phoenics | 2 | June 6, 2008 10:32 |