CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   FLUENT (https://www.cfd-online.com/Forums/fluent/)
-   -   Linux Fluent HPC parallel system check failed (https://www.cfd-online.com/Forums/fluent/124648-linux-fluent-hpc-parallel-system-check-failed.html)

saad3000 October 10, 2013 02:35

Linux Fluent HPC parallel system check failed
 
Dears,

I am trying to run fluent from Master Node Linux Cluster in order to spawn node01 and it is failing with errors:

fluent
/ansys_inc/v140/fluent/fluent14.0.0/bin/fluent -r14.0.0
/ansys_inc/v140/fluent/fluent14.0.0/bin/fluent -r14.0.0 3d -t4 -pinfiniband -mpi=openmpi -cnf=/gpfs1/iaf04/.fluent.launcher.host -ssh
bash: /ansys_inc/v140/fluent/fluent14.0.0/multiport/mpi_wrapper/bin/mpicheck.fl: No such file or directory
*** Parallel system check failed!
*** To disable this check, run FLUENT with -pcheck=0
/ansys_inc/v140/fluent/fluent14.0.0/cortex/lnamd64/cortex.14.0.0 -f fluent -newcx (fluent "3d -host -alnamd64 -r14.0.0 -t4 -cnf=/gpfs1/iaf04/.fluent.launcher.host -path/ansys_inc/v140/fluent -ssh")
[iaf04@hpc-mgt1 ~]$ fluent
/ansys_inc/v140/fluent/fluent14.0.0/bin/fluent -r14.0.0
/ansys_inc/v140/fluent/fluent14.0.0/bin/fluent -r14.0.0 3d -t4 -pinfiniband -mpi=openmpi -cnf=/gpfs1/iaf04/.fluent.launcher.host

we have inifiniband - with openmpi and password-less ssh access with shared home folder.

Also I have noticed that I get cortexerror.log in my home folder and its content is:
Error [cortex] [time 10/10/13 9:25:50] \ufffdh\ufffd\ufffd3
1000000: fluent(CX_Primitive_Error+0x182) [0x4e12c2]
1000000: fluent(CX_Interrupt+0xa6) [0x4e2486]
1000000: fluent(CX_Await_Client+0x74) [0x4e25b4]
1000000: fluent() [0x4f4bac]
1000000: fluent(eval+0x7cc) [0x5a576c]
1000000: fluent(eval+0x906) [0x5a58a6]
1000000: fluent() [0x5a6f4e]
1000000: fluent(eval+0x603) [0x5a55a3]
1000000: fluent() [0x5a6f4e]
1000000: fluent(eval+0x603) [0x5a55a3]
1000000: fluent(eval+0x906) [0x5a58a6]
1000000: fluent() [0x5a6d38]
1000000: fluent(eval_errprotect+0x32) [0x5a6dc2]
1000000: fluent(eval+0x2ef) [0x5a528f]
1000000: fluent(eval+0x7b9) [0x5a5759]
==================

When specifiying Parallel on same machine MasterNode for example it works. But does not work when specifying node01.

any ideas why mpi are not communicating?

sanjeetk January 8, 2021 03:34

Hi,
Did you solve this? I am also trying to run fluent in HPC server with PBS Pro job scheduler.


All times are GMT -4. The time now is 17:52.