CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   FLUENT (https://www.cfd-online.com/Forums/fluent/)
-   -   Problem running Fluent in parallel (https://www.cfd-online.com/Forums/fluent/190590-problem-running-fluent-parallel.html)

Ryan. July 17, 2017 16:39

Problem running Fluent in parallel
 
Hi,

I am trying to run a very simple case in parallel so I can make sure it works before running my actual simulation. In my journal file I only do a few iterations. When I run it in serial using "fluent 2ddp -g -i jet2.jou > ${OUTFILE}" in my script, I get the solution in a few seconds . However, when I run it in parallel using "fluent 2ddp -t2 -pnmpi cnf=$PBS_NODEFILE -g -i jet2.jou > ${OUTFILE}" it writes the following message to the log-file and the job continues to run until reaching the wall-time without solving the problem or giving any errors. I would appreciate it if you could help me find the problem. I am running this simulation in background on a Linux cluster.

Quote:

Host spawning Node 0 on machine "l1439" (unix).
/apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/bin/fluent -r15.0.7 2ddp -flux -node -alnamd64 -t24 -pethernet -mpi=pcmpi -mport 10.188.11.229:10.188.11.229:42773:0
Starting /apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/multiport/mpi/lnamd64/pcmpi/bin/mpirun -e MPI_USE_MALLOPT_MMAP_MAX=0 -np 2 /apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/lnamd64/2ddp_node/fluent_mpi.15.0.7 node -mpiw pcmpi -pic ethernet -mport 10.188.11.229:10.188.11.229:42773:0
Thanks in advance.

LuckyTran July 22, 2017 01:47

Quote:

Originally Posted by Ryan. (Post 657423)
Host spawning Node 0 on machine "l1439" (unix).
/apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/bin/fluent -r15.0.7 2ddp -flux -node -alnamd64 -t24 -pethernet -mpi=pcmpi -mport 10.188.11.229:10.188.11.229:42773:0
Starting /apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/multiport/mpi/lnamd64/pcmpi/bin/mpirun -e MPI_USE_MALLOPT_MMAP_MAX=0 -np 2 /apps/Ansys/15.0.7/v150/fluent/fluent15.0.7/lnamd64/2ddp_node/fluent_mpi.15.0.7 node -mpiw pcmpi -pic ethernet -mport 10.188.11.229:10.188.11.229:42773:0

After this block, Fluent should print the list of nodes that it's running on. After that, it usually creates a cleanup script. If you don't see any text after this, that means Fluent was not able to start any processes and that's why it sits there and does nothing.

If you can run on your cluster in serial but not in parallel, then that means it's a problem in your mpi setup on the cluster. That's because the mpi itself is what starts the parallel Fluent processes on each node.

Ryan. August 11, 2017 13:09

Quote:

Originally Posted by LuckyTran (Post 658009)
After this block, Fluent should print the list of nodes that it's running on. After that, it usually creates a cleanup script. If you don't see any text after this, that means Fluent was not able to start any processes and that's why it sits there and does nothing.

If you can run on your cluster in serial but not in parallel, then that means it's a problem in your mpi setup on the cluster. That's because the mpi itself is what starts the parallel Fluent processes on each node.


Thanks a lot. The problem was exactly the same as you described. Now, I'm using a different version and it works.


All times are GMT -4. The time now is 00:41.