CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > FLUENT

fluent issue in windows hpc pack 2012 r2

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   January 6, 2019, 07:31
Default fluent issue in windows hpc pack 2012 r2
  #1
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
hello

i have created a HPC cluster using microsoft hpc pack 2012r2 and i,m trying to run fluent.
i followed the post installation task described in this link:


http://www.sharcnet.ca/Software/Ansy..._setupdan.html

the cluster has 2 nodes (test cluster).
when i try to run fluent in parallel mode i have a warning. (added pics)
i also added pics from my fluent launcher config.
the issue is that the job fails after a little while and fluent GUI is stuck.
i have ansys 18.2 installed on both nodes
any ideas what i,m doing wrong?


update: if i attempt to run it on one node and choose the head node it works, so the issue has to be my compute node
also i noticed this warning: FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory


update 2: attempting to run only on compute node also succeeds.
when i try to run on both machines it does not fail but the GUI is stuck.


fluent console output:
Host spawning Node 0 on machine "WIN-6CQO3PDKEA1" (win64).

***
*** FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory! Fluent may not work properly in parallel across a network!
***
Job has been submitted. ID: 3013.
Waiting for CCP scheduler@WIN-6CQO3PDKEA1 to start msmpi nodes ...
Job 3013 is Running.
Attached Images
File Type: png fluent 1.png (26.1 KB, 22 views)
File Type: png fluent 2.png (23.5 KB, 19 views)
File Type: png fluent 3.png (24.3 KB, 15 views)
File Type: png fluent 4.png (24.6 KB, 15 views)
File Type: png fluent 6.png (6.6 KB, 16 views)

Last edited by mosesHPC; January 6, 2019 at 09:58.
mosesHPC is offline   Reply With Quote

Old   January 6, 2019, 13:36
Default
  #2
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
Check out this thread
LuckyTran is offline   Reply With Quote

Old   January 6, 2019, 15:29
Default
  #3
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
tnx for the reply
i have already read this thread.
my goal is to set it up using microsoft HPC since we will reuse the cluster for other applications later on.
also my cluster has AMD processors and this thread uses intel MPI.
mosesHPC is offline   Reply With Quote

Old   January 7, 2019, 07:39
Default
  #4
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
Check out this thread
i tried this with my systems which have AMD ryzen threadripper processors.
when installing intel mpi i got a warning that my CPUs dont have intel architecture (obviously).

but it installed anyway.
when i run ansys in parallel mode this error occures.
"cannot connect from [WINDOWS-HOSTNAME] to IP ADDRESS"
the error appears to be from mpiexec.exe.
i think the issue is my CPU.
any ideas?
mosesHPC is offline   Reply With Quote

Old   January 7, 2019, 11:21
Default
  #5
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
You don't have to use the intel mpi if you don't want to or can't. But the important point is that your workers need to be able to connect to the host machine and vice-versa.


Quote:
Originally Posted by mosesHPC View Post
also i noticed this warning: FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory
LuckyTran is offline   Reply With Quote

Old   January 7, 2019, 12:10
Default
  #6
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
You don't have to use the intel mpi if you don't want to or can't. But the important point is that your workers need to be able to connect to the host machine and vice-versa.
what alternatives do you suggest?
my nodes can ping the host and have access to shared directory and i have disabled firewalls on all systems.

i did a bit more digging and looks like msmpi.exe cannot access the service on the main node or any other node. looks like a certain service is not running or refusing connection. but i couldn't find the service that the error message mentioned on any nodes.


is it possible to get guides on running fluent on other clusters like rocks or openHPC or even running it on linux as distributed?
i appreciate the help
mosesHPC is offline   Reply With Quote

Old   January 7, 2019, 14:37
Default
  #7
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
I'm assuming you have already entered you windows username and password at some point into the dialogue box.

And you no longer get this warning? Otherwise that means they are not being found!
Quote:
Originally Posted by mosesHPC View Post
also i noticed this warning: FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory

From the head node, can you launch Fluent running only on the compute node? From the compute node, can you launch Fluent running only on the head node? I'm guessing neither of these will work.

Wildcard attempt at a fix. Try selecting default for the mpi type instead of choosing msmpi.

Last edited by LuckyTran; January 7, 2019 at 18:18.
LuckyTran is offline   Reply With Quote

Old   January 7, 2019, 17:45
Default
  #8
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
i can launch fluent on head node alone.
also i can launch from compute node alone (bring down head node and launch with one node which results in the job being assigned to compute node) but when i launch with both it just hangs.
it asks for username and password after launch which i entered.
if i don't use the HPC cluster. how can i run distributed ansys on systems with AMD processors(ryzen type)?
mosesHPC is offline   Reply With Quote

Old   January 7, 2019, 18:21
Default
  #9
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
The fact that you have AMD processors is not important. Intel MPI should have also run on your system(s) but I can see why you would want to run it on msmpi to use the job scheduler. It's purely a MPI installation problem and whether or not each machine has an open connection to the others and whether they have sufficient privileges to access the directory for the Fluent binaries.
LuckyTran is offline   Reply With Quote

Old   January 8, 2019, 00:48
Default
  #10
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
i have added console output from when i run fluent on different nodes.
when i run on both nodes it seems a bit odd.
when choosing both nodes it hangs.
i don't know why i,m getting the error that fluent directory is not shared even though i have shared it.(pic added)


any ideas what is causing it?
could it be a licensing issue?
Attached Images
File Type: png fluent 5.png (6.7 KB, 9 views)
File Type: png fluent 7.png (73.0 KB, 15 views)
Attached Files
File Type: txt fluent output.txt (1.5 KB, 4 views)
mosesHPC is offline   Reply With Quote

Old   January 8, 2019, 10:03
Default
  #11
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
You have a very common issue, the solution is not always apparent.

So I see you have shared some directory.

Quote:
Originally Posted by ghost82 View Post
Test usernames/password

On both machines 1 and 2 share a directory, for example the C:\ directory
Go to Start->Computer
Right click on C:\ then click on properties
Click on sharing tab->advanced sharing
Check Share this folder, click apply and click on permissions
Highlight Everyone in users and groups and assign full control (all checks under "Allow")
Click apply, ok, ok

From machine 1 go to Start->Computer and click on network on the left to see machine 2
Double click on it and access the shared folder C:\ on machine 2
You will be prompted for a username and a password
Type the windows username and password and see if you can access the shared folder

From machine 2, do the same
But can you do this last part? That is, actually visit the directory and create a file in it? Sharing locations in window is very buggy. Sharing anything in the program files is even more weird because program files is a restricted directory.

Daniele also mentioned some interesting things needed to be done to access the Fluent binaries.
LuckyTran is offline   Reply With Quote

Old   January 8, 2019, 10:29
Default
  #12
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
You have a very common issue, the solution is not always apparent.

So I see you have shared some directory.


But can you do this last part? That is, actually visit the directory and create a file in it? Sharing locations in window is very buggy. Sharing anything in the program files is even more weird because program files is a restricted directory.

Daniele also mentioned some interesting things needed to be done to access the Fluent binaries.

i tried this part.
if i open RUN and type \\IP_ADDRESS_OF_NODE . it will open the shared directory and it works from both sides, however if i open the network segment from my computer it will not show the other node(works with the command but not the GUI). maybe that is the problem.
how can i resolve this?
as for the firewall i have added a rule accepting all inbound connections on all nodes and since i,m running windows server 2012r2 i don't have windows defender to stop anything.
mosesHPC is offline   Reply With Quote

Old   January 8, 2019, 11:48
Default
  #13
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,674
Rep Power: 65
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
Using the trick by going to run and using the \\address\C$ almost always works because it is a hidden administrative share. All you need is administrative privileges for this to work (i.e. be logged into an admin account with the correct username and password) and have certain remote services enabled.

If you can do everything in run using \\address\blah\blah\blah but not directly access them in network explorer then... Unfortunately there's an entire list of services involved and any one of them can cause and error (ignore that it is a Win10 issue because I've run into it all the way back to XP).

My guess is, if you try to use these, they will also not work.

Quote:
Originally Posted by ghost82 View Post
Remember to start fluent with these options:
- Working directory: \\workstation\path-to-fluent-working-directory
- Fluent Root Path: \\workstation\path-to\Ansys Inc\v192\fluent
LuckyTran is offline   Reply With Quote

Old   January 8, 2019, 12:21
Default
  #14
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
Quote:
Originally Posted by LuckyTran View Post
Using the trick by going to run and using the \\address\C$ almost always works because it is a hidden administrative share. All you need is administrative privileges for this to work (i.e. be logged into an admin account with the correct username and password) and have certain remote services enabled.

If you can do everything in run using \\address\blah\blah\blah but not directly access them in network explorer then... Unfortunately there's an entire list of services involved and any one of them can cause and error (ignore that it is a Win10 issue because I've run into it all the way back to XP).

My guess is, if you try to use these, they will also not work.

i did try to run it with these and you are right they do not work
i will post a solution here if i find one.
i,m going to reinstall my cluster tomorrow.
thank you for the help
mosesHPC is offline   Reply With Quote

Old   January 9, 2019, 00:18
Default
  #15
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
i enabled some services and the systems can see each other in the network and access files but the problem persists.
the services i enabled were these:


SSDP Discovery
Function Discovery Resource Publication
UPnP Device Host


neither HPC pack nor Distributed mode works for me.

Last edited by mosesHPC; January 9, 2019 at 01:26.
mosesHPC is offline   Reply With Quote

Old   January 9, 2019, 04:51
Default
  #16
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
i managed to get it working with distributed ansys and intel mpi.
when i fluent with both machines it works fine.
i opened a project using all the cores but it's giving me an error:


Ansys.Fluent.Cortex.CortexNotAvailableException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown.
at Ansys.Fluent.Data.SetupData.GetCommunicator(IReadL ockContainer context)
at Ansys.Fluent.Data.SetupData.ReadCaseModelInfo(IFul lContext context)
at Ansys.Fluent.Data.SetupData.ReadMeshAndModelInfo(I FullContext context)
at Ansys.Fluent.Data.SetupData.LoadFiles(IFullContext context)
at Ansys.Fluent.Commands.EditCommand.Execute(IFullCon text context)
at Ansys.Core.Commands.Concurrency.CommandWorkUnit.ex ecuteInContext(CommandContext subContext, IExecutionEngineCallback tracer)
at Ansys.Core.Commands.Concurrency.BaseWorkUnit.doExe cute(IExecutionEngineCallback executionEngine, CommandContext subContext)
at Ansys.Core.Commands.Concurrency.BaseWorkUnit.Execu te(IExecutionEngineCallback executionEngine, Boolean dontCatchExceptions)
--- Ansys.Core.Commands.CommandFailedException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown.
CommandName: Fluent.Edit(Container="Setup")
at Ansys.Core.Commands.CommandAsyncResult.Wait(Int32 milliSecondsTimeout, Boolean exitContext)
at Ansys.Fluent.Commands.EditCommand.InvokeAndWait(IP rotectedContext context, DataContainerReference Container, Boolean Interactive)
at Ansys.Fluent.Gui.OpenInFluentGui.Invoke(GuiOperati onContext context)
at Ansys.UI.GuiOperationContext.Invoke(GuiOperationMe taData operationData)
at Ansys.UI.UIManager.InvokeOperationCore(String pseudoname, OperationDelegate callback, Boolean allowOSMessages, Boolean coreTransaction)


is this an issue with the project itself or my configuration?
the project loads fine when i open it on a single machine
mosesHPC is offline   Reply With Quote

Old   January 9, 2019, 06:11
Default
  #17
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
Quote:
Originally Posted by mosesHPC View Post
i managed to get it working with distributed ansys and intel mpi.
when i fluent with both machines it works fine.
i opened a project using all the cores but it's giving me an error:


Ansys.Fluent.Cortex.CortexNotAvailableException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown.
at Ansys.Fluent.Data.SetupData.GetCommunicator(IReadL ockContainer context)
at Ansys.Fluent.Data.SetupData.ReadCaseModelInfo(IFul lContext context)
at Ansys.Fluent.Data.SetupData.ReadMeshAndModelInfo(I FullContext context)
at Ansys.Fluent.Data.SetupData.LoadFiles(IFullContext context)
at Ansys.Fluent.Commands.EditCommand.Execute(IFullCon text context)
at Ansys.Core.Commands.Concurrency.CommandWorkUnit.ex ecuteInContext(CommandContext subContext, IExecutionEngineCallback tracer)
at Ansys.Core.Commands.Concurrency.BaseWorkUnit.doExe cute(IExecutionEngineCallback executionEngine, CommandContext subContext)
at Ansys.Core.Commands.Concurrency.BaseWorkUnit.Execu te(IExecutionEngineCallback executionEngine, Boolean dontCatchExceptions)
--- Ansys.Core.Commands.CommandFailedException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown.
CommandName: Fluent.Edit(Container="Setup")
at Ansys.Core.Commands.CommandAsyncResult.Wait(Int32 milliSecondsTimeout, Boolean exitContext)
at Ansys.Fluent.Commands.EditCommand.InvokeAndWait(IP rotectedContext context, DataContainerReference Container, Boolean Interactive)
at Ansys.Fluent.Gui.OpenInFluentGui.Invoke(GuiOperati onContext context)
at Ansys.UI.GuiOperationContext.Invoke(GuiOperationMe taData operationData)
at Ansys.UI.UIManager.InvokeOperationCore(String pseudoname, OperationDelegate callback, Boolean allowOSMessages, Boolean coreTransaction)


is this an issue with the project itself or my configuration?
the project loads fine when i open it on a single machine

looks like one of the systems is running low on RAM (only has 8GB).
i can run the project with 15 cores but not 16.
mosesHPC is offline   Reply With Quote

Old   January 12, 2019, 15:01
Default
  #18
New Member
 
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7
mosesHPC is on a distinguished road
ok so i added some RAM and also some other hardware.
but noticed that i can only use up to 8 processes on extra machines otherwise i run into this error.
i will try to break those machines into multiple virtual machines to see whether or not i can utilize their full potential.
mosesHPC is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fluent - Linux vs Windows derick FLUENT 2 August 16, 2020 11:23
problems with Fluent display windows chris FLUENT 3 January 7, 2016 10:44
Fluent in Linux vs. Fluent in Windows Melih FLUENT 6 November 16, 2014 09:39
Microsoft HPC Pack 2008 Tool Pack (LINPACK) jemyungcha Hardware 1 October 22, 2011 18:21
CFX11 + Fortran compiler ? Mohan CFX 20 March 30, 2011 18:56


All times are GMT -4. The time now is 07:57.