Diverging solution on HPC cluster
Hi,
I am running Reef3D on an HPC cluster, but unfortunately ever since I had updated Reef3D to the latest version, the solutions are diverging (values exceeding the critical velocity). Interestingly, the same control files are running normally on a 4-core PC. Any suggestion in this regard will be greatly appreciated! Thanks in advance. |
Thanks for the feedback. What type of simulation are you running?
|
Hello, sir. Thank you for the prompt response.
I am simulating open channel flow around a cylinder in a rectangular domain. The umax parameter is exponentially increasing with each time step, crossing a value of 500 m/s. |
Hi Priyanka,
could you please share your input files so that we can take a closer look? |
2 Attachment(s)
Hi Arun
I am attaching the control files. |
When you are running the case on the laptop, do you use the same mesh width?
|
While running on the laptop, I have kept the same mesh width.
PS: All the functions (except the M10 parameter), are kept the same in both control.txt and ctrl.txt files. |
I am not sure why it works locally and fails on the cluster, but the solution is most probably to avoid using N 40 6. This crashes cases for me locally as well with open channel flow.
Use N 40 3 and the problem should be solved. |
Yes, Arun is correct, N 40 6 can be less stable than N 40 3. But his should be independent of the processor count. We are currently looking in to the matter.
Another idea: maybe you did not make clean the previous build on your HPC cluster before you compiled the newest version? Maybe you have a go, make clean and then compile again? |
A small update. With the latest version of the model, N 40 6 seems to work fine (but I would still advice caution while using it).
I do notice though that the cylinder you have in the domain has about the same diameter as the grid size. I would advice using a stretched grid to include more cells around the structure. And finally something that might just solve the cluster problem, add M 20 2 to control.txt (Since I cant really reproduce the error, I am limited to looking all possible options, one by one) Did you see a change after making a fresh compilation of the latest version? |
We have tested your files with 24 cores. Everything works, so I think it is a compilation problem. As I wrote before:
maybe you did not make clean the previous build on your HPC cluster before you compiled the newest version? Maybe you have a go, make clean and then compile again? |
All times are GMT -4. The time now is 16:35. |