CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   SU2 (https://www.cfd-online.com/Forums/su2/)
-   -   Could not find NPOIN= keyword with different numbers of cores (https://www.cfd-online.com/Forums/su2/237575-could-not-find-npoin-keyword-different-numbers-cores.html)

pdp.aero July 24, 2021 21:06

Could not find NPOIN= keyword with different numbers of cores
 
Hi there,


I am running bunch of optimization problems. In one when I increase number of cores from 24 to 72, I get this:


------------------- Geometry Preprocessing ( Zone 0 ) -------------------


Error in "void CSU2ASCIIMeshReaderFVM::ReadMetadata()":
-------------------------------------------------------------------------
Could not find NPOIN= keyword.
Check the SU2 ASCII file format.
------------------------------ Error Exit -------------------------------





SU2_DEF has no problem and generates the deformed mesh but when I use the deformed mesh to start a new design iteration with a direct run it seems to fail to read the mesh.


Deformed mesh is okay. I mean inside the mesh I see everything that should be there. NPOIN is there.



Interestingly this issue appears when I increase number of the cores. With 12 and 24 no problem. But with 32, 36, 48 and 72 the above error pops out.


Any trick to run this problem with at least 72 cores?


I am using FADO + SU2 v7.1.1


Thanks,
Pay

pcg July 25, 2021 06:18

Hello,
What happens if you run SU2_DEF on 12-24 cores and CFD on 72.
I had similar issues in the past due to filesystem problems and writing ascii files in parallel.
Is 24 a "magic" number? E.g. it starts using more than one compute node?
Check if the size of the mesh files is approx the same.

pdp.aero July 26, 2021 12:41

Quote:

Originally Posted by pcg (Post 808908)
Hello,
What happens if you run SU2_DEF on 12-24 cores and CFD on 72.
I had similar issues in the past due to filesystem problems and writing ascii files in parallel.
Is 24 a "magic" number? E.g. it starts using more than one compute node?
Check if the size of the mesh files is approx the same.


Hi Pedro,


Thanks for the reply.


I did what you suggested. I end up with different kinda error. It got stuck writing the restart file. Then I switch all read and write to ASCII including tecplot surface and volume outputs. It worked but again for 24 cores.



Separate question, was this issue solved in 7.1.1?
https://github.com/su2code/SU2/issues/971


If I switch between 24 and 72 in a design iteration when I deform and when I run direct and discrete adjoint I get this:


Error in "virtual void CPhysicalGeometry::SetBoundVolume()":
-------------------------------------------------------------------------
The surface element (1, 200) doesn't have an associated volume element
------------------------------ Error Exit -------------------------------



Which is kinda doesn't make sense because if I direct to the exact same directory and submit a single CFD job with 72 cores including the same deformed mesh and same config file, it runs without this error. This error appears only when I switch number of cores in my FADO python script after the first deform on 24 and second CFD run on 72 cores.



Yes, 24 is the magic number :) it's max number of cores I could call on a single node.


Yep, size of the deformed mesh doesn't change, like:
15128038 --> without deformation including FFD
then:
15128200 --> after first deformation


Thinking maybe I must submit two separate jobs one when I DEF and one when I run CFD or AD!


Cheers,
Pay

pdp.aero August 1, 2021 17:40

Quote:

Originally Posted by pcg (Post 808908)
Hello,
What happens if you run SU2_DEF on 12-24 cores and CFD on 72.
I had similar issues in the past due to filesystem problems and writing ascii files in parallel.
Is 24 a "magic" number? E.g. it starts using more than one compute node?
Check if the size of the mesh files is approx the same.


I just wanna confirm switching between 12 24 cores for DEF and 72 for CFD and AD worked with ASCII restart.


The last error I was getting solved by deleting some unnecessary customized class defined in the FADO's optimization script defined by someone to manipulate ExternalRun to skip the first deform iteration for no reason.


Cheers,
Pay

pdp.aero August 19, 2021 09:47

Quote:

Originally Posted by pdp.aero (Post 809494)
I just wanna confirm switching between 12 24 cores for DEF and 72 for CFD and AD worked with ASCII restart.


The last error I was getting solved by deleting some unnecessary customized class defined in the FADO's optimization script defined by someone to manipulate ExternalRun to skip the first deform iteration for no reason.


Cheers,
Pay


Also extra point on SU2_DEF.



It doesn't write the whole mesh when more cores than a node being used. I just had to keep SU2_DEF and SU2_DOT_AD running in serial.


All times are GMT -4. The time now is 14:58.