Can I modify runTimeWrite() to output results into one folder?

ovie · October 2, 2010, 01:54

Hi,

I am running computations on a cluster using over 100 processors. However, my computations always get terminated because I exceed the number of files permitted in the work directory as the computation results are duplicated over all the processors with new set of files created for each output time step. My question: is there anyway I can configure runTimeWrite() to write the results for each particular variable at each time step into the same file?

What I mean is, instead of having a folder for each time step, is it possible to have one folder (say results) with one file for each output variable where the results at each time step is written into that same file but separated by white space or any delimiter of choice?

Or is there some other way to get around this?

Thanks.

wyldckat · October 2, 2010, 08:33

Greetings Ovie,

There is a variable in the controlDict file that controls the removal of previous times:

Code:

purgeWrite      0;

Change it to 1 and it will only keep the latest time snapshot

Additionally, you might want to increase the value for writeInterval, but I guess you've already tried that.

Best regards,
Bruno

ovie · October 2, 2010, 15:47

Thanks Bruno.

I am aware of the different write options available in controlDict. However I was hoping I could modify the write() function in regIOobject so I can write just ONE GIANT FILE for each variable and then transfer to a directory or cluster where I dont have file size/number restrictions for post processing. In any case, I have looked at the code and its pretty daunting to pull this off.

I would stick to the purgeWrite option and write the last few time steps. Hopefully I would be able to make meaningful animations out of the results.

Thanks for your reply.

wyldckat · October 2, 2010, 17:00

Hi Ovie,

Sorry about that. I should have paid more attention to what you wrote.

Is FUSE (Filesystem in Userspace) an option in the cluster? You would mount a single file to be used as a file system. This would allow you to keep everything in a single file, while still using a normal OpenFOAM... although the cluster quota management system might see the very same FUSE layer and interpret the same file limit issue.

There is also a possible solution of using an Embedded File System, but it's not very practical either. But if you do wish to use this, I believe that looking here (regIOobjectWrite) would be a good starting point for figuring out how to do a quick replacement of the normal system for the Embedded system.

But, if I were in your shoes, I would simply aim for finding a good place to launch a shell script from within the solver (somewhere close or inside "runTime.write()"), and do so only on the first processor ($FOAM_APP/test/parallel is a good reference for knowing who's the first processor). The shell script would then take care of doing some grunt work and package the most recent time snapshot from all processors into a single tar file and remove that time snapshot after that - the option for compression would be actived in controlDict and not in the tar filling part.
When the simulation would be complete, you would get the case folder back into your personal system and then unpack all tar files in a single blow, in such a manner that the files would all go into the desired places on their own

The downside to this method is that you might hit the file limit nonetheless after a while. But for that you could run a timed script that would pull files from the cluster every once in a while.

If you still want to make each variable to be stuffed into a single file, that will require some serious overhauling of the way OpenFOAM handles each file.

Best regards,
Bruno

ovie · October 2, 2010, 20:06

Greetings Bruno,

Quote:

Originally Posted by wyldckat

Hi Ovie,

Sorry about that. I should have paid more attention to what you wrote.

No worries..

Your ideas really sound fascinating and well thought out. Most of the suggestions are completely new to me though (like the FUSE and Embedded File System options.) In any case, there are too many restrictions regarding user installed software on the cluster so I dont fancy my chances very much with those options if they are not already available. In any case I would try to confirm the possibility of using either.

Quote:

Originally Posted by wyldckat

But, if I were in your shoes, I would simply aim for finding a good place to launch a shell script from within the solver (somewhere close or inside "runTime.write()"), and do so only on the first processor ($FOAM_APP/test/parallel is a good reference for knowing who's the first processor). The shell script would then take care of doing some grunt work and package the most recent time snapshot from all processors into a single tar file and remove that time snapshot after that - the option for compression would be actived in controlDict and not in the tar filling part.
When the simulation would be complete, you would get the case folder back into your personal system and then unpack all tar files in a single blow, in such a manner that the files would all go into the desired places on their own

The downside to this method is that you might hit the file limit nonetheless after a while. But for that you could run a timed script that would pull files from the cluster every once in a while.

I am really interested in this option. I would try to explore this further as it seems the most practical at the moment. But if you have any specific ideas on how this could be implemented be sure I would grab it with both hands. I really appreciate your assistance.

Thanks for your responses..

wyldckat · October 2, 2010, 20:58

Hi Ovie,

Yes, for FUSE, you'll have to ask your cluster administrator to know if such is or can be installed. After that, if I'm not mistaken, you can build your own FUSE based system (wikipedia link). But the simplest way would be to mount a file that is formatted into ext2/3 or zfs or whichever filesystem you desire (example on how to do this). The file space would be contained to that single sized file, for example 500MB, but would allow you to stuff in as much as it can fit.

As for the Embedded File System, it's a library that you would have to build yourself and then change OpenFOAM's code to plug it to the new library. A bunch of OpenFOAM's core code would need to be changed, but this way the new modified OpenFOAM would always work with a single file for each case

And it shouldn't need administrative permissions to do so.

As for the quickest implementation (aside from the FUSE/mount one), what I have in mind is:

Take for instance the solver simpleFoam. It has a while loop for going over each time iteration. Inside that loop we have "runTime.write()", which should handle the saving process of the files.
Now we add new code in the line after this one, something like this:
Code:
```
if(Pstream::parRun() && Pstream::master())
{
    system("/bin/sh ./timepacker")
}
```
This would make the solver to call the system to run the script, using sh. Here is the information about the C/C++ code: Pstream and system call.
Try to build the modified solver. It might be missing some header files, which should be indicated in the previous two links for Pstream and system call.
With the solver rebuilt, create a test script timepacker in a case folder for running simpleFoam in both single and parallel modes. The test script can for example, run this:
Code:
```
#!/bin/sh
find . >> testscript.log
```
Don't forget to do:
Code:
```
chmod +x timepacker
```
When it's confirmed it works locally, try the same case in the cluster, just to check if the same command will do the trick. It might be necessary to create a string in the solver to something like (this is just a pseudo-code example):
Code:
```
string str="/bin/sh "+caseFolder+"/timepacker";
system(str);
```
Once it works, just write the code for the timepacker script. Test it locally with a normal or the modified solver, while it's running, to see if the script works properly. Using the normal solver, could allow you to work on the script and test it right away, while the solver is running in parallel.
NOTES:
- With this script, whether during tests, or during the real execution, do not use the purgeWrite option in controlDict, just in case... it's possible something could go wrong in the mean time. It shouldn't, but it's possible... for example, if there is a nifty super cache solution, that would allow the tar command to finish in background execution, while it returns control back to the solver.
- As I said in the previous post, use the compress option in controlDict, and simply use tar without compression in the timepacker script. Also, the script should be the one to remove the time snapshots after it's done packing them.
Once all is working as intended, do some math and see if you will need to also create a local script (or in the cluster, called from within the timepacker script), that does some moving from the cluster to the local machine automatically every xxx MB or xx time snapshots, or every real elapsed hour, so you won't run dry on your cluster space quota

Well... if this is the quickest solution, just imagine how long it would take to implement the embedded solution

That mount option would really sweeten the deal

Either one of these solutions are enticing to implement, but time is of the essence.

Good luck! Keep us posted. If you have any questions, feel free to ask!

Best regards,
Bruno

ovie · October 2, 2010, 21:18

Waoo!!...

You are very kind indeed! Thanks so much for the help.

But right now I am grappling with validating results from my phase change simulation for which I am to make a presentation in a few days. I would take a look at your very detailed work steps afterwards. But be sure that this dude is very thankful for your help.

NB
Just on the side (and please dont bother if you dont have to) do you have any ideas on how to implement 1-D simulations in OpenFOAM? I mean, does it make sense to solve the pEqn.H for 1-D simulations? I am trying to simulate 1-D stefan problems and I cant get my mind around solving continuity in 1-D.

Just asking.

Thanks again..

wyldckat · October 2, 2010, 23:01

Hi Ovie,

Quote:

Originally Posted by ovie

You are very kind indeed! Thanks so much for the help.

You're welcome

I'm interested in the subject of embedded file systems, but I don't have the time to get my hands dirty on that subject

But I still like spending some time thinking about it once in a while

Quote:

Originally Posted by ovie

Just on the side (and please dont bother if you dont have to) do you have any ideas on how to implement 1-D simulations in OpenFOAM? I mean, does it make sense to solve the pEqn.H for 1-D simulations? I am trying to simulate 1-D stefan problems and I cant get my mind around solving continuity in 1-D.

I have barely any experience on the subject, but I do have a theory, after reading up a little bit: use 2D, 3 x N cells, 1-D along N and on the sides use cyclic patches instead of empty patches. In other words, you collapse 3D into 2D by using empty patches and collapse 2D to 1D by using cyclic patches. This way you would emulate an infinite flat plate, thus reducing influence from the sides of the flow.

Good luck! Best regards,
Bruno

marupio · October 3, 2010, 10:58

If you are comfortable writing OpenFOAM classes, you could write a class that extends regIOobject, and give it all your output fields as member variables. For its writeData() function, you can open a file in constant named after the time name, and use writeData() for each of the fields to this file. Lastly, set purgeWrite to 1, as you don't need the duplicate output going to case/timeName.

Just an idea.

ovie · October 3, 2010, 15:45

Quote:

Originally Posted by marupio

If you are comfortable writing OpenFOAM classes, you could write a class that extends regIOobject, and give it all your output fields as member variables. For its writeData() function, you can open a file in constant named after the time name, and use writeData() for each of the fields to this file. Lastly, set purgeWrite to 1, as you don't need the duplicate output going to case/timeName.

Just an idea.

Thanks Marupio for the nice idea. Truth is I have reviewed the code for the regIOobject class and it looks pretty complicated for me even though I have written several OF class extensions before. Maybe if you could kindly elaborate with some ideas and work steps then it would be really nice. Wyldckat already gave some insights which I find very helpful. One question though, would the number of files multiply with each time step or would the output for each variable be written to the same file?

Quote:

Originally Posted by wyldckat

I have barely any experience on the subject, but I do have a theory, after reading up a little bit: use 2D, 3 x N cells, 1-D along N and on the sides use cyclic patches instead of empty patches. In other words, you collapse 3D into 2D by using empty patches and collapse 2D to 1D by using cyclic patches. This way you would emulate an infinite flat plate, thus reducing influence from the sides of the flow.

Greetings Bruno and thanks for the suggestion. I have started a thread for 1-D OF simulations and hopefully people who have been able to do this would provide some insight.

Thanks guys for the responses..

marupio · October 3, 2010, 17:17

Quote:

Originally Posted by ovie

Thanks Marupio for the nice idea. Truth is I have reviewed the code for the regIOobject class and it looks pretty complicated for me even though I have written several OF class extensions before. Maybe if you could kindly elaborate with some ideas and work steps then it would be really nice. Wyldckat already gave some insights which I find very helpful. One question though, would the number of files multiply with each time step or would the output for each variable be written to the same file?

You don't need to change regIOobject, just make a class that inherits regIOobject. Any of the functions in regIOobject.H that are preceded with "virtual" can be overridden by your class, and the only one you really need is writeData().

I envision changing the way the solver stores its data. Rather than creating global scale fields (e.g. volScalarField, volVectorField) in createFields.H, you create your custom object ioDataCompressor (or whatever you want to name it). e.g.:

Code:

    ioDataCompressor myData(runTime, mesh);

(I'm just guessing at what you will need in the constructor - runTime and mesh)

Then instead of

Code:

    solve(fvm::ddt(U) + ...)

you'd have something like:

Code:

    solver(fvm::ddt(myData.U()) + ...)

The key to getting this to work will be the ioDataCompressor::writeData() function. In this function, you open the file where ever you want, and dump all the data in it. Something like:

Code:

ofStream( fileName );
os << "// U field" << endl;
U.writeData(os);
os << "// p field" << endl;
p.writeData(os);
os << "// T field" << endl;
T.writeData(os);

etc.. The number of files it produces depends on how you write this function. You could write it to continuously append all the data to the same file, adding os << "Time = " << runTime.timeName() << end;, you could have one file for each field... it's up to you.

I don't know what you intend to do with the data once it's done. If you are outputting it in this manner, you will have all the data, but the standard post-processors won't work. You'd either have to write some mini application that unloads the data to what the post processors expect, or use your own data processing strategy, such as MATLAB scripts or something like that.

marupio · October 3, 2010, 17:21

It might be easier to have some function that copies all your case/timeName directories into a zip file and deletes them periodically.

wyldckat · January 22, 2011, 19:52

Greetings to all!

I'm reviving this thread simply to do a somewhat important update for future users reading this thread:
There is no need to modify the solver itself for running scripts whenever a time snapshot is saved!

Simply read this wiki page I just recently added to openfoamwiki.net: Tip Function Object systemCall
This will allow you to do system calls to shell scripts at the instances described in the wiki page. Additionally, this is something already built-in OpenFOAM and not documented well enough, although I haven't tracked down when it was added...

Best regards,
Bruno

October 2, 2010, 01:54	Can I modify runTimeWrite() to output results into one folder?	#1
ovie Member Ovie Doro Join Date: Jul 2009 Posts: 99 Rep Power: 16	Hi, I am running computations on a cluster using over 100 processors. However, my computations always get terminated because I exceed the number of files permitted in the work directory as the computation results are duplicated over all the processors with new set of files created for each output time step. My question: is there anyway I can configure runTimeWrite() to write the results for each particular variable at each time step into the same file? What I mean is, instead of having a folder for each time step, is it possible to have one folder (say results) with one file for each output variable where the results at each time step is written into that same file but separated by white space or any delimiter of choice? Or is there some other way to get around this? Thanks.

October 2, 2010, 08:33		#2
wyldckat Retired Super Moderator Bruno Santos Join Date: Mar 2009 Location: Lisbon, Portugal Posts: 10,975 Blog Entries: 45 Rep Power: 128	Greetings Ovie, There is a variable in the controlDict file that controls the removal of previous times: Code: purgeWrite 0; Change it to 1 and it will only keep the latest time snapshot Additionally, you might want to increase the value for writeInterval, but I guess you've already tried that. Best regards, Bruno fedvasu and doppler like this. __________________ OpenFOAM: FAQ \| Getting started Forum: How to get help, to post code/output and forum guide Read this before sending me PM

October 2, 2010, 17:00		#4
wyldckat Retired Super Moderator Bruno Santos Join Date: Mar 2009 Location: Lisbon, Portugal Posts: 10,975 Blog Entries: 45 Rep Power: 128	Hi Ovie, Sorry about that. I should have paid more attention to what you wrote. Is FUSE (Filesystem in Userspace) an option in the cluster? You would mount a single file to be used as a file system. This would allow you to keep everything in a single file, while still using a normal OpenFOAM... although the cluster quota management system might see the very same FUSE layer and interpret the same file limit issue. There is also a possible solution of using an Embedded File System, but it's not very practical either. But if you do wish to use this, I believe that looking here (regIOobjectWrite) would be a good starting point for figuring out how to do a quick replacement of the normal system for the Embedded system. But, if I were in your shoes, I would simply aim for finding a good place to launch a shell script from within the solver (somewhere close or inside "runTime.write()"), and do so only on the first processor ($FOAM_APP/test/parallel is a good reference for knowing who's the first processor). The shell script would then take care of doing some grunt work and package the most recent time snapshot from all processors into a single tar file and remove that time snapshot after that - the option for compression would be actived in controlDict and not in the tar filling part. When the simulation would be complete, you would get the case folder back into your personal system and then unpack all tar files in a single blow, in such a manner that the files would all go into the desired places on their own The downside to this method is that you might hit the file limit nonetheless after a while. But for that you could run a timed script that would pull files from the cluster every once in a while. If you still want to make each variable to be stuffed into a single file, that will require some serious overhauling of the way OpenFOAM handles each file. Best regards, Bruno __________________ OpenFOAM: FAQ \| Getting started Forum: How to get help, to post code/output and forum guide Read this before sending me PM

October 2, 2010, 20:58		#6
wyldckat Retired Super Moderator Bruno Santos Join Date: Mar 2009 Location: Lisbon, Portugal Posts: 10,975 Blog Entries: 45 Rep Power: 128	Hi Ovie, Yes, for FUSE, you'll have to ask your cluster administrator to know if such is or can be installed. After that, if I'm not mistaken, you can build your own FUSE based system (wikipedia link). But the simplest way would be to mount a file that is formatted into ext2/3 or zfs or whichever filesystem you desire (example on how to do this). The file space would be contained to that single sized file, for example 500MB, but would allow you to stuff in as much as it can fit. As for the Embedded File System, it's a library that you would have to build yourself and then change OpenFOAM's code to plug it to the new library. A bunch of OpenFOAM's core code would need to be changed, but this way the new modified OpenFOAM would always work with a single file for each case And it shouldn't need administrative permissions to do so. As for the quickest implementation (aside from the FUSE/mount one), what I have in mind is: Take for instance the solver simpleFoam. It has a while loop for going over each time iteration. Inside that loop we have "runTime.write()", which should handle the saving process of the files. Now we add new code in the line after this one, something like this: Code: if(Pstream::parRun() && Pstream::master()) { system("/bin/sh ./timepacker") } This would make the solver to call the system to run the script, using sh. Here is the information about the C/C++ code: Pstream and system call. Try to build the modified solver. It might be missing some header files, which should be indicated in the previous two links for Pstream and system call. With the solver rebuilt, create a test script timepacker in a case folder for running simpleFoam in both single and parallel modes. The test script can for example, run this: Code: #!/bin/sh find . >> testscript.log Don't forget to do: Code: chmod +x timepacker When it's confirmed it works locally, try the same case in the cluster, just to check if the same command will do the trick. It might be necessary to create a string in the solver to something like (this is just a pseudo-code example): Code: string str="/bin/sh "+caseFolder+"/timepacker"; system(str); Once it works, just write the code for the timepacker script. Test it locally with a normal or the modified solver, while it's running, to see if the script works properly. Using the normal solver, could allow you to work on the script and test it right away, while the solver is running in parallel. NOTES: With this script, whether during tests, or during the real execution, do not use the purgeWrite option in controlDict, just in case... it's possible something could go wrong in the mean time. It shouldn't, but it's possible... for example, if there is a nifty super cache solution, that would allow the tar command to finish in background execution, while it returns control back to the solver. As I said in the previous post, use the compress option in controlDict, and simply use tar without compression in the timepacker script. Also, the script should be the one to remove the time snapshots after it's done packing them. Once all is working as intended, do some math and see if you will need to also create a local script (or in the cluster, called from within the timepacker script), that does some moving from the cluster to the local machine automatically every xxx MB or xx time snapshots, or every real elapsed hour, so you won't run dry on your cluster space quota Well... if this is the quickest solution, just imagine how long it would take to implement the embedded solution That mount option would really sweeten the deal Either one of these solutions are enticing to implement, but time is of the essence. Good luck! Keep us posted. If you have any questions, feel free to ask! Best regards, Bruno __________________ OpenFOAM: FAQ \| Getting started Forum: How to get help, to post code/output and forum guide Read this before sending me PM

January 22, 2011, 19:52		#13
wyldckat Retired Super Moderator Bruno Santos Join Date: Mar 2009 Location: Lisbon, Portugal Posts: 10,975 Blog Entries: 45 Rep Power: 128	Greetings to all! I'm reviving this thread simply to do a somewhat important update for future users reading this thread: There is no need to modify the solver itself for running scripts whenever a time snapshot is saved! Simply read this wiki page I just recently added to openfoamwiki.net: Tip Function Object systemCall This will allow you to do system calls to shell scripts at the instances described in the wiki page. Additionally, this is something already built-in OpenFOAM and not documented well enough, although I haven't tracked down when it was added... Best regards, Bruno __________________ OpenFOAM: FAQ \| Getting started Forum: How to get help, to post code/output and forum guide Read this before sending me PM

October 2, 2010, 15:47		#3
ovie Member Ovie Doro Join Date: Jul 2009 Posts: 99 Rep Power: 16	Thanks Bruno. I am aware of the different write options available in controlDict. However I was hoping I could modify the write() function in regIOobject so I can write just ONE GIANT FILE for each variable and then transfer to a directory or cluster where I dont have file size/number restrictions for post processing. In any case, I have looked at the code and its pretty daunting to pull this off. I would stick to the purgeWrite option and write the last few time steps. Hopefully I would be able to make meaningful animations out of the results. Thanks for your reply.

October 2, 2010, 21:18		#7
ovie Member Ovie Doro Join Date: Jul 2009 Posts: 99 Rep Power: 16	Waoo!!... You are very kind indeed! Thanks so much for the help. But right now I am grappling with validating results from my phase change simulation for which I am to make a presentation in a few days. I would take a look at your very detailed work steps afterwards. But be sure that this dude is very thankful for your help. NB Just on the side (and please dont bother if you dont have to) do you have any ideas on how to implement 1-D simulations in OpenFOAM? I mean, does it make sense to solve the pEqn.H for 1-D simulations? I am trying to simulate 1-D stefan problems and I cant get my mind around solving continuity in 1-D. Just asking. Thanks again..

October 3, 2010, 10:58		#9
marupio Senior Member David Gaden Join Date: Apr 2009 Location: Winnipeg, Canada Posts: 437 Rep Power: 22	If you are comfortable writing OpenFOAM classes, you could write a class that extends regIOobject, and give it all your output fields as member variables. For its writeData() function, you can open a file in constant named after the time name, and use writeData() for each of the fields to this file. Lastly, set purgeWrite to 1, as you don't need the duplicate output going to case/timeName. Just an idea.

October 3, 2010, 17:21		#12
marupio Senior Member David Gaden Join Date: Apr 2009 Location: Winnipeg, Canada Posts: 437 Rep Power: 22	It might be easier to have some function that copies all your case/timeName directories into a zip file and deletes them periodically.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Different Results from Fluent 5.5 and Fluent 6.0	Rajeev Kumar Singh	FLUENT	6	December 19, 2010 11:33
Output transient file to csv	Ben	CFX	3	September 23, 2008 08:17
CFX-5.7 MPICH Parallel Problem (Output of Results)	James Date	CFX	7	February 15, 2005 16:03
Help with DPM UDF for OUTPUT needed	Zhengcai Ye	FLUENT	0	January 5, 2004 16:58
Automating Fluent Results output	Rajil Saraswat	FLUENT	7	May 21, 2003 02:35