CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Hardware

Solid-state drive as a write buffer

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree2Likes
  • 2 Post By kyle

Reply
 
LinkBack Thread Tools Display Modes
Old   July 9, 2012, 17:11
Default Solid-state drive as a write buffer
  #1
JDR
New Member
 
Jonathan Regele
Join Date: Jul 2010
Posts: 6
Rep Power: 7
JDR is on a distinguished road
During long transient simulations where the entire simulation waits while a timestep is written to the hard drive, the write time can be considerable. Writing to a solid-state drive could speed up this step and could reduce the overall run-time by a significant amount. However, they are currently expensive and I would like to try a less expensive solution.

What I would like to do is try to use a solid-state drive as an intermediate storage buffer that the code writes to before transferring the data to a traditional hard drive. Has anyone tried this before?

I'm currently looking at a new workstation and my rep said this was possible, but I haven't seen any indication that people have done this before. All opinions are welcome.

Thanks,
JDR
JDR is offline   Reply With Quote

Old   July 10, 2012, 16:10
Default
  #2
Senior Member
 
Join Date: Mar 2009
Location: Austin, TX
Posts: 134
Rep Power: 9
kyle is on a distinguished road
I struggled with this too. Most solutions that use a fast drive for a cache for a slower drive are optimized for reading many small files quickly, which is exactly the opposite of what we care about. We want to write large files quickly.

In typical computing writes are extremely important, because if the power goes out or a machine crashes and recently written data is lost, you could lose a customer purchase, or a record of a stock trade. There is all kinds of stuff in place in typical RAID and hybrid disk arrays that ensure that if you write something and the power goes out or the computer crashes, then that data will be preserved.

In CFD, we don't really care if we lose all the writes we did in the last 5 minutes. It only means that we lost 5 minutes of simulation time. Additionally, we know that our writes are going to be happening at somewhat consistent intervals (we won't be just saving data constantly for hours on end, there are breaks while the solver is iterating). What we want to be able to do is save data in 2GB-20GB bursts as quickly as possible.

The best solution I have found is to use a fileserver with at least 1.5x-2.0x as much RAM as the largest file you will ever save. You can get 32GB of RAM for less than $200. Then, you have to tell the kernel that you don't need it to be so paranoid about losing your writes. On Linux this is done by changing the value of "/proc/sys/vm/dirty_background_ratio" to something like 75%, which will allow you to use 75% of your RAM as a filesystem write cache (~10% is default). Now whenever your store a timestep, the solver can dump the results extremely quickly to the fileserver's RAM, and the fileserver can take its time writing out to disk while the solver continues on its way.

Intel's "Smart Response Technology" is a piece of software for Windows that does exactly what you are asking about, but the speed of RAM is orders of magnitude faster than even SSDs. I think it only works on the z68 chipset though (which is bogus, because it is just a software technology). There is also an young Linux project that does much the same thing called bcache.
ShowponyStuart and Anna Tian like this.
kyle is offline   Reply With Quote

Old   July 10, 2012, 18:38
Default
  #3
JDR
New Member
 
Jonathan Regele
Join Date: Jul 2010
Posts: 6
Rep Power: 7
JDR is on a distinguished road
Kyle,

Thanks for your response. I'm glad others have been thinking about this as well.

Your solution seems interesting. Regular RAM would clearly be faster than a SSD, but I'm curious as to what type of configuration you're talking about. On a large cluster you have a fileserver, but what about on just a workstation with the hard drive located internally? I suppose you could write to an external file server that had its own memory, but you would want to make sure you can still use SATA data speeds and not drop down to 1Gbps ethernet speeds. Is there a way to allocate memory to the hard drive as an i/o buffer?

I have heard a little bit about the whole hybrid drive concept, but based upon what you said, it seems unlikely that they contain the volume of memory required for our applications.
JDR is offline   Reply With Quote

Old   July 10, 2012, 19:53
Default
  #4
Senior Member
 
Join Date: Mar 2009
Location: Austin, TX
Posts: 134
Rep Power: 9
kyle is on a distinguished road
I am using a small cluster with a dedicated fileserver. It is connected with 20Gb Infiniband, but even when I switched to just gig-e to test it, I was able to pretty much overwhelm the cheap hard drives whenever it saved.

You should be able apply a similar technique to a single workstation setup, but the RAM requirements go up a lot because you have to not only store your transient history files, but the simulation itself.
kyle is offline   Reply With Quote

Old   July 13, 2014, 05:29
Question
  #5
Senior Member
 
Anna Tian's Avatar
 
Meimei Wang
Join Date: Jul 2012
Posts: 492
Rep Power: 7
Anna Tian is on a distinguished road
Quote:
Originally Posted by kyle View Post
I struggled with this too. Most solutions that use a fast drive for a cache for a slower drive are optimized for reading many small files quickly, which is exactly the opposite of what we care about. We want to write large files quickly.

In typical computing writes are extremely important, because if the power goes out or a machine crashes and recently written data is lost, you could lose a customer purchase, or a record of a stock trade. There is all kinds of stuff in place in typical RAID and hybrid disk arrays that ensure that if you write something and the power goes out or the computer crashes, then that data will be preserved.

In CFD, we don't really care if we lose all the writes we did in the last 5 minutes. It only means that we lost 5 minutes of simulation time. Additionally, we know that our writes are going to be happening at somewhat consistent intervals (we won't be just saving data constantly for hours on end, there are breaks while the solver is iterating). What we want to be able to do is save data in 2GB-20GB bursts as quickly as possible.

The best solution I have found is to use a fileserver with at least 1.5x-2.0x as much RAM as the largest file you will ever save. You can get 32GB of RAM for less than $200. Then, you have to tell the kernel that you don't need it to be so paranoid about losing your writes. On Linux this is done by changing the value of "/proc/sys/vm/dirty_background_ratio" to something like 75%, which will allow you to use 75% of your RAM as a filesystem write cache (~10% is default). Now whenever your store a timestep, the solver can dump the results extremely quickly to the fileserver's RAM, and the fileserver can take its time writing out to disk while the solver continues on its way.

Intel's "Smart Response Technology" is a piece of software for Windows that does exactly what you are asking about, but the speed of RAM is orders of magnitude faster than even SSDs. I think it only works on the z68 chipset though (which is bogus, because it is just a software technology). There is also an young Linux project that does much the same thing called bcache.

I'm gonna to purchase my new hardware for CFD. I'm thinking about SSD.

I noticed that it always took a while to open a case file or to save a case file. If the case file is large, it takes more than 10 seconds to open it or save it. That just disperses my attentions. And the accumulated time waiting cost won't be small. We actually save or read files very frequently, because we don't want to lose our work and we need to do some tests to make sure the simulation won't diverge or to make sure there isn't anything wrong with it. My idea is to use a small SSD (e.g. 32 GB or 60 GB, not expensive) as the current working file disk. Once it is full, move the finished simulation files to the usual disks. I also think if I install the CFD software in the SSD, the waiting time could be further reduced a lot. Is this correct?

I'm not a CFD hardware expert. Could anyone comment on this idea?

Btw, I use Fluent as the solver and ICEM to generate grids.
__________________
Best regards,
Meimei
Anna Tian is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Calculation of the Governing Equations Mihail CFX 7 September 7, 2014 06:27
No results for solid domain Gary Holland CFX 10 March 13, 2009 04:30
Two-Phase Buoyant Flow Issue Miguel Baritto CFX 4 August 31, 2006 12:02
THERMAL CONDUCTIVITY FOR SOLID DOMAIN -BOUYANCY CARL CFX 1 June 9, 2006 16:44
Convective Heat Transfer - Heat Exchanger Mark CFX 6 November 15, 2004 16:55


All times are GMT -4. The time now is 12:30.