Writing results takes waaaayyy toooo looonnggg...
I have a ~10M element case that is taking forever to write results (and transient results) for. Just wondered if anyone has an suggestions on compression settings or hardware.
The transient results files are about 4GB and I'm using a Western Digital 1.5TB Green drive (~5400 RPM). Yes, I'm aware that it's probably the slowest storage solution you could get, but it takes way less time to transfer a 4GB movie file (or equivalent amount in smaller sizes) on the same drive. I can't quite tell where the bottleneck is, if its the storage itself or the compression, but watching TskMgr I can see that the "solver-hpmpi.exe" is definitely not idle. In fact, during the write process, there is some pretty heavy CPU usage. I'm testing a few different compression options now - none and high - to see what impact that has. I initially was using the default setting. I didn't have the same problem (long ?write? time) on a cluster where the same simulation was running. I don't know how the storage was setup there, but most likely some sort of RAID configuration with faster 7200 rpm drives.
I'm currently running the no compression test and it is still taking a very long time and showing a lot of cpu usage on all my cores.
Am considering an SSD, but if my speed is cpu limited, then that's money I could spend on long-term data storage instead.
Any advice to speed this up?
32GB Corsair Vengeance 1600 MHz 9-9-9-24-2n timing. The simulation runs at about 26GB/32GB. I shouldn't be into the page file yet.
My results files are about 4GB default compression, which appears to be high compression based on running high compression and no compression. No compression gives me about 20GB for a results file. Both took almost the same time to "write" according to the .out file.
A test on my drive with AS-SSD shows that sequential write and reads are about 50 MB/s, which should take less than 2 min to write a 4GB file. However, if I use the 4kb size the test slows to a crawl. Could it be that the results files are being written in a large number of small pieces?
So I don't think it's a hard disk speed limitation anymore. The entire time I'm waiting for the results or transient results to "write" I have one processor on my system churining away at full load. It seems very similar to when I do a partion-only run.
Anyway, the cpu time required for the "write" gets handled by the master partition and takes about 4 hours. I don't know what is going on during this time. Can anyone shed some light on it for me? Is there some kind of setting to control the results files so that the "write" doesn't take this long?
I'm running a 12-processor job, all local so it's not a network issue. I'm wondering if that one processor is de-partitioning the mesh. I've specified my mesh partitions with a separate file so that I don't have to partition the case at the beginning of each run.
|All times are GMT -4. The time now is 21:03.|