I will have to install a new c
I will have to install a new cluster which has a lot of HD (120Gb) on the nodes.
I understand that common configurations use NFS server on a head (or master) node, and avoid using disk space from the nodes. I heard that I would experience a performance drop if my nodes started to write to their own disk instead of writing to the NFS server in the master node. Is that the case?
Is there any other efficient alternatives to using NFS concentrating all data storage on the master node?
What would be the ideal file system / architecture to install in a new cluster to go with OF? NFS concentrated on the master node, or some other distributed FS like PVFS? Or anything else?
(Sorry if it is a dumb question, I am just starting to learn how to install a cluster)
Thanks a lot,
I would say it is the other wa
I would say it is the other way around. If all the nodes have to write results to the master node that can become a bottleneck. So if you can dump all the results to local disks that would be faster.
However I find it a small price to pay for the convenience so personnaly always use nfs. Compare the time spent in i/o to the time spent doing calculations and see if it is worthwhile.
The only time the nfs IO will
The only time the nfs IO will become a serious bottleneck is if you are
1) Dumping every 10 or fewer timesteps, like when you are making animations.
2) Doing aero-acoustic calcs, which require wall pressure data for each timestep.
To alleviate this you can run with node distributed data. I think there is a section in the manual describing how to do this. Hasnt been tested for a few years though.
I am reactivating this thread because we are thinking of introducing a parallel file system on our Cluster. The nodes run on OpenSUSE, have each one or two disks and are connected via Gigabit Ethernet.
At the moment I am reading the available articles for parallel file systems. But there are so many (PVFS, GFS, OCFS, Panasas, pNFS, XtreemFS, ...) - I am a little confused.
Our main interest is to have a global file system for all nodes and to increase throughput mainly for pre- and post-processing. Redundancy is also an issue.
What is your experience?
My experience is that unless you are a sucker for punishment, going with a pre-configured or commercial PFS is by far the least painful option. If you are using it for post-processing, I would recommend a dedicated storage node based on something like the Panasus system. This is expensive though, but several surveys of parallel file systems in the wild have not filled me with excitement. The only one I have come across that really seems to tick all the boxes is the Google FS, but unfortunately they aren't sharing. Hadoop has similar functionality, but it is all Java based API (i.e. not POSIX compliant).
|All times are GMT -4. The time now is 17:34.|