|
[Sponsors] |
June 28, 2019, 16:04 |
RAID 0 on 2-node cluster
|
#1 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Hi guys, i would kindly ask for an information.
I have a simple 2-node cluster, each node has 1 TB HDD, and on each node is installed ubuntu 18.04 LTS. At this moment, i use a NFS folder to make nodes dialoge together. I use the cluster to make simple CFD computation with openfoam. I was thinking about the possibility to exploit the RAID 0 configuration (just to improve my knowledge and the computation speed during the writing step). I have some questions about it: 1) Is it possible to use a 2-node cluster with RAID 0 configuration? I mean, on Node A i have 1TB, on Node B i have 1TB, i would like to cluster Node A + Node B but with their hard disks in RAID 0, is this possible? 2) if 1) is possible, i know that using a RAID 0 configuration is something like using a " parallelism " between disks to read and write data, must the operative system be installed only on the master node? 3) what's about the NFS folder? If the hard disks of the nodes are in raid 0 configuration it is something like having a 1 + 1TB hard disks that work in parallel, so it is no more required to share a folder between the two, isn't it? It is the very first time i face with this kind of problems, so i'm sorry if anything of the thing i wrote are not correct, feel free to correct me, i would appreciate. Thanks you in advance for the time spent to read the post and answers. Astan |
|
June 28, 2019, 17:38 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
1) I can't comment on if you could. Maybe this could be done given enough dedication, I just don't know how. But I can comment on if you should: absolutely not. The node interconnect would cripple performance of a RAID 0. And let's not start on the increased probability for failure, on top of the already higher failure probability of RAID 0.
If you want RAID 0, put both disks into one machine and set it up there. Then the second node can access it through the network. Using a RAID 0 -especially with spinning disks- as a drive for the operating system seems counter-intuitive. You give up reliability and don't gain much speed if any. It can be used to speed up sequential reads and writes, e.g. for loading and storing simulation results. |
|
June 29, 2019, 03:36 |
|
#3 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Thanks you very much for the answer flotus1, you have convinced me not to follow that way.
Just one more information. If for example i buy 2 ssd and mount them on the master node, could i for example do the following thing: i boot the operative system on the HDD of the master node, then i use the 2 ssd in raid 0 to run the simulation with openfoam and make them accessible by the client node ( this one is equipped with his own hard disk, not used to store data of the simulation). In other words, can i put in raid 0 two ssd to run a simulation, even if the operative system is installed on only one disk? I'm sorry if the question is trivial, but as said before, i'm new to this kind of problems. Thanks you again for the time dedicated to this post. Astan |
|
June 29, 2019, 16:34 |
|
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
So to recap: Each node has it's own HDD that holds the operating system. One of the nodes has both SSDs in RAID 0, the other node connects to this via Ethernet.
Sounds pretty straightforward. Just keep in mind that you need a fast node interconnect to get any benefit from 2 striped SSDs. Is this all for tinkering or are you trying to solve a specific problem? |
|
June 30, 2019, 06:41 |
|
#5 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Thanks you again flotus1 for the answer.
Yes exactly, that's my idea. I am not focusing on a specific problem, it is just to improve my knowledge and test " new " configurations. In your opinion could this configuration improve the writing speed in the context of transient simulations (with respect to the classic configuration with an hard disk which stores data)? Thanks you again for the time spent in this post Astan |
|
June 30, 2019, 16:15 |
|
#6 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
That depends on quite a few factors, but theoretically: yes. RAID 0 with 2 disks doubles the theoretical throughput, sometimes at the expense of higher latency. You will need at least 10Gigabit Ethernet to see an effect on the remote node.
Last edited by flotus1; July 1, 2019 at 05:07. |
|
July 1, 2019, 05:34 |
|
#7 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Thanks you very much flotus1, i really appreciate your answers.
I will think about it and then decide what to do! Thanks you again Astan |
|
July 1, 2019, 15:04 |
|
#8 |
Senior Member
Join Date: Oct 2011
Posts: 239
Rep Power: 16 |
Hello,
This is close to what I have built on one of our clusters. Each node and the master is equiped with two ssd's in raid for the os. Then the master has two additional ssd's in raid as well where the nfs partition is mounted. The difference is that I did it for redundancy so RAID 1, not to increase performance. We also use a 40gb interconnect. If you do it for curiosity go for it but as flotus1 said do not expect a global gain on your simulation without a fast interconnect and heavy solution writes |
|
July 4, 2019, 17:08 |
|
#9 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Hi naffrancois, thanks you very much for your suggestion.
Yes, i think i will follow your advice in the case i decide to build it. Just a curiosity, does the choice of the operative system play an important role in the field of parallel computation? For example, i use Ubuntu 18.04, but a lot people use CentOS or RHEL for clusters. Thanks you again for the time spent in this post. Astan |
|
July 5, 2019, 04:45 |
|
#10 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
The way I see it, the choice of Linux distribution on clusters and workstations is closely related to support policies of hardware and software vendors. CentOS and RHEL are some of the few Linux distros that are officially supported.
That does not mean that you can not run any other distro. Or that you will encounter a performance penalty with your favourite distro. |
|
July 5, 2019, 14:34 |
|
#11 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
ok understood! thanks you very much for your answers flotus1, really appreciate them!
Astan |
|
July 11, 2019, 19:13 |
|
#12 | |
Senior Member
Join Date: Oct 2011
Posts: 239
Rep Power: 16 |
Quote:
|
||
July 16, 2019, 13:59 |
|
#13 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
Hi naffrancois, thanks you very much for the comment i appreciate it!
Astan |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[ICEM] Error in mesh writing | helios | ANSYS Meshing & Geometry | 21 | August 19, 2021 14:18 |
SAP cluster resource/services not coming online on cluster node 2 | Nthar1@yahoo.com | Hardware | 0 | May 9, 2017 05:55 |
Running UDF with Supercomputer | roi247 | FLUENT | 4 | October 15, 2015 13:41 |
Cluster ID's not contiguous in compute-nodes domain. ??? | Shogan | FLUENT | 1 | May 28, 2014 15:03 |
The fluent stopped and errors with "Emergency: received SIGHUP signal" | yuyuxuan | FLUENT | 0 | December 3, 2013 22:56 |