RAID 0 on 2-node cluster

Astan · June 28, 2019, 16:04

Hi guys, i would kindly ask for an information.

I have a simple 2-node cluster, each node has 1 TB HDD, and on each node is installed ubuntu 18.04 LTS.
At this moment, i use a NFS folder to make nodes dialoge together.
I use the cluster to make simple CFD computation with openfoam.

I was thinking about the possibility to exploit the RAID 0 configuration (just to improve my knowledge and the computation speed during the writing step).

I have some questions about it:

1) Is it possible to use a 2-node cluster with RAID 0 configuration? I mean, on Node A i have 1TB, on Node B i have 1TB, i would like to cluster Node A + Node B but with their hard disks in RAID 0, is this possible?

2) if 1) is possible, i know that using a RAID 0 configuration is something like using a " parallelism " between disks to read and write data, must the operative system be installed only on the master node?

3) what's about the NFS folder? If the hard disks of the nodes are in raid 0 configuration it is something like having a 1 + 1TB hard disks that work in parallel, so it is no more required to share a folder between the two, isn't it?

It is the very first time i face with this kind of problems, so i'm sorry if anything of the thing i wrote are not correct, feel free to correct me, i would appreciate.

Thanks you in advance for the time spent to read the post and answers.

Astan

flotus1 · June 28, 2019, 17:38

1) I can't comment on if you could. Maybe this could be done given enough dedication, I just don't know how. But I can comment on if you should: absolutely not. The node interconnect would cripple performance of a RAID 0. And let's not start on the increased probability for failure, on top of the already higher failure probability of RAID 0.

If you want RAID 0, put both disks into one machine and set it up there. Then the second node can access it through the network.

Using a RAID 0 -especially with spinning disks- as a drive for the operating system seems counter-intuitive. You give up reliability and don't gain much speed if any. It can be used to speed up sequential reads and writes, e.g. for loading and storing simulation results.

Astan · June 29, 2019, 03:36

Thanks you very much for the answer flotus1, you have convinced me not to follow that way.

Just one more information.

If for example i buy 2 ssd and mount them on the master node, could i for example do the following thing:

i boot the operative system on the HDD of the master node, then i use the 2 ssd in raid 0 to run the simulation with openfoam and make them accessible by the client node ( this one is equipped with his own hard disk, not used to store data of the simulation).

In other words, can i put in raid 0 two ssd to run a simulation, even if the operative system is installed on only one disk?

I'm sorry if the question is trivial, but as said before, i'm new to this kind of problems.

Thanks you again for the time dedicated to this post.

Astan

flotus1 · June 29, 2019, 16:34

So to recap: Each node has it's own HDD that holds the operating system. One of the nodes has both SSDs in RAID 0, the other node connects to this via Ethernet.
Sounds pretty straightforward. Just keep in mind that you need a fast node interconnect to get any benefit from 2 striped SSDs.

Is this all for tinkering or are you trying to solve a specific problem?

Astan · June 30, 2019, 06:41

Thanks you again flotus1 for the answer.

Yes exactly, that's my idea.

I am not focusing on a specific problem, it is just to improve my knowledge and test " new " configurations.

In your opinion could this configuration improve the writing speed in the context of transient simulations (with respect to the classic configuration with an hard disk which stores data)?

Thanks you again for the time spent in this post

Astan

flotus1 · June 30, 2019, 16:15

That depends on quite a few factors, but theoretically: yes. RAID 0 with 2 disks doubles the theoretical throughput, sometimes at the expense of higher latency. You will need at least 10Gigabit Ethernet to see an effect on the remote node.

Astan · July 1, 2019, 05:34

Thanks you very much flotus1, i really appreciate your answers.

I will think about it and then decide what to do!

Thanks you again

Astan

naffrancois · July 1, 2019, 15:04

Hello,

This is close to what I have built on one of our clusters. Each node and the master is equiped with two ssd's in raid for the os. Then the master has two additional ssd's in raid as well where the nfs partition is mounted. The difference is that I did it for redundancy so RAID 1, not to increase performance. We also use a 40gb interconnect.

If you do it for curiosity go for it but as flotus1 said do not expect a global gain on your simulation without a fast interconnect and heavy solution writes

Astan · July 4, 2019, 17:08

Hi naffrancois, thanks you very much for your suggestion.

Yes, i think i will follow your advice in the case i decide to build it.

Just a curiosity, does the choice of the operative system play an important role in the field of parallel computation?
For example, i use Ubuntu 18.04, but a lot people use CentOS or RHEL for clusters.

Thanks you again for the time spent in this post.

Astan

flotus1 · July 5, 2019, 04:45

The way I see it, the choice of Linux distribution on clusters and workstations is closely related to support policies of hardware and software vendors. CentOS and RHEL are some of the few Linux distros that are officially supported.
That does not mean that you can not run any other distro. Or that you will encounter a performance penalty with your favourite distro.

Astan · July 5, 2019, 14:34

ok understood! thanks you very much for your answers flotus1, really appreciate them!

Astan

naffrancois · July 11, 2019, 19:13

Quote:

Originally Posted by Astan

Hi naffrancois, thanks you very much for your suggestion.

Yes, i think i will follow your advice in the case i decide to build it.

Just a curiosity, does the choice of the operative system play an important role in the field of parallel computation?
For example, i use Ubuntu 18.04, but a lot people use CentOS or RHEL for clusters.

Thanks you again for the time spent in this post.

Astan

I am using ubuntu 18.04 desktop/server for our machines. Never had issues for basic research oriented mpi clusters involving nfs/slurm/ssh. I do not know much about centos but rhel is pretty conservative so you usually have to stick with older software versions

Astan · July 16, 2019, 13:59

Hi naffrancois, thanks you very much for the comment i appreciate it!

Astan

June 28, 2019, 16:04	RAID 0 on 2-node cluster	#1
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Hi guys, i would kindly ask for an information. I have a simple 2-node cluster, each node has 1 TB HDD, and on each node is installed ubuntu 18.04 LTS. At this moment, i use a NFS folder to make nodes dialoge together. I use the cluster to make simple CFD computation with openfoam. I was thinking about the possibility to exploit the RAID 0 configuration (just to improve my knowledge and the computation speed during the writing step). I have some questions about it: 1) Is it possible to use a 2-node cluster with RAID 0 configuration? I mean, on Node A i have 1TB, on Node B i have 1TB, i would like to cluster Node A + Node B but with their hard disks in RAID 0, is this possible? 2) if 1) is possible, i know that using a RAID 0 configuration is something like using a " parallelism " between disks to read and write data, must the operative system be installed only on the master node? 3) what's about the NFS folder? If the hard disks of the nodes are in raid 0 configuration it is something like having a 1 + 1TB hard disks that work in parallel, so it is no more required to share a folder between the two, isn't it? It is the very first time i face with this kind of problems, so i'm sorry if anything of the thing i wrote are not correct, feel free to correct me, i would appreciate. Thanks you in advance for the time spent to read the post and answers. Astan

June 30, 2019, 16:15		#6
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,399 Rep Power: 46	That depends on quite a few factors, but theoretically: yes. RAID 0 with 2 disks doubles the theoretical throughput, sometimes at the expense of higher latency. You will need at least 10Gigabit Ethernet to see an effect on the remote node. Last edited by flotus1; July 1, 2019 at 05:07.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
[ICEM] Error in mesh writing	helios	ANSYS Meshing & Geometry	21	August 19, 2021 14:18
SAP cluster resource/services not coming online on cluster node 2	Nthar1@yahoo.com	Hardware	0	May 9, 2017 05:55
Running UDF with Supercomputer	roi247	FLUENT	4	October 15, 2015 13:41
Cluster ID's not contiguous in compute-nodes domain. ???	Shogan	FLUENT	1	May 28, 2014 15:03
The fluent stopped and errors with "Emergency: received SIGHUP signal"	yuyuxuan	FLUENT	0	December 3, 2013 22:56

June 28, 2019, 17:38		#2
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,399 Rep Power: 46	1) I can't comment on if you could. Maybe this could be done given enough dedication, I just don't know how. But I can comment on if you should: absolutely not. The node interconnect would cripple performance of a RAID 0. And let's not start on the increased probability for failure, on top of the already higher failure probability of RAID 0. If you want RAID 0, put both disks into one machine and set it up there. Then the second node can access it through the network. Using a RAID 0 -especially with spinning disks- as a drive for the operating system seems counter-intuitive. You give up reliability and don't gain much speed if any. It can be used to speed up sequential reads and writes, e.g. for loading and storing simulation results.

June 29, 2019, 03:36		#3
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Thanks you very much for the answer flotus1, you have convinced me not to follow that way. Just one more information. If for example i buy 2 ssd and mount them on the master node, could i for example do the following thing: i boot the operative system on the HDD of the master node, then i use the 2 ssd in raid 0 to run the simulation with openfoam and make them accessible by the client node ( this one is equipped with his own hard disk, not used to store data of the simulation). In other words, can i put in raid 0 two ssd to run a simulation, even if the operative system is installed on only one disk? I'm sorry if the question is trivial, but as said before, i'm new to this kind of problems. Thanks you again for the time dedicated to this post. Astan

June 29, 2019, 16:34		#4
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,399 Rep Power: 46	So to recap: Each node has it's own HDD that holds the operating system. One of the nodes has both SSDs in RAID 0, the other node connects to this via Ethernet. Sounds pretty straightforward. Just keep in mind that you need a fast node interconnect to get any benefit from 2 striped SSDs. Is this all for tinkering or are you trying to solve a specific problem?

June 30, 2019, 06:41		#5
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Thanks you again flotus1 for the answer. Yes exactly, that's my idea. I am not focusing on a specific problem, it is just to improve my knowledge and test " new " configurations. In your opinion could this configuration improve the writing speed in the context of transient simulations (with respect to the classic configuration with an hard disk which stores data)? Thanks you again for the time spent in this post Astan

July 1, 2019, 05:34		#7
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Thanks you very much flotus1, i really appreciate your answers. I will think about it and then decide what to do! Thanks you again Astan

July 1, 2019, 15:04		#8
naffrancois Senior Member Join Date: Oct 2011 Posts: 239 Rep Power: 16	Hello, This is close to what I have built on one of our clusters. Each node and the master is equiped with two ssd's in raid for the os. Then the master has two additional ssd's in raid as well where the nfs partition is mounted. The difference is that I did it for redundancy so RAID 1, not to increase performance. We also use a 40gb interconnect. If you do it for curiosity go for it but as flotus1 said do not expect a global gain on your simulation without a fast interconnect and heavy solution writes

July 4, 2019, 17:08		#9
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Hi naffrancois, thanks you very much for your suggestion. Yes, i think i will follow your advice in the case i decide to build it. Just a curiosity, does the choice of the operative system play an important role in the field of parallel computation? For example, i use Ubuntu 18.04, but a lot people use CentOS or RHEL for clusters. Thanks you again for the time spent in this post. Astan

July 5, 2019, 04:45		#10
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,399 Rep Power: 46	The way I see it, the choice of Linux distribution on clusters and workstations is closely related to support policies of hardware and software vendors. CentOS and RHEL are some of the few Linux distros that are officially supported. That does not mean that you can not run any other distro. Or that you will encounter a performance penalty with your favourite distro.

July 5, 2019, 14:34		#11
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	ok understood! thanks you very much for your answers flotus1, really appreciate them! Astan

July 16, 2019, 13:59		#13
Astan Member Andrew Join Date: Mar 2018 Posts: 82 Rep Power: 8	Hi naffrancois, thanks you very much for the comment i appreciate it! Astan