CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Old cluster optimization

Register Blogs Community New Posts Updated Threads Search

Like Tree2Likes
  • 1 Post By flotus1
  • 1 Post By Rec

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   January 16, 2018, 02:39
Default Old cluster optimization
  #1
Member
 
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 8
Noco is on a distinguished road
Hello!

And I want to buy for good price 5 old one computers (I have 3 of them now for testing) from 2010:

- 2 CPU Xeon X5660
- Aquarius S5520SC
- Nvidia Quadro FX 1800
- DDR 12x8192 + SSD (1 of 5) DDR 12x4096 (4 of 5)
- Windows 7 Pro
- Ansys CFX R17

I also want to buy 1 new 2 CPU AMD Epyc 7301 in 2018 (it will be start of the new cluster, I plan to buy some extra further)

Questions:
1. What will be the fastest way to link this 5 computers? Infiniband is not possible, so which ethernet/other type?

2. Do I need to change something (like DD3, SSD) to receive significant improvment? Maybe switch to some Linux?

3. I can not find normal instruction how to organise parallel computing (using ansys parallel solver) in CFX R17 (I use old instruction for CFX 10 + ansys manual - it works somehow, but as for now 3% of time is calculation, 97% of time - writing to sata disk). Never did this before. Where can I find good instruction for CFX R17 with parallel solver for 6-7 and more computers?
Noco is offline   Reply With Quote

Old   January 16, 2018, 09:33
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
it works somehow, but as for now 3% of time is calculation, 97% of time - writing to sata disk
Not quite sure how you measured this or for which setup, but this could indicate one of two problems: 1) Your workload is heavily I/O bound, e.g. by writing lots of results to disk during calculation 2) Your models do not fit in memory
The solution to 1 is not adding more PCs, but making sure that you do not write more results than absolutely necessary and getting fast SSD storage.

Why exactly is infiniband not possible? Used IB cards are even cheaper than 10 gigabit ethernet cards.
Noco likes this.
flotus1 is offline   Reply With Quote

Old   January 16, 2018, 11:23
Default
  #3
Member
 
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 8
Noco is on a distinguished road
By clock on the wall

Well we always open CPU load graphs during calculations. They were active 3% of time (about 5-7 minutes), then 7 hours the computer was thinking, and after this 7 hours he provided result.

Normally we have:
1. start solver, all adjustments, starting 1 iteration,
- all CPU cores are 80-100% load
- RAM is loaded 10-12%

2. The iteration calculation finished
- all CPU cores - 1-2%
- RAM is loaded 10-12%
- SATA is writing/checking the writing/add to the folder (we found some ghost-writing protocol, which made some ghost-copies (from Windows 10) who make mistakes sometimes and delete this process)

3. Next iteration
- the grid of the new model is formed by criteria
- all CPU cores - 80-100%
.......

The normal cycle time we have is about 11-13 minutes per iteration. With 1 computer. With 3 computers using parallel - about 7 minutes, but another 7 hours to provide result.
Noco is offline   Reply With Quote

Old   January 17, 2018, 04:21
Default
  #4
Rec
New Member
 
Sergey
Join Date: Jan 2018
Posts: 18
Rep Power: 8
Rec is on a distinguished road
Hello!

Tell me what model of the IB-switch is used for you, what IB-cards are installed in the computers, what is their bandwidth? Using a trunk connection(2 or more connection to 1 PC)?

Thank you.

Quote:
Originally Posted by flotus1 View Post
Not quite sure how you measured this or for which setup, but this could indicate one of two problems: 1) Your workload is heavily I/O bound, e.g. by writing lots of results to disk during calculation 2) Your models do not fit in memory
The solution to 1 is not adding more PCs, but making sure that you do not write more results than absolutely necessary and getting fast SSD storage.

Why exactly is infiniband not possible? Used IB cards are even cheaper than 10 gigabit ethernet cards.
Noco likes this.
Rec is offline   Reply With Quote

Old   January 18, 2018, 01:49
Default
  #5
Member
 
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 8
Noco is on a distinguished road
I am also interested in IB-switch model, IB-cards, bandwidth.

What is best-practice?
Noco is offline   Reply With Quote

Old   January 18, 2018, 02:11
Default
  #6
Member
 
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 8
Noco is on a distinguished road
Regarding my 'old' cluster, based on 5 computers 2 CPU Xeon 5660 (2*6 cores, 2,8 GHz, 3 memory channels).

Please advise is it make sense to add in this cluster using Ansys parallel solver:

1. One computer from 2017
- 1 CPU i9 7980XE (1*18 cores, 2,6 GHz, 4 memory channels)
- ASUS S2066 PRIME X299-A RTL
- GPU GeForce PCI-E 11264Mb 1080 Ti InnoVision (not helps actually)
- 64Gb DDR4 2133 MHz (4x16)
- 512 SSD for system
- 2x8 Tb SATA RAID0 for storage
- Windows 10
- ANSYS CFX R17

P.S. We tested i9, using the same task, but with different amount of cores and receive quite not understandable results (attached). 10 cores are faster, then 18.

2. Two computers from 2018
- 2 CPU AMD EPYC 7301 (2*16 cores, 2,2 GHz, 8 memory channels)
- Supermicro H11DSi
- GPU - no or 1060 Ti
- 512 SSD for system
- 2x8 Tb SATA RAID0 for storage
- Windows 10
- ANSYS CFX R17

What will be the bottlenecks of this system and does it make sense to make it all together? Or better to run separately?
Attached Images
File Type: jpg i9 Testing v 3.jpg (75.4 KB, 21 views)

Last edited by Noco; January 18, 2018 at 03:24. Reason: Attachment renew
Noco is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
SAP cluster resource/services not coming online on cluster node 2 Nthar1@yahoo.com Hardware 0 May 9, 2017 05:55
[OpenFOAM.org] OpenFOAM Cluster Setup for Beginners Ruli OpenFOAM Installation 7 July 22, 2016 04:14
Improper data to cluster through .cas and .dat files kaeran FLUENT 0 October 24, 2014 04:10
Why not install cluster by connecting workstations together for CFD application? Anna Tian Hardware 5 July 18, 2014 14:32
Parallel cluster solving with OpenFoam? P2P Cluster? hornig OpenFOAM Programming & Development 8 December 5, 2010 16:06


All times are GMT -4. The time now is 18:14.