CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > SU2

Temperature Problem in Parallel Processing

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree2Likes
  • 1 Post By knaik
  • 1 Post By knaik

Reply
 
LinkBack Thread Tools Display Modes
Old   June 6, 2013, 10:12
Default Temperature Problem in Parallel Processing
  #1
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
Dear All,

I am trying to run SU2 with my core i5 laptop but unfortunately after a several iteration, my system will shut down due to thermal increase. I know it may be irrelevant to SU2 forum but I think my be others already had similar experience or hint for solving it.

Best Regards
Payam

Last edited by pdp.aero; June 9, 2013 at 00:49.
pdp.aero is offline   Reply With Quote

Old   June 13, 2013, 17:37
Default
  #2
New Member
 
Join Date: Jun 2013
Posts: 2
Rep Power: 0
knaik is on a distinguished road
Hello Payam,

Unfortunately, this isn't a problem we have seen arise in the past.
As you say, it may well be an issue with your specific machine.
Perhaps other users have had similar experiences with temperature and will be able to offer their solutions here. If you are able to resolve the problem yourself, please do let us know - especially if it is related to running SU^2!

Many thanks,
Kedar
pdp.aero likes this.
knaik is offline   Reply With Quote

Old   June 14, 2013, 11:16
Default
  #3
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
Quote:
Originally Posted by knaik View Post
Hello Payam,

Unfortunately, this isn't a problem we have seen arise in the past.
As you say, it may well be an issue with your specific machine.
Perhaps other users have had similar experiences with temperature and will be able to offer their solutions here. If you are able to resolve the problem yourself, please do let us know - especially if it is related to running SU^2!

Many thanks,
Kedar
Thanks for your caring response. Yes, I guessed it isn't normal problem.

Actually the problem has been solved somehow. It seems my OS (Ubuntu 12.04) caused arise this problem, As I understood some dynamic CPU function and lack of hardware driver support caused Ubuntu OS have high CPU heat issues especially after heavy load, Therefore I guess there is two kind of solution : 1- using huge powerful Coolpad. 2- Undervolt the CPU.
Since the undervolting the CPU has huge risk, I tried to solve the problem with Coolpad first but it wasn't effective so much. My sensor indicators shows it has 101 (C) on average when it run in parallel without Coolpad and has 98(C) with Coolpad. So still it is really high. Hence it seems that I haven't any choice except undervotling the CPU with Linux Processor Hardware Control.


Sincerely,
Payam
pdp.aero is offline   Reply With Quote

Old   June 14, 2013, 13:48
Default
  #4
New Member
 
Join Date: Jun 2013
Posts: 2
Rep Power: 0
knaik is on a distinguished road
Thanks for the update Payam!
What an interesting (albeit unfortunate) problem.
Hopefully you're able to find a solution in the near future that doesn't involve scaling down the voltage.
Best of luck!
-Kedar
pdp.aero likes this.
knaik is offline   Reply With Quote

Old   June 15, 2013, 07:14
Default
  #5
Senior Member
 
Cean
Join Date: Feb 2010
Posts: 126
Rep Power: 6
shirazbj is on a distinguished road
Interesting.

I have this problem using win7@32. I was thinking maybe my CPU is overlooked a little. Now I know it is common.
shirazbj is offline   Reply With Quote

Old   June 15, 2013, 19:18
Default
  #6
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
Quote:
Originally Posted by shirazbj View Post
Interesting.

I have this problem using win7@32. I was thinking maybe my CPU is overlooked a little. Now I know it is common.
Hi shirazbij!!!

You shouldn't have this problem in window OS, This isnt common in windows OS. Actually windows OS automatically reduces or optimizes the CPU voltage and hinder overheating specially for machine which have CPU architecture with integrated graphics (e.g. i3, i5, i7) for overclocking the frequencies. If your machine has dual or quad core and you had thermal problem when you were over loading all your cores, your overheating caused by another problems. I guess you have .NET on your OS which cause mscrosvm.exe eating up your CPU. So when you load all your cores, you are confronting thermal problem. By the way I will post my results on this issue very soon and I will explain it more there.

Best Regards
Payam
pdp.aero is offline   Reply With Quote

Old   June 16, 2013, 00:59
Default
  #7
Senior Member
 
Cean
Join Date: Feb 2010
Posts: 126
Rep Power: 6
shirazbj is on a distinguished road
Hi Payam,

How could a win machine doesn't has a .net program? I can't see mscrosvm.exe listed in task manager.

My cpu is i7-870, quad core with 8 threads, but without integrated graphics.

Thanks
shirazbj is offline   Reply With Quote

Old   June 18, 2013, 20:55
Default
  #8
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
Quote:
Originally Posted by knaik View Post
Thanks for the update Payam!
What an interesting (albeit unfortunate) problem.
Hopefully you're able to find a solution in the near future that doesn't involve scaling down the voltage.
Best of luck!
-Kedar
Thank you Kedar, It is cool and challenging. I undervolt my CPU manually successfully, then perform some speedup test on my laptop and a survey on parallel processing performance by using SU2. In the other word I tune my CPU for running SU2 in parallel with temperature consideration. I got very interesting results. I thought others may face similar issue in future. So I decided to share my survey on this issue in general.
First of the all, Undervolting the CPU has risks. It may cause some hardware or software damage particularly for heavy computational task. But if you manage to perform it correctly your system can run on low temperature and can save more energy. However if you don’t know what you want to do, you shouldn’t do this. If you decided to do it, you will find a very good thread here. I used it as a clue too.
Here it is my procedure summery for undervolting my CPU. My operational system is Ubuntu-12.04 LTS. First you should install kernel PHC (Processor Hardware Control) patch to be able to control your CPU voltage and frequency. Then you should unload your old CPU driver and load the appropriate PHC driver for your CPU. Finally you should find lowest possible voltage that your CPU can run with lowest frequency without crash and gradually increase the voltage to find proper temperature results. During this step you need to do some stress test on your CPU by loading all your cores and see how your temperature will change. Stress test can be performed by using CPUburn. My CPU has 2534Mhz as its maximum frequency. I was finding 1199Mhz as its lowest possible frequency and gradually increase it to the maximum for tuning the voltage. In every frequency and voltage I do stress test for 10-15 minutes. You can follow my results summery.
Code:
Test Number     Processor Frequency   VID (Voltage ID)    Max. Temperature
       1	    1199Mhz		  9		      65˚C
       3	    1466Mhz		 11		      68˚C
       4	    1599Mhz		 12		      70˚C
       7	    1999Mhz		 15		      79˚C
       8	    2133Mhz		 16		      83˚C
       9	    2266Mhz		 17		      89˚C
      12	    2534Mhz		 20	              102˚C
I set my CPU frequency by considering the maximum temperature and then run SU2 on that frequency. The temperature was the same as stress test results. Thus SU2 didn’t cause any thermal problems. My maximum speedup for running in parallel with 2 physical cores and 4 threads is 2.04497.It’s really good. Although some acceleration method like Multigrid or GMRES play a great role but parallel processing always is effective. Also aerodynamic coefficient for every test converged to exactly the same number with exactly the same iterations. Moreover I could find the right answer about computational performance of Intel and AMD processors that I always had. You would follow this link to find out how frequency could effects computational performance. Also I find out frequency could play a great role if someone going to develop some dynamic partitioning for parallel processing.
After all this works, I found out there is another easy way to do this by just installing indicator-cpufreq ppa package, but the manual way always better.
The aerodynamic coefficient results for ONERA test case was good but still I have some discrepancy between Cp’s sectional distributions. It will be posted as a new thread soon. Thanks.

Best Regards,
Payam
pdp.aero is offline   Reply With Quote

Old   June 18, 2013, 20:57
Default
  #9
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
Quote:
Originally Posted by shirazbj View Post
Hi Payam,

How could a win machine doesn't has a .net program? I can't see mscrosvm.exe listed in task manager.

My cpu is i7-870, quad core with 8 threads, but without integrated graphics.

Thanks
Everything is always possible . I meant I guess your Microsoft .NET framework eat up your CPU. Its just a guess. If you are looking for mscrosvm.exe you will find it in task manager> resource monitor> CPU.
You will able to disable it by navigating to C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727 and entering
Code:
ngen.exe executequeueditems
in your command prompt.

Sincerely,
Payam
pdp.aero is offline   Reply With Quote

Old   July 27, 2014, 22:11
Default
  #10
Member
 
Payam
Join Date: Aug 2011
Posts: 66
Blog Entries: 3
Rep Power: 5
pdp.aero is on a distinguished road
I am posting the final solution to my question here. The thermal problem that I had during the simulation made my laptop to be shut down after almost 20 iterations. First, as you are seeing in the first solution stated previously in this thread, I made it working by undervolting the processor. However, when I was upgrading my OS, and re-installing the SU2 3.2.0, this happened again when ATLAS optimizes itself with the processors. I checked out the problem again, this time opened back of the laptop, removed the fan, removed the dusts, and replaced the CPU and GPU's old thermal paste with new one. This problem already posted here as a bug. Cleaning the fan and replacing the thermal paste perfectly worked for me. The maximum CPU temperature reached to 86C during ATLAS configuration in comparison with the previous 104C that imposed an unexpected shut down to my system.

All in all, if somebody confronting with unexpected shut down followed by the overheat warning, my first advice is cleaning the cooling system, removing the old thermal paste, and using the new one for the CPU and GPU.
pdp.aero is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
libOpenSMOKE Tobi OpenFOAM Programming & Development 397 September 10, 2014 03:30
is internalField(U) equivalent to zeroGradient? immortality OpenFOAM Running, Solving & CFD 7 March 29, 2013 02:27
fluent parallel processing problem pedram.sotudeh FLUENT 0 June 19, 2012 01:32
problem in the CFX12.1 parallel computation BalanceChen ANSYS 2 July 7, 2011 10:26
Parallel processing problem with mpich nzy102 OpenFOAM Running, Solving & CFD 14 October 18, 2007 00:05


All times are GMT -4. The time now is 03:48.