CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Main CFD Forum

Different results from AMD and Intel machine

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 17, 2011, 15:20
Default Different results from AMD and Intel machine
  #1
New Member
 
Join Date: Feb 2011
Posts: 6
Rep Power: 15
MichaelCFD is on a distinguished road
Hi,

I hope anyone can help me out here: I got slightly different numerical results from AMD core machine and Intel core machine by using exactly the same code (MPI parallel). For serial running the difference is much smaller and only on some single cells for about 8th- or 9th decimal digits, but still there.

Debugging seems no problem or not get the root cause yet. Anyone here had such experience or any idea? Thanks.

Michael

Last edited by MichaelCFD; February 17, 2011 at 15:37.
MichaelCFD is offline   Reply With Quote

Old   February 17, 2011, 15:23
Default
  #2
New Member
 
Join Date: Feb 2011
Posts: 6
Rep Power: 15
MichaelCFD is on a distinguished road
btw, the code is in C. Thanks.
MichaelCFD is offline   Reply With Quote

Old   February 17, 2011, 18:05
Default
  #3
Senior Member
 
Julien de Charentenay
Join Date: Jun 2009
Location: Australia
Posts: 231
Rep Power: 17
julien.decharentenay is on a distinguished road
Send a message via Skype™ to julien.decharentenay
Hi Michael,

It is not uncommon to not have exactly the same results on different architectures/OS.

For exmaple, the following looks identical, but may be treated slightly differently:
double c = 0;
double c = 0.d0;

I also assume that there is no random number generator in the code.

The main question is:
Does the differences affect the results at convergence significantly?

Regards, Julien
__________________
---
Julien de Charentenay
julien.decharentenay is offline   Reply With Quote

Old   February 17, 2011, 18:22
Default
  #4
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19
Martin Hegedus is on a distinguished road
Is the code single precision or double precision?

Are they the same executable or did you recompile the code on each machine?

If you recompiled it, what level of optimization did you use?

What type of solver is it, structured, unstructured, implicit, explicit?

Is the solution steady or unsteady? If steady, did you converge it to machine zero? If unsteady, at what point do you see the difference build up?

You mentioned MPI. Does this mean you are using multiple machines or just one machine with multiple cores? Also, how is your domain broken up? For example, I would expect an implicit structured chimera solver to converge differently depending on how the overall grid is parsed and passed out among the various cores.
Martin Hegedus is offline   Reply With Quote

Old   February 21, 2011, 10:31
Default different using the same executable
  #5
New Member
 
Join Date: Feb 2011
Posts: 6
Rep Power: 15
MichaelCFD is on a distinguished road
Thanks for all your responses and helps.

I use the same executable compiled on a Intel machine. No random number. Also I tested to output (printf) the following as suggested:

double a=0;
double a=0.d0;

The output is the same from AMD and Intel for many decimal digits.

The convergence is fine for both. and parallel part doing the exactly the same thing. The problem is that even for serial running, there are digit difference fr om the two machines.

I guess if any one had tested code like this. Maybe I should grep another code to test...
MichaelCFD is offline   Reply With Quote

Old   February 21, 2011, 12:37
Default
  #6
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19
Martin Hegedus is on a distinguished road
Assuming your code is double precision, both codes are doing exactly the same thing, and the case is steady state.

1) Euler should be run first for comparison, then laminar, then turbulent.
2) The results should be converged to machine zero. In general, the residual should be in the vicinity of 1.0e-15 and 1.0e-16.
3) Assuming you don't have some very fine cells, or cells with poor Jacobians, the differences in state variables (i.e. rho, rho-v, p, etc) between the two codes should be less than 1.0e-10, IMO.
4) Integrated load values, such as lift and drag, can be off by much more since they depend on integrating pressure differences.
Martin Hegedus is offline   Reply With Quote

Old   February 22, 2011, 12:19
Default I use the same executable for AMD and Intel machines
  #7
New Member
 
Join Date: Feb 2011
Posts: 6
Rep Power: 15
MichaelCFD is on a distinguished road
Initially the difference is very small, like 1.e-14, but as iterations go on, it becomes significantly large... Did anyone here test your own code on different machines? Thanks.
MichaelCFD is offline   Reply With Quote

Old   February 22, 2011, 12:36
Default
  #8
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19
Martin Hegedus is on a distinguished road
Yes I have, and the answer you are looking for depends on the case you are running, the solution methodology, the grid, and what you are comparing. A simple answer does not exist.

You haven't given enough details to help address your question.
Martin Hegedus is offline   Reply With Quote

Old   February 23, 2011, 21:51
Default
  #9
New Member
 
Join Date: Feb 2011
Posts: 6
Rep Power: 15
MichaelCFD is on a distinguished road
So did you find any difference in results from different machine? if any difference, what is the cause? My code is a typical cfd code: unsteady or steady, fv, ... But I do not think those factors should make such machine difference... Thanks.
MichaelCFD is offline   Reply With Quote

Old   February 23, 2011, 22:57
Default
  #10
Senior Member
 
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19
Martin Hegedus is on a distinguished road
For an unsteady result, once the results diverge, even by an epsilon, they will continue to diverge. So, in that case, a difference of 1e-8 in a field value isn't anything special.

However, my general experience is that field values for steady results converged to machine zero are within 1e-12 between amd and intel for solutions on high quality grids. I take notice if the results differ by more than 1e-10. My experience is that, in general, when this is the case, the probability is significant that there is a bug in the code. However, this is only true if the solution is independent of the number of cores solving the problem. For example, the probability is high that the solution during convergence for a steady problem from an implicit method is dependent on the number of cpu cores solving the problem. This difference should diminish as the solution converges to machine zero, assuming that the right hand side is independent of how the problem is broken up for the various cores.

But, it is also important to take into account nonlineararities of the flow being analyzed. Epsilon changes in a shock or a vortex can cause noticeable differences in other areas of the flow field.
Martin Hegedus is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
CFX11 + Fortran compiler ? Mohan CFX 20 March 30, 2011 18:56
OpenFOAM 13 Intel quadcore parallel results msrinath80 OpenFOAM Running, Solving & CFD 13 February 5, 2008 05:26
OpenFOAM 13 AMD quadcore parallel results msrinath80 OpenFOAM Running, Solving & CFD 1 November 10, 2007 23:23
AMD X2 & INTEL core 2 are compatible for parallel? nikolas FLUENT 0 October 5, 2006 06:49
INTEL vs. AMD Michael Bo Hansen CFX 9 June 19, 2001 16:54


All times are GMT -4. The time now is 07:53.