CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Programming & Development

Strange crash, possibly memory related

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   November 4, 2014, 06:15
Default Strange crash, possibly memory related
  #1
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
I'm currently writing a boundary condition based on TemperatureCoupledBase and I'm seeing a crash when I try to call its kappa(const scalarField& Tp) function. It gives a segv error in the field addition in heThermo::alphaEff(const scalarField& alphat, const label patchi) function. However, when I add a simple console output, Info << "test " <<endl; here, to an intermediate function (in RASModel.H, kappaEff(const label patchI) ) or directly in my solver, it does not crash anymore. Furthermore, the kappa function from TemperatureCoupledBase is called multiple times in the BC without crashing.

This smells very fishy and I suspect memory corruption or similar things (maybe incompatibility between compiled libraries?). Any suggestions how one can tackle such a problem?

I should add one more thing, I'm using a custom thermophysical library. The turbulenceModels reference the default thermophysical library, can this be a problem? I have tried compiling them with referencing only my own library, but this didn't change anything.
chriss85 is offline   Reply With Quote

Old   November 4, 2014, 07:39
Default
  #2
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
Changing the number of pimple iterations also avoids this crash...

I had to change something in the boundary condition which also prevents crashing, so it's currently running, but I fear that there is some kind of hidden memory corruption which will give me problems later on or maybe lead to wrong results, so I'm not happy with the status quo.
chriss85 is offline   Reply With Quote

Old   November 8, 2014, 16:45
Default
  #3
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,974
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings chriss85,

Not enough information to diagnose the actual problem, so guessing is the only thing that can be done here: you were probably not using properly an access by reference to a field/variable.
If accesses are made too many times without relying on an intermediate reference variable, it could lead to memory problems, at least when running in parallel.

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   November 10, 2014, 06:28
Default
  #4
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
The crash is also happening on single threaded runs.
Anything I can provide for better diagnosis? The thermophysical library itself doesn't use any references in the kappa() function in which it crashes, it just uses some member variables of the class.

With not using an intermediate reference variable, do you mean copying the value/object using a copy constructor? If so, why is this going to lead to memory problems? I can see why it would be inefficient, but crashing?
chriss85 is offline   Reply With Quote

Old   November 10, 2014, 16:40
Default
  #5
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,974
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answers:
Quote:
Originally Posted by chriss85 View Post
Anything I can provide for better diagnosis?
A code snapshot? Allow us to see what you're seeing? Show us what modifications you've made?
Even if it you don't show specific implemented equations, a mock-up of the code would give some more ideas on what you have in front of you

Quote:
Originally Posted by chriss85 View Post
I can see why it would be inefficient, but crashing?
OpenFOAM relies on a very automated parallellization mechanism, which means that excessive requests of a particular field, can lead to overlapping requests for communicating in parallel. Either that, or flood the stack of pending communications.
wyldckat is offline   Reply With Quote

Old   November 11, 2014, 03:33
Default
  #6
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
I will post some excerpts later, though I don't think it will be possible to post the whole code, as it is too involved and not so easy to set up.
chriss85 is offline   Reply With Quote

Old   November 13, 2014, 02:35
Default
  #7
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
Here's the code of the BC in which the crash happened. (Un)fortunately, in this current version the crash is not happening anymore. I took a close look at the changes when I made them and I'm relatively sure the cause wasn't due to them, so maybe there is still something in there. This is a region coupled BC which limits the temperature to an evaporation temperature.
Code:
void TemperatureRadCoupledMixedLimitedFvPatchScalarField::updateCoeffs()
{
    if (updated())
    {
        return;
    }
    // Since we're inside initEvaluate/evaluate there might be processor
    // comms underway. Change the tag we use.
    int oldTag = UPstream::msgType();
    UPstream::msgType() = oldTag+1;

    // Get the coupling information from the mappedPatchBase
    const mappedPatchBase& mpp =
        refCast<const mappedPatchBase>(patch().patch());
    const polyMesh& nbrMesh = mpp.sampleMesh();
    const label samplePatchI = mpp.samplePolyPatch().index();
    const fvPatch& nbrPatch =
        refCast<const fvMesh>(nbrMesh).boundary()[samplePatchI];


    scalarField& Tp = *this;
    
    const TemperatureRadCoupledMixedLimitedFvPatchScalarField&
        nbrField = refCast
            <const TemperatureRadCoupledMixedLimitedFvPatchScalarField>
            (
                nbrPatch.lookupPatchField<volScalarField, scalar>(TnbrName_)
            );

    // Swap to obtain full local values of neighbour internal field
    scalarField TcNbr(nbrField.patchInternalField());
    
    //Note: distributing (-->interpolating) from one mesh to another of a directional quantity such as Qr 
    //        does not reverse the sign due to changing surface normal
    mpp.distribute(TcNbr);


    // Swap to obtain full local values of neighbour K*delta
    tmp<scalarField> KDeltaNbr(new scalarField(TcNbr.size(), 0.0));
    if (contactRes_ == 0.0)
    {
        // Swap to obtain full local values of neighbour K*delta
        KDeltaNbr() = nbrField.kappa(nbrField)*nbrPatch.deltaCoeffs();
    }
    else
    {
        KDeltaNbr() = contactRes_;
    }
    mpp.distribute(KDeltaNbr());

    scalarField KDelta(kappa(*this)*patch().deltaCoeffs());

    scalarField Qr(Tp.size(), 0.0);
    if (QrName_ != "none")
    {
        Qr = patch().lookupPatchField<volScalarField, scalar>(QrName_);
    }

    scalarField QrNbr(Tp.size(), 0.0);
    if (QrNbrName_ != "none")
    {
        QrNbr = nbrPatch.lookupPatchField<volScalarField, scalar>(QrNbrName_);
        mpp.distribute(QrNbr);
    }
    scalarField alpha(KDeltaNbr() - (Qr + QrNbr)/Tp);
    valueFraction() = alpha/(alpha + KDelta);
    refValue() = (KDeltaNbr()*TcNbr)/alpha;
    
    forAll(*this, faceI)
    {
      //Evaluating the value directly works better here.
      if(valueFraction()[faceI]*refValue()[faceI]+(1-valueFraction()[faceI])*patchInternalField()()[faceI] > T_Evap_)
      {
    valueFraction()[faceI] = 1;
    refValue()[faceI] = T_Evap_;
    Info << "limiting temperature at patch to evaporation temperature " << T_Evap_ << endl;
      }
    }
    mixedFvPatchScalarField::updateCoeffs();
    tmp<scalarField> QNbr(new scalarField(TcNbr.size(), 0.0));
    // Swap to obtain full local values of neighbour K*delta
    //Second minus because the snGrad of neighbour field points in the wrong direction
    QNbr() = - (- nbrField.kappa(nbrField) * nbrField.snGrad());
    mpp.distribute(QNbr());
    
    Info << "patch: " << patch().name() << ", min T: " << min(Tp) << endl;
    //Info << "patch: " << patch().name() << ", dQ: " << sqrt(pow((min(Qr) - min(kappa(*this)*snGrad()) - (min(QNbr) - min(QrNbr)))/(min(-kappa(*this)*snGrad())), 2)) << endl;
    //The crash happened in this line
    Info << "patch: " << patch().name() << ", kappa: " << min(kappa(*this)) << endl;
    Info << "patch: " << patch().name() << ", deltaCoeffs: " << min(patch().deltaCoeffs()) << endl;
    Info << "patch: " << patch().name() << ", Tcenter: " << min(patchInternalField()) << endl;
    
    Info << "patch: " << patch().name() << ", Qr: " << std::abs(min(Qr)) << endl;
    //Info << "patch: " << patch().name() << ", Q: " << std::abs(min(-kappa(*this)*snGrad())) << endl;
    //Info << "patch: " << patch().name() << ", QTotal: " << std::abs(min(-kappa(*this)*snGrad() + Qr)) << endl;
    Info << "patch: " << patch().name() << ", alpha: " << min(alpha) << endl;
    Info << "patch: " << patch().name() << ", refValue: " << refValue() << endl;
    
    //Qr is positive for outgoing radiation, QrNbr is positive for radiation from wall to plasma
    /*
    evaporationHeatFlux_ = (- kappa(*this) * snGrad() + Qr) - (-(QNbr)  - QrNbr);
    */
    if (debug)
    {
        scalar Q = gSum(kappa(*this)*patch().magSf()*snGrad());

        Info<< patch().boundaryMesh().mesh().name() << ':'
            << patch().name() << ':'
            << this->dimensionedInternalField().name() << " <- "
            << nbrMesh.name() << ':'
            << nbrPatch.name() << ':'
            << this->dimensionedInternalField().name() << " :"
            << " heat transfer rate:" << Q
            << " walltemperature "
            << " min:" << gMin(*this)
            << " max:" << gMax(*this)
            << " avg:" << gAverage(*this)
            << endl;
    }

    // Restore tag
    UPstream::msgType() = oldTag;
}
The crash happened in the kappa() function call near the end (look for the comment). The thermophysical model which I use is very simple, and the kappa implementation only accesses member variables and should not be an issue.
chriss85 is offline   Reply With Quote

Old   November 16, 2014, 17:10
Default
  #8
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,974
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi chriss85,

Quote:
Originally Posted by chriss85 View Post
The crash happened in the kappa() function call near the end (look for the comment).
There are a lot of calls to "kappa()". As for the exact last one you're referring to, I'm not certain you're referring to the one inside the "if(debug)" block.

My guess is that you're triggering the exact thing I was trying to indicate, but that you're currently no longer triggering, only because you now probably have less calls being made.

From what I can see on this excerpt of code (thanks for making a big excerpt, since it's easier to diagnose this way), the problem is that you're not using something like this:
Code:
    const fvPatch& nbrPatch =
        refCast<const fvMesh>(nbrMesh).boundary()[samplePatchI];
for the calls to "kappa", namely something like this:
Code:
const scalarField & nbrFieldKappa = nbrField.kappa(nbrField);
const scalarField & localKappa = kappa(*this);
Since you don't use this, you're forcing the code to create new entries for each call on the variable stack or heap (I never know which is which), it can lead to an overload of said variable storage and consequent crash.

If the last "kappa()" you're referring to is the one inside the "if(debug)" block, then the "gSum" can easily be the thing that broke the camel's back, because it's a function that takes care of doing a parallel summation, which requires more intermediate variables that you might be expecting.


If you can make it break again (use a more refined mesh if you have to, at least make the patches have more cell-faces), then try the solution I mentioned a few lines above and it will possibly fix the problem.

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   November 17, 2014, 06:05
Default
  #9
Senior Member
 
Join Date: Oct 2013
Posts: 397
Rep Power: 18
chriss85 will become famous soon enough
Are you saying that this can also happen without parallel running?
I don't know too much about the internals of the OpenFOAM code, so I don't know about this for sure. Values which are needed temporarily inside a statement should be allocated on the stack and be deallocated at the end of the scope. OpenFOAM uses the tmp class to wrap objects which normally appear on the stack, so they can be on the heap (which is faster because the tmp object which is on the stack can be copied without penalties). I could possibly see an issue with the tmp<> class, but I did not look into its implementation so I can't tell for sure.

The crash was happening in the line which I commented, before the debugging code.
chriss85 is offline   Reply With Quote

Old   November 17, 2014, 14:56
Default
  #10
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,974
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quote:
Originally Posted by chriss85 View Post
Are you saying that this can also happen without parallel running?
Given enough calls, it's possible that in serial it could also have issues.

I went back to read again your first post and it seems that the crash would occur when "kappa" was calling "heThermo::alphaEff" midway. Without being able to test myself, my guess is that the crash occurs because there is a long operation going on in that particular line. If you split up the operation in two or three steps, would it still crash in the same way?
Namely something like this:
Code:
evaporationHeatFlux_ = -kappa(*this) * snGrad();
evaporationHeatFlux_ += Qr;
evaporationHeatFlux_ += QNbr + QrNbr;
(I have seen issues similar to this, namely when there are too many operations in a single line, but it was around 10 years ago with overloaded operators in FORTRAN And it was a compiler limitation.)


It's also possible that this is an optimization glitch made by the compiler and/or linker.


In addition, "heThermo::alphaEff" is possibly something within a heavily source code C++ template section of the code. If any changes were made to any single one of those template files and which is used in both:
  • a library you link to;
  • and/or in your solver;
  • and/or in your boundary condition library;
that's more than enough to make it crash
Now that I think about it, right now this is indeed the 1st or 2nd most likely suspect...

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   January 12, 2015, 11:26
Default
  #11
dzi
Member
 
Join Date: Nov 2011
Location: Berlin
Posts: 31
Rep Power: 14
dzi is on a distinguished road
Hello,
I found your thread and maybe there is also a relation to this recent bugfix:
turbulentTemperatureRadCoupledMixedFvPatchScalarFi eld::updateCoeffs(): refValue can become negative.
http://www.openfoam.org/mantisbt/view.php?id=1369
best regards dirk
dzi is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memory/ swap area J-man STAR-CD 1 September 29, 2009 19:43
Unusual memory related error ReeKo CFX 7 September 30, 2008 19:11
CFX 5.6 memory problem with solver sreevisakh CFX 3 February 2, 2004 04:57
Memory allocation problem shhe CFX 4 January 6, 2004 23:00
ECC Memory Needed for Clusters? Jonas Larsson Main CFD Forum 1 January 17, 2001 08:08


All times are GMT -4. The time now is 04:02.