CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Community Contributions (https://www.cfd-online.com/Forums/openfoam-community-contributions/)
-   -   [PyFoam] PyFoam killing my SSH sessions (https://www.cfd-online.com/Forums/openfoam-community-contributions/134186-pyfoam-killing-my-ssh-sessions.html)

leroyv April 25, 2014 11:23

PyFoam killing my SSH sessions
 
Dear Foamers,

I recently decided to switch to PyFoam to run and monitor my cases. However, I encountered an annoying issue: some PyFoam functions seem to kill my SSH sessions. Here's the kind of output I get:
Code:

<some output>
patch 0 (start: 19800 size: 50) name: inlet
patch 1 (start: 19850 size: 50) name: outlet
patch 2 (start: 19900 size: 50) name: lowerWall
patch 3 (start: 19950 size: 50) name: upperWall
patch 4 (start: 20000 size: 200) name: cylinderWalls
patch 5 (start: 20200 size: 10000) name: back
patch 6 (start: 30200 size: 10000) name: front

End

Killing PID 29668
Connection to hpc4 closed by remote host.
Connection to hpc4 closed.

Sometime it happens, sometimes it doesn't. Any idea of what could be the cause of this behaviour?

wyldckat April 25, 2014 11:51

Greetings Vincent,

That's a curious occurrence. A few questions:
  1. Which Python version are you using?
  2. Can you give specific examples of the commands you're using?
  3. Are you're using any scripts for launching the PyFoam scripts?
  4. Are the simulations being launched by a job scheduler?
Best regards,
Bruno

gschaider April 25, 2014 15:27

Quote:

Originally Posted by leroyv (Post 488209)
Dear Foamers,

I recently decided to switch to PyFoam to run and monitor my cases. However, I encountered an annoying issue: some PyFoam functions seem to kill my SSH sessions. Here's the kind of output I get:
Code:

<some output>
patch 0 (start: 19800 size: 50) name: inlet
patch 1 (start: 19850 size: 50) name: outlet
patch 2 (start: 19900 size: 50) name: lowerWall
patch 3 (start: 19950 size: 50) name: upperWall
patch 4 (start: 20000 size: 200) name: cylinderWalls
patch 5 (start: 20200 size: 10000) name: back
patch 6 (start: 30200 size: 10000) name: front

End

Killing PID 29668
Connection to hpc4 closed by remote host.
Connection to hpc4 closed.

Sometime it happens, sometimes it doesn't. Any idea of what could be the cause of this behaviour?

That is not a lot you're giving us to work with. What SHOULD happen? (in addition to the stuff that Bruno asked)

I've never seen that kind of behaviour and can only guess (although most guesses say "it's not my (PyFoam) fault but a weird thing where the termination signal of the thread that runs the OF-command". But that is a REALLY wild guess)

leroyv April 27, 2014 06:07

1 Attachment(s)
Quote:

Originally Posted by wyldckat (Post 488213)
Greetings Vincent,

That's a curious occurrence. A few questions:
  1. Which Python version are you using?
  2. Can you give specific examples of the commands you're using?
  3. Are you're using any scripts for launching the PyFoam scripts?
  4. Are the simulations being launched by a job scheduler?
Best regards,
Bruno

Dear Bruno,

Thank you for answering so quickly. Here are the answers to your questions:
  1. I'm using Python 2.7.3
  2. An example of the kind of script I use is included below
  3. I am directly launching my Allrun.py script from an interactive ZSH session (over SSH)
  4. I don't use a job scheduler
Vincent

Allrun.py
Code:

#!/usr/bin/python

# Run script
# Requires pyFoam

import os, subprocess, configobj
from PyFoam.Applications.ClearCase import ClearCase
from PyFoam.Applications.Runner import Runner
from PyFoam.Applications.FromTemplate import FromTemplate

def touch(path):
    with open(path, 'a'):
        os.utime(path, None)

def main():
    """
    Steps for the run:
    - Cleanup the case (remove time directories)
    - Build the mesh
    - Run the createPatch tool
    - Run the mapFields utility (import average fields)
    - Run the decomposePar tool
    - Run the application
    - Run the postprocessing programs (reconstructPar, sample, foamCalc and foamCalcEx)
    - Generate the .foam files for visualization with ParaView
    """

    nProc = 2
    rootPath = os.getcwd()

    # Load case parameters
    CaseParameters = configobj.ConfigObj("CaseParameters.ini")

    # Cleanup
    ClearCase()

    # Build dictionaries
    os.chdir(os.path.join(rootPath, "constant/polyMesh/"))
    subprocess.call(["python", "blockMeshDict.py"])
    os.chdir(rootPath)

    os.chdir(os.path.join(rootPath, "system/"))
    subprocess.call(["python", "createPatchDict.py"])
    os.chdir(rootPath)

    for dictName in CaseParameters['Dictionaries']:
        FromTemplate(args=[ dictName, CaseParameters['Dictionaries'][dictName] ])

    # Move resulting files in 0/templates to their runtime location
    os.rename("0/templates/TTilde", "0/TTilde")

    # Build mesh
    Runner(args=["blockMesh"])

    # Run the createPatch utility
    Runner(args=["createPatch", "-overwrite"])

    # Map input average fields
    Runner(args=[
                "mapFields",
                "input/average"
                ])

    # Map input velocity deviations
    Runner(args=[
                "mapFields",
                "-consistent",
                "input/cell"
                ])
   

    # Decompose
    Runner(args=[
                "decomposePar",
                "-force"
                ])

    # Run application
    Runner(args=[
                "--proc=%d"%nProc,
                "thermoDownscalingFoam",
                ])

    # Reconstruct if necessary
    Runner(args=["reconstructPar"])

    # Create visualization files if necessary
    touch("case.foam")


if __name__ == "__main__":
    main()


leroyv April 27, 2014 06:13

Quote:

Originally Posted by gschaider (Post 488246)
That is not a lot you're giving us to work with. What SHOULD happen? (in addition to the stuff that Bruno asked)

I've never seen that kind of behaviour and can only guess (although most guesses say "it's not my (PyFoam) fault but a weird thing where the termination signal of the thread that runs the OF-command". But that is a REALLY wild guess)

Dear Bernhard,

Thank you for your answer. In the example output I posted, the killing happens right at the end of the execution of the blockMesh utility using a Runner object (see the Allrun.py script posted above). The script should then proceed with the execution of various utilities and a solver.

Vincent

gschaider April 27, 2014 19:11

Quote:

Originally Posted by leroyv (Post 488464)
Dear Bernhard,

Thank you for your answer. In the example output I posted, the killing happens right at the end of the execution of the blockMesh utility using a Runner object (see the Allrun.py script posted above). The script should then proceed with the execution of various utilities and a solver.

Vincent

I had a look: PyFoam only kills a run if it receives a keyboard-interrupt (=user pressed Ctrl-C or similar). So it seems that something happened that "looks like a keyboard interrupt". But I have no idea what that could be in your case

wyldckat April 28, 2014 14:38

Greetings to all!

Quote:

Originally Posted by gschaider (Post 488543)
I had a look: PyFoam only kills a run if it receives a keyboard-interrupt (=user pressed Ctrl-C or similar). So it seems that something happened that "looks like a keyboard interrupt". But I have no idea what that could be in your case

There is a feature in SSH connections for keeping a connection alive, but I'm not certain how it works. One of the possible strategies might be similar to a keyboard character being sent periodically.

I'm not familiar enough with Python to know this, but when using bash scripts, it's possible to use the "-e" set option:
Code:

set -e
that defines that the script should terminate when an error occurs. It can be turned off later in the script with:
Code:

set +e
I can only guess that Python might have a similar feature, i.e. don't quit until there are any major errors.

Best regards,
Bruno

gschaider April 29, 2014 05:35

Quote:

Originally Posted by wyldckat (Post 488731)
Greetings to all!


There is a feature in SSH connections for keeping a connection alive, but I'm not certain how it works. One of the possible strategies might be similar to a keyboard character being sent periodically.

I'm not familiar enough with Python to know this, but when using bash scripts, it's possible to use the "-e" set option:
Code:

set -e
that defines that the script should terminate when an error occurs. It can be turned off later in the script with:
Code:

set +e
I can only guess that Python might have a similar feature, i.e. don't quit until there are any major errors.

Best regards,
Bruno

The problem is not the SSH-connection closing (otherwise we wouldn't see the "Killing"-message. Which comes from PyFoam). The problem is that PyFoam thinks that somebody pressed Ctrl-C and decides to kill the process. That information comes from the Thread-subsystem so there is not much I can do about. Probably the same one who passes out this wrong information also passes it to the running shell and that closes the connection.

One thing: it seems that the output from PyFoam messes up some terminals. What you could try is to add the "--silent"-option to the Runner calls. This will print nothing to the terminal but you should still have the log-files (don't use "--progress" because it tends to make this worse: some terminals have problem with '\r')

Other option is to go to PyFoam.Execution.BasicRunner.py and add print-statemtents before self.run.interrupt() to find out how the interrupt got there (there should also be traces in ~/pyFoam/log/general and the PyFoamState.TheState in the case should read "Interrupted" if PyFoam thinks that someone Ctrl-C'ed it (but that is only diagnostic and I wouldn't know how to fix this)

leroyv April 29, 2014 05:52

Thank you for your advice, gentlemen. I will try to do what Bernhard suggests ASAP and I'll come back if I make any progress.

Regards,

Vincent

leroyv May 5, 2014 07:55

Update
 
Quote:

Originally Posted by gschaider (Post 488844)
One thing: it seems that the output from PyFoam messes up some terminals. What you could try is to add the "--silent"-option to the Runner calls. This will print nothing to the terminal but you should still have the log-files (don't use "--progress" because it tends to make this worse: some terminals have problem with '\r')

Dear Bernhard,

I suppressed the terminal output using the --silent option. Killings still occurred. I noticed that they seem to happend preferably at the end of the execution of utilities (all of them) and quite randomly. I tried to run the utilities using the UtilityRunner class and killings stopped occurring.

However, I still get errors like:
Code:

Getting LinuxMem: [Errno 2] No such file or directory: '/proc/27467/status'
I browsed the forum and found this post: http://www.cfd-online.com/Forums/ope...tml#post193961. Might this be related to my problem?

gschaider May 5, 2014 12:54

Quote:

Originally Posted by leroyv (Post 489880)
Dear Bernhard,

I suppressed the terminal output using the --silent option. Killings still occurred. I noticed that they seem to happend preferably at the end of the execution of utilities (all of them) and quite randomly. I tried to run the utilities using the UtilityRunner class and killings stopped occurring.

However, I still get errors like:
Code:

Getting LinuxMem: [Errno 2] No such file or directory: '/proc/27467/status'
I browsed the forum and found this post: http://www.cfd-online.com/Forums/ope...tml#post193961. Might this be related to my problem?

Hm. UtilityRunner from PyFoam.Applications or PyFoam.Execution? In the later case one big difference is that NO server process is constructed automatically. What you could try is whether the --no-server-process-option makes a difference for the PyFoam.Applications.Runner-class

Anyway: If you're doing scripts PyFoam.Execution.UtilitiyRunner might be the better choice anyway (the PyFoam.Applications-classes are mainly for quickly "translating" what you did on the shell to a script)

leroyv May 9, 2014 10:44

Solution found
 
Quote:

Originally Posted by gschaider (Post 489947)
Hm. UtilityRunner from PyFoam.Applications or PyFoam.Execution? In the later case one big difference is that NO server process is constructed automatically. What you could try is whether the --no-server-process-option makes a difference for the PyFoam.Applications.Runner-class

Anyway: If you're doing scripts PyFoam.Execution.UtilitiyRunner might be the better choice anyway (the PyFoam.Applications-classes are mainly for quickly "translating" what you did on the shell to a script)

Okay, that seems to be the thing: session killings stopped right after I added the --no-server-process option. Thank you!

leroyv July 2, 2014 05:29

Quote:

Originally Posted by gschaider (Post 489947)
Hm. UtilityRunner from PyFoam.Applications or PyFoam.Execution? In the later case one big difference is that NO server process is constructed automatically. What you could try is whether the --no-server-process-option makes a difference for the PyFoam.Applications.Runner-class

Anyway: If you're doing scripts PyFoam.Execution.UtilitiyRunner might be the better choice anyway (the PyFoam.Applications-classes are mainly for quickly "translating" what you did on the shell to a script)

Hi,

I have been working on the problem again. I still get the getLinuxMem errors I mentioned. After testing on several configurations, it seems to be happening more frequently on my production configuration than on my dev computer. There are major differences between them, but the most important one is that the production computer has much faster processors than the development one.

This also happens only on very 'small' cases. When processing big amounts of data, I do not see these messages.

So my hypothesis would be: maybe this is just a matter of execution speed. Maybe the process started using any Runner class terminates so quickly that when any function using its PID is run, it cannot do its work as expected. What do you think about that?

leroyv July 2, 2014 12:37

I guess I was right
 
I added " && sleep N", where N is a number to be chosen for every task, to give PyFoam enough time to sort things out. So the typical call looks like
Code:

UtilityRunner(
    logname="PyFoamUtility.decomposePar",
    silent=not(verbose > 1),
    argv=[
        "decomposePar",
        "-case", caseDirectory,
        "-force",
        " && sleep 1"
    ]).start()

Now, the nearly all of the getLinuxMem OSErrors are gone.

gschaider July 2, 2014 18:35

Quote:

Originally Posted by leroyv (Post 499697)
I added " && sleep N", where N is a number to be chosen for every task, to give PyFoam enough time to sort things out. So the typical call looks like
Code:

UtilityRunner(
    logname="PyFoamUtility.decomposePar",
    silent=not(verbose > 1),
    argv=[
        "decomposePar",
        "-case", caseDirectory,
        "-force",
        " && sleep 1"
    ]).start()

Now, the nearly all of the getLinuxMem OSErrors are gone.

That is an OK workaround. But not really a solution as it slows things doen on machines where everything would have worked anyway.

The problem (as you said) is timing: pyFoam puts the utility into another thread so that it can "look at it from outside". Problem is that sometimes the utility is finished before PyFoam can have a first look.

leroyv July 3, 2014 05:25

Quote:

Originally Posted by gschaider (Post 499738)
That is an OK workaround. But not really a solution as it slows things doen on machines where everything would have worked anyway.

The problem (as you said) is timing: pyFoam puts the utility into another thread so that it can "look at it from outside". Problem is that sometimes the utility is finished before PyFoam can have a first look.

Ok, now I understand things better. I guess in that case, since I don't do any log analysis, I should be running the utilities directly using the subprocess.call() function.


All times are GMT -4. The time now is 16:30.