I just want to see the state of the process, is it possible to attach a console into the process, so I can invoke functions inside the process and see some of the global variables.
It's better the process is running without being affected(of course performance can down a little bit)
This will interrupt your process (unless you start it in a thread), but you can use the code module to start a Python console:
import code
code.interact()
This will block until the user exits the interactive console by executing exit().
The code module is available in at least Python v2.6, probably others.
I tend to use this approach in combination with signals for my Linux work (for Windows, see below). I slap this at the top of my Python scripts:
import code
import signal
signal.signal(signal.SIGUSR2, lambda sig, frame: code.interact())
And then trigger it from a shell with kill -SIGUSR2 <PID>, where <PID> is the process ID. The process then stops whatever it is doing and presents a console:
Python 2.6.2 (r262:71600, Oct 9 2009, 17:53:52)
[GCC 3.4.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
Generally from there I'll load the server-side component of a remote debugger like the excellent WinPDB.
Windows is not a POSIX-compliant OS, and so does not provide the same signals as Linux. However, Python v2.2 and above expose a Windows-specific signal SIGBREAK (triggered by pressing CTRL+Pause/Break). This does not interfere with normal CTRL+C (SIGINT) operation, and so is a handy alternative.
Therefore a portable, but slightly ugly, version of the above is:
import code
import signal
signal.signal(
vars(signal).get("SIGBREAK") or vars(signal).get("SIGUSR2"),
lambda sig, frame: code.interact()
)
Advantages of this approach:
No external modules (all standard Python stuff)
Barely consumes any resources until triggered (2x import)
Here's the code I use in my production environment which will load the server-side of WinPDB (if available) and fall back to opening a Python console.
# Break into a Python console upon SIGUSR1 (Linux) or SIGBREAK (Windows:
# CTRL+Pause/Break). To be included in all production code, just in case.
def debug_signal_handler(signal, frame):
del signal
del frame
try:
import rpdb2
print
print
print "Starting embedded RPDB2 debugger. Password is 'foobar'"
print
print
rpdb2.start_embedded_debugger("foobar", True, True)
rpdb2.setbreak(depth=1)
return
except StandardError:
pass
try:
import code
code.interact()
except StandardError as ex:
print "%r, returning to normal program flow" % ex
import signal
try:
signal.signal(
vars(signal).get("SIGBREAK") or vars(signal).get("SIGUSR1"),
debug_signal_handler
)
except ValueError:
# Typically: ValueError: signal only works in main thread
pass
If you have access to the program's source-code, you can add this functionality relatively easily.
See Recipe 576515: Debugging a running python process by interrupting and providing an interactive prompt (Python)
To quote:
This provides code to allow any python
program which uses it to be
interrupted at the current point, and
communicated with via a normal python
interactive console. This allows the
locals, globals and associated program
state to be investigated, as well as
calling arbitrary functions and
classes.
To use, a process should import the
module, and call listen() at any point
during startup. To interrupt this
process, the script can be run
directly, giving the process Id of the
process to debug as the parameter.
Another implementation of roughly the same concept is provided by rconsole. From the documentation:
rconsole is a remote Python console
with auto completion, which can be
used to inspect and modify the
namespace of a running script.
To invoke in a script do:
from rfoo.utils import rconsole
rconsole.spawn_server()
To attach from a shell do:
$ rconsole
Security note: The rconsole listener
started with spawn_server() will
accept any local connection and may
therefore be insecure to use in shared
hosting or similar environments!
Use pyrasite-shell. I can't believe it works so well, but it does. "Give it a pid, get a shell".
$ sudo pip install pyrasite
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope # If YAMA activated, see below.
$ pyrasite-shell 16262
Pyrasite Shell 2.0
Connected to 'python my_script.py'
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> globals()
>>> print(db_session)
>>> run_some_local_function()
>>> some_existing_local_variable = 'new value'
This launches the python shell with access to the globals() and locals() variables of that running python process, and other wonderful things.
Only tested this personally on Ubuntu but seems to cater for OSX too.
Adapted from this answer.
Note: The line switching off the ptrace_scope property is only necessary for kernels/systems that have been built with CONFIG_SECURITY_YAMA on. Take care messing with ptrace_scope in sensitive environments because it could introduce certain security vulnerabilities. See here for details.
Why not simply using the pdb module? It allows you to stop a script, inspect elements values, and execute the code line by line. And since it is built upon the Python interpreter, it also provides the features provided by the classic interpreter. To use it, just put these 2 lines in your code, where you wish to stop and inspect it:
import pdb
pdb.set_trace()
Another possibility, without adding stuff to the python scripts, is described here:
https://wiki.python.org/moin/DebuggingWithGdb
Unfortunately, this solution also requires some forethought, at least to the extent that you need to be using a version of python with debugging symbols in it.
pdb_attach worked well for us for attaching the Python debugger to a long-running process.
The author describes it as follows:
This package was made in response to frustration over debugging long running processes. Wouldn't it be nice to just attach pdb to a running python program and see what's going on? Well that's exactly what pdb-attach does.
Set it up as follows in your main module:
import pdb_attach
pdb_attach.listen(50000) # Listen on port 50000.
When the program is running, attach to it by calling pdb_attach from the command line with the PID of the program and the port passed to pdb_attach.listen():
$ python -m pdb_attach <PID> 50000
(Pdb) # Interact with pdb as you normally would
You can use my project madbg. It is a python debugger that allows you to attach to a running python program and debug it in your current terminal. It is similar to pyrasite and pyringe, but supports python3, doesn't require gdb, and uses IPython for the debugger (which means pdb with colors and autocomplete).
For example, to see where your script is stuck, you could run:
madbg attach <pid>
After that you will have a pdb shell, in which you can invoke functions and inspect variables.
Using PyCharm, I was getting a failure to connect to process in Ubuntu. The fix for this is to disable YAMA. For more info see askubuntu
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
Related
I am trying to run code in the python interpreter from a python script (on windows, using the terminal build in to vsc), but I can't make anything work. I have spent a lot of time using the subprocess,and have also tried os module, but the issue with those, is that they cannot run code in the interpreter. So, I can make them start the interpreter, and I can enter code myself, which my script can get the result of (stdout and stderr), but it cannot enter code into the interpreter. I have tried running multiple commands in a row, using \n\r in the commands, and a few other attempts, but it always runs the second command/line after I manually quit() the interpreter. I have tried almost all of the functions from the subprocess module, and have tried numerous configrations for stdin, stdout, and stderr.
So, my qyuestion is: How can I have a script enter code into the interpreter?
It would also be nice to collect the results in real time, so my script does not have to start and quit an instance of the interpreter every time it wants the results, but that is not a priority.
Example of the issue with the OS module (but the issue is more or less the same with subprocess:
My code:
import os
print(os.popen("python").read())
print(os.popen("1 + 1").read())
Result:
Python 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 + 2 #entered by me
>>> quit() #entered by me
3 #what the print statement returns
'1' is not recognized as an internal or external command,
operable program or batch file.
P.S. I am aware there is another question about this issue, but the only answer it has does not work for me. (When using the module they say, python cannot find the module after I installed it)
EDIT: my code with subprocess:
import subprocess as sp
c = sp.Popen("python", text=True, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
c.stdin.write("1 + 1")
c.stdin.close()
print(c.stdout.read())
Use the suprocess library like this.
import sys
import subprocess
p = subprocess.run(sys.executable, text=True, capture_output=True,
input='print(1+1)')
print(p.stdout)
print(p.stderr)
If you want to reuse a single child process, you have to implement a client and server system. One easy method is to implement a remote call with multiprocessing.Manager. See the example in the documentation.
As a side note, I don't recommend these if you don't have a good reason for spawning a child process, such as sandboxing an execution environment. Just use eval() in the parent process, because the child process will do the same work as what will be done by eval() if it has been done by the parent process.
I often wish to perform Unix commands from inside Python, but I have found recently that some commands are not found. An example is the 'limit' command:
$ echo $SHELL
/bin/tcsh
$ limit vmemoryuse 1000m
$ python
Python 2.7.3 (default, Aug 3 2012, 20:09:51)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system("echo $SHELL")
/bin/tcsh
0
>>> os.system("limit vmemoryuse 1000m")
sh: limit: command not found
32512
>>>
Another example is the 'setenv' command. Why do these commands do not work inside Python? I have tried using both the 'os' and 'subprocess' modules without success. Does anybody know of another module or method that will allow me to successfully call these commands from inside Python?
That's because some shell commands are not really programs, but internal shell commands.
The classical example is cd: if it were an external program it would change the current directory of the new process, not the one of the shell, so it cannot be an external program.
Roughly speaking there are two types of internal shell commands:
Commands that are implemented by the shell of efficiency's sake, but it still exists as an standalone program: true, false, test, sleep...
Commands that change the environment of the shell, and so cannot be done from a child process: cd, umask, setenv, ulimit...
The commands in the first category are quite shell specific. The commands in the second category, not so much.
For details see the man page of the relevant shell (man bash for example).
And if you want to know about an specific command run:
$ type -a <command>
Type is a bashism, I don't know the equivalent in tcsh, but which is an external program, so this:
$ which -a <command>
will show you whether your command exists as an external program, but it knows nothing about shell internals.
If you need the functionality of an internal command (of type 2 above) in your Python program you need to use the relevant system call. Hopefully it will already be available in some module. If not, you would need to write your own wrapper in C.
About your specific commands:
The environment (setenv and getenv) can be manipulated with os.environ or os.getenv, os.putenv, etc.
For the process limits (limit) take a look at the resource module.
I'm running my Python program and have a point where it would be useful to jump in and see what's going on, and then step out again. Sort of like a temporary console mode.
In Matlab, I'd use the keyboard command to do this, but I'm not sure what the command is in python.
Is there a way to do this?
For instance:
for thing in set_of_things:
enter_interactive_mode_here()
do_stuff_to(thing)
When enter_interactive_mode() calls, I'd like to go there, look around, and then leave and have the program continue running.
code.interact() seems to work somehow:
>>> import code
>>> def foo():
... a = 10
... code.interact(local=locals())
... return a
...
>>> foo()
Python 3.6.5 (default, Apr 1 2018, 05:46:30)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
10
Ctrl+Z returns to the "main" interpreter.
You can read the locals, but modifying them doesn't seem to work this way.
python -i myapp.py
This will execute myapp.py and drop you in the interactive shell. From there you can execute functions and check their output, with the whole environment (imports, etc.) of myapp.py loaded.
For something more sophisticated - it would be better to use a debugger like pdb, setting a breakpoint. Also, most IDEs (PyDev, PyCharm, Komodo...) have graphical debuggers.
I use pdb for this purpose. I realize Emil already mentioned this in his answer, but he did not include an example or elaborate on why it answers your question.
for thing in set_of_things:
import pdb; pdb.set_trace()
do_stuff_to(thing)
You can read and set variables by starting your command with an exclamation point. You can also move up and down the stack (commands u and d), which InteractiveConsole does not have built-in mechanisms to do.
To have the program continue executing, use the c command. In the above example it will enter the debugger every loop iteration, so you might want to wrap the set_trace() call in an if sentence.
You have options -- Python standard library or IPython.
The Python standard library has a code module which has an InteractiveConsole class whose purpose is to "Closely emulate the behavior of the interactive Python interpreter." This would probably be able to do what you want, but the documentation doesn't have any examples on how to use this, and I don't have any suggestions on where to go.
IPython, which is a more advanced Python terminal, has the option to embed a console at any point in your program built in. According to their documentation, you can simply do
from IPython import embed
for thing in set_of_things:
embed()
do_stuff_to(thing)
From Python 3.7 onwards, you can also use breakpoint() to get into the debugger, e.g.:
for thing in set_of_things:
breakpoint()
do_stuff_to(thing)
This is a little easier to remember and write, and will open your code in pdb by default.
However, it's also possible to set the PYTHONBREAKPOINT environment to the name of a callable, which could be another debugger such as pudb or ipdb, or it could be IPython's embed, or anything else.
Most comfortable tool for me is ipdb.
ipdb exports functions to access the IPython debugger, which features tab completion, syntax highlighting, better tracebacks, better introspection with the same interface as the pdb module.
Completion and handy introspection is especially useful for debugging.
You can use ipdb.
To set your breakpoints, add import ipdb; ipdb.set_trace() where you want to jump into the debugger. Once you reach a breakpoint, you’ll be given an interactive shell and a few lines of code around your breakpoint for context.
https://www.safaribooksonline.com/blog/2014/11/18/intro-python-debugger/
When I launch a PowerShell script from Python, the delay seems to be approximately 45s, and I cannot figure out why.
I'm trying to run a PowerShell script (accessing some APIs only available to PowerShell) from a Python script.
I've tried a lot of permutations, and all incur ~45 second delay compared to just running the script from a command prompt, using an identical command line.
For example - sample.ps1 might say:
echo foo
And runner.py might say:
import subprocess
p = subprocess.Popen([POWERSHELL, '-File', 'sample.ps1'], stdout=subprocess.STDOUT)
d = p.stdout.read()
Running the .ps1 script directly is fast, running it via runner.py (Python 2.7, 32bit on a 64bit machine) incurs 45 second delay.
The exact same thing occurs if I use "os.system", or Twisted's built-in process tools. So I suspect it's some subtle interaction between the Python interpreter and the Powershell interpreter, possibly related to creation of console windows, or handling of stdin/out/err streams? (which I know don't "really exist" in the same way on Windows)
I do not see any such delays. It is pretty snappy. ( that will also depend on what your script actually does.) Try using call:
from subprocess import call
call(["powershell", "sample.ps1"])
PowerShell loads your user's profile by default. Use the -NoProfile argument to turn that behavior off:
import subprocess
p = subprocess.Popen([POWERSHELL, '-NoProfile', '-File', 'sample.ps1'], stdout=subprocess.STDOUT)
d = p.stdout.read()
Is there a way on Linux to check what a running Python daemon process is doing? That is, without instrumenting the code and without terminating it? Preferably I'd like to get the name of the module and the line number in it that is currently running.
Conventional debugging tools such as strace, pstack and gdb are not very useful for Python code. Most stack frames just contain functions from the interpreter code like PyEval_EvalFrameEx and and PyEval_EvalCodeEx, it doesn't give you any hint on were in the .py-file the execution is.
Some of the answers in Showing the stack trace from a running Python application are applicable in this situation:
pyrasite (this was the one that worked for me):
$ sudo pip install pyrasite
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
$ sudo pyrasite 16262 dump_stacks.py # dumps stacks to stdout/stderr of the python program
pyringe
pydbattach - couldn't get this to work, but the repository https://github.com/albertz/pydbattach contains pointer to other tools
pstack reportedly prints the python stack on Solaris
py-spy (https://github.com/benfred/py-spy) has a few useful tools for inspecting running Python processes. In particular, py-spy dump will print a stack trace (including function, file, and line) for every thread.
winpdb allows you to attach to a running python process, but to do this, you must start the python process this way:
rpdb2 -d -r script.py
Then, after setting a password:
A password should be set to secure debugger client-server communication.
Please type a password:mypassword
you could launch winpdb to File>Attach to (or File>Detach from) the process.
on POSIX systems like Linux, you can use good old GDB, see
https://t37.net/debug-a-running-python-process-without-printf.html
and
https://wiki.python.org/moin/DebuggingWithGdb
There's also the excellent PyCharm IDE (free community version available) that can attach to a running Python process right from within the IDE, using Pdb 4 under the hood, see this blog entry:
http://blog.jetbrains.com/pycharm/2015/02/feature-spotlight-python-debugger-and-attach-to-process/
lptrace does exactly that. It allows you to attach to a running Python process and show currently executing functions, like strace does for system calls. You can call it like this:
vagrant#precise32:/vagrant$ sudo python lptrace -p $YOUR_PID
fileno (/usr/lib/python2.7/SocketServer.py:438)
meth (/usr/lib/python2.7/socket.py:223)
fileno (/usr/lib/python2.7/SocketServer.py:438)
meth (/usr/lib/python2.7/socket.py:223)
...
Note that it requires gdb to run, which isn't available on every server machine.
You can use madbg (by me). It is a python debugger that allows you to attach to a running python program and debug it in your current terminal. It is similar to pyrasite and pyringe, but newer, doesn't require gdb, and uses IPython for the debugger (which means colors and autocomplete).
To see the stack trace of a running program, you could run:
madbg attach <pid>
And in the debugger shell, enter:
bt
It's possible to debug Python with gdb. See Chapter 22: gdb Support in the Python Developer’s Guide.
For example, on Debian with Python 3.7:
# apt-get update -y && apt-get install gdb python3.7-dbg
# gdb
(gdb) source /usr/share/gdb/auto-load/usr/bin/python3.7-gdb.py
(gdb) attach <PID>
(gdb) py-bt
You can also use satella to do this. A nice side effect will be that every local variable in every stack frame will be printed out. The code would be:
from satella.instrumentation import Traceback
import sys
for frame_no, frame in sys._current_frames().items():
sys.stderr.write("For stack frame %s" % (frame_no,))
tb = Traceback(frame)
tb.pretty_print()
sys.stderr.write("End of stack frame dump\n")