I'm trying to test a small server application using python. The application is working a bit like a remote shell - it takes an input and outputs some data based on the input.
My initial try looks like this:
#!/usr/bin/python3
import subprocess
nc = subprocess.Popen(['/usr/bin/ncat','127.0.0.1','9999'], stdin =
subprocess.PIPE, stdout = subprocess.PIPE)
nc.stdin.write(b'test')
This doesn't work and tells me:
write: Broken pipe
However when i test this out in a python3 interactive intepreter everything seems to be working just fine:
$ /usr/bin/python3
Python 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> nc = subprocess.Popen(['ncat','127.0.0.1','9999'], stdin = subprocess.PIPE, stdout = subprocess.PIPE)
>>> nc.stdin.write(b'test')
4
When I run the ncat in the shell it also waits for the input as expected.
So what is going on? How is the interactive interpreter different from running a script and why does it affect the subprocesses?
Some notes:
I don't want to use sockets myself. The server I'm testing is run via xinetd, which means that the executable can also be tested by running it directly. With the current solution I could change ['/usr/bin/ncat','127.0.0.1','9999'] to ['./server_executable'] and test everything without the networking inbetween. With sockets I'd lose this functionality.
I know about pexpect and I even consider using it. Even if I end up not using the subprocesses I'd like to know what the heck is happening here.
When I provide the server executable to Popen everything works fine. This could hint that there is some problem with ncat and how it is run (don't know if it's relevant but I use ncat version 7.40).
Related
I am trying to run code in the python interpreter from a python script (on windows, using the terminal build in to vsc), but I can't make anything work. I have spent a lot of time using the subprocess,and have also tried os module, but the issue with those, is that they cannot run code in the interpreter. So, I can make them start the interpreter, and I can enter code myself, which my script can get the result of (stdout and stderr), but it cannot enter code into the interpreter. I have tried running multiple commands in a row, using \n\r in the commands, and a few other attempts, but it always runs the second command/line after I manually quit() the interpreter. I have tried almost all of the functions from the subprocess module, and have tried numerous configrations for stdin, stdout, and stderr.
So, my qyuestion is: How can I have a script enter code into the interpreter?
It would also be nice to collect the results in real time, so my script does not have to start and quit an instance of the interpreter every time it wants the results, but that is not a priority.
Example of the issue with the OS module (but the issue is more or less the same with subprocess:
My code:
import os
print(os.popen("python").read())
print(os.popen("1 + 1").read())
Result:
Python 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 + 2 #entered by me
>>> quit() #entered by me
3 #what the print statement returns
'1' is not recognized as an internal or external command,
operable program or batch file.
P.S. I am aware there is another question about this issue, but the only answer it has does not work for me. (When using the module they say, python cannot find the module after I installed it)
EDIT: my code with subprocess:
import subprocess as sp
c = sp.Popen("python", text=True, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
c.stdin.write("1 + 1")
c.stdin.close()
print(c.stdout.read())
Use the suprocess library like this.
import sys
import subprocess
p = subprocess.run(sys.executable, text=True, capture_output=True,
input='print(1+1)')
print(p.stdout)
print(p.stderr)
If you want to reuse a single child process, you have to implement a client and server system. One easy method is to implement a remote call with multiprocessing.Manager. See the example in the documentation.
As a side note, I don't recommend these if you don't have a good reason for spawning a child process, such as sandboxing an execution environment. Just use eval() in the parent process, because the child process will do the same work as what will be done by eval() if it has been done by the parent process.
I often wish to perform Unix commands from inside Python, but I have found recently that some commands are not found. An example is the 'limit' command:
$ echo $SHELL
/bin/tcsh
$ limit vmemoryuse 1000m
$ python
Python 2.7.3 (default, Aug 3 2012, 20:09:51)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system("echo $SHELL")
/bin/tcsh
0
>>> os.system("limit vmemoryuse 1000m")
sh: limit: command not found
32512
>>>
Another example is the 'setenv' command. Why do these commands do not work inside Python? I have tried using both the 'os' and 'subprocess' modules without success. Does anybody know of another module or method that will allow me to successfully call these commands from inside Python?
That's because some shell commands are not really programs, but internal shell commands.
The classical example is cd: if it were an external program it would change the current directory of the new process, not the one of the shell, so it cannot be an external program.
Roughly speaking there are two types of internal shell commands:
Commands that are implemented by the shell of efficiency's sake, but it still exists as an standalone program: true, false, test, sleep...
Commands that change the environment of the shell, and so cannot be done from a child process: cd, umask, setenv, ulimit...
The commands in the first category are quite shell specific. The commands in the second category, not so much.
For details see the man page of the relevant shell (man bash for example).
And if you want to know about an specific command run:
$ type -a <command>
Type is a bashism, I don't know the equivalent in tcsh, but which is an external program, so this:
$ which -a <command>
will show you whether your command exists as an external program, but it knows nothing about shell internals.
If you need the functionality of an internal command (of type 2 above) in your Python program you need to use the relevant system call. Hopefully it will already be available in some module. If not, you would need to write your own wrapper in C.
About your specific commands:
The environment (setenv and getenv) can be manipulated with os.environ or os.getenv, os.putenv, etc.
For the process limits (limit) take a look at the resource module.
I am observing a different output from a C++ compiled binary file, that calls some OpenCV libraries, executed via the python interpreter launched via manage.py ($ python2.7 manage.py shell) versus the standard python interpreter ($ python2.7). The output achieved from the bash shell is equivalent to that of the standard python interpreter.
It appears that something is different about the 'environment' of the python interpreter launched via manage.py as compared to the standard python shell. I'd like to know how to determine the difference between the two interpreters and ultimately how to have the result of the binary execution be the same.
Setup details:
connected to web server via ssh (putty)
Centos6-64bit
/bin/bash
From within my Django project directory I run the following and the processed image (result of executing the binary file) is as I expect:
$ python2.7
Python 2.7.1 (r271:86832, Sep 13 2011, 19:13:17)
[GCC 4.4.4 20100726 (Red Hat 4.4.4-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> import os
>>> subprocess.call(['/home/username/engine/binary','/home/username/imagetmp/image.jpg'])
0
>>>
From within my Django project directory I run the following and the processed image is NOT as I expect:
$ python2.7 manage.py shell
Python 2.7.1 (r271:86832, Sep 13 2011, 19:13:17)
[GCC 4.4.4 20100726 (Red Hat 4.4.4-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import subprocess
>>> import os
>>> subprocess.call(['/home/username/engine/binary','/home/username/imagetmp/image.jpg'])
0
>>>
On the command line I run the following and the processed image is as I expect.
$ pwd
/home/username/engine/
$ ./binary /home/username/imagetmp/image.jpg
$
Within both python interpreters I've compared the following:
sys.path (the result obtained from the python interpreter launched via manage.py has the path to my django project while the result from the standard python interpreter does not)
os.environ (the result obtained from the python interpreter launched via manage.py includes the DJANGO_SETTINGS_MODULE and CELERY_LOADER enviroment variables while the result from the standard python interpreter does not)
os.stat for every file in /home/username/engine and /home/username/engine/libs .. with no differences observed
I've also tried modifying the subprocess call which had no impact:
subprocess.call(['/home/username/engine/binary','/home/username/imagetmp/image.jpg'], env=os.environ)
So the differences I've noted are:
the (InteractiveConsole) line when the python interpreter shell is started via $ python2.7 manage.py shell .. I'm unsure of what this extra line means or rather if and what it's presence implies and if it's the cause of the difference I observe in behavior.
the slight differences in results of sys.path and os.environ
My conclusion is that there is something fundamental that I'm unaware of regarding the differences between the python interpreter launched via manage.py versus the standard python interpreter. Any thoughts you have on how to debug this situation would be greatly appreciated.
I'd say that probably the main difference to look at are the environment variables changes that you already noticed.
To make sure that the environment variables are the same in both shells, you can try to create your own customized environment using the env parameter to subprocess.Popen. Once you get the binary working on any of them, it should work the same way in the other one.
The root cause of this observed issue was related to the way file paths were being handled in the binary. Once we realized this was the case and remedied the situation in the binary, we observed the correct behaviour i.e. execution in the python interpreter launched via manage.py led to the same result as execution in the standard python interpreter.
I just want to see the state of the process, is it possible to attach a console into the process, so I can invoke functions inside the process and see some of the global variables.
It's better the process is running without being affected(of course performance can down a little bit)
This will interrupt your process (unless you start it in a thread), but you can use the code module to start a Python console:
import code
code.interact()
This will block until the user exits the interactive console by executing exit().
The code module is available in at least Python v2.6, probably others.
I tend to use this approach in combination with signals for my Linux work (for Windows, see below). I slap this at the top of my Python scripts:
import code
import signal
signal.signal(signal.SIGUSR2, lambda sig, frame: code.interact())
And then trigger it from a shell with kill -SIGUSR2 <PID>, where <PID> is the process ID. The process then stops whatever it is doing and presents a console:
Python 2.6.2 (r262:71600, Oct 9 2009, 17:53:52)
[GCC 3.4.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
Generally from there I'll load the server-side component of a remote debugger like the excellent WinPDB.
Windows is not a POSIX-compliant OS, and so does not provide the same signals as Linux. However, Python v2.2 and above expose a Windows-specific signal SIGBREAK (triggered by pressing CTRL+Pause/Break). This does not interfere with normal CTRL+C (SIGINT) operation, and so is a handy alternative.
Therefore a portable, but slightly ugly, version of the above is:
import code
import signal
signal.signal(
vars(signal).get("SIGBREAK") or vars(signal).get("SIGUSR2"),
lambda sig, frame: code.interact()
)
Advantages of this approach:
No external modules (all standard Python stuff)
Barely consumes any resources until triggered (2x import)
Here's the code I use in my production environment which will load the server-side of WinPDB (if available) and fall back to opening a Python console.
# Break into a Python console upon SIGUSR1 (Linux) or SIGBREAK (Windows:
# CTRL+Pause/Break). To be included in all production code, just in case.
def debug_signal_handler(signal, frame):
del signal
del frame
try:
import rpdb2
print
print
print "Starting embedded RPDB2 debugger. Password is 'foobar'"
print
print
rpdb2.start_embedded_debugger("foobar", True, True)
rpdb2.setbreak(depth=1)
return
except StandardError:
pass
try:
import code
code.interact()
except StandardError as ex:
print "%r, returning to normal program flow" % ex
import signal
try:
signal.signal(
vars(signal).get("SIGBREAK") or vars(signal).get("SIGUSR1"),
debug_signal_handler
)
except ValueError:
# Typically: ValueError: signal only works in main thread
pass
If you have access to the program's source-code, you can add this functionality relatively easily.
See Recipe 576515: Debugging a running python process by interrupting and providing an interactive prompt (Python)
To quote:
This provides code to allow any python
program which uses it to be
interrupted at the current point, and
communicated with via a normal python
interactive console. This allows the
locals, globals and associated program
state to be investigated, as well as
calling arbitrary functions and
classes.
To use, a process should import the
module, and call listen() at any point
during startup. To interrupt this
process, the script can be run
directly, giving the process Id of the
process to debug as the parameter.
Another implementation of roughly the same concept is provided by rconsole. From the documentation:
rconsole is a remote Python console
with auto completion, which can be
used to inspect and modify the
namespace of a running script.
To invoke in a script do:
from rfoo.utils import rconsole
rconsole.spawn_server()
To attach from a shell do:
$ rconsole
Security note: The rconsole listener
started with spawn_server() will
accept any local connection and may
therefore be insecure to use in shared
hosting or similar environments!
Use pyrasite-shell. I can't believe it works so well, but it does. "Give it a pid, get a shell".
$ sudo pip install pyrasite
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope # If YAMA activated, see below.
$ pyrasite-shell 16262
Pyrasite Shell 2.0
Connected to 'python my_script.py'
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> globals()
>>> print(db_session)
>>> run_some_local_function()
>>> some_existing_local_variable = 'new value'
This launches the python shell with access to the globals() and locals() variables of that running python process, and other wonderful things.
Only tested this personally on Ubuntu but seems to cater for OSX too.
Adapted from this answer.
Note: The line switching off the ptrace_scope property is only necessary for kernels/systems that have been built with CONFIG_SECURITY_YAMA on. Take care messing with ptrace_scope in sensitive environments because it could introduce certain security vulnerabilities. See here for details.
Why not simply using the pdb module? It allows you to stop a script, inspect elements values, and execute the code line by line. And since it is built upon the Python interpreter, it also provides the features provided by the classic interpreter. To use it, just put these 2 lines in your code, where you wish to stop and inspect it:
import pdb
pdb.set_trace()
Another possibility, without adding stuff to the python scripts, is described here:
https://wiki.python.org/moin/DebuggingWithGdb
Unfortunately, this solution also requires some forethought, at least to the extent that you need to be using a version of python with debugging symbols in it.
pdb_attach worked well for us for attaching the Python debugger to a long-running process.
The author describes it as follows:
This package was made in response to frustration over debugging long running processes. Wouldn't it be nice to just attach pdb to a running python program and see what's going on? Well that's exactly what pdb-attach does.
Set it up as follows in your main module:
import pdb_attach
pdb_attach.listen(50000) # Listen on port 50000.
When the program is running, attach to it by calling pdb_attach from the command line with the PID of the program and the port passed to pdb_attach.listen():
$ python -m pdb_attach <PID> 50000
(Pdb) # Interact with pdb as you normally would
You can use my project madbg. It is a python debugger that allows you to attach to a running python program and debug it in your current terminal. It is similar to pyrasite and pyringe, but supports python3, doesn't require gdb, and uses IPython for the debugger (which means pdb with colors and autocomplete).
For example, to see where your script is stuck, you could run:
madbg attach <pid>
After that you will have a pdb shell, in which you can invoke functions and inspect variables.
Using PyCharm, I was getting a failure to connect to process in Ubuntu. The fix for this is to disable YAMA. For more info see askubuntu
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
I am wondering, is there a way to change the name of a script so that it is not called "python.exe" in the tasklist. The reason I am asking is that I am trying to make a batch file that run's a python script. I want the batch file to check to see if the script is already running. if the script is already running then the batch file will do nothing. Thanks
Maybe you can try this : http://code.google.com/p/procname/
This library does not work on Windows, and shouldn't be used in production code. Manipulation the argv array is a rather dirty hack.
Generally I'd not try to identify processes by scanning the process table. This is not really reliable, as process names aren't guaranteed to be unique. Instead I'd spawn a simple server on localhost inside the python script. If started, the script can then try to connect to the server, and quit, if the server is already running. This approach can later on also be expanded to support any kind of IPC.
You could use py2exe to convert the Python script to a .exe file which means you could then give it any name you like.
Alternatively you could use Python itself (rather than a .bat file) using the approaches given at Reading Command Line Arguments of Another Process (Win32 C code) to determine the name of the scripts being run by the 'python.exe' processes.
I'd simply create a lockfile in the local filesystem and exit if this exists already.
Copy python.exe to a file name of your choice.
C:\Python26>copy python.exe my_proc.exe
1 file(s) copied.
C:\Python26>my_proc.exe
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
In the tasklist it is showing as my_proc.exe.
I've tried to make a symlink of python.exe (mklink in Windows 7). Unfortunately it is still showing as python.exe in the task list.