How to get process PID for manual lock mechanism in Python?

How to get process PID for manual lock mechanism in Python? - python

I would like to make a simple locking mechanism in Python without having to rely on the existing libraries for locking (namely fcntl and probably others)
I already have a small stub, but after searching for a bit I couldn't find a good on answer on how to manually create the lock file and put the process PID inside. Here is my stub:
dir_name = "/var/lock/mycompany"
file_name = "myapp.pid"
lock = os.path.join(dir_name, file_name)
if os.path.exists(lock):
print >> sys.stderr, "already running under %s, exiting..." % lock
# display process PID contained in the file, not relevant to my question
sys.exit(ERROR_LOCK)
else:
# create the file 'lock' and put the process PID inside
How can I get the current process PID and put it inside the lock file? I thought about looking at /proc filesystem but that seems a bit too much for such a simple task.
Thanks.

http://docs.python.org/library/os.html#os.getpid

open(lock, 'w').write(os.getpid())

Don't neglect to convert the result of os.getpid() to a string with str(os.getpid()). write wants a string argument.

Related

Check if process is running in Windows using only Python built-in modules

I know that there are a couple of ways to complete that task using psutil or win32ui modules. But I wonder if there's an option to do that using Python built-in modules only? I have also found this question:
Check if PID exists on Windows with Python without requiring libraries
But in this case the object is located by PID and I want to do it using process name.

Maybe this will help you:
import subprocess
s = subprocess.check_output('tasklist', shell=True)
if "cmd.exe" in s:
print s

Without PyWin32, you're going to have to go at it the hard way and use Python's ctypes module. Fortunately, there is already a post about this here on StackOverflow:
How can I find a process by name and kill using ctypes?
You might also find this article useful for getting a list of the running processes:
http://code.activestate.com/recipes/305279-getting-process-information-on-windows/

If all you're trying to do is test whether a process is running by its process name, you could just import the check_output method from the subprocess module (not the entire module):
from subprocess import check_output
print('Test whether a named process appears in the task list.\n')
processname = input('Enter process name: ') # or just assign a specific process name to the processname variable
tasks = check_output('tasklist')
if processname in str(tasks):
print('{} is in the task list.'.format(processname))
else:
print('{} not found.'.format(processname))
Output:
>>> Discord.exe
Discord.exe is in the task list.
>>> NotARealProcess.exe
NotARealProcess.exe not found.
(This works for me on Windows 10 using Python 3.10.) Note that since this is just searching for a specific string across the entire task list output, it will give false positives on partial process names (such as "app.exe" or "app" if "myapp.exe" is running) and other non-process text input that happens to be in the task list:
>>> cord.ex
cord.ex is in the task list.
>>> PID Session Name
PID Session Name is in the task list.
This code generally should work fine if you just want to find a known process name in the task list and are searching by the whole name, but for more rigorous uses, you might want to use a more complex approach like parsing the task list into a dictionary and separating out the names for more focused searching, as well as add some error checking to handle edge cases.

Python `tee` stdout of child process

Is there a way in Python to do the equivalent of the UNIX command line tee? I'm doing a typical fork/exec pattern, and I'd like the stdout from the child to appear in both a log file and on the stdout of the parent simultaneously without requiring any buffering.
In this python code for instance, the stdout of the child ends up in the log file, but not in the stdout of the parent.
pid = os.fork()
logFile = open(path,"w")
if pid == 0:
os.dup2(logFile.fileno(),1)
os.execv(cmd)
edit: I do not wish to use the subprocess module. I'm doing some complicated stuff with the child process that requires me call fork manually.

Here you have a working solution without using the subprocess module. Although, you could use it for the tee process while still using the exec* functions suite for your custom subprocess (just use stdin=subprocess.PIPE and then duplicate the descriptor to your stdout).
import os, time, sys
pr, pw = os.pipe()
pid = os.fork()
if pid == 0:
os.close(pw)
os.dup2(pr, sys.stdin.fileno())
os.close(pr)
os.execv('/usr/bin/tee', ['tee', 'log.txt'])
else:
os.close(pr)
os.dup2(pw, sys.stdout.fileno())
os.close(pw)
pid2 = os.fork()
if pid2 == 0:
# Replace with your custom process call
os.execv('/usr/bin/yes', ['yes'])
else:
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
pass
Note that the tee command, internally, does the same thing as Ben suggested in his answer: reading input and looping over output file descriptors while writing to them. It may be more efficient because of the optimized implementation and because it's written in C, but you have the overhead of the different pipes (don't know for sure which solution is more efficient, but in my opinion, reassigning a custom file-like object to stdout is a more elegant solution).
Some more resources:
How do I duplicate sys.stdout to a log file in python?
http://www.shallowsky.com/blog/programming/python-tee.html

In the following, SOMEPATH is the path to the child executable, in a format suitable for subprocess.Popen (see its docs).
import sys, subprocess
f = open('logfile.txt', 'w')
proc = subprocess.Popen(SOMEPATH, stdout=subprocess.PIPE)
while True:
out = proc.stdout.read(1)
if out == '' and proc.poll() != None:
break
if out != '':
# CR workaround since chars are read one by one, and Windows interprets
# both CR and LF as end of lines. Linux only has LF
if out != '\r': f.write(out)
sys.stdout.write(out)
sys.stdout.flush()

Would an approach like this do what you want?
import sys
class Log(object):
def __init__(self, filename, mode, buffering):
self.filename = filename
self.mode = mode
self.handle = open(filename, mode, buffering)
def write(self, thing):
self.handle.write(thing)
sys.stdout.write(thing)
You'd probably need to implement more of the file interface for this to be really useful (and I've left out properly defaulting mode and buffering, if you want it). You could then do all your writes in the child process to an instance of Log. Or, if you wanted to be really magic, and you're sure you implement enough of the file interface that things won't fall over and die, you could potentially assign sys.stdout to be an instance of this class. Then I think any means of writing to stdout, including print, will go via the log class.
Edit to add: Obviously if you assign to sys.stdout you will have to do something else in the write method to echo the output to stdout!! I think you could use sys.__stdout__ for that.

Oh, you. I had a decent answer all prettied-up before I saw the last line of your example: execv(). Well, poop. The original idea was replacing each child process' stdout with an instance of this blog post's tee class, and split the stream into the original stdout, and the log file:
http://www.shallowsky.com/blog/programming/python-tee.html
But, since you're using execv(), the child process' tee instance would just get clobbered, so that won't work.
Unfortunately for you, there is no "out of the box" solution to your problem that I can find. The closest thing would be to spawn the actual tee program in a subprocess; if you wanted to be more cross-platform, you could fork a simple Python substitute.
First thing to know when coding a tee substitute: tee really is a simple program. In all the true C implementations I've seen, it's not much more complicated than this:
while((character = read()) != EOF) {
/* Write to all of the output streams in here, then write to stdout. */
}
Unfortunately, you can't just join two streams together. That would be really useful (so that the input of one stream would automatically be forwarded out of another), but we've no such luxury without coding it ourselves. So, Eli and I are going to have very similar answers. The difference is that, in my answer, the Python 'tee' is going to run in a separate process, via a pipe; that way, the parent thread is still useful!
(Remember: copy the blog post's tee class, too.)
import os, sys
# Open it for writing in binary mode.
logFile=open("bar", "bw")
# Verbose names, but I wanted to get the point across.
# These are file descriptors, i.e. integers.
parentSideOfPipe, childSideOfPipe = os.pipe()
# 'Tee' subprocess.
pid = os.fork()
if pid == 0:
while True:
char = os.read(parentSideOfPipe, 1)
logFile.write(char)
os.write(1, char)
# Actual command
pid = os.fork()
if pid == 0:
os.dup2(childSideOfPipe, 1)
os.execv(cmd)
I'm sorry if that's not what you wanted, but it's the best solution I can find.
Good luck with the rest of your project!

The first obvious answer is to fork an actual tee process but that is probably not ideal.
The tee code (from coreutils) merely reads each line and writes to each file in turn (effectively buffering).

Python: How to determine subprocess children have all finished running

I am trying to detect when an installation program finishes executing from within a Python script. Specifically, the application is the Oracle 10gR2 Database. Currently I am using the subprocess module with Popen. Ideally, I would simply use the wait() method to wait for the installation to finish executing, however, the documented command actually spawns child processes to handle the actual installation. Here is some sample code of the failing code:
import subprocess
OUI_DATABASE_10GR2_SUBPROCESS = ['sudo',
'-u',
'oracle',
os.path.join(DATABASE_10GR2_TMP_PATH,
'database',
'runInstaller'),
'-ignoreSysPrereqs',
'-silent',
'-noconfig',
'-responseFile '+ORACLE_DATABASE_10GR2_SILENT_RESPONSE]
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
There is a similar question here: Killing a subprocess including its children from python, but the selected answer does not address the children issue, instead it instructs the user to call directly the application to wait for. I am looking for a specific solution that will wait for all children of the subprocess. What if there are an unknown number of subprocesses? I will select the answer that addresses the issue of waiting for all children subprocesses to finish.
More clarity on failure: The child processes continue executing after the wait() command since that command only waits for the top level process (in this case it is 'sudo'). Here is a simple diagram of the known child processes in this problem:
Python subprocess module -> Sudo -> runInstaller -> java -> (unknown)

Ok, here is a trick that will work only under Unix. It is similar to one of the answers to this question: Ensuring subprocesses are dead on exiting Python program. The idea is to create a new process group. You can then wait for all processes in the group to terminate.
pid = os.fork()
if pid == 0:
os.setpgrp()
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
os._exit(0)
else:
os.waitpid(-pid)
I have not tested this. It creates an extra subprocess to be the leader of the process group, but avoiding that is (I think) quite a bit more complicated.
I found this web page to be helpful as well. http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/

You can just use os.waitpid with the the pid set to -1, this will wait for all the subprocess of the current process until they finish:
import os
import sys
import subprocess
proc = subprocess.Popen([sys.executable,
'-c',
'import subprocess;'
'subprocess.Popen("sleep 5", shell=True).wait()'])
pid, status = os.waitpid(-1, 0)
print pid, status
This is the result of pstree <pid> of different subprocess forked:
python───python───sh───sleep
Hope this can help :)

Check out the following link http://www.oracle-wiki.net/startdocsruninstaller which details a flag you can use for the runInstaller command.
This flag is definitely available for 11gR2, but I have not got a 10g database to try out this flag for the runInstaller packaged with that version.
Regards

Everywhere I look seems to say it's not possible to solve this in the general case. I've whipped up a library called 'pidmon' that combines some answers for Windows and Linux and might do what you need.
I'm planning to clean this up and put it on github, possibly called 'pidmon' or something like that. I'll post a link if/when I get it up.
EDIT: It's available at http://github.com/dbarnett/python-pidmon.
I made a special waitpid function that accepts a graft_func argument so that you can loosely define what sort of processes you want to wait for when they're not direct children:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, recursive=True,
graft_func=(lambda p: p.name == '???' and p.parent.pid == ???))
or, as a shotgun approach, to just wait for any processes started since the call to waitpid to stop again, do:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, graft_func=(lambda p: True))
Note that this is still barely tested on Windows and seems very slow on Windows (but did I mention it's on github where it's easy to fork?). This should at least get you started, and if it works at all for you, I have plenty of ideas on how to optimize it.

How to get environment from a subprocess?

I want to call a process via a python program, however, this process need some specific environment variables that are set by another process. How can I get the first process environment variables to pass them to the second?
This is what the program look like:
import subprocess
subprocess.call(['proc1']) # this set env. variables for proc2
subprocess.call(['proc2']) # this must have env. variables set by proc1 to work
but the to process don't share the same environment. Note that these programs aren't mine (the first is big and ugly .bat file and the second a proprietary soft) so I can't modify them (ok, I can extract all that I need from the .bat but it's very combersome).
N.B.: I am using Windows, but I prefer a cross-platform solution (but my problem wouldn't happen on a Unix-like ...)

Here's an example of how you can extract environment variables from a batch or cmd file without creating a wrapper script. Enjoy.
from __future__ import print_function
import sys
import subprocess
import itertools
def validate_pair(ob):
try:
if not (len(ob) == 2):
print("Unexpected result:", ob, file=sys.stderr)
raise ValueError
except:
return False
return True
def consume(iter):
try:
while True: next(iter)
except StopIteration:
pass
def get_environment_from_batch_command(env_cmd, initial=None):
"""
Take a command (either a single command or list of arguments)
and return the environment created after running that command.
Note that if the command must be a batch file or .cmd file, or the
changes to the environment will not be captured.
If initial is supplied, it is used as the initial environment passed
to the child process.
"""
if not isinstance(env_cmd, (list, tuple)):
env_cmd = [env_cmd]
# construct the command that will alter the environment
env_cmd = subprocess.list2cmdline(env_cmd)
# create a tag so we can tell in the output when the proc is done
tag = 'Done running command'
# construct a cmd.exe command to do accomplish this
cmd = 'cmd.exe /s /c "{env_cmd} && echo "{tag}" && set"'.format(**vars())
# launch the process
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, env=initial)
# parse the output sent to stdout
lines = proc.stdout
# consume whatever output occurs until the tag is reached
consume(itertools.takewhile(lambda l: tag not in l, lines))
# define a way to handle each KEY=VALUE line
handle_line = lambda l: l.rstrip().split('=',1)
# parse key/values into pairs
pairs = map(handle_line, lines)
# make sure the pairs are valid
valid_pairs = filter(validate_pair, pairs)
# construct a dictionary of the pairs
result = dict(valid_pairs)
# let the process finish
proc.communicate()
return result
So to answer your question, you would create a .py file that does the following:
env = get_environment_from_batch_command('proc1')
subprocess.Popen('proc2', env=env)

As you say, processes don't share the environment - so what you literally ask is not possible, not only in Python, but with any programming language.
What you can do is to put the environment variables in a file, or in a pipe, and either
have the parent process read them, and pass them to proc2 before proc2 is created, or
have proc2 read them, and set them locally
The latter would require cooperation from proc2; the former requires that the variables become known before proc2 is started.

Since you're apparently in Windows, you need a Windows answer.
Create a wrapper batch file, eg. "run_program.bat", and run both programs:
#echo off
call proc1.bat
proc2
The script will run and set its environment variables. Both scripts run in the same interpreter (cmd.exe instance), so the variables prog1.bat sets will be set when prog2 is executed.
Not terribly pretty, but it'll work.
(Unix people, you can do the same thing in a bash script: "source file.sh".)

You can use Process in psutil to get the environment variables for that Process.
If you want to implement it yourself, you can refer to the internal implementation of psutil. It adapts to some operating system.
Currently supported operating systems are:
AIX
FreeBSD, OpenBSD, NetBSD
Linux
macOS
Sun Solaris
Windows
Eg: In Linux platform, you can find one pid 7877 environment variables in file /proc/7877/environ, just open with rt mode to read it.
Of course the best way to do this is to:
import os
from typing import Dict
from psutil import Process
process = Process(pid=os.getpid())
process_env: Dict = process.environ()
print(process_env)
You can find other platform implementation in source code
Hope I can help you.

The Python standard module multiprocessing have a Queues system that allow you to pass pickle-able object to be passed through processes. Also processes can exchange messages (a pickled object) using os.pipe. Remember that resources (e.g : database connection) and handle (e.g : file handles) can't be pickled.
You may find this link interesting :
Communication between processes with multiprocessing
Also the PyMOTw about multiprocessing worth mentioning :
multiprocessing Basics
sorry for my spelling

Two things spring to mind: (1) make the processes share the same environment, by combining them somehow into the same process, or (2) have the first process produce output that contains the relevant environment variables, that way Python can read it and construct the environment for the second process. I think (though I'm not 100% sure) that there isn't any way to get the environment from a subprocess as you're hoping to do.

Environment is inherited from the parent process. Set the environment you need in the main script, not a subprocess (child).

Ensure a single instance of an application in Linux

I'm working on a GUI application in WxPython, and I am not sure how I can ensure that only one copy of my application is running at any given time on the machine. Due to the nature of the application, running more than once doesn't make any sense, and will fail quickly. Under Win32, I can simply make a named mutex and check that at startup. Unfortunately, I don't know of any facilities in Linux that can do this.
I'm looking for something that will automatically be released should the application crash unexpectedly. I don't want to have to burden my users with having to manually delete lock files because I crashed.

The Right Thing is advisory locking using flock(LOCK_EX); in Python, this is found in the fcntl module.
Unlike pidfiles, these locks are always automatically released when your process dies for any reason, have no race conditions exist relating to file deletion (as the file doesn't need to be deleted to release the lock), and there's no chance of a different process inheriting the PID and thus appearing to validate a stale lock.
If you want unclean shutdown detection, you can write a marker (such as your PID, for traditionalists) into the file after grabbing the lock, and then truncate the file to 0-byte status before a clean shutdown (while the lock is being held); thus, if the lock is not held and the file is non-empty, an unclean shutdown is indicated.

Complete locking solution using the fcntl module:
import fcntl
pid_file = 'program.pid'
fp = open(pid_file, 'w')
try:
fcntl.lockf(fp, fcntl.LOCK_EX | fcntl.LOCK_NB)
except IOError:
# another instance is running
sys.exit(1)

There are several common techniques including using semaphores. The one I see used most often is to create a "pid lock file" on startup that contains the pid of the running process. If the file already exists when the program starts up, open it up and grab the pid inside, check to see if a process with that pid is running, if it is check the cmdline value in /proc/pid to see if it is an instance of your program, if it is then quit, otherwise overwrite the file with your pid. The usual name for the pid file is application_name.pid.

wxWidgets offers a wxSingleInstanceChecker class for this purpose: wxPython doc, or wxWidgets doc. The wxWidgets doc has sample code in C++, but the python equivalent should be something like this (untested):
name = "MyApp-%s" % wx.GetUserId()
checker = wx.SingleInstanceChecker(name)
if checker.IsAnotherRunning():
return False

This builds upon the answer by user zgoda. It mainly addresses a tricky concern having to do with write access to the lock file. In particular, if the lock file was first created by root, another user foo can then no successfully longer attempt to rewrite this file due to an absence of write permissions for user foo. The obvious solution seems to be to create the file with write permissions for everyone. This solution also builds upon a different answer by me, having to do creating a file with such custom permissions. This concern is important in the real world where your program may be run by any user including root.
import fcntl, os, stat, tempfile
app_name = 'myapp' # <-- Customize this value
# Establish lock file settings
lf_name = '.{}.lock'.format(app_name)
lf_path = os.path.join(tempfile.gettempdir(), lf_name)
lf_flags = os.O_WRONLY | os.O_CREAT
lf_mode = stat.S_IWUSR | stat.S_IWGRP | stat.S_IWOTH # This is 0o222, i.e. 146
# Create lock file
# Regarding umask, see https://stackoverflow.com/a/15015748/832230
umask_original = os.umask(0)
try:
lf_fd = os.open(lf_path, lf_flags, lf_mode)
finally:
os.umask(umask_original)
# Try locking the file
try:
fcntl.lockf(lf_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
except IOError:
msg = ('Error: {} may already be running. Only one instance of it '
'can run at a time.'
).format('appname')
exit(msg)
A limitation of the above code is that if the lock file already existed with unexpected permissions, those permissions will not be corrected.
I would've liked to use /var/run/<appname>/ as the directory for the lock file, but creating this directory requires root permissions. You can make your own decision for which directory to use.
Note that there is no need to open a file handle to the lock file.

Here's the TCP port-based solution:
# Use a listening socket as a mutex against multiple invocations
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('127.0.0.1', 5080))
s.listen(1)

Look for a python module that interfaces to SYSV semaphores on unix. The semaphores have a SEM_UNDO flag which will cause the resources held by the a process to be released if the process crashes.
Otherwise as Bernard suggested, you can use
import os
os.getpid()
And write it to /var/run/application_name.pid. When the process starts, it should check if the pid in /var/run/application_name.pid is listed in the ps table and quit if it is, otherwise write its own pid into /var/run/application_name.pid. In the following var_run_pid is the pid you read from /var/run/application_name.pid
cmd = "ps -p %s -o comm=" % var_run_pid
app_name = os.popen(cmd).read().strip()
if len(app_name) > 0:
Already running

The set of functions defined in semaphore.h -- sem_open(), sem_trywait(), etc -- are the POSIX equivalent, I believe.

If you create a lock file and put the pid in it, you can check your process id against it and tell if you crashed, no?
I haven't done this personally, so take with appropriate amounts of salt. :p

Can you use the 'pidof' utility? If your app is running, pidof will write the Process ID of your app to stdout. If not, it will print a newline (LF) and return an error code.
Example (from bash, for simplicity):
linux# pidof myapp
8947
linux# pidof nonexistent_app
linux#

By far the most common method is to drop a file into /var/run/ called [application].pid which contains only the PID of the running process, or parent process.
As an alternative, you can create a named pipe in the same directory to be able to send messages to the active process, e.g. to open a new file.

I've made a basic framework for running these kinds of applications when you want to be able to pass the command line arguments of subsequent attempted instances to the first one. An instance will start listening on a predefined port if it does not find an instance already listening there. If an instance already exists, it sends its command line arguments over the socket and exits.
code w/ explanation

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.