How to get environment variables after running a subprocess - python

I am using the subprocess.call to execute a shell script from another application I am integrating with. This script sets environment variables with export MY_VAR=foo. Next, I need to execute more commands over subprocess with the environment that was set by the shell script.
How to extract the state of environment from the child process? It only returns the errno code.
i.e. I want to run:
subprocess.call(["export", "MY_VAR=foo"]
subprocess.call(["echo", "$MY_VAR"]) # should print 'foo'.
I know that I can set environment with env keyword, but the point of my question is how to get the environment variables that a subprocess sets. In shell you can source any script to get it's declared environment variables. What's the alternative in python?

I ran into this issue just recently. It seems that this is a difficult problem for reasons upstream of Python: posix_spawn doesn't give a way to read the environment variables of the spawned process, nor is there any easy way to read the environment of a running process.
Bash's source is specific to running bash code in the bash interpreter: it just evals the file in the current bash interpreter rather than starting a subprocess. This mechanism can't work if you are running bash code from Python.
It is possible to make a separate mechanism specific to running bash code from Python. The following is the best that I could manage. Would be nice to have a less flimsy solution.
import json
import os
import subprocess
import sys
from contextlib import AbstractContextManager
class BashRunnerWithSharedEnvironment(AbstractContextManager):
"""Run multiple bash scripts with persisent environment.
Environment is stored to "env" member between runs. This can be updated
directly to adjust the environment, or read to get variables.
"""
def __init__(self, env=None):
if env is None:
env = dict(os.environ)
self.env: Dict[str, str] = env
self._fd_read, self._fd_write = os.pipe()
def run(self, cmd, **opts):
if self._fd_read is None:
raise RuntimeError("BashRunner is already closed")
write_env_pycode = ";".join(
[
"import os",
"import json",
f"os.write({self._fd_write}, json.dumps(dict(os.environ)).encode())",
]
)
write_env_shell_cmd = f"{sys.executable} -c '{write_env_pycode}'"
cmd += "\n" + write_env_shell_cmd
result = subprocess.run(
["bash", "-ce", cmd], pass_fds=[self._fd_write], env=self.env, **opts
)
self.env = json.loads(os.read(self._fd_read, 5000).decode())
return result
def __exit__(self, exc_type, exc_value, traceback):
if self._fd_read:
os.close(self._fd_read)
os.close(self._fd_write)
self._fd_read = None
self._fd_write = None
def __del__(self):
self.__exit__(None, None, None)
Example:
with BashRunnerWithSharedEnvironment() as bash_runner:
bash_runner.env.pop("A", None)
res = bash_runner.run("A=6; echo $A", stdout=subprocess.PIPE)
assert res.stdout == b'6\n'
assert bash_runner.env.get("A", None) is None
bash_runner.run("export A=2")
assert bash_runner.env["A"] == "2"
res = bash_runner.run("echo $A", stdout=subprocess.PIPE)
assert res.stdout == b'2\n'
res = bash_runner.run("A=6; echo $A", stdout=subprocess.PIPE)
assert res.stdout == b'6\n'
assert bash_runner.env.get("A", None) == "6"
bash_runner.env["A"] = "7"
res = bash_runner.run("echo $A", stdout=subprocess.PIPE)
assert res.stdout == b'7\n'
assert bash_runner.env["A"] == "7"

It is not possible, because the environment is changed only in the child process. You might from there return it as output to STDOUT, STDERR - but as soon as the subprocess is terminated, You can not access anything from it.
# this is process #1
subprocess.call(["export", "MY_VAR=foo"]
# this is process #2 - it can not see the environment of process #1
subprocess.call(["echo", "$MY_VAR"]) # should print 'foo'.

Not sure I see the problem here. You just need to remember the following:
each subprocess that gets started is independent of any setups done in previous subprocesses
if you want to set up some variables and use them, do both those things in ONE process
So make setupVars.sh like this:
export vHello="hello"
export vDate=$(date)
export vRandom=$RANDOM
And make printVars.sh like this:
#!/bin/bash
echo $vHello, $vDate, $vRandom
And make that executable with:
chmod +x printVars.sh
Now your Python looks like this:
import subprocess
subprocess.call(["bash","-c","source setupVars.sh; ./printVars.sh"])
Output
hello, Mon Jul 12 00:32:29 BST 2021, 8615

Related

When using subprocess.Popen to create tmux sessions is it possible to change the envionment variables for each new tmux session spawned from Popen?

I'm trying to spawn multiple tmux sessions with different environment variables from the same python3 script.
I have been arguing {**os.environ, "CUDA_VISIBLE_DEVICES":str(device_id)} to the env key word argument to subprocess.Popen.
for device_id in device_ids:
new_env = {**os.environ, "CUDA_VISIBLE_DEVICES":str(device_id)}
p = subprocess.Popen([
'tmux', 'new', '-d', "-c", "./", '-s',
sesh_name,
"python3",
path_to_script
], env=new_env)
I'm finding that the CUDA_VISIBLE_DEVICES parameter, however, is equal to the first device_id that I argue across all processes. What is the meaning of this!?
Is this an inherent issue with Popen and the subprocess module? If so, how do I fix it?
I've tried to argue the device id to the script of the new process, but sadly torch won't allow me to update the environment variable after it's been imported and it would be way more trouble than it's worth to rework the code for that.
EDIT: Providing minimal example
Save this script as test.py (or whatever else you fancy):
import subprocess
import os
def sesh(name):
procs = []
for device_id in [4,5,6]:
proc_env = {**os.environ, "CUDA_VISIBLE_DEVICES": str(device_id)}
p = subprocess.Popen(['tmux', 'new', '-d', "-c", "./", '-s', name+str(device_id), "python3", "deleteme.py"], env=proc_env)
procs.append(p)
return procs
if __name__=="__main__":
sesh("foo")
Save this script as deleteme.py within the same directory:
import time
import os
if __name__=="__main__":
print(os.environ)
for i in range(11):
print("running")
if "CUDA_VISIBLE_DEVICES" in os.environ:
print(os.environ["CUDA_VISIBLE_DEVICES"])
else:
print("CUDA_VISIBLE_DEVICES not found")
time.sleep(5)
Then run test.py from the terminal.
$ python3 test.py
Then switch to the tmux sessions to figure out what environment is being created.
For anyone else running into this problem, you can use os.system instead of subprocess.Popen in the following way.
import os
def sesh(name, device_id, script):
command = "tmux new -d -s \"{}{}\" \'export CUDA_VISIBLE_DEVICES={}; python3 {} \'"
command = command.format(
name,
device_id,
device_id,
script
)
os.system(command)
if __name__=="__main__":
sesh("foo", 4, "deleteme.py")

subprocess that prints to pseudo terminal is not using full terminal size

I have the following program that wraps top in a pseudo terminal and prints it back to the real terminal.
import os
import pty
import subprocess
import sys
import time
import select
stdout_master_fd, stdout_slave_fd = pty.openpty()
stderr_master_fd, stderr_slave_fd = pty.openpty()
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True
)
stdout_parts = []
while p.poll() is None:
rlist, _, _ = select.select([stdout_master_fd, stderr_master_fd], [], [])
for f in rlist:
output = os.read(f, 1000) # This is used because it doesn't block
sys.stdout.write(output.decode("utf-8"))
sys.stdout.flush()
time.sleep(0.01)
This works well control sequences are handled as expected. However, the subprocess is not using the full dimensions of the real terminal.
For comparison, running the above program:
And running top directly:
I didn't find any api of the pty library to suggest dimensions could be provided.
The dimensions I get in practice for the pseudo terminal are height of 24 lines and width of 80 columns, I'm assuming it might be hardcoded somewhere.
Reading on Emulate a number of columns for a program in the terminal I found the following working solution, at least on my environment (OSX and xterm)
echo LINES=$LINES COLUMNS=$COLUMNS TERM=$TERM
which comes to LINES=40 COLUMNS=203 TERM=xterm-256color in my shell. Then setting the following in the script gives the expected output:
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True,
env={
"LINES": "40",
"COLUMNS": "203",
"TERM": "xterm-256color"
}
)
#Mugen's answer pointed me in the right direction but did not quite work, here is what worked for me personally :
import os
import subprocess
my_env = os.environ.copy()
my_env["LINES"] = "40"
my_env["COLUMNS"] = "203"
result = subprocess.Popen(
cmd,
stdout= subprocess.PIPE,
env=my_env
).communicate()[0]
So I had to first get my entire environment variable with os library and then add the elements I needed to it.
The solutions provided by #leas and #Mugen did not work for me, but I eventually stumbled upon ptyprocess Python module, which allows you to provide terminal dimensions when spawning a process.
For context, I am trying to use a Python script to run a PowerShell 7 script and capture the PowerShell script's output. The host OS is Ubuntu Linux 22.04.
My code looks something like this:
from ptyprocess import PtyProcessUnicode
# Run the PowerShell script
script_run_cmd = 'pwsh -file script.ps1 param1 param2'
p = PtyProcessUnicode.spawn(script_run_cmd.split(), dimensions=(24,130))
# Get all script output
script_output = []
while True:
try:
script_output.append(p.readline().rstrip())
except EOFError:
break
# Not sure if this is necessary
p.close()
I feel like there should be a class method to get all the output, but I couldn't find one and the above code works well for me.

Run cmd file using python

I have a cmd file "file.cmd" containing 100s of lines of command.
Example
pandoc --extract-media -f docx -t gfm "sample1.docx" -o "sample1.md"
pandoc --extract-media -f docx -t gfm "sample2.docx" -o "sample2.md"
pandoc --extract-media -f docx -t gfm "sample3.docx" -o "sample3.md"
I am trying to run these commands using a script so that I don't have to go to a file and click on it.
This is my code, and it results in no output:
file1 = open('example.cmd', 'r')
Lines = file1.readlines()
# print(Lines)
for i in Lines:
print(i)
os.system(i)
You don't need to read the cmd file line by line. you can simply try the following:
import os
os.system('myfile.cmd')
or using the subprocess module:
import subprocess
p = subprocess.Popen(['myfile.cmd'], shell = True, close_fds = True)
stdout, stderr = proc.communicate()
Example:
myfile.cmd:
#ECHO OFF
ECHO Grettings From Python!
PAUSE
script.py:
import os
os.system('myfile.cmd')
The cmd will open with:
Greetings From Python!
Press any key to continue ...
You can debug the issue by knowing the return exit code by:
import os
return_code=os.system('myfile.cmd')
assert return_code == 0 #asserts that the return code is 0 indicating success!
Note: os.system works by calling system() in C can only take up to 65533 arguments after a command (so it is a 16 bit issue). Giving one more argument will result in the return code 32512 (which implies the exit code 127).
The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function (os.system('command')).
since it is a command file (cmd), and only the shell can run it, then shell argument must set to be true. since you are setting the shell argument to true, the command needs to be string form and not a list.
use the Popen method for spawn a new process and the communicte for waiting on that process (you can time it out as well). if you whish to communicate with the child process, provide the PIPES (see mu example, but you dont have to!)
the code below for python 3.3 and beyond
import subprocess
try:
proc=subprocess.Popen('myfile.cmd', shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
outs, errs = proc.communicate(timeout=15) #timing out the execution, just if you want, you dont have to!
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
for older python versions
proc = subprocess.Popen('myfile.cmd', shell=True)
t=10
while proc.poll() is None and t >= 0:
print('Still waiting')
time.sleep(1)
t -= 1
proc.kill()
In both cases (python versions) if you dont need the timeout feature and you dont need to interact with the child process, then just, use:
proc = subprocess.Popen('myfile.cmd', shell=True)
proc.communicate()

Python subprocess in .exe

I'm creating a python script that will copy files and folder over the network. it's cross-platform so I make an .exe file using cx_freeze
I used Popen method of the subprocess module
if I run .py file it is running as expected but when i create .exe subprocess is not created in the system
I've gone through all documentation of subprocess module but I didn't find any solution
everything else (I am using Tkinter that also works fine) is working in the .exe accept subprocess.
any idea how can I call subprocess in .exe.file ??
This file is calling another .py file
def start_scheduler_action(self, scheduler_id, scheduler_name, list_index):
scheduler_detail=db.get_scheduler_detail_using_id(scheduler_id)
for detail in scheduler_detail:
source_path=detail[2]
if not os.path.exists(source_path):
showerror("Invalid Path","Please select valid path", parent=self.new_frame)
return
self.forms.new_scheduler.start_scheduler_button.destroy()
#Create stop scheduler button
if getattr(self.forms.new_scheduler, "stop_scheduler_button", None)==None:
self.forms.new_scheduler.stop_scheduler_button = tk.Button(self.new_frame, text='Stop scheduler', width=10, command=lambda:self.stop_scheduler_action(scheduler_id, scheduler_name, list_index))
self.forms.new_scheduler.stop_scheduler_button.grid(row=11, column=1, sticky=E, pady=10, padx=1)
scheduler_id=str(scheduler_id)
# Get python paths
if sys.platform == "win32":
proc = subprocess.Popen(['where', "python"], env=None, stdout=subprocess.PIPE)
else:
proc = subprocess.Popen(['which', "python"], env=None,stdout=subprocess.PIPE)
out, err = proc.communicate()
if err or not out:
showerror("", "Python not found", parent=self.new_frame)
else:
try:
paths = out.split(os.pathsep)
# Create python path
python_path = (paths[len(paths) - 1]).split('\n')[0]
cmd = os.path.realpath('scheduler.py')
#cmd='scheduler.py'
if sys.platform == "win32":
python_path=python_path.splitlines()
else:
python_path=python_path
# Run the scheduler file using scheduler id
proc = subprocess.Popen([python_path, cmd, scheduler_id], env=None, stdout=subprocess.PIPE)
message="Started the scheduler : %s" %(scheduler_name)
showinfo("", message, parent=self.new_frame)
#Add process id to scheduler table
process_id=proc.pid
#showinfo("pid", process_id, parent=self.new_frame)
def get_process_id(name):
child = subprocess.Popen(['pgrep', '-f', name], stdout=subprocess.PIPE, shell=False)
response = child.communicate()[0]
return [int(pid) for pid in response.split()]
print(get_process_id(scheduler_name))
# Add the process id in database
self.db.add_process_id(scheduler_id, process_id)
# Add the is_running status in database
self.db.add_status(scheduler_id)
except Exception as e:
showerror("", e)
And this file is called:
def scheduler_copy():
date= strftime("%m-%d-%Y %H %M %S", localtime())
logFile = scheduler_name + "_"+scheduler_id+"_"+ date+".log"
#file_obj=open(logFile, 'w')
# Call __init__ method of xcopy file
xcopy=XCopy(connection_ip, username , password, client_name, server_name, domain_name)
check=xcopy.connect()
# Cretae a log file for scheduler
file_obj=open(logFile, 'w')
if check is False:
file_obj.write("Problem in connection..Please check connection..!!")
return
scheduler_next_run=schedule.next_run()
scheduler_next_run="Next run at: " +str(scheduler_next_run)
# If checkbox_value selected copy all the file to new directory
if checkbox_value==1:
new_destination_path=xcopy.create_backup_directory(share_folder, destination_path, date)
else:
new_destination_path=destination_path
# Call backup method for coping data from source to destination
try:
xcopy.backup(share_folder, source_path, new_destination_path, file_obj, exclude)
file_obj.write("Scheduler completed successfully..\n")
except Exception as e:
# Write the error message of the scheduler to log file
file_obj.write("Scheduler failed to copy all data..\nProblem in connection..Please check connection..!!\n")
# #file_obj.write("Error while scheduling")
# return
# Write the details of scheduler to log file
file_obj.write("Total skipped unmodified file:")
file_obj.write(str(xcopy.skipped_unmodified_count))
file_obj.write("\n")
file_obj.write("Total skipped file:")
file_obj.write(str(xcopy.skipped_file))
file_obj.write("\n")
file_obj.write("Total copied file:")
file_obj.write(str(xcopy.copy_count))
file_obj.write("\n")
file_obj.write("Total skipped folder:")
file_obj.write(str(xcopy.skipped_folder))
file_obj.write("\n")
# file_obj.write(scheduler_next_run)
file_obj.close()
There is some awkwardness in your source code, but I won't spend time on that. For instance, if you want to find the source_path, it's better to use a for loop with break/else:
for detail in scheduler_detail:
source_path = detail[2]
break # found
else:
# not found: raise an exception
...
Some advice:
Try to separate the user interface code and the sub-processing, avoid mixing the two.
Use exceptions and exception handlers.
If you want portable code: avoid system call (there are no pgrep on Windows).
Since your application is packaged in a virtualenv (I make the assumption cx_freeze does this kind of thing), you have no access to the system-wide Python. You even don't have that on Windows. So you need to use the packaged Python (this is a best practice anyway).
If you want to call a Python script like a subprocess, that means you have two packaged applications: you need to create an exe for the main application and for the scheduler.py script. But, that's not easy to communicate with it.
Another solution is to use multiprocessing to spawn a new Python process. Since you don't want to wait for the end of processing (which may be long), you need to create daemon processes. The way to do that is explained in the multiprocessing module.
Basically:
import time
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.daemon = True
p.start()
# let it live and die, don't call: `p.join()`
time.sleep(1)
Of course, we need to adapt that with your problem.
Here is how I would do that (I removed UI-related code for clarity):
import scheduler
class SchedulerError(Exception):
pass
class YourClass(object):
def start_scheduler_action(self, scheduler_id, scheduler_name, list_index):
scheduler_detail = db.get_scheduler_detail_using_id(scheduler_id)
for detail in scheduler_detail:
source_path = detail[2]
break
else:
raise SchedulerError("Invalid Path", "Missing source path", parent=self.new_frame)
if not os.path.exists(source_path):
raise SchedulerError("Invalid Path", "Please select valid path", parent=self.new_frame)
p = Process(target=scheduler.scheduler_copy, args=('source_path',))
p.daemon = True
p.start()
self.db.add_process_id(scheduler_id, p.pid)
To check if your process is still running, I recommend you to use psutil. It's really a great tool!
You can define your scheduler.py script like that:
def scheduler_copy(source_path):
...
Multiprocessing vs Threading Python
Quoting this answer: https://stackoverflow.com/a/3044626/1513933
The threading module uses threads, the multiprocessing module uses processes. The difference is that threads run in the same memory space, while processes have separate memory. This makes it a bit harder to share objects between processes with multiprocessing. Since threads use the same memory, precautions have to be taken or two threads will write to the same memory at the same time. This is what the global interpreter lock is for.
Here, the advantage of multiprocessing over multithreading is that you can kill (or terminate) a process; you can't kill a thread. You may need psutil for that.
This is not an exact solution you are looking for, but following suggestion should be preferred for two reasons.
These are more pythonic way
subprocess is slightly expensive
Suggestions you can consider
Don't use subprocess for fetching system path. Try check os.getenv('PATH') to get env variable & try to find if python is in the path. For windows, one has to manually add Python path or else you can directly check in Program Files I guess
For checking process ID's you can try psutils. A wonderful answer is provided here at how do I get the process list in Python?
Calling another script from a python script. This does not look cool. Not bad, but I would not prefer this at all.
In above code, line - if sys.platform == "win32": has same value in if and else condition ==> you dont need a conditional statement here.
You wrote pretty fine working code to tell you. Keep Coding!
If you want to run a subprocess in an exe file, then you can use
import subprocess
program=('example')
arguments=('/command')
subprocess.call([program, arguments])

How to use an existing Environment variable in subprocess.Popen()

Scenario
In my python script I need to run an executable file as a subprocess with x number of command line parameters which the executable is expecting.
Example:
EG 1: myexec.sh param1 param2
EG 2: myexec.sh param1 $MYPARAMVAL
The executable and parameters are not known as these are configured and retrieved from external source (xml config) at run time.
My code is working when the parameter is a known value (EG 1) and configured, however the expectation is that a parameter could be an environment variable and configured as such, which should be interpreted at run time.(EG 2)
In the example below I am using echo as a substitute for myexec.sh to demonstrate the scenario.
This is simplified to demonstrate issue. 'cmdlst' is built from a configuration file, which could be any script with any number of parameters and values which could be a value or environment variable.
test1.py
import subprocess
import os
cmdlst = ['echo','param1','param2']
try:
proc = subprocess.Popen(cmdlst,stdout=subprocess.PIPE)
jobpid = proc.pid
stdout_value, stderr_value = proc.communicate()
except (OSError, subprocess.CalledProcessError) as err:
raise
print stdout_value
RESULT TEST 1
python test1.py
--> param1 param2
test2.py
import subprocess
import os
cmdlst = ['echo','param1','$PARAM']
try:
proc = subprocess.Popen(cmdlst,stdout=subprocess.PIPE)
jobpid = proc.pid
stdout_value, stderr_value = proc.communicate()
except (OSError, subprocess.CalledProcessError) as err:
raise
print stdout_value
RESULT TEST 2
export PARAM=param2
echo $PARAM
--> param2
python test2.py
--> param1 $PARAM
I require Test 2 to produce the same result as Test 1, considering that $PARAM would only be known at run-time and need to be retrieved from the current environment.
I welcome your advice.
If you want to have the shell expand environment variables, you have to set shell=True
subprocess.Popen('echo param1 $PARAM', shell=True, stdout=subprocess.PIPE)
Alternatively, you could just query the environment variable yourself when constructing the command, and then there is no need for shell expansion
subprocess.Popen(['echo', 'param1', os.environ['PARAM']], stdout=subprocess.PIPE)
You could do:
cmdlist = ['echo','param',os.environ["PARAM"]]
Or:
cmdlist = ['echo','param1','$PARAM']
proc = subprocess.Popen(cmdlist,stdout=subprocess.PIPE, env={'PARAM':os.environ['PARAM'])
You can not access the environment variable using $VAR. You'd use os.environ[..] instead:
cmdlst = ['echo','param1',os.environ['PARAM']]

Categories

Resources