I am trying to explore the results of different parameter settings on my python script "train.py". For that, I use a wandb sweep. Each wandb agent executes the file "train.py" and passes some parameters to it. As per the wandb documentation (https://docs.wandb.ai/guides/sweeps/configuration#command), in case of e.g. two parameters "param1" and "param2" each agents starts the file with the command
/usr/bin/env python train.py --param1=value1 --param2=value2
However, "train.py" expects
/usr/bin/env python train.py value1 value2
and parses the parameter values by position. I did not write train.py and would like to not change it if possible. How can I get wandb to pass the values without "--param1=" in front?
Don't think you can get positional arguments from W&B Sweeps. However, there's a little work around you can try that won't require you touching the train.py file.
You can create an invoker file, let's call it invoke.py. Now, you can use it get rid of the keyword argument names. Something like this might work:
import sys
import subprocess
if len(sys.argv[0]) <= 1:
print(f"{sys.argv[0]} program_name param0=<param0> param1=<param1> ...")
sys.exit(0)
program = sys.argv[1]
params = sys.argv[2:]
posparam = []
for param in params:
_, val = param.split("=")
posparam.append(val)
command = [sys.executable, program, *posparam]
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = process.communicate()
sys.stdout.write(out.decode())
sys.stdout.flush()
sys.stderr.write(err.decode())
sys.stderr.flush()
sys.exit(process.returncode)
This allows you to invoke your train.py file as follows:
$ python3 invoke.py /path/to/train.py param0=0.001 param1=20 ...
Now to perform W&B sweeps you can create a command: section (reference) in your sweeps.yaml file while sweeping over the parameters param0 and param1. For example:
program: invoke.py
...
parameters:
param0:
distribution: uniform
min: 0
max: 1
param1:
distribution: categorical
values: [10, 20, 30]
command:
- ${env}
- ${program}
- /path/to/train.py
- ${args_no_hyphens}
Related
I have the following program that wraps top in a pseudo terminal and prints it back to the real terminal.
import os
import pty
import subprocess
import sys
import time
import select
stdout_master_fd, stdout_slave_fd = pty.openpty()
stderr_master_fd, stderr_slave_fd = pty.openpty()
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True
)
stdout_parts = []
while p.poll() is None:
rlist, _, _ = select.select([stdout_master_fd, stderr_master_fd], [], [])
for f in rlist:
output = os.read(f, 1000) # This is used because it doesn't block
sys.stdout.write(output.decode("utf-8"))
sys.stdout.flush()
time.sleep(0.01)
This works well control sequences are handled as expected. However, the subprocess is not using the full dimensions of the real terminal.
For comparison, running the above program:
And running top directly:
I didn't find any api of the pty library to suggest dimensions could be provided.
The dimensions I get in practice for the pseudo terminal are height of 24 lines and width of 80 columns, I'm assuming it might be hardcoded somewhere.
Reading on Emulate a number of columns for a program in the terminal I found the following working solution, at least on my environment (OSX and xterm)
echo LINES=$LINES COLUMNS=$COLUMNS TERM=$TERM
which comes to LINES=40 COLUMNS=203 TERM=xterm-256color in my shell. Then setting the following in the script gives the expected output:
p = subprocess.Popen(
"top",
shell=True,
stdout=stdout_slave_fd,
stderr=stderr_slave_fd,
close_fds=True,
env={
"LINES": "40",
"COLUMNS": "203",
"TERM": "xterm-256color"
}
)
#Mugen's answer pointed me in the right direction but did not quite work, here is what worked for me personally :
import os
import subprocess
my_env = os.environ.copy()
my_env["LINES"] = "40"
my_env["COLUMNS"] = "203"
result = subprocess.Popen(
cmd,
stdout= subprocess.PIPE,
env=my_env
).communicate()[0]
So I had to first get my entire environment variable with os library and then add the elements I needed to it.
The solutions provided by #leas and #Mugen did not work for me, but I eventually stumbled upon ptyprocess Python module, which allows you to provide terminal dimensions when spawning a process.
For context, I am trying to use a Python script to run a PowerShell 7 script and capture the PowerShell script's output. The host OS is Ubuntu Linux 22.04.
My code looks something like this:
from ptyprocess import PtyProcessUnicode
# Run the PowerShell script
script_run_cmd = 'pwsh -file script.ps1 param1 param2'
p = PtyProcessUnicode.spawn(script_run_cmd.split(), dimensions=(24,130))
# Get all script output
script_output = []
while True:
try:
script_output.append(p.readline().rstrip())
except EOFError:
break
# Not sure if this is necessary
p.close()
I feel like there should be a class method to get all the output, but I couldn't find one and the above code works well for me.
I have this python file that I have to run everyday, so I'm making a batch file that I'll use to automate this process. The thing is: this python script has an input function in it. I have to everyday run it, press "1", "enter", and that's it.
I've learned that with
python_location\python.exe python_script_location\test.py
I can run the script. I don't know, however, how to pass "1" to the input function that is triggered when I run the aforementioned batch code.
I've tried echo 1 | python_location\python.exe python_script_location\test.py and it gives me an 'EOF' error.
Here are some solutions. The idea is to write a piece of code that will check whether it needs to get the input from the user or from a set variable.
Solution 1:
Using command line arguments to set the input variable.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--some_var', default=None, required=False)
cli_args = parser.parse_args()
def get_input(var_name):
if auto_input := getattr(cli_args, var_name, None):
print("Auto input:", auto_input)
return auto_input
else:
return input("Manual input: ")
some_var = get_input("some_var")
print(some_var)
If running manually, execute without arguments
$ python3 script.py
Manual input: 1
1
If running from a batch file, execute with arguments
$ python3 script.py --some_var=1
Auto input: 1
1
Solution 2
Using environment variables to set the input variable.
import os
def get_input(var_name):
if auto_input := os.getenv(var_name):
print("Auto input:", auto_input)
return auto_input
else:
return input("Manual input: ")
some_var = get_input("some_var")
print(some_var)
If running manually, execute without the environment variable
$ python3 script.py
Manual input: 1
1
If running from a batch file, execute with the environment variable
$ export some_var=1
$ python3 script.py
Auto input: 1
1
Let's say I have this snippet
list_command = 'mongo --host {host} --port {port} ' \
'--username {username} --password {password} --authenticationDatabase {database} < {path}'
def shell_exec(cmd: str):
import subprocess
p = subprocess.call(cmd, shell=True)
return p
Let's say these are the commands I'm trying to run on mongo
use users
show collections
db.base.find().pretty()
If format the string list_command with the appropriate values and pass it to the function with shell=True, it works fine. But I'm trying to avoid it for security purposes.
If I call it with shell=False, I get the following error:
2020-08-31T14:08:49.291+0100 E QUERY [thread1] SyntaxError: missing ; before statement #./mongo/user-01-09-2020:1:4
failed to load: ./mongo/user-01-09-2020
253
Your list_command is a shell command: in particular, it includes input redirection (via < {path}), which is a syntactic feature of the shell. To use it you need shell=True.
If you don’t want to use shell=True, you need to change the way you construct the argument (separate arguments need to be passed as separate items of a list rather than as a single string), and you need to pass the script into the standard input via an explicit pipe, by setting its input parameter:
cmd = ['mongo', '--host', '{host}', '--port', …]
subprocess.run(cmd, input=mongodb_script)
Using input raised the following error: TypeError: init() got an unexpected keyword argument 'input'.
I ended up doing the following:
import subprocess
def shell_exec(cmd: str, stdin=None):
with open(stdin, 'rb') as f:
return subprocess.call(cmd.split(), stdin=f)
Scenario
In my python script I need to run an executable file as a subprocess with x number of command line parameters which the executable is expecting.
Example:
EG 1: myexec.sh param1 param2
EG 2: myexec.sh param1 $MYPARAMVAL
The executable and parameters are not known as these are configured and retrieved from external source (xml config) at run time.
My code is working when the parameter is a known value (EG 1) and configured, however the expectation is that a parameter could be an environment variable and configured as such, which should be interpreted at run time.(EG 2)
In the example below I am using echo as a substitute for myexec.sh to demonstrate the scenario.
This is simplified to demonstrate issue. 'cmdlst' is built from a configuration file, which could be any script with any number of parameters and values which could be a value or environment variable.
test1.py
import subprocess
import os
cmdlst = ['echo','param1','param2']
try:
proc = subprocess.Popen(cmdlst,stdout=subprocess.PIPE)
jobpid = proc.pid
stdout_value, stderr_value = proc.communicate()
except (OSError, subprocess.CalledProcessError) as err:
raise
print stdout_value
RESULT TEST 1
python test1.py
--> param1 param2
test2.py
import subprocess
import os
cmdlst = ['echo','param1','$PARAM']
try:
proc = subprocess.Popen(cmdlst,stdout=subprocess.PIPE)
jobpid = proc.pid
stdout_value, stderr_value = proc.communicate()
except (OSError, subprocess.CalledProcessError) as err:
raise
print stdout_value
RESULT TEST 2
export PARAM=param2
echo $PARAM
--> param2
python test2.py
--> param1 $PARAM
I require Test 2 to produce the same result as Test 1, considering that $PARAM would only be known at run-time and need to be retrieved from the current environment.
I welcome your advice.
If you want to have the shell expand environment variables, you have to set shell=True
subprocess.Popen('echo param1 $PARAM', shell=True, stdout=subprocess.PIPE)
Alternatively, you could just query the environment variable yourself when constructing the command, and then there is no need for shell expansion
subprocess.Popen(['echo', 'param1', os.environ['PARAM']], stdout=subprocess.PIPE)
You could do:
cmdlist = ['echo','param',os.environ["PARAM"]]
Or:
cmdlist = ['echo','param1','$PARAM']
proc = subprocess.Popen(cmdlist,stdout=subprocess.PIPE, env={'PARAM':os.environ['PARAM'])
You can not access the environment variable using $VAR. You'd use os.environ[..] instead:
cmdlst = ['echo','param1',os.environ['PARAM']]
I asked already and few people gave good advises but there were to many unknowns for me as I am beginner. Therefore I decided to ask for help again without giving bad code.
I need a script which will execute copy files to directory while the other is still running.
Basically I run first command, it generates files (until user press enter) and then those files are gone (automatically removed).
What I would like to have is to copying those files (without have to press "Enter" as well).
I made in bash however I would like to achieve this on python. Please see below:
while kill -0 $! 2>/dev/null;do
cp -v /tmp/directory/* /tmp/
done
If first script is purely command line : it should be fully manageable with a python script.
General architecture :
python scripts starts first one with subprocess module
reads output from first script until it gets the message asking for pressing enter
copies all files from source directory to destination directory
sends \r into first script input
waits first script terminates
exits
General requirements :
first script must be purely CLI one
first script must write to standart output/error and read from standard input - if it reads/writes to physical terminal (/dev/tty on Unix/Linux or con: on Dos/Windows), it won't work
the end of processing must be identifiable in standard output/error
if the two above requirement were no met, the only way would be to wait a define amount of time
Optional operation :
if there are other interactions in first script (read and/or write), it will be necessary to add the redirections in the script, it is certainly feasible, but will be a little harder
Configuration :
the command to be run
the string (from command output) that indicates first program has finished processing
the source directory
the destination directory
a pattern for file name to be copied
if time defined and no identifiable string in output : the delay to wait before copying
A script like that should be simple to write and test and able to manage the first script as you want.
Edit : here is an example of such a script, still without timeout management.
import subprocess
import os
import shutil
import re
# default values for command execution - to be configured at installation
defCommand = "test.bat"
defEnd = "Appuyez"
defSource = "."
defDest = ".."
# BEWARE : pattern is in regex format !
defPattern="x.*\.txt"
class Launcher(object):
'''
Helper to launch a command, wait for a defined string from stderr or stdout
of the command, copy files from a source folder to a destination folder,
and write a newline to the stdin of the command.
Limits : use blocking IO without timeout'''
def __init__(self, command=defCommand, end=defEnd, source=defSource,
dest=defDest, pattern = defPattern):
self.command = command
self.end = end
self.source = source
self.dest = dest
self.pattern = pattern
def start(self):
'Actualy starts the command and copies the files'
found = False
pipes = os.pipe() # use explicit pipes to mix stdout and stderr
rx = re.compile(self.pattern)
cmd = subprocess.Popen(self.command, shell=True, stdin=subprocess.PIPE,
stdout=pipes[1], stderr=pipes[1])
os.close(pipes[1])
while True:
txt = os.read(pipes[0], 1024)
#print(txt) # for debug
if str(txt).find(self.end) != -1:
found = True
break
# only try to copy files if end string found
if found:
for file in os.listdir(self.source):
if rx.match(file):
shutil.copy(os.path.join(self.source, file), self.dest)
print("Copied : %s" % (file,))
# copy done : write the newline to command input
cmd.stdin.write(b"\n")
cmd.stdin.close()
try:
cmd.wait()
print("Command terminated with %d status" % (cmd.returncode,))
except:
print("Calling terminate ...")
cmd.terminate()
os.close(pipes[0])
# allows to use the file either as an imported module or directly as a script
if __name__ == '__main__':
# parse optional parameters
import argparse
parser = argparse.ArgumentParser(description='Launch a command and copy files')
parser.add_argument('--command', '-c', nargs = 1, default = defCommand,
help="full text of the command to launch")
parser.add_argument('--endString', '-e', nargs = 1, default = defEnd,
dest="end",
help="string that denotes that command has finished processing")
parser.add_argument('--source', '-s', nargs = 1, default = defSource,
help="source folder")
parser.add_argument('--dest', '-d', nargs = 1, default = defDest,
help = "destination folder")
parser.add_argument('--pattern', '-p', nargs = 1, default = defPattern,
help = "pattern (regex format) for files to be copied")
args = parser.parse_args()
# create and start a Launcher ...
launcher = Launcher(args.command, args.end, args.source, args.dest,
args.pattern)
launcher.start()