I am trying to run a batch process using array in slurm. I only know shell command to extract variable from array (text files), but failed to assign it as Python variable.
I have to assign a variable to a Python slurm script. I used a shell command to extract values from the array. but facing errors while assigning it to the variable. I used subprocess, os.system and os.popen.
or is there any way to extract values from text file to be used as a Python variable?
start_date = os.system('$(cat startdate.txt | sed -n ${SLURM_ARRAY_TASK_ID}p)')
start_date = subprocess.check_output("$(cat startdate.txt | sed -n ${SLURM_ARRAY_TASK_ID}p)", shell=True)
start_date = os.popen('$(cat startdate.txt | sed -n ${SLURM_ARRAY_TASK_ID}p)').read()
start_date = '07-24-2004'
Don't use $(...). That will execute the command, and then try to execute the output of the command. You want the output to be sent back to python, not re-executed by the shell.
start_date = subprocess.check_output("cat startdate.txt | sed -n ${SLURM_ARRAY_TASK_ID}p", shell=True)
Barmar is correct, the $(...) part is why you are not getting what you want, but the real question is why when you are using python would you want to use cat and sed as well. Just open the file and pull out the information you want
import os
with open("startdate.txt", "r") as fh:
lines = fh.readlines()
start_date = lines[os.environ['SLURM_ARRAY_TASK_ID']].strip()
the .strip() part gets rid of the newline character.
Related
I have a simple shell script script.sh:
echo "ubuntu:$1" | sudo chpasswd
I need to open the script, read it, insert the argument, and save it as a string like so: 'echo "ubuntu:arg_passed_when_opening" | sudo chpasswd' using Python.
All the options suggested here actually execute the script, which is not what I want.
Any suggestions?
You would do this the same way that you read any text file, and we can use sys.argv to get the argument passed when running the python script.
Ex:
import sys
with open('script.sh', 'r') as sfile:
modified_file_contents = sfile.read().replace('$1', sys.argv[1])
With this method, modified_file_contents is a string containing the text of the file, but with the specified variable replaced with the argument passed to the python script when it was run.
I am running a Python script which takes the dump of CSVs from a Postgres database and then I want to escape double quotes in all these files. So I am using sed to do so.
In my Python code:
sed_for_quotes = 'sed -i s/\\"//g /home/ubuntu/PTOR/csvdata1/'+table+'.csv'
subprocess.call(sed_for_quotes, shell=True)
The process completes without any error, but when I load these tables to Redshift, I get error No delimiter found and upon checking the CSV, I find that one of the rows is only half-loaded,for example if it is a timestamp column, then only half of it is loaded, and there is no data after that in the table (while the actual CSV has that data before running sed). And that leads to the No delimiter found error.
But when I run sed -i s/\"//g filename.csvon these files in the shell it works fine and the csv after running sed has all the rows. I have checked that there is no problem with the data in the files.
What is the reason for this not working in a Python program ? I have also tried using sed -i.bak in the python program but that makes no difference.
Please Note that I am using an extra backslash(\) in the Python code because I need to escape the other backslash.
Other approaches tried:
Using subprocess.Popen without any buffer size and with positive buffer size, but that didn't help
Using subprocess.Popen(sed_for_quotes,bufsize=-4096) (negative buffer size) worked for one of
the files which was giving the error, but then encountered the same
problem in another file.
Do not use intermediate shell when you do not need to. And check for return code of the subprocess to make sure it completed successfully (check_call does this for you)
path_to_file = ... # e.g. '/home/ubuntu/PTOR/csvdata1/' + table + '.csv'
subprocess.check_call(['sed', '-i', 's/"//g', path_to_file])
By "intermediate" shell I mean the shell process run by subprocess that parses the command (± splits by whitespace but not only) and runs it (runs sed in this example). Since you precisely know what arguments sed should be invoked with, you do not need all this and it's best to avoid that.
Put your sed into a shell script, e.g.
#!/bin/bash
# Parameter $1 = Filename
sed -i 's/"//g' "$1"
Call your shell script using subprocess:
sed_for_quotes = 'my_sed_script /home/ubuntu/PTOR/csvdata1/'+table+'.csv'
Use docs.python.org/3.6: shlex.split
shlex.split(s, comments=False, posix=True)
Split the string s using shell-like syntax.
I'm running grep from within a python script like so:
last_run_start = os.system("cat %(file)s | grep '[0-24]:[0-59]' | tail -n1" % locals())
Which pulls out the last timestamp in file. When I do this through the Python command line, or use that grep command through regular terminal, I get what would be expected - the last line containing a timestamp.
However, when run from this script last_run_start is returning this:
18:23:45
0
Whats causing this '0' to appear, let alone on a new line? More importantly, how can I remove it from last_run_start?
os.system returns the exit code of the command you've run, which in this case seems to be 0.
The output of the command goes directly to stdout and isn't stored in last_run_start, if you want that you should use Popen or check_output from the subprocess module.
I guess the 0 ends up being printed because you're printing last_run_start somwhere.
I am kind of new to python. Goal is to execute a shell command using subprocess parse & retrive the printed output from shell. The execution errors out as shown in the sample output msg below. Also shown below is the sample code snippet
Code snippet:
testStr = "cat tst.txt | grep Location | sed -e '/.*Location: //g' "
print "testStr = "+testStr
testStrOut = subprocess.Popen([testStr],shell=True,stdout=subprocess.PIPE).communicate()[0]
Output:
testStr = cat tst.txt | grep Location | sed -e '/.*Location: //g'
cat: tst.txt: No such file or directory
sed: -e expression #1, char 15: unknown command: `/'
Is there a workaround or a function that could be used ?
Appreciate your help
Thanks
I suppose your main error is not python related. To be more precise, there are 3 of them:
You forgot to import subprocess.
It should be sed -e 's/.*Location: //g'. You wrote ///g instead of s///g.
tst.txt does not exist.
You should be passing testStr directly as the first argument, rather than enclosing it in a list. See subprocess.Popen, the paragraph that starts "On Unix, with shell=True: ...".
I am trying to write a code in python that will take some information from top and put it into a file.
I want to just write the name of the application and generate the file. The problem i am having is that i can't get the output of the pidof command so i can use it in python. My code looks like this :
import os
a = input('Name of the application')
val=os.system('pidof ' + str(a))
os.system('top -d 30 | grep' + str(val) + '> test.txt')
os.system('awk '{print $10, $11}' test.txt > test2.txt')
The problem is that val always has 0 but the command is returning the pid i want. Any input would be great.
First up, the use of input() is discouraged as it expects the user to type in valid Python expressions. Use raw_input() instead:
app = raw_input('Name of the application: ')
Next up, the return value from system('pidof') isn't the PID, it's the exit code from the pidof command, i.e. zero on success, non-zero on failure. You want to capture the output of pidof.
import subprocess
# Python 2.7 only
pid = int(subprocess.check_output(['pidof', app]))
# Python 2.4+
pid = int(subprocess.Popen(['pidof', app], stdout=subprocess.PIPE).communicate()[0])
# Older (deprecated)
pid = int(os.popen('pidof ' + app).read())
The next line is missing a space after the grep and would have resulted in a command like grep1234. Using the string formatting operator % will make this a little easier to spot:
os.system('top -d 30 | grep %d > test.txt' % (pid))
The third line is badly quoted and should have caused a syntax error. Watch out for the single quotes inside of single quotes.
os.system("awk '{print $10, $11}' test.txt > test2.txt")
Instead of os.system, I recommend you to use the subprocess module: http://docs.python.org/library/subprocess.html#module-subprocess
With that module, you can communicate (input and output) with a shell. The documentation explains the details of how to use it.
Hope this helps!