The problem: I want to iterate over folder in search of certain file type, then execute it with a program and the name.ext as argument, and then run python script that changes the output name of the first program.
I know there is probably a better way to do the above, but the way I thought of was this:
[BAT]
for /R "C:\..\folder" %%a IN (*.extension) do ( SET name=%%a "C:\...\first_program.exe" "%%a" "C:\...\script.py" "%name%" )
[PY]
import io
import sys
def rename(i):
name = i
with open('my_file.txt', 'r') as file:
data = file.readlines()
data[40] ='"C:\\\\Users\\\\UserName\\\\Desktop\\\\folder\\\\folder\\\\' + name + '"\n'
with open('my_file.txt', 'w') as file:
file.writelines( data )
if __name__ == "__main__":
rename(sys.argv[1])
Expected result: I wish the python file changed the name, but after putting it once into the console it seems to stay with the script. The BAT does not change it and it bothers me.
PS. If there is a better way, I'll be glad to get to know it.
This is the linux bash version, I am sure you can change the loop etc to make it work as batch file instead of your *.exe I use cat as a generic input output example
#! /bin/sh
for f in *.txt
do
suffix=".txt"
name=${f%$suffix}
cat $f > tmp.dat
awk -v myName=$f '{if(NR==5) print $0 myName; else print $0 }' tmp.dat > $name.dat
done
This produces "unique" output *.dat files named after the input *.txt files. The files are treated by cat (virtually your *.exe) and the output is put into a temorary file. Eventually, this is handled by awk changing line 5 here. with the output placed in the unique file, as mentioned above.
Related
How come this command python test.py <(cat file1.txt) does not work accordingly. I could've sworn I had this working previously. Basically, I would like to get the output of that cat command as an input to the python script.
This command
cat file1.txt | python test.py
works okay, which outputs:
reading: file11
reading: file12
Which are based on the following scripts/files below.
The reason I want this to work is because I really want to feed in 2 input files like
python test.py <(cat file1.txt) <(cat file2.txt)
And would like some python output like:
reading: file11 file21
reading: file12 file22
I know this is a very simple example, and I can just read in or open() both files inside the python script and iterate accordingly. This is a simplified version of my current screnario, the cat command is technically another executable doing other things, so its not as easy as just reading/opening the file to read.
Sample script/files:
test.py:
import sys
for line in sys.stdin:
print("reading: ", line.strip())
sys.stdin.close()
file1.txt:
file11
file12
file2.txt:
file21
file22
changing test.py to:
import sys
input1 = open(sys.argv[1], "r")
input2 = open(sys.argv[2], "r")
for line1, line2 in zip(input1, input2):
print("reading: ", line1.strip(), line2.strip())
input1.close()
input2.close()
will enable python test.py <(cat file1.txt) <(cat file2.txt) to work
Actually it depends on shell you are using.
I guess you use bash which unfortunately can't have it working as only last redirection from specific descriptor is taken. You could create temporary file, redirect output of scripts to it and then feed your main script with tmp file.
Or if you don't mind you can switch e.g to zsh, which has such feature enabled by default.
I want to run a bash script from a python program. The script has a command like this:
find . -type d -exec bash -c 'cd "$0" && gunzip -c *.gz | cut -f 3 >> ../mydoc.txt' {} \;
Normally I would run a subprocess call like:
subprocess.call('ls | wc -l', shell=True)
But that's not possible here because of the quoting signs. Any suggestions?
Thanks!
While the question is answered already, I'll still jump in because I assume that you want to execute that bash script because you do not have the functionally equivalent Python code (which is lees than 40 lines basically, see below).
Why do this instead the bash script?
Your script now is able to run on any OS that has a Python interpreter
The functionality is a lot easier to read and understand
If you need anything special, it is always easier to adapt your own code
More Pythonic :-)
Please bear in mind that is (as your bash script) without any kind of error checking and the output file is a global variable, but that can be changed easily.
import gzip
import os
# create out output file
outfile = open('/tmp/output.txt', mode='w', encoding='utf-8')
def process_line(line):
"""
get the third column (delimiter is tab char) and write to output file
"""
columns = line.split('\t')
if len(columns) > 3:
outfile.write(columns[3] + '\n')
def process_zipfile(filename):
"""
read zip file content (we assume text) and split into lines for processing
"""
print('Reading {0} ...'.format(filename))
with gzip.open(filename, mode='rb') as f:
lines = f.read().decode('utf-8').split('\n')
for line in lines:
process_line(line.strip())
def process_directory(dirtuple):
"""
loop thru the list of files in that directory and process any .gz file
"""
print('Processing {0} ...'.format(dirtuple[0]))
for filename in dirtuple[2]:
if filename.endswith('.gz'):
process_zipfile(os.path.join(dirtuple[0], filename))
# walk the directory tree from current directory downward
for dirtuple in os.walk('.'):
process_directory(dirtuple)
outfile.close()
Escape the ' marks with a \.
i.e. For every: ', replace with: \'
Triple quotes or triple double quotes ('''some string''' or """some other string""") are handy as well. See here (yeah, its python3 documentation, but it all works 100% in python2)
mystring = """how many 'cakes' can you "deliver"?"""
print(mystring)
how many 'cakes' can you "deliver"?
Having some issues calling awk from within Python. Normally, I'd do the following to call the command in awk from the command line.
Open up command line, in admin mode or not.
Change my directory to awk.exe, namely cd R\GnuWin32\bin
Call awk -F "," "{ print > (\"split-\" $10 \".csv\") }" large.csv
My command is used to split up the large.csv file based on the 10th column into a number of files named split-[COL VAL HERE].csv. I have no issues running this command. I tried to run the same code in Python using subprocess.call() but I'm having some issues. I run the following code:
def split_ByInputColumn():
subprocess.call(['C:/R/GnuWin32/bin/awk.exe', '-F', '\",\"',
'\"{ print > (\\"split-\\" $10 \\".csv\\") }\"', 'large.csv'],
cwd = 'C:/R/GnuWin32/bin/')
and clearly, something is running when I execute the function (CPU usage, etc) but when I go to check C:/R/GnuWin32/bin/ there are no split files in the directory. Any idea on what's going wrong?
As I stated in my previous answer that was downvoted, you overprotect the arguments, making awk argument parsing fail.
Since there was no comment, I supposed there was a typo but it worked... So I suppose that's because I should have strongly suggested a full-fledged python solution, which is the best thing to do here (as stated in my previous answer)
Writing the equivalent in python is not trivial as we have to emulate the way awk opens files and appends to them afterwards. But it is more integrated, pythonic and handles quoting properly if quoting occurs in the input file.
I took the time to code & test it:
def split_ByInputColumn():
# get rid of the old data from previous runs
for f in glob.glob("split-*.csv"):
os.remove(f)
open_files = dict()
with open('large.csv') as f:
cr = csv.reader(f,delimiter=',')
for r in cr:
tenth_row = r[9]
filename = "split-{}.csv".format(tenth_row)
if not filename in open_files:
handle = open(filename,"wb")
open_files[filename] = (handle,csv.writer(handle,delimiter=','))
open_files[filename][1].writerow(r)
for f,_ in open_files.values():
f.close()
split_ByInputColumn()
in detail:
read the big file as csv (advantage: quoting is handled properly)
compute the destination filename
if filename not in dictionary, open it and create csv.writer object
write the row in the corresponding dictionary
in the end, close file handles
Aside: My old solution, using awk properly:
import subprocess
def split_ByInputColumn():
subprocess.call(['awk.exe', '-F', ',',
'{ print > ("split-" $10 ".csv") }', 'large.csv'],cwd = 'some_directory')
Someone else posted an answer (and then subsequently deleted it), but the issue was that I was over-protecting my arguments. The following code works:
def split_ByInputColumn():
subprocess.call(['C:/R/GnuWin32/bin/awk.exe', '-F', ',',
'{ print > (\"split-\" $10 \".csv\") }', 'large.csv'],
cwd = 'C:/R/GnuWin32/bin/')
I am converting some Python scripts I wrote in a Windows environment to run in Unix (Red Hat 5.4), and I'm having trouble converting the lines that deal with filepaths. In Windows, I usually read in all .txt files within a directory using something like:
pathtotxt = "C:\\Text Data\\EJC\\Philosophical Transactions 1665-1678\\*\\*.txt"
for file in glob.glob(pathtotxt):
It seems one can use the glob.glob() method in Unix as well, so I'm trying to implement this method to find all text files within a directory entitled "source" using the following code:
#!/usr/bin/env python
import commands
import sys
import glob
import os
testout = open('testoutput.txt', 'w')
numbers = [1,2,3]
for number in numbers:
testout.write(str(number + 1) + "\r\n")
testout.close
sourceout = open('sourceoutput.txt', 'w')
pathtosource = "/afs/crc.nd.edu/user/d/dduhaime/data/hill/source/*.txt"
for file in glob.glob(pathtosource):
with open(file, 'r') as openfile:
readfile = openfile.read()
souceout.write (str(readfile))
sourceout.close
When I run this code, the testout.txt file comes out as expected, but the sourceout.txt file is empty. I thought the problem might be solved if I change the line
pathtosource = "/afs/crc.nd.edu/user/d/dduhaime/data/hill/source/*.txt"
to
pathtosource = "/source/*.txt"
and then run the code from the /hill directory, but that didn't resolve my problem. Do others know how I might be able to read in the text files in the source directory? I would be grateful for any insights others can offer.
EDIT: In case it is relevant, the /afs/ tree of directories referenced above is located on a remote server that I'm ssh-ing into via Putty. I'm also using a test.job file to qsub the Python script above. (This is all to prepare myself to submit jobs on the SGE cluster system.) The test.job script looks like:
#!/bin/csh
#$ -M dduhaime#nd.edu
#$ -m abe
#$ -r y
#$ -o tmp.out
#$ -e tmp.err
module load python/2.7.3
echo "Start - `date`"
python tmp.py
echo "Finish - `date`"
Got it! I had misspelled the output command. I wrote
souceout.write (str(readfile))
instead of
sourceout.write (str(readfile))
What a dunce. I also added a newline bit to the line:
sourceout.write (str(readfile) + "\r\n")
and it works fine. I think it's time for a new IDE!
You haven't really closed the file. The function testout.close() isn't called, because you have forgotten the parentheses. The same is for sourceout.close()
testout.close
...
sourceout.close
Has to be:
testout.close()
...
sourceout.close()
If the program finishes all files are automatically closed so it is only important if you reopen the file.
Even better (the pythonic version) would be to use the with statement. Instead of this:
testout = open('testoutput.txt', 'w')
numbers = [1,2,3]
for number in numbers:
testout.write(str(number + 1) + "\r\n")
testout.close()
you would write this:
with open('testoutput.txt', 'w') as testout:
numbers = [1,2,3]
for number in numbers:
testout.write(str(number + 1) + "\r\n")
In this case the file will be automatically closed even when an error occurs.
In the Linux kernel, I can send a file to the printer using the following command
cat file.txt > /dev/usb/lp0
From what I understand, this redirects the contents in file.txt into the printing location. I tried using the following command
>>os.system('cat file.txt > /dev/usb/lp0')
I thought this command would achieve the same thing, but it gave me a "Permission Denied" error. In the command line, I would run the following command prior to concatenating.
sudo chown root:lpadmin /dev/usb/lp0
Is there a better way to do this?
While there's no reason your code shouldn't work, this probably isn't the way you want to do this. If you just want to run shell commands, bash is much better than python. On the other hand, if you want to use Python, there are better ways to copy files than shell redirection.
The simplest way to copy one file to another is to use shutil:
shutil.copyfile('file.txt', '/dev/usb/lp0')
(Of course if you have permissions problems that prevent redirect from working, you'll have the same permissions problems with copying.)
You want a program that reads input from the keyboard, and when it gets a certain input, it prints a certain file. That's easy:
import shutil
while True:
line = raw_input() # or just input() if you're on Python 3.x
if line == 'certain input':
shutil.copyfile('file.txt', '/dev/usb/lp0')
Obviously a real program will be a bit more complex—it'll do different things with different commands, and maybe take arguments that tell it which file to print, and so on. If you want to go that way, the cmd module is a great help.
Remember, in UNIX - everything is a file. Even devices.
So, you can just use basic (or anything else, e.g. shutil.copyfile) files methods (http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files).
In your case code may (just a way) be like that:
# Read file.txt
with open('file.txt', 'r') as content_file:
content = content_file.read()
with open('/dev/usb/lp0', 'w') as target_device:
target_device.write(content)
P. S. Please, don't use system() call (or similar) to solve your issue.
under windows OS there is no cat command you should usetype instead of cat under windows
(**if you want to run cat command under windows please look at: https://stackoverflow.com/a/71998867/2723298 )
import os
os.system('type a.txt > copy.txt')
..or if your OS is linux and cat command didn't work anyway here are other methods to copy file..
with grep:
import os
os.system('grep "" a.txt > b.txt')
*' ' are important!
copy file with sed:
os.system('sed "" a.txt > sed.txt')
copy file with awk:
os.system('awk "{print $0}" a.txt > awk.txt')