Need a script to iterate over files and execute a command

Need a script to iterate over files and execute a command - python

Please bear with me, I've not used python before, and I'm trying to get some rendering done as quick as possible and getting stopped in my tracks with this.
I'm outputting the .ifd files to a network drive (Z:), and they are stored in a folder structure like;
Z:
- \0001
- \0002
- \0003
I need to iterate over the ifd files within a single folder, but the number of files is not static so there also needs to be a definable range (1-300, 1-2500, etc). The script therefore has to be able to take an additional two arguments for a start and end range.
On each iteration it executes something called 'mantra' using this statement;
mantra -f file.FRAMENUMBER.ifd outputFile.FRAMENUMBER.png
I've found a script on the internet that is supposed to do something similar;
import sys, os
#import command line args
args = sys.argv
# get args as string
szEndRange = args.pop()
szStartRange = args.pop()
#convert args to int
nStartRange = int(szStartRange, 10);
nEndRange = int(szEndRange, 10);
nOrd = len(szStartRange);
#generate ID range
arVals = range(nStartRange, nEndRange+1);
for nID in arVals:
szFormat = 'mantra -V a -f testDebris.%%(id)0%(nOrd)dd.ifd' % {"nOrd": nOrd};
line = szFormat % {"id": nID};
os.system(line);
The problem I'm having is that I can't get it to work. It seems to iterate, and do something - but it looks like it's just spitting out ifds into a different folder somewhere.
TLDR;
I need a script which will at least take two arguments;
startFrame
endFrame
and from those create a frameRange, which is then used to iterate over all ifd files executing the following command;
mantra -f fileName.currentframe.ifd fileName.currentFrame.png
If I were able to specify the filename and the files directory and output directory that'd be great too. I've tried manually doing that but there must be some convention to that I don't know as it was coming up with errors when I tried (stopping at the colon).
If anyone could hook me up or point me in the right direction that'd be swell. I know I should try and learn python, but I'm at my wits end with the rendering and need a helping hand.

import os, subprocess, sys
if len(sys.argv) != 3:
print('Must have 2 arguments!')
print('Correct usage is "python answer.py input_dir output_dir" ')
exit()
input_dir = sys.argv[1]
output_dir = sys.argv[2]
input_file_extension = '.txt'
cmd = 'currentframe'
# iterate over the contents of the directory
for f in os.listdir(input_dir):
# index of last period in string
fi = f.rfind('.')
# separate filename from extension
file_name = f[:fi]
file_ext = f[fi:]
# create args
input_str = '%s.%s.ifd' % (os.path.join(input_dir, file_name), cmd)
output_str = '%s.%s.png' % (os.path.join(output_dir + file_name), cmd)
cli_args = ['mantra', '-f', input_str, output_str]
#call function
if subprocess.call(cli_args, shell=True):
print('An error has occurred with command "%s"' % ' '.join(cli_args))
This should be sufficient for you to either use currently or with slight modification.

Instead of specifically inputting a start and end range you could just do:
import os
path, dirs, files = os.walk("/Your/Path/Here").next()
nEndRange = len(files)
#generate ID range
arVals = range(1, nEndRange+1);
The command os.walk() counts the # of files in the folder that you specified.
Although, an even easier way of getting your desired output is like this:
import os
for filename in os.listdir('dirname'):
szFormat = 'mantra -f ' + filename + ' outputFile.FRAMENUMBER.png'
line = szFormat % {"id": filename}; # you might need to play around with this formatting
os.system(line);
Because os.listdir() iterates through the specified directory and filename is every file in that directory, so you don't even need to count them.

a little help building the command.
for nID in arVals:
command = 'mantra -V a -f '
infile = '{0}.{1:04d}.ifd '.format(filename, id)
outfile = '{0}.{1:04d}.png '.format(filename, id)
os.system(command + infile + outfile);
and definitely use os.walk or os.listdir like #logic recommends
for file in os.listdir("Z:"):
filebase = os.path.splitext(file)[0]
command = 'mantra -V a -f {0}.ifd {0}.png'.format(filebase)

Related

How to execute a python script in bash in parallel

I am trying to run a bash script multiple times on a cluster. The issue however is I need to grab certain file names to fill the command which I only know how to do via python.
Note:I want to run the the last line (the line that calls the script) in parallel in groups of like two. How can I do this?
I have thought of: outputting all commands to a .txt and catting that in parallel. However, I feel that it is not the most efficient.
Thank you for any help
The script looks like this:
#!/usr/bin/python
import os
import sys
cwd = os.getcwd()
for filename in os.listdir(cwd):
if "_2_" in filename:
continue
elif "_1_" in filename:
in1 = os.path.join(cwd, filename)
secondread = filename.replace("_1.fastq_1_trimmed.fq","_1.fastq_2_trimmed.fq")
in2 = os.path.join(cwd, secondread)
outrename = filename.replace("_1.fastq_1_trimmed.fq",".bam")
out = "/home/blmatsum/working/bamout/" + outrename
cmd = "bbmap.sh ref=/home/blmatsum/working/datafiles/sequence.phages_clustered.fna in={} in2={} out={}".format(in1,in2,out)
os.system(cmd)
an example of the command I want to run would be:
bbmap.sh ref=/home/working/datafiles/sequence.phages_clustered.fna in=/home/working/trimmed/SRR7077355_1.fastq_1_trimmed.fq in2=/home/working/trimmed/SRR7077355_1.fastq_2_trimmed.fq out=/home/working/bamout/SRR7077355.bam'
bbmap.sh ref=/home/working/datafiles/sequence.phages_clustered.fna in=/home/working/trimmed/SRR7077366_1.fastq_1_trimmed.fq in2=/home/working/trimmed/SRR7077366_1.fastq_2_trimmed.fq out=/home/working/bamout/SRR7077366.bam

Getting rid of the apostrophe in subprocess.Popen to move files

My script generates multiple files that contain random names based on the info it extracts. I created this test to try and move all new files created while running into a new directory named after the file being ran.
When I use os.popen("mv " + moveFiles +' ' + filename + "_dir") it works just fine, but os.popen is considered insecure due to shellshock
When switching to cmd = Popen(["mv", str(moveFiles), filename + "_dir"]), I get the following error
mv: cannot stat '/home/test/testing/TestFile1.txt
/home/test/testing/TestFile2.txt': No such file or directory
I believe this is due to it adding the apostrophe at the beginning and end of the moveFiles variable which tries to move it as 1 file rather than 2. So it works when a single file is created, but anymore results in the error. Is there a way to remove this?
'/home/test/testing/TestFile1.txt /home/test/testing/TestFile2.txt'
def createDir(filename):
"""
createDir creates the folder of the file/argument given (Example.txt_dir)
"""
Dir = str(filename) + "_dir"
cmd = Popen(["mkdir", Dir], stdout=PIPE, stderr=PIPE)
def createFiles(filename):
"""
createFiles creates test files to move into Example.txt_dir
"""
with open('TestFile1.txt', 'w') as m:
cmd = Popen(["file", filename], stdout=m, stderr=PIPE)
print('Saved as TestFile1.txt')
with open('TestFile2.txt', 'w') as m:
cmd = Popen(["file", filename], stdout=m, stderr=PIPE)
print('Saved as TestFile2.txt')
def dirDifference(dir1, dir2):
"""
dirDifference compares 2 paths, 1 before being ran and 1 after to get a list of all new files to be moved
"""
#Compares Directory before and after running
dif = [i for i in dir1 + dir2 if i not in dir1 or i not in dir2]
separator = ' '
x = separator.join(map(str, dif))
return x
def moveDir(filename, moveFiles):
"""
moveDir: Moves the new files to the directory.
"""
Dir = str(filename) + "_dir"
cmd = Popen(["mv", moveFiles, filename + "_dir"])

Your suspicion is correct: the problem is that you have two filenames joined together with a space. Since you're using Popen() and not os.popen() you're bypassing shell interpretation of the arguments, which means that individual filenames aren't getting separated. This is the same as if you had used quotes on the command line:
mv 'file1 file2' destination
mv: cannot stat 'file1 file2': No such file or directory
You've asked it to move a single file whose name has a space in the middle. What you need to do is make each filename a separate element in the list in Popen():
cmd = Popen(["mv", file1, file2, destination])
In the case of your code above, instead of dirDifference() returning filenames joined together with spaces, it could simply return a list, which you could use with Popen():
cmd = Popen(["mv"] + moveFiles + [filename + "_dir"])
(making sure that moveFiles is a non-empty list of course)

How to use os.system to convert all files in a folder at once using external python script

I've managed to find out the method to convert a file from one file extension to another (.evtx to .xml) using an external script. Below is what I am using:
os.system("file_converter.py file1.evtx > file1.xml")
This successfully converts a file from .txt to .xml using the external script I called (file_converter.py).
I am now trying to find out a method on how I can use 'os.system' or perhaps another method to convert more than one file at once, I would like for my program to dive into a folder and convert all of the 10 files I have at once to .xml format.
The questions I have are how is this possible as os.system only takes 1 argument and I'm not sure on how I could make it locate through a directory as unlike the first file I converted was on my standard home directory, but the folder I want to access with the 10 files is inside of another folder, I am trying to find out a way to address this argument and for the conversion to be done at once, I also want the file name to stay the same for each individual file with the only difference being the '.xml' being changed from '.evtx' at the end.
The file "file_converter.py" is downloadable from here

import threading
import os
def file_converter(file):
os.system("file_converter.py {0} > {1}".format(file, file.replace(".evtx", ".xml")))
base_dir = "C:\\Users\\carlo.zanocco\\Desktop\\test_dir\\"
for file in os.listdir(base_dir):
threading.Thread(target=file_converter, args=(file,)).start()
Here my sample code.
You can generate multiple thread to run the operation "concurrently". The program will check for all files in the directory and convert it.
EDIT python2.7 version
Now that we have more information about what you want I can help you.
This program can handle multiple file concurrently from one folder, it check also into the subfolders.
import subprocess
import os
base_dir = "C:\\Users\\carlo.zanocco\\Desktop\\test_dir\\"
commands_to_run = list()
#Search all files
def file_list(directory):
allFiles = list()
for entry in os.listdir(directory):
fullPath = os.path.join(directory, entry)
#if is directory search for more files
if os.path.isdir(fullPath):
allFiles = allFiles + file_list(fullPath)
else:
#check that the file have the right extension and append the command to execute later
if(entry.endswith(".evtx")):
commands_to_run.append("C:\\Python27\\python.exe file_converter.py {0} > {1}".format(fullPath, fullPath.replace(".evtx", ".xml")))
return allFiles
print "Searching for files"
file_list(base_dir)
print "Running conversion"
processes = [subprocess.Popen(command, shell=True) for command in commands_to_run]
print "Waiting for converted files"
for process in processes:
process.wait()
print "Conversion done"
The subprocess module can be used in two ways:
subprocess.Popen: it run the process and continue the execution
subprocess.call: it run the process and wait for it, this function return the exit status. This value if zero indicate that the process terminate succesfully
EDIT python3.7 version
if you want to solve all your problem just implement the code that you share from github in your program. You can easily implement it as function.
import threading
import os
import Evtx.Evtx as evtx
import Evtx.Views as e_views
base_dir = "C:\\Users\\carlo.zanocco\\Desktop\\test_dir\\"
def convert(file_in, file_out):
tmp_list = list()
with evtx.Evtx(file_in) as log:
tmp_list.append(e_views.XML_HEADER)
tmp_list.append("<Events>")
for record in log.records():
try:
tmp_list.append(record.xml())
except Exception as e:
print(e)
tmp_list.append("</Events>")
with open(file_out, 'w') as final:
final.writelines(tmp_list)
#Search all files
def file_list(directory):
allFiles = list()
for entry in os.listdir(directory):
fullPath = os.path.join(directory, entry)
#if is directory search for more files
if os.path.isdir(fullPath):
allFiles = allFiles + file_list(fullPath)
else:
#check that the file have the right extension and append the command to execute later
if(entry.endswith(".evtx")):
threading.Thread(target=convert, args=(fullPath, fullPath.replace(".evtx", ".xml"))).start()
return allFiles
print("Searching and converting files")
file_list(base_dir)
If you want to show your files generate, just edit as above:
def convert(file_in, file_out):
tmp_list = list()
with evtx.Evtx(file_in) as log:
with open(file_out, 'a') as final:
final.write(e_views.XML_HEADER)
final.write("<Events>")
for record in log.records():
try:
final.write(record.xml())
except Exception as e:
print(e)
final.write("</Events>")
UPDATE
If you want to delete the '.evtx' files after the conversion you can simply add the following rows at the end of the convert function:
try:
os.remove(file_in)
except(Exception, ex):
raise ex
Here you just need to use try .. except because you run the thread only if the input value is a file.
If the file doesn't exist, this function throws an exception, so it's necessary to check os.path.isfile() first.

import os, sys
DIR = "D:/Test"
# ...or as a command line argument
DIR = sys.argv[1]
for f in os.listdir(DIR):
path = os.path.join(DIR, f)
name, ext = os.path.splitext(f)
if ext == ".txt":
new_path = os.path.join(DIR, f"{name}.xml")
os.rename(path, new_path)
Iterates over a directory, and changes all text files to XML.

Why is my write function not creating a file?

According to all the sources I've read, the open method creates a file or overwrites one with an existing name. However I am trying to use it and i get an error:
File not found - newlist.txt (Access is denied)
I/O operation failed.
I tried to read a file, and couldn't. Are you sure that file exists? If it does exist, did you specify the correct directory/folder?
def getIngredients(path, basename):
ingredient = []
filename = path + '\\' + basename
file = open(filename, "r")
for item in file:
if item.find("name") > -1:
startindex = item.find("name") + 5
endindex = item.find("<//name>") - 7
ingredients = item[startindex:endindex]
ingredient.append(ingredients)
del ingredient[0]
del ingredient[4]
for item in ingredient:
printNow(item)
file2 = open('newlist.txt', 'w+')
for item in ingredient:
file2.write("%s \n" % item)
As you can see i'm trying to write the list i've made into a file, but its not creating it like it should. I've tried all the different modes for the open function and they all give me the same error.

It looks like you do not have write access to the current working directory. You can get the Python working directory with import os; print os.getcwd().
You should then check whether you have write access in this directory. This can be done in Python with
import os
cwd = os.getcwd()
print "Write access granted to current directory", cwd, '>', os.access(cwd, os.W_OK)
If you get False (no write access), then you must put your newfile.txt file somewhere else (maybe at path + '/newfile.txt'?).

Are you certain the directory that you're trying to create the folder in exists?
If it does NOT... Then the OS won't be able to create the file.

This looks like a permissions problem.
either the directory does not exist or your user doesn't have the permissions to write into this directory .

I guess the possible problems may be:
1) You are passing the path and basename as parameters. If you are passing the parameters as strings, then you may get this problem:
For example:
def getIngredients(path, basename):
ingredient = []
filename = path + '\\' + basename
getIngredients("D","newlist.txt")
If you passing the parameters the above way, this means you are doing this
filename = "D" + "\\" + "newlist.txt"
2) You did not include a colon(:) after the path + in the filename.
3) Maybe, the file does not exist.

Python Windows CMD mklink, stops working without error message

I want to create symlinks for each file in a nested directory structure, where all symlinks will be put in one large flat folder, and have the following code by now:
# loop over directory structure:
# for all items in current directory,
# if item is directory, recurse into it;
# else it's a file, then create a symlink for it
def makelinks(folder, targetfolder, cmdprocess = None):
if not cmdprocess:
cmdprocess = subprocess.Popen("cmd",
stdin = subprocess.PIPE,
stdout = subprocess.PIPE,
stderr = subprocess.PIPE)
print(folder)
for name in os.listdir(folder):
fullname = os.path.join(folder, name)
if os.path.isdir(fullname):
makelinks(fullname, targetfolder, cmdprocess)
else:
makelink(fullname, targetfolder, cmdprocess)
#for a given file, create one symlink in the target folder
def makelink(fullname, targetfolder, cmdprocess):
linkname = os.path.join(targetfolder, re.sub(r"[\/\\\:\*\?\"\<\>\|]", "-", fullname))
if not os.path.exists(linkname):
try:
os.remove(linkname)
print("Invalid symlink removed:", linkname)
except: pass
if not os.path.exists(linkname):
cmdprocess.stdin.write("mklink " + linkname + " " + fullname + "\r\n")
So this is a top-down recursion where first the folder name is printed, then the subdirectories are processed. If I run this now over some folder, the whole thing just stops after 10 or so symbolic links.
The program still seems to run but no new output is generated. It created 9 symlinks for some files in the # tag & reencode and the first three files in the ChillOutMix folder. The cmd.exe Window is still open and empty, and shows in its title bar that it is currently processing the mklink command for the third file in ChillOutMix.
I tried to insert a time.sleep(2) after each cmdprocess.stdin.write in case Python is just too fast for the cmd process, but it doesn't help.
Does anyone know what the problem might be?

Why not just execute mklink directly?

Try this at the end:
if not os.path.exists(linkname):
fullcmd = "mklink " + linkname + " " + fullname + "\r\n"
print fullcmd
cmdprocess.stdin.write(fullcmd)
See what commands it prints. You may see a problem.
It may need double quotes around mklink's arg, since it contains spaces sometimes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Need a script to iterate over files and execute a command - python

Related

How to execute a python script in bash in parallel

Getting rid of the apostrophe in subprocess.Popen to move files

How to use os.system to convert all files in a folder at once using external python script

Why is my write function not creating a file?

Python Windows CMD mklink, stops working without error message

Categories

Resources