dna assembly with velvet; it runs velveth but doesn't regocinse velvetg

dna assembly with velvet; it runs velveth but doesn't regocinse velvetg - python

I have the above code to run velvet. I can run velveth with no problems but it is not recognising the parameter for velvetg. I have checked the documentation, and I cannot see anything different to what I have. when the programme reaches velveteg, I get the following messege: [0.000000] unknown option: -ins_length 500.
import glob, sys, os, subprocess
def velvet_project():
print 'starting_velvet'
#'this is the directory where I copied the two test files. H*, I realised this is a subdirectory *.gastq.gz, to process all the files with that extention'
folders = glob.glob('/home/my_name/fastqs_test/H*')
#print folders
for folder in folders:
print folder
#looking for fastqs in each folder
fastqs=glob.glob(folder + '/*.fastq.gz')
#print fastqs
strain_id = os.path.basename(folder)
output= '/home/my_name/velvet_results/' + strain_id + '_velvet'
if os.path.exists(output):
print 'velevet folder already exist'
else:
os.makedirs(output)
#cmd is a command line within the programme#
cmd=['velveth', output, '59', '-fastq.gz','shortPaired',fastqs[0],fastqs[1]]
#print cmd
my_file=subprocess.Popen(cmd)#I got this from the documentation.
my_file.wait()
print 'velveth has finished'
cmd_2=['velvetg', output, '-ins_length 500', '-exp_cov auto', '-scaffoding no']
print cmd_2
my_file_2=subprocess.Popen(cmd_2)
my_file_2.wait()
print "velvet has finished :)"
print 'start'
velvet_project()

Related

Running vulture from a python script

I'm trying to find a way to run vulture (which finds unused code in python projects) inside a python script.
vulture documentation can be found here:
https://pypi.org/project/vulture/
Does anyone know how to do it?
The only way I know to use vulture is by shell commands.
I tried to tun the shell commands from the script, using module subprocess, something like this:
process = subprocess.run(['vulture', '.'], check=True,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,universal_newlines=True)
which I though would have the same effect as running the shell command "vulture ."
but it doesn't work.
Can anyone help?
Thanks

Vulture dev here.
The Vulture package exposes an API, called scavenge - which it uses internally for running the analysis after parsing command line arguments (here in vulture.main).
It takes in a list of Python files/directories. For each directory, Vulture analyzes all contained *.py files.
To analyze the current directory:
import vulture
v = vulture.Vulture()
v.scavenge(['.'])
If you just want to print the results to stdout, you can call:
v.report()
However, it's also possible to perform custom analysis/filters over Vulture's results. The method vulture.get_unused_code returns a list of vulture.Item objects - which hold the name, type and location of unused code.
For the sake of this answer, I'm just gonna print the name of all unused objects:
for item in v.get_unused_code():
print(item.name)
For more info, see - https://github.com/jendrikseipp/vulture

I see you want to capture the output shown at console:
Below code might help:
import tempfile
import subprocess
def run_command(args):
with tempfile.TemporaryFile() as t:
try:
out = subprocess.check_output(args,shell=True, stderr=t)
t.seek(0)
console_output = '--- Provided Command: --- ' + '\n' + args + '\n' + t.read() + out + '\n'
return_code = 0
except subprocess.CalledProcessError as e:
t.seek(0)
console_output = '--- Provided Command: --- ' + '\n' + args + '\n' + t.read() + e.output + '\n'
return_code = e.returncode
return return_code, console_output
Your expected output will be displayed in console_output
Link:
https://docs.python.org/3/library/subprocess.html

How do I call a python function with parameters in CGI?

I want to build an interface for browsing an Apache2 server using python scripts.
I spent the past days learning python and today getting familiar with CGI. I want to test out some stuff, like the possibility for the user to navigate to the path he wants from the base directory
/var/www/cgi-bin
by inputting the path he wants to visit, for example
/etc/httpd/conf.d.
For that i have the change_path.py script which looks like this:
import os
def changePath(path):
os.chdir(path)
I already got this script running to make sure everything is set up properly:
#!/usr/bin/python
# Import modules for CGI handling
import cgi, cgitb, os
cwd = os.getcwd()
print "Content-type:text/html\r\n\r\n"
print "<html>"
print "<head>"
print "<title>TestScript</title>"
print "</head>"
print "<body>"
print "<h2> Current working directory is: %s</h2>" % cwd
print "</body>"

Getting subprocess.run(....)'s output written to a file in python3.5 while benchmarking

I am writing a benchmarking driver program in python, whose purpose is to take as input some C++ source-files(.cpp) and then for each input file compile it, make an executable of it(.out) and then run that executable with some input to that executable as command-line argument, and while that executable is running, measure it's time taken(and all) using /usr/bin/time.
So, in other words, what this driver program is doing is trying to automate this statement(used to measure timing of an executable):
/usr/bin/time ./way1.out 10 > way1.output
In this statment the input to way1.out(the C++ program executable) is 10 and the output of C++ program is written to way1.output, and then the time taken information is printed to console by /usr/bin/time. Ofcourse, in place of 10(as in this statement), it's the driver program job to run this command for all numbers from 1 to 10^6. The driver program will do this for all input C++ source files and for each source file write the output of /usr/bin/time(for each value between 1 to 10^6) to another file(which will later be parsed for benchmarking results of that source code).
This is my driver.py:
#!/usr/bin/env python3.5
import subprocess
import sys
n_limit = 1000000
files_list = ["./src/way1.cpp", "./src/way2.cpp"]
def compile_programs(files_list):
try:
for eachFile in files_list:
subprocess.run(["g++", "-std=c++14", eachFile, "-o", eachFile.split(".")[1].split("/")[2] + ".out"], check=True)
except:
print("Some error occured")
return False
return True
def benchmark(files_list):
if compile_programs(files_list):
print("\n\n Compilation Successful")
else:
raise Exception("Compilation Problem")
print("\n\n Benchmarking started..")
for eachFile in files_list:
current_file_name = eachFile.split(".")[1].split("/")[2]
with open(current_file_name + ".results", 'w') as each_file_bench_results:
for n in range(1, n_limit + 1):
print(" Currently running for n =", n, " for filename:", eachFile)
with open(current_file_name+".output", 'w') as current_output_file:
completed_process = subprocess.run(["/usr/bin/time", "./" + current_file_name + ".out", str(n)], stdout=current_output_file)
each_file_bench_results.write(completed_process.stdout)
subprocess.run(["rm", current_file_name + ".output"])
print()
print("\n\n Benchmarking Complete.. Results files are with '.results' extensions")
if __name__ == "__main__":
if (len(sys.argv) == 1):
print("Using default ./src/way1.cpp and ./src/way2.cpp")
benchmark(files_list)
else:
benchmark(*sys.argv[1:])
So, I used python3's subprocess module and used it's run method i.e. subprocess.run on this line :
completed_process = subprocess.run(["/usr/bin/time", "./" + current_file_name + ".out", str(n)], stdout=current_output_file)
The C++ program receive the input, they execute, and write their output to a file, but the output of /usr/bin/time is getting printed on terminal, so I tried this:
each_file_bench_results.write(completed_process.stdout)
But, it turns out that, completed_process.stdout is None and so will not be written to file, but if I comment this statement out, then the output of /usr/bin/time is printed to terminal.
So, my question is how to get the output of /usr/bin/time written to each_file_bench_results ?

Try capturing both STDOUT and STDERR:
completed_process = subprocess.run(
["/usr/bin/time", "./" + current_file_name + ".out", str(n)],
stdout=current_output_file, stderr=each_file_bench_results
)
It appears that /usr/bin/time (at least on my system) writes partially to STDERR. You can also use subprocess.check_output() for a bit more convenient approach which gives you a greater control over the received output.

Python subprocess call rsync

I am trying to to run rsync for each folder in a folder.
__author__ = 'Alexander'
import os
import subprocess
root ='/data/shares'
arguments=["--verbose", "--recursive", "--dry-run", "--human-readable", "--remove-source-files"]
remote_host = 'TL-AS203'
for folder in os.listdir(root):
print 'Sync Team ' + folder.__str__()
path = os.path.join(root,folder, 'in')
if os.path.exists(path):
folder_arguments = list(arguments)
print (type(folder_arguments))
folder_arguments.append("--log-file=" + path +"/rsync.log")
folder_arguments.append(path)
folder_arguments.append("transfer#"+remote_host+":/data/shares/"+ folder+"/out")
print "running rsync with " + str(folder_arguments)
returncode = subprocess.call(["rsync",str(folder_arguments)])
if returncode == 0:
print "pull successfull"
else:
print "error during rsync pull"
else:
print "not a valid team folder, in not found"
If I run this I get the following output:
Sync Team IT-Systemberatung
<type 'list'>
running rsync with ['--verbose', '--recursive', '--dry-run', '--human-readable', '--remove-source-files', '--log-file=/data/shares/IT-Systemberatung/in/rsync.log', '/data/shares/IT-Systemberatung/in', 'transfer#TL-AS203:/data/shares/IT-Systemberatung/out']
rsync: change_dir "/data/shares/IT-Systemberatung/['--verbose', '--recursive', '--dry-run', '--human-readable', '--remove-source-files', '--log-file=/data/shares/IT-Systemberatung/in/rsync.log', '/data/shares/IT-Systemberatung/in', 'transfer#TL-AS203:/data/shares/IT-Systemberatung" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1040) [sender=3.0.4]
error during rsync pull
Sync Team IT-Applikationsbetrieb
not a valid team folder, in not found
transfer#INT-AS238:/data/shares/IT-Systemberatung
If i manually start rsync from bash with these arguments, everything works fine. I also tried it with shell=true but with the same result.

You need to do:
returncode = subprocess.call(["rsync"] + folder_arguments)
Calling str() on a list will return the string represention of the python list which is not what you want to pass in as an argument to rsync

You do a os.chdir(os.path.join(root,folder)), but never go back.
In order to properly resume operation on the next folder, you should either remember the last os.getpwd() and return to it, or just do os.chdir('..') at the end of one loop run.

Path Problems (command line)

I have created a script with an array containing file names. The script searches for pdf files through directories and sub-directories by recursion and adds them to an array. It then outputs a string into the command line for pdftk so as to merge them.
pdftk takes arguments such as:
pdftk inputpdf1.pdf inputpdf2.pdf cat output output.pdf
However, it seems that the inputted path is not correct as per the error message I get from the windows cmd (listed above). I get the same error on Ubuntu.
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\student3>cd C:\Documents and Settings\student3\Desktop
\Test
C:\Documents and Settings\student3\Desktop\Test>pdftest.py
Merging C:\Documents and Settings\student3\Desktop\Test\1.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\1.pdf" cat outputC:\Docum
ents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\1.pdf" cat outputC:\Docum
ents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Merging C:\Documents and Settings\student3\Desktop\Test\2.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\2.pdf" cat outputC:\Docum
ents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\2.pdf" cat outputC:\Docum
ents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Merging C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf" cat outputC:
\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\1.pdf" cat outputC:
\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Merging C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf" cat outputC:
\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\brian\2.pdf" cat outputC:
\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Merging C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf" cat output
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\1.pdf" cat output
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Merging C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf" cat output
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
pdftk "C:\Documents and Settings\student3\Desktop\Test\testing\2.pdf" cat output
C:\Documents and Settings\student3\Desktop\Test\Output\.pdf
Error: Unexpected text in page reference, here:
outputC:\Documents
Exiting.
Acceptable keywords, here, are: "even", "odd", or "end".
Errors encountered. No output created.
Done. Input errors, so no output created.
Finished Processing
C:\Documents and Settings\student3\Desktop\Test>
This is the code for the script:
#----------------------------------------------------------------------------------------------
# Name: pdfMerger
# Purpose: Automatic merging of all PDF files in a directory and its sub-directories and
# rename them according to the folder itself. Requires the pyPDF Module
#
# Current: Processes all the PDF files in the current directory
# To-Do: Process the sub-directories.
#
# Version: 1.0
# Author: Brian Livori
#
# Created: 03/08/2011
# Copyright: (c) Brian Livori 2011
# Licence: Open-Source
#---------------------------------------------------------------------------------------------
#!/usr/bin/env python
import os
import glob
import sys
import fnmatch
import subprocess
path = str(os.getcwd())
x = 0
def process_file(_, path, filelist):
os.path.walk(os.path.realpath(topdir), process_file, ())
input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf"))
output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"'
cmd = "pdftk " + input_param + " cat output " + output_param
os.system(cmd)
for filenames in os.walk (path):
if "Output" in filenames:
filenames.remove ("Output")
if os.path.exists(final_output) != True:
os.mkdir(final_output)
sp = subprocess.Popen(cmd)
sp.wait()
else:
sp = subprocess.Popen(cmd)
sp.wait()
def files_recursively(topdir):
os.path.walk(os.path.realpath(topdir), process_file, ())
files_recursively(path)
print "Finished Processing"
What exactly am I doing wrong?
File "C:\Documents and Settings\student3\Desktop\Test\pdftest2.py", line 32
output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"'
^
SyntaxError: invalid syntax

You need to escape the paths by enclosing them in double quotes, because of the whitespaces. Otherwise, your shell will interpret every whitespace as a seperator for a new file.
" ".join('"' + str(f) + '"' for f in filesArr)
Several more things:
You call PDFTK for every PDF. You should put that out of the loop and build a input list of files. (Assuming you want to merge all input pdfs into one output pdf
You are missing a space after cat output
... " cat output " + outputpath + ext)
Your outputpath variable is empty.
Edit:
Your code is a little bit confusing. I would change the process_file method to this:
def process_file(_, path, filelist):
input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf"))
output_param = '"C:\ENTER\OUTPUT\PATH\HERE.PDF"'
cmd = "pdftk " + input_param + " cat output " + output_param
os.system(cmd)
I don't really understand why you need all those assignments there.
Edit 2:
Here my full script:
#!/usr/bin/env python
import os
import glob
def process_file(_, path, filelist):
input_param = " ".join('"' + x + '"' for x in glob.glob(os.path.join(path, "*.pdf"))))
output_param = '"' + os.path.join(path, os.path.basename(path) + ".pdf") + '"'
cmd = "pdftk " + input_param + " cat output " + output_param
print cmd
os.system(cmd)
def files_recursively(topdir):
os.path.walk(os.path.realpath(topdir), process_file, ())
if __name__ == "__main__":
files_recursively(os.getcwd())
And here on Pastebin
Commands it produces:
pdftk "/home/user/pdf/Test1.pdf" "/home/user/pdf/Test3.pdf" "/home/user/pdf/Test2.pdf" cat output "/home/user/pdf/pdf.pdf"
pdftk "/home/user/pdf/Sub3/Test1.pdf" "/home/user/pdf/Sub3/Test3.pdf" "/home/user/pdf/Sub3/Test2.pdf" cat output "/home/user/pdf/Sub3/Sub3.pdf"
pdftk "/home/user/pdf/Sub2/Test1.pdf" "/home/user/pdf/Sub2/Test3.pdf" "/home/user/pdf/Sub2/Test2.pdf" cat output "/home/user/pdf/Sub2/Sub2.pdf"
pdftk "/home/user/pdf/Sub2/SubSub21/Test1.pdf" "/home/user/pdf/Sub2/SubSub21/Test3.pdf" "/home/user/pdf/Sub2/SubSub21/Test2.pdf" cat output "/home/user/pdf/Sub2/SubSub21/SubSub21.pdf"
pdftk "/home/user/pdf/Sub2/SubSub22/Test1.pdf" "/home/user/pdf/Sub2/SubSub22/Test3.pdf" "/home/user/pdf/Sub2/SubSub22/Test2.pdf" cat output "/home/user/pdf/Sub2/SubSub22/SubSub22.pdf"
pdftk "/home/user/pdf/Sub1/Test1.pdf" "/home/user/pdf/Sub1/Test3.pdf" "/home/user/pdf/Sub1/Test2.pdf" cat output "/home/user/pdf/Sub1/Sub1.pdf"
pdftk "/home/user/pdf/Sub1/SubSub2/Test1.pdf" "/home/user/pdf/Sub1/SubSub2/Test3.pdf" "/home/user/pdf/Sub1/SubSub2/Test2.pdf" cat output "/home/user/pdf/Sub1/SubSub2/SubSub2.pdf"
pdftk "/home/user/pdf/Sub1/SubSub1/Test1.pdf" "/home/user/pdf/Sub1/SubSub1/Test3.pdf" "/home/user/pdf/Sub1/SubSub1/Test2.pdf" cat output "/home/user/pdf/Sub1/SubSub1/SubSub1.pdf"

Instead of os.system() you should use subprocess.Popen - this module's contents deal properly with spaces in filenames if you give the command and arguments as a list.
On Windows: the Popen class uses CreateProcess() to execute the child
program, which operates on strings. If args is a sequence, it will be
converted to a string using the list2cmdline method. Please note that
not all MS Windows applications interpret the command line the same
way: The list2cmdline is designed for applications using the same
rules as the MS C runtime.
In your example, that would be
cmd = ["pdftk"] + files_arr + "cat", "output", outputpath + ext]
and then
sp = subprocess.Popen(cmd)
sp.wait()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

dna assembly with velvet; it runs velveth but doesn't regocinse velvetg - python

Related

Running vulture from a python script

How do I call a python function with parameters in CGI?

Getting subprocess.run(....)'s output written to a file in python3.5 while benchmarking

Python subprocess call rsync

Path Problems (command line)

Categories

Resources