Converting a file from .sam to .bam using python subprocess - python

I would like to start out by saying any help is greatly appreciated. I'm new to Python and scripting in general. I am trying to use a program called samtools view to convert a file from .sam to a .bam I need to be able do what this BASH command is doing in Python:
samtools view -bS aln.sam > aln.bam
I understand that BASH commands like | > < are done using the subprocess stdin, stdout and stderr in Python. I have tried a few different methods and still can't get my BASH script converted correctly. I have tried:
cmd = subprocess.call(["samtools view","-bS"], stdin=open(aln.sam,'r'), stdout=open(aln.bam,'w'), shell=True)
and
from subprocess import Popen
with open(SAMPLE+ "."+ TARGET+ ".sam",'wb',0) as input_file:
with open(SAMPLE+ "."+ TARGET+ ".bam",'wb',0) as output_file:
cmd = Popen([Dir+ "samtools-1.1/samtools view",'-bS'],
stdin=(input_file), stdout=(output_file), shell=True)
in Python and am still not getting samtools to convert a .sam to a .bam file. What am I doing wrong?

Abukamel is right, but in case you (or others) are wondering about your specific examples....
You're not too far off with your first attempt, just a few minor items:
Filenames should be in quotes
samtools reads from a named input file, not from stdin
You don't need "shell=True" since you're not using shell tricks like redirection
So you can do:
import subprocess
subprocess.call(["samtools", "view", "-bS", "aln.sam"],
stdout=open('aln.bam','w'))
Your second example has more or less the same issues, so would need to be changed to something like:
from subprocess import Popen
with open('aln.bam', 'wb',0) as output_file:
cmd = Popen(["samtools", "view",'-bS','aln.sam'],
stdout=(output_file))

You can pass execution to the shell by kwarg 'shell=True'
subprocess.call('samtools view -bS aln.sam > aln.bam', shell=True)

Related

How to get shell output into python after it has run the bash command

Anyone know how to get the output of a python bash code back into python?
I am running:
import subprocess
output = subprocess.run("ls -l", shell=True, stdout=subprocess.PIPE,
universal_newlines=True)
print(output.stdout)
how do I get the output of ls -l back into my python code? I can have it dump into a file and then call on that file in my python code but is there an easier way, where my code can then directly read the output, without the additional file?
Thank you in advance.
You could use subprocess.getoutput.
subprocess.getoutput("ls -l")
The function returns a string, which then you can parse.

Executing awk command from python

I am trying to execute the following awk command from a python script
awk 'BEGIN {FS="\t"}; {print $1"\t"$2}' file_a > file_b
For this, I tried to use subprocess as follows:
subprocess.check_output(["awk", 'BEGIN {FS="\t"}; {print $1"\t"$2}',
file_a, ">",
file_b])
where file_a and file_b are strings pointing to the path of the files.
From this, I am getting the error
awk: cannot open > (No such file or directory)
I'm sure I'm inputing the arguments to subprocess in a wrong way, but I can't figure out what's wrong.
While it may look like it in your shell of choice, >, <, and | are not actually passed as arguments to the program you run. Rather, they're a special part of the shell that the program never gets to see.
Since they're part of the shell, and not part of the OS or program, you have to emulate their effects yourself with the normal facilities the language gives you. In your case, since you're trying to pipe to a file, simply use Python's open() as you would normally. The subprocess API supports arguments to specify stdout, stdin, and stderr, and you can supply any file object for those.
Check it out:
with open(file_b, 'wb') as f:
subprocess.call(["awk", 'BEGIN {FS="\t"}; {print $1"\t"$2}', file_a], stdout=f)
Since subprocess.check_output redirects output already, it doesn't take the stdout argument. Using subprocess.call avoids this. If you also need the output later in the script, you can instead assign the return value of check_output to a variable, and then save that to file_b.
If you use a lot of shell commands, you might also want to check out Plumbum, which gives you a large set of fairly silly shell-like operator overloads.

python subprocess.popen redirect to create a file

I've been searching for how to do this without any success. I've inherited a python script for performing an hourly backup on our database. The original script is too slow, so I'm trying a faster utility. My new command would look like this if typed into a shell:
pg_basebackup -h 127.0.0.1 -F t -X f -c fast -z -D - --username=postgres > db.backup.tgz
The problem is that the original script uses call(cmd) and it fails if the above is the cmd string. I've been looking for how to modify this to use popen but cannot find any examples where a file create redirect is used as in
>. The pg_basebackup as shown will output to stdout. The only way I've succeeded so far is to change -D - to -D some.file.tgz and then move the file to the archive, but I'd rather do this in one step.
Any ideas?
Jay
May be like this ?
with open("db.backup.tgz","a") as stdout:
p = subprocess.Popen(cmd_without_redirector, stdout=stdout, stderr=stdout, shell=True)
p.wait()
Hmmm... The pg_basebackup executable must be able to attach to that file. If I open the file in the manner you suggest, I don't know the correct syntax in python to be able to do that. If I try putting either " > " or " >> " in the string to call with cmd(), python pukes on it. That's my real problem that I'm not finding any guidance on.

Using cat command in Python for printing

In the Linux kernel, I can send a file to the printer using the following command
cat file.txt > /dev/usb/lp0
From what I understand, this redirects the contents in file.txt into the printing location. I tried using the following command
>>os.system('cat file.txt > /dev/usb/lp0')
I thought this command would achieve the same thing, but it gave me a "Permission Denied" error. In the command line, I would run the following command prior to concatenating.
sudo chown root:lpadmin /dev/usb/lp0
Is there a better way to do this?
While there's no reason your code shouldn't work, this probably isn't the way you want to do this. If you just want to run shell commands, bash is much better than python. On the other hand, if you want to use Python, there are better ways to copy files than shell redirection.
The simplest way to copy one file to another is to use shutil:
shutil.copyfile('file.txt', '/dev/usb/lp0')
(Of course if you have permissions problems that prevent redirect from working, you'll have the same permissions problems with copying.)
You want a program that reads input from the keyboard, and when it gets a certain input, it prints a certain file. That's easy:
import shutil
while True:
line = raw_input() # or just input() if you're on Python 3.x
if line == 'certain input':
shutil.copyfile('file.txt', '/dev/usb/lp0')
Obviously a real program will be a bit more complex—it'll do different things with different commands, and maybe take arguments that tell it which file to print, and so on. If you want to go that way, the cmd module is a great help.
Remember, in UNIX - everything is a file. Even devices.
So, you can just use basic (or anything else, e.g. shutil.copyfile) files methods (http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files).
In your case code may (just a way) be like that:
# Read file.txt
with open('file.txt', 'r') as content_file:
content = content_file.read()
with open('/dev/usb/lp0', 'w') as target_device:
target_device.write(content)
P. S. Please, don't use system() call (or similar) to solve your issue.
under windows OS there is no cat command you should usetype instead of cat under windows
(**if you want to run cat command under windows please look at: https://stackoverflow.com/a/71998867/2723298 )
import os
os.system('type a.txt > copy.txt')
..or if your OS is linux and cat command didn't work anyway here are other methods to copy file..
with grep:
import os
os.system('grep "" a.txt > b.txt')
*' ' are important!
copy file with sed:
os.system('sed "" a.txt > sed.txt')
copy file with awk:
os.system('awk "{print $0}" a.txt > awk.txt')

os.system: saving shell variables with multiple commands in one method

I am having a problem using my command/commands with one instance of os.system.
Unfortunately I have to use os.system as I have no control over this, as I send the string to the os.system method. I know I should really use subprocess module for my case, but that ain't an option.
So here is what I am trying to do.
I have a string like below:
cmd = "export BASE_PATH=`pwd`; export fileList=`python OutputString.py`; ./myscript --files ${fileList}; cp outputfile $BASE_PATH/.;"
This command then gets sent to the os.system module like so
os.system(cmd)
unfortunately when I consult my log file I get something that looks like this
os.system(r"""export BASE_PATH=/tmp/bla/bla; export fileList=; ./myscript --files ; cp outputfile /.;""")
As you can see BASE_PATH seems to be working but then when I call it with the cp outputfile /.
I get a empty string
Also with my fileList I get a empty string as fileList=python OutputString.py should print out a file list to this variable.
My thoughts:
Are these bugs due to a new process for each command? Hence I loose the variable in BASE_PATH in the next command.
Also for I not sure why fileList is empty.
Is there a solution to my above problem using os.system and my command string?
Please Note I have to use os.system module. This is out of my control.

Categories

Resources