Trying to avoid shell=True in a Python subprocess - python

I need to concatenate multiple files that begin with the same name inside a Python program. My idea, in a bash shell, would be to something like
cat myfiles* > my_final_file
but there are two shell operators to use: * and >. This could be easily solved using
subprocess.Popen("cat myfiles* > my_final_file", shell=True)
but everybody says the using shell=True is something you have to avoid for security and portability reasons. How can I execute that piece of code, then?

You have to expand the pattern in python:
import glob
subprocess.check_call(['cat'] + glob.glob("myfiles*"), stdout=open("my_final_file", "wb"))
or better do everything in python:
with open("my_final_file", "wb") as output:
for filename in glob.glob("myfiles*"):
with open(filename, "rb") as inp:
output.write(inp.read())

Related

python read data from variable and write to a file (no print) which i wanted to use later as backup data

I am new to python and programming. Starting to try few things for my project..
My problem is as below
p=subprocess.Popen(Some command which gives me output],stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
p.wait()
content=p.stdout.readlines()
for line in content:
filedata=line.lstrip().rstrip()
-----> I want this filedata output to open and save it to a file.
If i use print filedata it works and gives me exactly what i wanted but i donot want to print and wanted to use this data later.
Thanks in advance..
You can do that in following two ways.
Option one uses more traditional way of file handling, I have used with statement, using with statement you don't have to worry about closing the file
Option two, which makes use of pathlib module and this is new in version 3.4 (I recommend using this)
somefile.txt is the full file path in file system. I've also included documentation links and I highly recommend going through those.
OPTION ONE
p=subprocess.Popen(Some command which gives me output],stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
p.wait()
content=p.stdout.readlines()
for line in content:
filedata=line.lstrip().rstrip()
with open('somefile.txt', 'a') as file:
file.write(filedata + '\n')
Documentation for The with Statement
OPTION TWO - For Python 3.4 or above
import pathlib
p=subprocess.Popen(Some command which gives me output],stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
p.wait()
content=p.stdout.readlines()
for line in content:
filedata=line.lstrip().rstrip()
pathlib.Path('somefile.txt').write_text(filedata + '\n')
Documentation on Pathlib module

Python writes 0 to the file

import os
fileHandle = open('booksNames.txt', 'r+')
def getData():
data = os.system('dir /b /a /s *.pdf *.epub *.mobi')
fileHandle.writelines(str(data))
fileHandle.close()
I'm trying to write the data returned by the os.system function to a file. But the only thing that gets written in file is 0. Here are some other variations that I tried as well.
import os
fileHandle = open('booksNames.txt', 'r+')
getData = lambda:os.system('dir /b /a /s *.pdf *.epub *.mobi')
data = getData()
fileHandle.writelines(str(data))
fileHandle.close()
On the output window, it gives perfect output but while writing to a text fileit writes zero. I've also tried using return but no use. Please Help.
Use the subprocess module. There are a number of methods, but the simplest is:
>>> import subprocess
>>> with open('out.txt','w') as f:
... subprocess.call(['dir','/b','/a','/s','*.pdf','*.epub','*.mobi'],stdout=f,stderr=f,shell=True)
...
0
Zero is the exit code, but the content will be in out.txt.
For windows (I assume you are using Windows since you are using the 'dir' command, not the Unix/Linux 'ls'):
simply let the command do the work.
os.system('dir /b /a /s *.pdf *.epub *.mobi >> booksNames.txt')
Using '>>' will append to any existing file. just use '>' to write a new file.
I liked the other solution using subprocess, but since this is OS-specific anyway, I think this is simpler.

Converting a file from .sam to .bam using python subprocess

I would like to start out by saying any help is greatly appreciated. I'm new to Python and scripting in general. I am trying to use a program called samtools view to convert a file from .sam to a .bam I need to be able do what this BASH command is doing in Python:
samtools view -bS aln.sam > aln.bam
I understand that BASH commands like | > < are done using the subprocess stdin, stdout and stderr in Python. I have tried a few different methods and still can't get my BASH script converted correctly. I have tried:
cmd = subprocess.call(["samtools view","-bS"], stdin=open(aln.sam,'r'), stdout=open(aln.bam,'w'), shell=True)
and
from subprocess import Popen
with open(SAMPLE+ "."+ TARGET+ ".sam",'wb',0) as input_file:
with open(SAMPLE+ "."+ TARGET+ ".bam",'wb',0) as output_file:
cmd = Popen([Dir+ "samtools-1.1/samtools view",'-bS'],
stdin=(input_file), stdout=(output_file), shell=True)
in Python and am still not getting samtools to convert a .sam to a .bam file. What am I doing wrong?
Abukamel is right, but in case you (or others) are wondering about your specific examples....
You're not too far off with your first attempt, just a few minor items:
Filenames should be in quotes
samtools reads from a named input file, not from stdin
You don't need "shell=True" since you're not using shell tricks like redirection
So you can do:
import subprocess
subprocess.call(["samtools", "view", "-bS", "aln.sam"],
stdout=open('aln.bam','w'))
Your second example has more or less the same issues, so would need to be changed to something like:
from subprocess import Popen
with open('aln.bam', 'wb',0) as output_file:
cmd = Popen(["samtools", "view",'-bS','aln.sam'],
stdout=(output_file))
You can pass execution to the shell by kwarg 'shell=True'
subprocess.call('samtools view -bS aln.sam > aln.bam', shell=True)

Python: Using Popen() versus File Objects to write to a file in linux

I noticed I have two alternatives to writing to a file in Linux within a python script. I can either create a Popen object and write to a file using shell redirection (e.g. ">" or ">>") - or I can use File Objects (e.g. open(), write(), close()).
I've played around with both for a short while and noticed that using Popen involves less code if I need to use other shell tools. For instance, below I try to get a checksum of a file and write it to a temporary file named with the PID as a unique identifier. (I know $$ will change if I call Popen again but pretend I don't need to):
Popen("md5sum " + filename + " >> /dir/test/$$.tempfile", shell=True, stdout=PIPE).communicate()[0]
Below is a (hastily written) rough equivalent using file objects. I use os.getpid instead of $$ but I still use md5sum and have to call Popen still.
PID = str(os.getpid())
manifest = open('/dir/test/' + PID + '.tempfile','w')
hash = Popen("md5sum " + filename, shell=True, stdout=PIPE).communicate()[0]
manifest.write(hash)
manifest.close()
Are there any pros/cons to either approach? I'm actually trying to port bash code over to Python and would like to use more Python, but I'm not sure which way I should go here.
Generally speaking, I would write something like:
manifest = open('/dir/test/' + PID + '.tempfile','w')
p = Popen(['md5sum',filename],stdout=manifest)
p.wait()
manifest.close()
This avoids any shell injection vulnerabilities. You also know the PID as you're not picking up the PID of the spawned subshell.
Edit: md5 module is deprecated (but still around), instead you should use the hashlib module
hashlib version
to file:
import hashlib
with open('py_md5', mode='w') as out:
with open('test.txt', mode='ro') as input:
out.write(hashlib.md5(input.read()).hexdigest())
to console:
import hashlib
with open('test.txt', mode='ro') as input:
print hashlib.md5(input.read()).hexdigest()
md5 version
Python's md5 module provides an identical tool:
import md5
# open file to write
with open('py_md5', mode='w') as out:
with open('test.txt', mode='ro') as input:
out.write(md5.new(input.read()).hexdigest())
If you just wanted to get the md5 hexadecimal digest string, you can print it insted of writing it out to a file:
import md5
# open file to write
with open('test.txt', mode='ro') as input:
print md5.new(input.read()).hexdigest()

Using cat command in Python for printing

In the Linux kernel, I can send a file to the printer using the following command
cat file.txt > /dev/usb/lp0
From what I understand, this redirects the contents in file.txt into the printing location. I tried using the following command
>>os.system('cat file.txt > /dev/usb/lp0')
I thought this command would achieve the same thing, but it gave me a "Permission Denied" error. In the command line, I would run the following command prior to concatenating.
sudo chown root:lpadmin /dev/usb/lp0
Is there a better way to do this?
While there's no reason your code shouldn't work, this probably isn't the way you want to do this. If you just want to run shell commands, bash is much better than python. On the other hand, if you want to use Python, there are better ways to copy files than shell redirection.
The simplest way to copy one file to another is to use shutil:
shutil.copyfile('file.txt', '/dev/usb/lp0')
(Of course if you have permissions problems that prevent redirect from working, you'll have the same permissions problems with copying.)
You want a program that reads input from the keyboard, and when it gets a certain input, it prints a certain file. That's easy:
import shutil
while True:
line = raw_input() # or just input() if you're on Python 3.x
if line == 'certain input':
shutil.copyfile('file.txt', '/dev/usb/lp0')
Obviously a real program will be a bit more complex—it'll do different things with different commands, and maybe take arguments that tell it which file to print, and so on. If you want to go that way, the cmd module is a great help.
Remember, in UNIX - everything is a file. Even devices.
So, you can just use basic (or anything else, e.g. shutil.copyfile) files methods (http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files).
In your case code may (just a way) be like that:
# Read file.txt
with open('file.txt', 'r') as content_file:
content = content_file.read()
with open('/dev/usb/lp0', 'w') as target_device:
target_device.write(content)
P. S. Please, don't use system() call (or similar) to solve your issue.
under windows OS there is no cat command you should usetype instead of cat under windows
(**if you want to run cat command under windows please look at: https://stackoverflow.com/a/71998867/2723298 )
import os
os.system('type a.txt > copy.txt')
..or if your OS is linux and cat command didn't work anyway here are other methods to copy file..
with grep:
import os
os.system('grep "" a.txt > b.txt')
*' ' are important!
copy file with sed:
os.system('sed "" a.txt > sed.txt')
copy file with awk:
os.system('awk "{print $0}" a.txt > awk.txt')

Categories

Resources