How do I pass user-input filenames to ImageMagick safely?

How do I pass user-input filenames to ImageMagick safely? - python

I am generating an ImageMagick bash command using Python. Something like
import subprocess
input_file = "hello.png"
output_file = "world.jpg"
subprocess.run(["convert", input_file, output_file])
where there might be more arguments before input_file or output_file. My question is, if either of the filenames is user provided and the user provides a filename that can be parsed as a command line option for ImageMagick, isn't that unsafe?

If the filename starts with a dash, ImageMagick indeed could think that this is an option instead of a filename. Most programs - including AFIK the ImageMagick command line tools - follow the convention that a double-dash (--) denotes the end of the options. If you do a
subprocess.run(["convert", "--", input_file, output_file])
you should be safe in this respect.

From the man page (and a few tests), convert requires an input file and an output file. If you only allow two tokens and if a file name is interpreted as an option then convert is going to miss at least one of the files, so you'll get an ugly message but you should be fine.
Otherwise you can prefix any file name that starts with - with ./ (except - itself, which is stdin or stdout depending on position), so that it becomes an unambiguous file path to the same file.

Related

Handling quotes and spaces in filenames

I want to create a Python (3) script that passes files to a Linux shell program. Straightforward enough to do, but I'm not sure how to pass filenames that could contain single- or double-quotes and spaces to the shell. I would presumably need to delimit filenames in case they contain spaces.
I might consider a command string something like f"wc -c '{filename}'", but that would break down if I encounter a filename containing a single quote. Likewise if I delimit with double-quotes and encounter a file containing those.
As something like Bob's "special" file would be a valid ext4 filename, how do I cope with all the possibilities?

As Tim Roberts mentioned in comments, you can use subprocess module to bypass this problem. Here is a short example (assuming you have a list of filenames) for passing a list of filenames to wc -c:
from subprocess import run
# assuming you have got a list of filenames
filenames = ['test.py', "Bob's special file", 'test space.py']
for filename in filenames:
run(['wc', '-c', filename])
By the way, if you want to use Python to get all filenames under one specific directory,
you might consider os.listdir.

How do you use Python Ghostscript's high-level interface to convert a .pdf file into multiple .png files?

I am trying to convert a .pdf file into several .png files using Ghostscript in Python. The other answers on here were pretty old hence this new thread.
The following code was given as an example on pypi.org of the 'high level' interface, and I am trying to model my code after the example code below.
import sys
import locale
import ghostscript
args = [
"ps2pdf", # actual value doesn't matter
"-dNOPAUSE", "-dBATCH", "-dSAFER",
"-sDEVICE=pdfwrite",
"-sOutputFile=" + sys.argv[1],
"-c", ".setpdfwrite",
"-f", sys.argv[2]
]
# arguments have to be bytes, encode them
encoding = locale.getpreferredencoding()
args = [a.encode(encoding) for a in args]
ghostscript.Ghostscript(*args)
Can someone explain what this code is doing? And can it be used somehow to convert a .pdf into .png files?
I am new to this and am truly confused. Thanks so much!

That's calling Ghostscript, obviously. From the arguments it's not spawning a process, it's linked (either dynamically or statically) to the Ghostscript library.
The args are Ghostscript arguments. These are documented in the Ghostscript documentation, you can find it online here. Because it mimics the command line interface, where the first argument is the calling program, the first argument here is meaningless and can be anything you want (as the comment says).
The next three arguments turn on SAFER (which prevents some potentially dangerous operations and is, now, the default anyway), sets NOPAUSE so the entire input is processed without pausing between pages, and BATCH so that on completion Ghostscript exits instead of returning to the interactive prompt.
Then it selects a device. In Ghostscript (due to the PostScript language) devices are what actually output stuff. In this case the device selected is the pdfwrite device, which outputs PDF.
Then there's the OutputFile, you can probably guess that this is the name (and path) of the file where the output is to be written.
The next 3 arguments; -c .setpdfwrite -f are, frankly archaic and pointless. They were once recommended when using the pdfwrite device (and only the pdfwrite device) but they have no useful effect these days.
The very last argument is, of course, the input file.
Certainly you can use Ghostscript to render PDF files to PNG. You want to use one of the PNG devices, there are several depending on what colour depth you want to support. Unless you have some stranger requirement, just use png16m. If your input file contains more than one page you'll want to set the OutputFile to use %d so that it writes one file per page.
More details on all of this can, of course, be found in the documentation.

Open a command prompt at a specific location followed by running a script at that location with some command line arguments using python

I was able to open command prompt and change the directory to required location using the subprocess module, but I was unable to pass further arguments to run an application along with some command line arguments. I am new to the subprocess module, so I did some search over stackoverflow couldn't find the desired result.
Mycode:
import subprocess
path = r"C:/Users/Application_Folder"
p = subprocess.Popen(r"cmd.exe", cwd="C:/Project_Files", shell=True)
Desired output:
Path: C:\Users\Application_folder\Application.exe
Need to open the cmd prompt in windows at the Application_folder location,
run the Application.exe by passing some command line arguments, using python

Just pass the command line you actually want to execute, with the executable path and whatever arguments you want to pass:
command_line = [r'C:\Users\Application_Folder\Application.exe', '/argument1', '/argument2']
p = subprocess.Popen(command_line, cwd=r'C:\Project_Files')
A couple of notes to keep in mind:
You shouldn't use shell=True. It's not necessary here -- in fact it's almost never necessary -- but it does introduce a potential security risk.
The whole point of raw string literals (starting with r' or r") is to change how backslash characters within the string are interpreted. r'C:\Program Files' is exactly the same string as "C:\\Program Files". If your string doesn't have backslashes in it, don't bother using the r prefix.

Passing piped data to Python program and also an input file

I want to pass data to a Python file using a pipe and also specifying an input file like:
cat file.txt|python script.py -u configuration.txt
I currently have this:
for line in fileinput.input(mode='rU'):
print(line)
I know there can be something with sys.argv but maybe using fileinput there is a clean way to do it?
Thanks.

From the documentation:
If a filename is '-', it is also replaced by sys.stdin. To specify an alternative list of filenames, pass it as the first argument to input().
So you can create a list containing '-' as well as the contents of sys.argv[1:] (the default), and pass that to input(). Or alternatively just put - in the list of arguments of your Python program:
cat file.txt|python script.py -u - configuration.txt
or
cat file.txt|python script.py -u configuration.txt -
depending on whether you want data provided on standard input to be processed before or after the contents of configuration.txt.
If you want to do anything more complicated than just processing the contents of standard input as if it were an input file, you probably should not be using the fileinput module.

How to escape a spacebar in a path name with subprocess?

I'm trying to convert a file from .m4a to .mp3 using ffmpeg and I need to access to the music folder.
The path name of this folder is : C:\\Users\A B\Desktop\Music
I can't access it with subprocess.call() because only C:\\Users\A gets recognized. The white space is not processed.
Here's my python script :
import constants
import os
import subprocess
path = 'C:\\Users\A B\Desktop\Music'
def main():
files = sorted(os.listdir(path), key=lambda x: os.path.getctime(os.path.join(path, x)))
if "Thumbs.db" in files: files.remove("Thumbs.db")
for f in files:
if f.lower()[-3:] == "m4a":
process(f)
def process(f):
inFile = f
outFile = f[:-3] + "mp3"
subprocess.call('ffmpeg -i {} {} {}'.format('C:\\Users\A B\Desktop\Music', inFile, outFile))
main()
When I run it I get an error that states :
C:\Users\A: No such file or directory
I wonder if someones knows how to put my full path name (C:\Users\A B\Desktop\Music) in subprocess.call() ?

Beforehand edit: spaces or not, the following command line -i <directory> <infilename> <outfilename> is not correct for ffmpeg since it expects the -i option, then input file and output file, not a directory first. So you have more than one problem here (which explains the "permission denied" message you had, because ffmpeg was trying to open a directory as a file!)
I suppose that you want to:
read all files from directory
convert them all to a file located in the same directory
In that case, you could add quotes to your both input & output absolute files like this:
subprocess.call('ffmpeg -i "{0}\{1}" "{0}\{2}"'.format('C:\\Users\A B\Desktop\Music', inFile, outFile))
That would work, but that's not the best thing to do: not very performant, using format when you already have all the arguments already, you may not have knowledge of other characters to escape, etc... don't reinvent the wheel.
The best way to do it is to pass the arguments in a list so subprocess module handles the quoting/escaping when necessary:
path = r'C:\Users\A B\Desktop\Music' # use raw prefix to avoid backslash escaping
subprocess.call(['ffmpeg','-i',os.path.join(path,inFile), os.path.join(path,outFile)])
Aside: if you're the user in question, it's even better to do:
path = os.getenv("USERPROFILE"),'Desktop','Music'
and you could even run the process in the path directory with cwd option:
subprocess.call(['ffmpeg','-i',inFile, outFile],cwd=path)
and if you're not, be sure to run the script with elevated privileges or you won't get access to another user directory (read-protected)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.