How can I localize argparse generated messages in a portable way? - python

Context:
I am developping a small module to automatically rename photographs in a directory according to their exif timestamp (goal: easily mixing pictures from different cameras or smartphones). It works smoothly either as a Python package or directly from the command line through a tiny wrapper using argparse.
And I have just had the (rather stupid) idea to localize it in non English language. Ok, gettext is my friend for all my own code, but when I came to agparse generated messages, I found myself on a sloppy ground...
Current research:
I have already found some resources on SO:
How to make python's argparse generate Non-English text?
Translate argparse's internal strings
Both end in adding the relevant strings from argparse into a po/mo file and let the argparse module automatically use the translated strings because internally it uses the _(...) wrapper. So far, so good.
My problem:
I feel this more as a workaround than a clean and neat solution because:
I could not find a word advising it in official Python documentation
It looks like a work in progress: implemented by not documented, so some strings could change in a future Python release (or did I miss something?)
Current code:
parser = argparse.ArgumentParser(
prog = prog,
description="Rename pictures according to their exif timestamp")
parser.add_argument("-v", "--version", action="version",
version="%(prog)s " + __version__)
parser.add_argument("--folder", "-f", default = ".",
help = "folder containing files to rename")
parser.add_argument("-s", "--src_mask", default="DSCF*.jpg",
help = "pattern to select the files to rename")
parser.add_argument("-d", "--dst_mask", default="%Y%m%d_%H%M%S",
help = "format for the new file name")
parser.add_argument("-e", "--ext", default=".jpg", dest="ext_mask",
help = "extension for the new file name")
parser.add_argument("-r", "--ref_file", default="names.log",
help = "a file to remember the old names")
parser.add_argument("-D", "--debug", action="store_true",
help = "print a line per rename")
parser.add_argument("-X", "--dry_run", action="store_true", dest="dummy",
help = "process normally except no rename occurs")
# subcommands configuration (rename, back, merge)
subparser = parser.add_subparsers(dest='subcommand', help="sub-commands")
ren = subparser.add_parser("rename", help=
"rename files by using their exif timestamp")
ren.add_argument("files", nargs="*",
help = "files to process (default: src_mask)")
back = subparser.add_parser("back",
help="rename files back to their original name")
back.add_argument("files", nargs="*",
help = "files to process (default: content of ref_file)")
merge = subparser.add_parser("merge",
help="merge files from a different folder")
merge.add_argument("src_folder", metavar="folder",
help = "folder from where merge picture files")
merge.add_argument("files", nargs="*",
help = "files to process (default: src_mask)")
I know how to wrap my own strings with _(), and I could probably have acceptable translations for the usage and help messages, but there are plenty of error messages when the user gives a wrong syntax, and I would like to prevent English error messages in the middle of French speaking program...
Question:
Is there any guarantee that the implementation of strings in the argparse module will not change, or is there a more robust/portable way to provide translations for its messages?

After some more research and #hpaulj's great comments, I can confirm:
the localizable messages from argparse are upward compatible from 3.3 to current version (old messages were never changed but new messages were added for new features)
the above is not true before 3.3
there are slight differences in 2.7
That means that only 2 paths are possible here:
accept the risk and provide a translation for the current version that will accept any Python version >= 3.3 - the risk is that a future version breaks the translation or add new (untranslated) messages. Nothing more to say because this will explicitely use the implementation details of the module
do not use at all argparse module and build a custom parser based on getopt. It is probably an acceptable option for simple use cases that do not require the full power of argparse
None of them are really good, but I cannot imagine a better one...
I will try to setup a project on github or gitlab providing the pot file and the french translation for the current argparse, and make it available on PyPI. If it ever exists, I will add references for it here and shall be glad to include other languages there.
A beta version of a project giving French translations for the argparse module is currently available on GitHUB
I call it beta because at this time, it has not been extensively tested, but it can be used either directly or as an example of what could be done. The binary wheel contains a little endian mo file, but the source distribution allows the mo file to be generated automatically on the target system with no additional dependency by including a copy of the msgfmt.py file from the Tools i18n of CPython.

Related

How do I tell a .py script to look for folders and file in the directory it is in?

I have a .py script that pulls in data from Google Sheets and outputs it in a yaml format. This is for my Hugo powered website which is served via Netlify. As I understand it, Netlify is capable of running Python too so I thought I could upload the web content and the python file in the same directory. This is required for updating the content dynamically, and I expect the python file to run everytime I trigger a build for the website. However, the python file requires certain credentials to work.
My code currently looks like this:
# Set location to write new files to.
outputpath = Path("D:/content/submission/")
#Read JSON:
json = Path("D:/credentials.json")
These are hardcoded local paths. When I bundle the script in the website directory, what paths should I write in so that when the script runs, these files are read in and outputted correctly?
I would want to output in my content/submission folder and read in from my creds/credentials.json. Should I just put these paths in? Will it know that it has to look within the directory for these folders, or is there something I need to add to the script that tells it to work within the directory it is sitting in?
🧨 First, credentials and secrets are best kept out of files (and esp, esp, source control).
For general file locations however, you can use something like:
pa_same_but_json = Path(__file__).with_suffix(".json")
pa_same_directory = Path(__file__).parent / "nosecrets.json")
To answer your comments:
(Mind you not 100% sure about Window Drives in the following):
parent
is an attribute on Path objects allowing you to "climb up" hierarchy.
Path("c:\temp\foo.json").parent returns same as Path("c:\temp")
Yes, you can do mypath.parent.parent
/ is a path concatenation operator
when applied to Path objects
So
myfile = os.path.join(["c:", "temp", "foo.json"])
and
myfile_as_a_Path = Path("c:") / "temp" / "foo.json"
are the same, except for one being a string, the other a Path instance. Once the first Path has been built (on C:) the rest of the code "knows that it operating on Path instances" and re-purposes the division operator support (probably some magic __div__ method intended for instance math ) to support path concatenation. This happens because most operations on Path instances return another Path, allowing you to do this type of chaining.
It's best not to write way up the hierarchy in a hosted/VM context (you never know directory structure above or if you have permissions), but something based on your script location might be
pa_current = Path(__file__).parent
# could `content/submission` but that's assuming you're always on Posix
# systems. Letting Pathlib do the work is safer, even if Windows probably
# puts up with `/`
pa_write = pa_current / "content" / "submission"
pa_read = pa_current / "credentials.json"
These at this points are Path instances, but really not much different than strings except having smarter methods to manipulate them. They don't know or care if the files exist or not.
P.S.
🧨 A consideration is that, in many web contexts, writing to code directories (like what happens in a content/submission under the python scripts) is a security goof as well.
Maybe pa_write = pa_current.parent.parent / "uploads" / "content" / "submission" would be better.
Specifically when it comes to user uploads and secrets, please refer to best practices for your platform, not just what Python can do. This answer was about pathlib.Path, not Hugo uploads.

parse_args all .png files from a parser argument

I would like to get a arg.pics which returns something like ['pic1.png', 'pic2.png', 'pic3.png'] (to arbitrarily parse all files of .png format) after running the following (test.py):
import argparse
import os
def parser_arg():
par = argparse.ArgumentParser()
parser = par.add_argument_group('pictures')
parser.add_argument("-p", "--pics", nargs="+", help="picture files", required=True)
arguments = par.parse_args()
return arguments
args = parser_arg()
And upon running the script via command line, and inputting
python test.py -p ../User/Desktop/Data/*.png
then args.pics returns ['../User/Desktop/Data/*.png'] instead..
Am I using the right approach? I heard using *.png will be expanded into the .png files once inputted but it doesn't seem to be the case on my end.
Edits: I'm using Anaconda Prompt on Windows 10 if it helps.
There are a couple of things that could be going on. One possibility is that ../User/Desktop/Data/*.png does not match any files, so does not get expanded. This would happen on a UNIX-like shell only (or PowerShell I suppose). The other possibility is that you are using cmd.exe on Windows, which simply does not do wildcard expansion at all. Given that you are using Anaconda prompt on Windows, I would lean towards the latter possibility as the explanation.
Since you are looking for a list of all the PNGs in a folder, you don't need to rely on the shell at all. There are lots of ways of doing the same thing in Python, with and without integrating in argparse.
Let's start by implementing the listing functionality. Given a directory, here are some ways to get a list of all the PNGs in it:
Use glob.glob (recommended option). This can either recurse into subdirectories or not, depending on what you want:
mydir = '../User/Desktop/Data/'
pngs = glob.glob(os.path.join(mydir, '*.png'))
To recurse into subfolders, just add the recursive=True keyword-only option.
Use os.walk. This is much more flexible (and therefore requires more work), but also lets you have recursive or non-recursive solutions:
mydir = '../User/Desktop/Data/'
pngs = []
for path, dirs, files in os.walk(mydir):
pngs.extend(f for f in files if f.lower().endswith('.png'))
del dirs[:]
To enable recursion, just delete the line del dirs[:], which suppresses subdirectory search.
A related method that is always non-recursive, is to use os.listdir, which is Pythons rough equivalent to ls or dir commands:
mydir = '../User/Desktop/Data/'
pngs = [f for f in os.listdir(mydir) if f.lower().endswith('.png')]
This version does not check if something is actually a file. It assumes you don't have folder names ending in .png.
Let's say you've picked one of these methods and created a function that accepts a folder and returns a file list:
def list_pngs(directory):
return glob.glob(os.path.join(directory, '*.png'))
Now that you know how to list files in a folder, you can easily plug this into argparse at any level. Here are a couple of examples:
Just get all your directories from the argument and list them out afterwards:
parser.add_argument("-p", "--pics", action='store', help="picture files", required=True)
Once you've processed the arguments:
print(list_pngs(args.pics))
Integrate directly into argparse with the type argument:
parser.add_argument("-p", "--pics", action='store', type=list_pngs, help="picture files", required=True)
Now you can use the argument directly, since it will be converted into a list directly:
print(args.pics)
Your approach is correct. However, your script will only receive the expanded list of files as parameters if your shell supports globbing and the pattern actually matches any files. Otherwise, it will be the pattern itself in most cases.
The Anaconda Command Prompt uses cmd.exe by default, which doesn't support wildcard expansion. You could use PowerShell instead, which does understand wildcards. Alternatively, you can do the expansion in your application as described in Mad Physicist's answer.

Fastest way to implement listdir function in the python dropbox-api

I've a question regarding python and dropobox-api.
I would need to download the whole content of a specific dropbox folder; standard python sftp library allows you to do this through sftp.listdir(), dropbox-api doesn't seem to support this feature. You could use the DropoxClient.get_file(from_path, rev=None, start=None, length=None), but this implies you know the from_path value (that must be a file, not a folder).
I'm wondering whether using the solution below which be the correct way to achieve the sftp.listdir()feature.
Please note the below is pseudo-code, I didn't post client initialisation for the sake of brevity.
dir_content = []
folder_metadata = dropbox_client.metadata(my_folder) #this gives you folder metadata information
folder_content = folter_metadata("contents")
for element in folder_content:
path = element["path"]
if path.split(".") > 1: dir_content.append(path) #checking if it's an actual file or a folder
Any suggestion here?
Alessio

Is it possible to obtain the path of the script using argparse?

I would like to know how to get the path where the script is stored with argparse, if possible, because if I run the script from another path (I have the path of the script in the %PATH% variable) it uses by default the relative path.
I know that I can obtain it using:
import sys
sys.argv[0]
but I would like to know if it is possible to acess it directly from the argparse module.
Thanks
Edit: I have my reply and I am satisfied.
To explain better the question: I have a script called mdv.py that I use to transform markdown files into html. I would like to call it from any location in my computer.
The script is in:
c:\Python27\markdown
in this path there are other files and a folder templates that I use to generate my HTML (a default stylesheet and files for header, body and footer).
These files are in:
C:\Python\27\markdown\markdown\templates
When I call the script from a non standard path, for example c:\dropbox\public it looks in c:\dropbox\public\templates for these files and not in c:\python27\markdown\templates where they are saved.
Ihope to have better explained. Sorry I'm not a native english speaker.
I think you are looking for the prog parameter; you can interpolate sys.argv[0] into your help strings with %(prog)s.
The value for prog can be set when creating the ArgumentParser() instance; it is the first parameter:
parser = argparse.ArgumentParser('some_other_name')
and can be retrieved with the .prog attribute:
print(parser.prog) # prints "some_other_name"
However, argparsecalls os.path.basename() on this name, and does not store the directory of the program anywhere.

Print PDF document with python's win32print module?

I'm trying to print a PDF document with the win32print module. Apparently this module can only accept PCL or raw text. Is that correct?
If so, is there a module available to convert a PDF document into PCL?
I contemplated using ShellExecute; however, this is not an option since it only allows printing to the default printer. I need to print to a variety of printers on servers across various networks.
Thanks for your help,
Pete
I ended up using Ghostscript to accomplish this task. There is a command line tool that relies on Ghostscript called gsprint.
You don't even need Acrobat installed to print PDFs in this fashion which is quite nice.
Here is an example:
on the command line:
gsprint -printer \\server\printer "test.pdf"
from python:
win32api.ShellExecute(0, 'open', 'gsprint.exe', '-printer "\\\\' + self.server + '\\' + self.printer_name + '" ' + file, '.', 0)
Note that I've added to my PATH variable in these examples, so I don't have to include the entire path when calling the executable.
There is one downside, however. The code is licensed under the GPL, so it's no very useful in commercial software.
Hope this helps someone,
Pete
I was already using the win32api.ShellExecute approach and needed to print to a non-default printer. The best way I could work out was to temporarily change the default printer. So right before I do the print I store what the current default printer is, change it, and then set it back after printing. Something like:
tempprinter = "\\\\server01\\printer01"
currentprinter = win32print.GetDefaultPrinter()
win32print.SetDefaultPrinter(tempprinter)
win32api.ShellExecute(0, "print", filename, None, ".", 0)
win32print.SetDefaultPrinter(currentprinter)
I'm not going to claim it's pretty, but it worked and it allowed me to leave my other code untouched.
I am not sure how to specifically get win32print to work, but there might be a couple of other options. Reportlab is often mentioned when creating PDFs from Python. If you are already invested in your approach, maybe using PyX or pypsg to generate the Postscript files and then feeding that into win32print would work.

Categories

Resources