Internationalizing a Python 2.6 application via Babel

Internationalizing a Python 2.6 application via Babel - python

We're evaluating Babel 0.9.5 [1] under Windows for use with Python 2.6 and have the following questions that we we've been unable to answer through reading the documentation or googling.
1) I would like to use an _ like abbreviation for ungettext. Is there a concencus on whether one should use n_ or N_ for this?
n_ does not appear to work. Babel does not extract text.
N_ appears to partially work. Babel extracts text like it does for gettext, but does not format for ngettext (missing plural argument and msgstr[ n ].)
2) Is there a way to set the initial msgstr fields like the following when creating a POT file?
I suspect there may be a way to do this via Babel cfg files, but I've been unable to find documentation on the Babel cfg file format.
"Project-Id-Version: PROJECT VERSION\n"
"Language-Team: en_US \n"
3) Is there a way to preserve 'obsolete' msgid/msgstr's in our PO files? When I use the Babel update command, newly created obsolete strings are marked with #~ prefixes, but existing obsolete message strings get deleted.
Thanks,
Malcolm
[1] http://babel.edgewall.org/

By default pybabel extract recognizes the following keywords: _, gettext, ngettext, ugettext, ungettext, dgettext, dngettext,N_. Use -k option to add others. N_ is often used for NULL-translations (also called deferred translations).
Update: The -k option can list arguments of function to be put in catalog. So, if you use n_ = ngettext try pybabel extract -k n_:1,2 ....

To answer question 2):
If you run Babel via pybabel extract, you can set Project-Id-Version via the --project and --version options.
If you run Babel via setup.py extract_messages, then Project-Id-Version is taken from the distribution (project name and version in the setup.py file).
Both ways also support the options --msgid-bugs-address and --copyright-holder for setting the POT metadata.

Related

Obtaining metadata "Where from" of a file on Mac

I am trying to obtain the "Where from" extended file attribute which is located on the "get info" context-menu of a file in MacOS.
Example
When right-clicking on the file and displaying the info it shows the this metadata.
The highlighted part in the image below shows the information I want to obtain (the link of the website where the file was downloaded from).
I want to use this Mac-specific function using Python.
I thought of using OS tools but couldn't figure out any.

TL;DR: Get the extended attribute like MacOS's "Where from" by e.g. pip-install pyxattr and use xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms").
Extended Attributes on files
These extended file attributes like your "Where From" in MacOS (since 10.4) store metadata not interpreted by the filesystem. They exist for different operating systems.
using the command-line
You can also query them on the command-line with tools like:
exiftool:
exiftool -MDItemWhereFroms -MDItemTitle -MDItemAuthors -MDItemDownloadedDate /path/to/file
xattr (apparently MacOS also uses a Python-script)
xattr -p -l -x /path/to/file
On MacOS many attributes are displayed in property-list format, thus use -x option to obtain hexadecimal output.
using Python
Ture Pålsson pointed out the missing link keywords. Such common and appropriate terms are helpful to search Python Package Index (PyPi):
Search PyPi by keywords: extend file attributes, meta data:
xattr
pyxattr
osxmetadata, requires Python 3.7+, MacOS only
For example to list and get attributes use (adapted from pyxattr's official docs)
import xattr
xattr.listxattr("file.pdf")
# ['user.mime_type', 'com.apple.metadata:kMDItemWhereFroms']
xattr.getxattr("file.pdf", "user.mime_type")
# 'text/plain'
xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms")
# ['https://example.com/downloads/file.pdf']
However you will have to convert the MacOS specific metadata which is stored in plist format, e.g. using plistlib.
File metadata on MacOS
Mac OS X 10.4 (Tiger) introduced Spotlight a system for extracting (or harvesting), storing, indexing, and querying metadata. It provides an integrated system-wide service for searching and indexing.
This metadata is stored as extended file attributes having keys prefixed with com.apple.metadata:. The "Where from" attribute for example has the key com.apple.metadata:kMDItemWhereFroms.
using Python
Use osxmetadata to use similar functionality like in MacOS's md* utils:
from osxmetadata import OSXMetaData
filename = 'file.pdf'
meta = OSXMetaData(filename)
# get and print "Where from" list, downloaded date, title
print(meta.wherefroms, meta.downloadeddate, meta.title)
See also
MacIssues (2014): How to look up file metadata in OS X
OSXDaily (2018): How to View & Remove Extended Attributes from a File on Mac OS
Ask Different: filesystem - What all file metadata is available in macOS?
Query Spotlight for a range of dates via PyObjC
Mac OS X : add a custom meta data field to any file

macOS stores metadata such as the "Where from" attribute under the key com.apple.metadata:kMDItemWhereFroms.
import xattr
value = xattr.getxattr("sublime_text_build_4121_mac.zip",'com.apple.metadata:kMDItemWhereFroms').decode("ISO-8859-1")
print(value)
'bplist00¢\x01\x02_\x10#https://download.sublimetext.com/sublime_text_build_4121_mac.zip_\x10\x1chttps://www.sublimetext.com/\x08\x0bN\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00m'
I had faced a similar problem long ago. We did not use Python to solve it.

How can I localize argparse generated messages in a portable way?

Context:
I am developping a small module to automatically rename photographs in a directory according to their exif timestamp (goal: easily mixing pictures from different cameras or smartphones). It works smoothly either as a Python package or directly from the command line through a tiny wrapper using argparse.
And I have just had the (rather stupid) idea to localize it in non English language. Ok, gettext is my friend for all my own code, but when I came to agparse generated messages, I found myself on a sloppy ground...
Current research:
I have already found some resources on SO:
How to make python's argparse generate Non-English text?
Translate argparse's internal strings
Both end in adding the relevant strings from argparse into a po/mo file and let the argparse module automatically use the translated strings because internally it uses the _(...) wrapper. So far, so good.
My problem:
I feel this more as a workaround than a clean and neat solution because:
I could not find a word advising it in official Python documentation
It looks like a work in progress: implemented by not documented, so some strings could change in a future Python release (or did I miss something?)
Current code:
parser = argparse.ArgumentParser(
prog = prog,
description="Rename pictures according to their exif timestamp")
parser.add_argument("-v", "--version", action="version",
version="%(prog)s " + __version__)
parser.add_argument("--folder", "-f", default = ".",
help = "folder containing files to rename")
parser.add_argument("-s", "--src_mask", default="DSCF*.jpg",
help = "pattern to select the files to rename")
parser.add_argument("-d", "--dst_mask", default="%Y%m%d_%H%M%S",
help = "format for the new file name")
parser.add_argument("-e", "--ext", default=".jpg", dest="ext_mask",
help = "extension for the new file name")
parser.add_argument("-r", "--ref_file", default="names.log",
help = "a file to remember the old names")
parser.add_argument("-D", "--debug", action="store_true",
help = "print a line per rename")
parser.add_argument("-X", "--dry_run", action="store_true", dest="dummy",
help = "process normally except no rename occurs")
# subcommands configuration (rename, back, merge)
subparser = parser.add_subparsers(dest='subcommand', help="sub-commands")
ren = subparser.add_parser("rename", help=
"rename files by using their exif timestamp")
ren.add_argument("files", nargs="*",
help = "files to process (default: src_mask)")
back = subparser.add_parser("back",
help="rename files back to their original name")
back.add_argument("files", nargs="*",
help = "files to process (default: content of ref_file)")
merge = subparser.add_parser("merge",
help="merge files from a different folder")
merge.add_argument("src_folder", metavar="folder",
help = "folder from where merge picture files")
merge.add_argument("files", nargs="*",
help = "files to process (default: src_mask)")
I know how to wrap my own strings with _(), and I could probably have acceptable translations for the usage and help messages, but there are plenty of error messages when the user gives a wrong syntax, and I would like to prevent English error messages in the middle of French speaking program...
Question:
Is there any guarantee that the implementation of strings in the argparse module will not change, or is there a more robust/portable way to provide translations for its messages?

After some more research and #hpaulj's great comments, I can confirm:
the localizable messages from argparse are upward compatible from 3.3 to current version (old messages were never changed but new messages were added for new features)
the above is not true before 3.3
there are slight differences in 2.7
That means that only 2 paths are possible here:
accept the risk and provide a translation for the current version that will accept any Python version >= 3.3 - the risk is that a future version breaks the translation or add new (untranslated) messages. Nothing more to say because this will explicitely use the implementation details of the module
do not use at all argparse module and build a custom parser based on getopt. It is probably an acceptable option for simple use cases that do not require the full power of argparse
None of them are really good, but I cannot imagine a better one...
I will try to setup a project on github or gitlab providing the pot file and the french translation for the current argparse, and make it available on PyPI. If it ever exists, I will add references for it here and shall be glad to include other languages there.
A beta version of a project giving French translations for the argparse module is currently available on GitHUB
I call it beta because at this time, it has not been extensively tested, but it can be used either directly or as an example of what could be done. The binary wheel contains a little endian mo file, but the source distribution allows the mo file to be generated automatically on the target system with no additional dependency by including a copy of the msgfmt.py file from the Tools i18n of CPython.

Sphinx apidoc section titles for Python module/package names

When I run sphinx-apidoc and then make html it produces doc pages that have "Subpackages" and "Submodules" sections and "module" and "package" at the end of each module/package name in the table of contents (TOC). How might I prevent these extra titles from being written without editing the Sphinx source?
here's an example doc pages I would like to make (notice TOC):
http://selenium.googlecode.com/svn/trunk/docs/api/py/index.html#documentation
I understand it is due to the apidoc.py file in the sphinx source (line 88):
https://bitbucket.org/birkenfeld/sphinx/src/ef3092d458cc00c4b74dd342ea05ba1059a5da70/sphinx/apidoc.py?at=default
I could manually edit each individual .rst file to delete these titles or just remove those lines of code from the script but then I'd have to compile the Sphinx source code. Is there an automatic way of doing this without manually editing the Sphinx source?

I was struggling with this myself when I found this question... The answers given didn't quite do what I wanted so I vowed to come back when I figured it out. :)
In order to remove 'package' and 'module' from the auto-generated headings and have docs that are truly automatic, you need to make changes in several places so bear with me.
First, you need to handle your sphinx-apidoc options. What I use is:
sphinx-apidoc -fMeET ../yourpackage -o api
Assuming you are running this from inside the docs directory, this will source yourpackage for documentation and put the resulting files at docs/api. The options I'm using here will overwrite existing files, put module docs before submodule docs, put documentation for each module on its own page, abstain from creating module/package headings if your docstrings already have them, and it won't create a table of contents file.
That's a lot of options to remember, so I just add this to the end of my Makefile:
buildapi:
sphinx-apidoc -fMeET ../yourpackage -o api
#echo "Auto-generation of API documentation finished. " \
"The generated files are in 'api/'"
With this in place, you can just run make buildapi to build your docs.
Next, create an api.rst file at the root of your docs with the following contents:
API Documentation
=================
Information on specific functions, classes, and methods.
.. toctree::
:glob:
api/*
This will create a table of contents with everything in the api folder.
Unfortunately, sphinx-apidoc will still generate a yourpackage.rst file with an ugly 'yourpackage package' heading, so we need one final piece of configuration. In your conf.py file, find the exclude_patterns option and add this file to the list. It should look something like this:
exclude_patterns = ['_build', 'api/yourpackage.rst']
Now your documentation should look exactly like you designed it in the module docstrings, and you never have to worry about your Sphinx docs and your in-code documentation being out of sync!

It's probably late, but the options maxdepth or titlesonly should do the trick.
More details :
http://sphinx-doc.org/latest/markup/toctree.html

The answer by Jen Garcia helped a lot but it requires to put repeat package names in docstrings. I used a Perl one-liner to remove the "module" or "package" suffix in my Makefile:
docs:
rm -rf docs/api docs/_build
sphinx-apidoc -MeT -o docs/api wdmapper
for f in docs/api/*.rst; do\
perl -pi -e 's/(module|package)$$// if $$. == 1' $$f ;\
done
$(MAKE) -C docs html

I didn't want to use the titles within my docstrings as I was following numpy style guidelines. So I first generate the rst files and then run the following python script as a post-processing step.
from pathlib import Path
src_dir = Path("source/api")
for file in src_dir.iterdir():
print("Processed RST file:", file)
with open(file, "r") as f:
lines = f.read()
junk_strs = ["Submodules\n----------", "Subpackages\n-----------"]
for junk in junk_strs:
lines = lines.replace(junk, "")
lines = lines.replace(" module\n=", "\n")
with open(file, "w") as f:
f.write(lines)
This script is kept in the same directory as the makefile. I also add the following lines to the makefile.
html:
# rm -r "$(BUILDDIR)"
python rst_process.py
#$(SPHINXBUILD) -M $# "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Now running make html builds the documentation in the way I want.

I'm not sure I'm 100% answering your question, but I had a similar experience and I realized I was running sphinx-apidoc with the -f flag each time, which created the .rst files fresh each time.
Now I'm allowing sphinx-apidoc to generate the .rst files once, but not overwriting them, so I can modify them to change titles/etc. and then run make html to propagate the changes. If I want to freshly generate .rst files I can just remove the files I want to regenerate or pass the -f flag in.
So you do have to change the rst files but only once.

In newer versions of Apidoc, you can use a custom Jinja template to control the generated output.
The default templates are here: https://github.com/sphinx-doc/sphinx/tree/5.x/sphinx/templates/apidoc
You can make a local copy of each template using the same names (e.g. source/_templates/toc.rst_t) and invoke sphinx-apidoc with the --templatedir option (e.g. sphinx-apidoc --templatedir source/_templates).
Once you are using your own template file, you can customize it however you want. For example, you can remove the ugly "package" and "module" suffix, which is added at this stage.

Trouble with gettext windows and domain

I have a problem with gettext on windows.
I'm using gettext module from python and the 3rd part module named gettext_windows:
http://bazaar.launchpad.net/~bialix/gettext-py-windows/trunk/view/head:/gettext_windows.py
THe code is the following:
gettext_windows.setup_env()
_ = gettext.gettext
self._appName = "bitbucket"
self._localeDir = os.getcwd() + "\\data\\locale\\"
self._languages = ["it_IT", "pl_PL"]
if gettext_windows.get_language()[0] in self._languages:
lang = gettext_windows.get_language()[0]
self._translation = gettext.translation(self._appName, self._localeDir, lang)
self._translation.install(unicode=True)
For create a .po/.mo files im using PoEdit.
Then i save these files and i put them in:
data
----locale/
--------it_IT/
------------LC_MESSAGES/
----------------bitbucket.mo
----------------bitbucket.po
data
----locale/
--------pl_PL/
------------LC_MESSAGES/
----------------bitbucket.mo
----------------bitbucket.po
When i trying to execute my app i have the followed error:
No translation files found for domain bitbucket
Can anybody explain me what's wrong?
THe files are in good directory.
IF i trying to user find() method from gettext module:
print gettext.find('bitbucket', self._localeDir, self._languages, all=True)
It work properly and returns *.mo files for it_IT/pl_PL language

I would recommend following the instructions on the wxPython wiki: http://wiki.wxpython.org/Internationalization#How_to_get_gettext_tools_for_Win32
If you get stuck, ask for help on the wxPython mailing list. There are multiple people there who have written this kind of support into their applications.

How can I make this long_description and README differ by a couple of sentences?

For a package of mine, I have a README.rst file that is read into the setup.py's long description like so:
readme = open('README.rst', 'r')
README_TEXT = readme.read()
readme.close()
setup(
...
long_description = README_TEXT,
....
)
This way that I can have the README file show up on my github page every time I commit and on the pypi page every time I python setup.py register. There's only one problem. I'd like the github page to say something like "This document reflects a pre-release version of envbuilder. For the most recent release, see pypi."
I could just put those lines in README.rst and delete them before I python setup.py register, but I know that there's going to be a time that I forget to remove the sentences before I push to pypi.
I'm trying to think of the best way to automate this so I don't have to worry about it. Anyone have any ideas? Is there any setuptools/distutils magic I can do?

You can just use a ReST comment with some text like "split here", and then split on that in your setup.py. Ian Bicking does that in virtualenv with index.txt and setup.py.

Another option is to side-step the issue completely by adding a paragraph that works in both environments: "The latest unstable code is on github. The latest stable kits are on pypi."
After all, why assume that pypi people don't want to be pointed to github? This would be more helpful to both audiences, and simplifies your setup.py.

You could always do this:
GITHUB_ALERT = 'This document reflects a pre-release version...'
readme = open('README.rst', 'r')
README_TEXT = readme.read().replace(GITHUB_ALERT, '')
readme.close()
setup(
...
long_description = README_TEXT,
....
)
But then you'd have to keep that GITHUB_ALERT string in sync with the actual wording of the README. Using a regular expression instead (to, say, match a line beginning with Note for Github Users: or something) might give you a little more flexibility.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Internationalizing a Python 2.6 application via Babel - python

Related

Obtaining metadata "Where from" of a file on Mac

How can I localize argparse generated messages in a portable way?

Sphinx apidoc section titles for Python module/package names

Trouble with gettext windows and domain

How can I make this long_description and README differ by a couple of sentences?

Categories

Resources