Why does Sphinx generate json? - python

I notice that Sphinx has the ability to generate documentation in JSON. What are these files used for?

As the docs say, it's
for use of a web application (or
custom postprocessing tool) that
doesn’t use the standard HTML
templates.
json's a good simple way for language-agnostic data interchange, so, why not?-)

I assume you're talking about the SerializingHTMLBuilder, in which case I think the answer might be that there isn't necessarily a specific purpose in mind. Rather many things provide conversion routines of various kinds with a "loads/dumps" API convention, and the json module (known as simplejson before it was brought standard library in 2.6) is but one of many such packages.
Presumably some people would prefer to work with data in JSON format for their own purposes. If I were trying to build some sort of dynamic Javascripty documentation system, I could well imagine choosing to use JSON as the way to get documentation from the backend out to the client in a manageable format, if for some reason HTML or XML didn't seem like the better option.

Related

parse and import data structure in Python from Franca interface descriptive language (fidl) files

Is there a Python library for reading and parsing fidl files? Ideally I would like to represent the datastructure described in the fidl as a Python dictionary, so that it can then be manipulated easily.
I was looking into pyfranca but it seems unmaintained and lacking proper documentation even for basic stuff.
Honestly, I could not find much more, so any help would be appreciated.

convert Doxygen HTML documentation to JSON/XML (aka reverse-generate documentation)

Maya has a pretty good HTML documentation for their C++ API (which I'm pretty sure comes from Doxygen).
However, the Python docs/API is lacking (eg no method return type information). If I had access to the docs in a structured way (JSON/XML/whatever), I could easily generate proper .pyi files with better type info.
Is there an easy way to convert Doxygen HTML documentation to JSON or something?

Is there a reliable python library for taking a BibTex entry and outputting it into specific formats?

I'm developing using Python and Django for a website. I want to take a BibTex entry and output it in a view in 3 different formats, MLA, APA, and Chicago. Is there a library out there that already does this or am I going to have to manually do the string formatting?
There are the following projects:
BibtexParser
Pybtex
Pybliographer
BabyBib
If you need complex parsing and output, Pybtex is recommended. Example:
>>> from pybtex.database.input import bibtex
>>> parser = bibtex.Parser()
>>> bib_data = parser.parse_file('examples/foo.bib')
>>> bib_data.entries.keys()
[u'ruckenstein-diffusion', u'viktorov-metodoj', u'test-inbook', u'test-booklet']
>>> print bib_data.entries['ruckenstein-diffusion'].fields['title']
Predicting the Diffusion Coefficient in Supercritical Fluids
Good luck.
Having tried them, all of these projects are bad, for various reasons: terrible APIs, bad documentation, and a failure to parse valid BibTeX files. The implementation you want doesn't show up in most Google searches, from my own searching: it's biblib. This text from the README should sell it:
There are a lot of BibTeX parsers out there. Most of them are complete nonsense based on some imaginary grammar made up by the module's author that is almost, but not quite, entirely unlike BibTeX's actual grammar. BibTeX has a grammar. It's even pretty simple, though it's probably not what you think it is. The hardest part of BibTeX's grammar is that it's only written down in one place: the BibTeX source code.
The accepted answer of using pybtex is fraught with danger as Pybtex does not preserve the bibtex format of even simple bibtex files. (https://bitbucket.org/pybtex-devs/pybtex/issues/130/need-to-specially-represent-bibtex-markup)
Pybtex is therefore losing bibtex information when reading and re-writing a simple .bib file without making any changes. Users should be very careful following the recommendations to use pybtex.
I will try biblib as well and report back but the accepted answer should be edited to not recommend pybtex.
Edit:
I was able to import the data using Bibtex Parser, without any loss of data. However, I had to compile from https://github.com/sciunto-org/python-bibtexparser as the version installed via pip was bugged at the time. Users should verify that pip is getting the latest version.
As for exporting, once the data has been imported via BibTex Parser, it's in a dictionary, and can be exported as the user desires. BibTex Parser does not have built in functions for exporting in common formats. As I did not need this functionality, I didn't specifically test it. However, once imported into a dictionary, the string output can be converted to any citation format rather easily.
Here, pybtex and a custom style file can help. I used the style file provided by the journal and compiled in LaTeX instead, but PyBtex has python style files (but also allows ingesting .sty files). So I would recommend taking the Bibtex Parser input and transferring it to PyBtex (or similar) for outputting in a certain style.
The closest thing I know of is the pybtex package

What's a good document standard to use programmatically?

I'm writing a program that requires input in the form of a document, it needs to replace a few values, insert a table, and convert it to PDF. It's written in Python + Qt (PyQt). Is there any well known document standard which can be easily used programmatically? It must be cross platform, and preferably open.
I have looked into Microsoft Doc and Docx, which are binary formats and I can't edit them. Python has bindings for it, but they're only on Windows.
Open Office's ODT/ODF is zipped in an xml file, so I can edit that one but there's no command line utilities or any way to programmatically convert the file to a PDF. Open Office provides bindings, but you need to run Open Office from the command line, start a server, etc. And my clients may not have Open Office installed.
RTF is readable from Python, but I couldn't find any way/libraries to convert RTF documents to PDF.
At the moment I'm exporting from Microsoft Word to HTML, replacing the values and using PyQt to convert it to a PDF. However it loses formatting features and looks awful. I'm surprised there isn't a well known library which lets you edit a variety of document formats and convert them into other formats, am I missing something?
Update: Thanks for the advice, I'll have a look at using Latex.
Thanks,
Jackson
Have you looked into using LaTeX documents?
They are perfect to use programatically (compiling documents? You gotta love that...), and you have several Python frameworks you can use such as plasTeX and PyTex.
Exporting a LaTeX documents to PDF is almost immediate.
Since you're already using PyQt anyway, it might be worth looking at Qt's built-in RTF processing module which looks decent. Here's the documentation on detailed content manipulation including inserting tables. Also the QPrinter module's default print-to-file format happens to be PDF.
Without knowing more about your particular needs it's hard to say if these would do what you want, but since your application already has PyQt as a dependency, seems silly to introduce any more without evaluating the functionality you've already got available.
The non-GUI parts of the Qt framework are often overlooked though.
edit: included more links.
You might want to try ReportLab. The open source version can write PDFs, and the commercial version has a lot of really nice abstractions to allow output to a variety of different formats from a single input.
I don't know the kind of odience of your program, Tex is good and i would go with it.
Another possible choice is Excel format, parsing it with xlrd.
I've used it a couple of time and it's pretty straightforward.
Excel file is a good for the following reasons:
Well known format easy to edit
You could prepare a predefined template with constrains and table
Creating XML documents, transforming them to XSL/fo and rendering with Fop or RenderX. If you use docbook as the primary input, there are toolchains freely available for converting that to PDF, RTF, HTML and so forth.
It is rather quirky to use and not my idea of fun, but is does deliver and can be embedded in an application, AFAICT.
Creating docbook is very straightforward as it has a wide range of semantic tags, table support etc to give a "meaningful" markup which can be reliably formatted. The XSL stylesheets are modular and allow parts to be customized or replaced to generate your own look and feel.
It works well for relatively free flow documents with lots of text.
For filling in the blanks kind of documents, a regular reporting engine may be a better fit, or some straighforward XSL stylesheets spitting out the XSL-fo directly.

Is there a way to correctly sort unicode strings in SQLite using Python?

Is there a simple way to order rows with unicode data in SQLite?
SQLite has a BYOS (Bring Your Own Sorter) policy. See the FAQ for more details. They chose not to include (by default) any Unicode-aware sorting algorithm, to keep the SQLite library svelte and easy to statically link in.
However, you can create a collator, that sorts however you please, then tell SQLite to use it. As the other poster hinted at, there are collators in the source tree that do this using ICU. However, you can also use your own, which makes sense if you're using a library like GLib that has its own Unicode-awareness.
There is a library called ICU that can do proper unicode sorting for you; there's a good description in this other question:
How to sort text in sqlite3 with specified locale?

Categories

Resources