convert Doxygen HTML documentation to JSON/XML (aka reverse-generate documentation)

convert Doxygen HTML documentation to JSON/XML (aka reverse-generate documentation) - python

Maya has a pretty good HTML documentation for their C++ API (which I'm pretty sure comes from Doxygen).
However, the Python docs/API is lacking (eg no method return type information). If I had access to the docs in a structured way (JSON/XML/whatever), I could easily generate proper .pyi files with better type info.
Is there an easy way to convert Doxygen HTML documentation to JSON or something?

Related

parse and import data structure in Python from Franca interface descriptive language (fidl) files

Is there a Python library for reading and parsing fidl files? Ideally I would like to represent the datastructure described in the fidl as a Python dictionary, so that it can then be manipulated easily.
I was looking into pyfranca but it seems unmaintained and lacking proper documentation even for basic stuff.
Honestly, I could not find much more, so any help would be appreciated.

Python create xml from xsd

I am new with python and need to implement an interface to an accounting tool. I received some XSD Files which describes the interface.
What is the easiest way to generate the XML according to the XSD?
Is there any module I can use?
Do I have to create the XML all by myself and I can use the XSD just to verify it?
How do I best proceed?

I think, generateDS is the solution to your problem.
Starting from chapter 5, the command
python generateDS.py -o people.py -s peoplesubs.py people.xsd
reads the XSD file and creates several classes and subclasses. It generates many data structures and getters and setters for accessing and using data :)
If there is any XML file that complies with that XSD, it can be read straight away by using
import people
rootObject = people.parse('people.xml')
within the code. More information is given in chapter 12.
The aforementioned classes also provide methods to export data as an XML format.
The level of documentation is good and it is highly suggested to use this for any future project.

There are some projects on github that do that by using xmlschema library, for instance fortesp/xsd2xml or miaozn/xsd2xml (python2)
For instance with the former:
xmlgenerator = XMLGenerator('resources/pain.001.001.09.xsd', True, DataFacet())
print(xmlgenerator.execute()) # Output to console
xmlgenerator.write('filename.xml') # Output to file
Unfortunately none of these are properly packaged though.

How can I use both Sphinx and Pycco for Python documentation?

I'd like to be able to use Sphinx for the main project documentation, so the docstrings must be in reStructuredText, but I'd also like to generate HTML for reading the inline comments in the style of Pycco, which uses Markdown.
Is there a tool or combination of tools that will allow me to do convert only the docstrings from reStructuredText to Markdown?

The pyment tool can convert docstrings to different formats. You could start with that, and then write a subclass of DocToolsBase to format docstrings the way you like.
See this question What is the standard Python docstring format? for more about python docstring conventions and tooling.

A software called Pandoc may be the right tool. You can see the detail in the page through the hyperlink. I ever wanted to have try of it, but it needs Haskell runtime environment which is a little big, so I gave up.

What's a good document standard to use programmatically?

I'm writing a program that requires input in the form of a document, it needs to replace a few values, insert a table, and convert it to PDF. It's written in Python + Qt (PyQt). Is there any well known document standard which can be easily used programmatically? It must be cross platform, and preferably open.
I have looked into Microsoft Doc and Docx, which are binary formats and I can't edit them. Python has bindings for it, but they're only on Windows.
Open Office's ODT/ODF is zipped in an xml file, so I can edit that one but there's no command line utilities or any way to programmatically convert the file to a PDF. Open Office provides bindings, but you need to run Open Office from the command line, start a server, etc. And my clients may not have Open Office installed.
RTF is readable from Python, but I couldn't find any way/libraries to convert RTF documents to PDF.
At the moment I'm exporting from Microsoft Word to HTML, replacing the values and using PyQt to convert it to a PDF. However it loses formatting features and looks awful. I'm surprised there isn't a well known library which lets you edit a variety of document formats and convert them into other formats, am I missing something?
Update: Thanks for the advice, I'll have a look at using Latex.
Thanks,
Jackson

Have you looked into using LaTeX documents?
They are perfect to use programatically (compiling documents? You gotta love that...), and you have several Python frameworks you can use such as plasTeX and PyTex.
Exporting a LaTeX documents to PDF is almost immediate.

Since you're already using PyQt anyway, it might be worth looking at Qt's built-in RTF processing module which looks decent. Here's the documentation on detailed content manipulation including inserting tables. Also the QPrinter module's default print-to-file format happens to be PDF.
Without knowing more about your particular needs it's hard to say if these would do what you want, but since your application already has PyQt as a dependency, seems silly to introduce any more without evaluating the functionality you've already got available.
The non-GUI parts of the Qt framework are often overlooked though.
edit: included more links.

You might want to try ReportLab. The open source version can write PDFs, and the commercial version has a lot of really nice abstractions to allow output to a variety of different formats from a single input.

I don't know the kind of odience of your program, Tex is good and i would go with it.
Another possible choice is Excel format, parsing it with xlrd.
I've used it a couple of time and it's pretty straightforward.
Excel file is a good for the following reasons:
Well known format easy to edit
You could prepare a predefined template with constrains and table

Creating XML documents, transforming them to XSL/fo and rendering with Fop or RenderX. If you use docbook as the primary input, there are toolchains freely available for converting that to PDF, RTF, HTML and so forth.
It is rather quirky to use and not my idea of fun, but is does deliver and can be embedded in an application, AFAICT.
Creating docbook is very straightforward as it has a wide range of semantic tags, table support etc to give a "meaningful" markup which can be reliably formatted. The XSL stylesheets are modular and allow parts to be customized or replaced to generate your own look and feel.
It works well for relatively free flow documents with lots of text.
For filling in the blanks kind of documents, a regular reporting engine may be a better fit, or some straighforward XSL stylesheets spitting out the XSL-fo directly.

Why does Sphinx generate json?

I notice that Sphinx has the ability to generate documentation in JSON. What are these files used for?

As the docs say, it's
for use of a web application (or
custom postprocessing tool) that
doesn’t use the standard HTML
templates.
json's a good simple way for language-agnostic data interchange, so, why not?-)

I assume you're talking about the SerializingHTMLBuilder, in which case I think the answer might be that there isn't necessarily a specific purpose in mind. Rather many things provide conversion routines of various kinds with a "loads/dumps" API convention, and the json module (known as simplejson before it was brought standard library in 2.6) is but one of many such packages.
Presumably some people would prefer to work with data in JSON format for their own purposes. If I were trying to build some sort of dynamic Javascripty documentation system, I could well imagine choosing to use JSON as the way to get documentation from the backend out to the client in a manageable format, if for some reason HTML or XML didn't seem like the better option.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

convert Doxygen HTML documentation to JSON/XML (aka reverse-generate documentation) - python

Related

parse and import data structure in Python from Franca interface descriptive language (fidl) files

Python create xml from xsd

How can I use both Sphinx and Pycco for Python documentation?

What's a good document standard to use programmatically?

Why does Sphinx generate json?

Categories

Resources