I am currently using AsciiDoc for documenting my software projects because it supports PDF and HTML help generation. I am currently running it through Cygwin so that the a2x toolchain functions properly. This works well for me but is a pain to setup on other Windows computers. I have been looking for alternative methods and recently revisited Sphinx. Noticing that it now produces HTML help files I gave it a try and it seems to work well in the small tests I performed.
My question is, is there a way to specify map id's for context sensitive help in the text so that my Windows programs can call the proper help API and the file is launched and opened to the desired location?
In AsciiDoc I am using pass::[<?dbhh topicname="_about" topicid="801"?>]. By using these constructs a context.h and alias.h are generated along with the other HTML help files (context sensitive help information).
I do not know about AcsiiDoc much, but in Sphinx you can reference arbitrary locations by placing anchors where you need them. See :ref: role.
Related
I have been looking around on and off for the last week to see if anybody has performed this type of work. Unfortunately I have found very little that is python specific.
I have a repository with ~10k solidworks parts. I would like to analyse these files in batch and collect information like volume, material, etc, so to get some general statistical information. Ideally this would be in python but solutions in other languages are more than welcome.
Most answers I found are about creating addons in VB,C#, C++ to interact with Solidworks API but nothing about doing general statistical analysis of just the parts. I don't want to interact with the application or build features, I just want to look at what's inside the files without having Solidworks. I am also working on Linux which is not supported by Solidworks.
Нi, 6F4E37
I see two ways to get what you want from solidworks files, unfortunately they both involve Windows and C#/VB code.
Without SolidWorks application. Use SW Document manager - a
library that allows you to access meta information of your parts.
You will be able to get some information about your part including
volume.
I'm unaware of any attempts to run Document Manager on Wine, please
share your results if you'll try.
Note that Document manager library license is free, provided that
you have active SolidWorks subscription.
Using SolidWorks API. You don't have to create an add-in, you can
connect to solidworks from a standalone application :
SldWorks swApp = (SldWorks)Activator.CreateInstance(System.Type.GetTypeFromProgID("SldWorks.Application"));
Obviously for that approach you'll need to have solidowrks installed on your machine.
Also note that SolidWorks is not the most stable application, and it'll crash every 200-400 files that you process, so you you'll choose this approach you'll need to keep an eye on SolidWorks instance and restart it if needed.
My program’s documentation is mainly written in Sphinx, but it also includes two custom HTML pages:
an example report produced by the program;
an extended reference on certain features of the program.
These two HTML files are produced by the program itself, not by Sphinx.
I want to host my docs on Read the Docs, and it would be very convenient for me to build and host the two custom pages, versioned, together with the Sphinx docs.
My program is already installed in the RtD build environment as I have the Install Project option enabled. And since the RtD docs mention writing your own builder, I gather it might be possible to invoke my program from there and have it dump the HTML content in a specific place.
So I really have two questions:
Is this an appropriate use of Read the Docs? I guess it’s not designed to host arbitrary Web pages — but then again, those files are not arbitrary, they are an important part of the docs.
How would I implement it? I’m having a hard time making sense of the RtD API: is this “builder” related in any way to Sphinx builders? how do I hook it up to RtD? perhaps there is an example somewhere?
I achieved the desired result using Sphinx’s html_extra_path feature:
A list of paths that contain extra files [...] They are copied to the output directory.
To generate these files, I haven’t found a better place than right in my conf.py, which seems a bit precarious, but works so far. Of course, Install your project inside a virtualenv needs to be enabled in Read the Docs advanced settings.
Now my custom notices.html and showcase.html are treated just like the .html pages produced by Sphinx itself, with versioning and redirects: http://httpolice.readthedocs.io/page/notices.html
Problem
On the Mac OS X platform, I would like to write a script, either in Python or Tcl to search for text within a PDF file and extract the relevant parts. I appreciate any help.
Background
I am writing scripts to look inside a PDF to determine if it is a bill, from what company, and for what period. Based on these information, I rename the PDF and move it to an appropriate directory. For example, file such as Statement_03948293929384.pdf might become 2012-07-15 Water Bill.pdf and moved to my Utilities folder.
What have I done so far?
I have searched for PDF-to-plain-text tools, but not found anything yet
I have looked into the Tcl wiki and found an example, but could not get it to work (I searched for text in PDF, but not found).
I am looking into pdf-parser.py by Didier Stevens
I heard of a Python package called pyPdf and will look at it next.
Update
I have found a command-line tool called pdftotext written by Glyph & Cog, LLC; built and packaged by Carsten Bluem. This tool is straight forward and it solves my problem. I am still looking out for those tools that can search PDF directly, without having to convert to text file.
I have successfully used PyODConverter to convert to/from PDFs (there is also a more powerful Java version). Once you have the PDF converted to text it should be trivial to do the searching. Also I believe iText should be capable of doing similar things, but I haven't tested it.
I'm writing a program that requires input in the form of a document, it needs to replace a few values, insert a table, and convert it to PDF. It's written in Python + Qt (PyQt). Is there any well known document standard which can be easily used programmatically? It must be cross platform, and preferably open.
I have looked into Microsoft Doc and Docx, which are binary formats and I can't edit them. Python has bindings for it, but they're only on Windows.
Open Office's ODT/ODF is zipped in an xml file, so I can edit that one but there's no command line utilities or any way to programmatically convert the file to a PDF. Open Office provides bindings, but you need to run Open Office from the command line, start a server, etc. And my clients may not have Open Office installed.
RTF is readable from Python, but I couldn't find any way/libraries to convert RTF documents to PDF.
At the moment I'm exporting from Microsoft Word to HTML, replacing the values and using PyQt to convert it to a PDF. However it loses formatting features and looks awful. I'm surprised there isn't a well known library which lets you edit a variety of document formats and convert them into other formats, am I missing something?
Update: Thanks for the advice, I'll have a look at using Latex.
Thanks,
Jackson
Have you looked into using LaTeX documents?
They are perfect to use programatically (compiling documents? You gotta love that...), and you have several Python frameworks you can use such as plasTeX and PyTex.
Exporting a LaTeX documents to PDF is almost immediate.
Since you're already using PyQt anyway, it might be worth looking at Qt's built-in RTF processing module which looks decent. Here's the documentation on detailed content manipulation including inserting tables. Also the QPrinter module's default print-to-file format happens to be PDF.
Without knowing more about your particular needs it's hard to say if these would do what you want, but since your application already has PyQt as a dependency, seems silly to introduce any more without evaluating the functionality you've already got available.
The non-GUI parts of the Qt framework are often overlooked though.
edit: included more links.
You might want to try ReportLab. The open source version can write PDFs, and the commercial version has a lot of really nice abstractions to allow output to a variety of different formats from a single input.
I don't know the kind of odience of your program, Tex is good and i would go with it.
Another possible choice is Excel format, parsing it with xlrd.
I've used it a couple of time and it's pretty straightforward.
Excel file is a good for the following reasons:
Well known format easy to edit
You could prepare a predefined template with constrains and table
Creating XML documents, transforming them to XSL/fo and rendering with Fop or RenderX. If you use docbook as the primary input, there are toolchains freely available for converting that to PDF, RTF, HTML and so forth.
It is rather quirky to use and not my idea of fun, but is does deliver and can be embedded in an application, AFAICT.
Creating docbook is very straightforward as it has a wide range of semantic tags, table support etc to give a "meaningful" markup which can be reliably formatted. The XSL stylesheets are modular and allow parts to be customized or replaced to generate your own look and feel.
It works well for relatively free flow documents with lots of text.
For filling in the blanks kind of documents, a regular reporting engine may be a better fit, or some straighforward XSL stylesheets spitting out the XSL-fo directly.
I am looking for a help viewer like Windows CHM that basically provides support for
adding content in HTML format
define Table of Contents
decent search
It should work on Windows, Mac and Linux. Bonus points for also having support for generating a "plain HTML/javascript" version that can be viewed in any browser (albeit without search support).
Language preference: Python
wxHtmlHelpController, which is part of wxWidgets, is a cross-platform viewer for HtmlHelp.
I'm not sure how easy it is to use it from a non-wxWidgets program, but I think it can be done.
wxHtmlHelpController doesn't support any scripting within pages, nor does it support css.