Cross-platform help viewer with search functionality - python

I am looking for a help viewer like Windows CHM that basically provides support for
adding content in HTML format
define Table of Contents
decent search
It should work on Windows, Mac and Linux. Bonus points for also having support for generating a "plain HTML/javascript" version that can be viewed in any browser (albeit without search support).
Language preference: Python

wxHtmlHelpController, which is part of wxWidgets, is a cross-platform viewer for HtmlHelp.
I'm not sure how easy it is to use it from a non-wxWidgets program, but I think it can be done.

wxHtmlHelpController doesn't support any scripting within pages, nor does it support css.

Related

Creating PDFs from HTML/Javascript in Python with no OS dependencies

Is there any way to use Python to create PDF documents from HTML/CSS/Javascript, without introducing any OS-level dependencies?
It seems every existing solution requires special supplemental software, but upon reviewing PDF formatting specifications and HTML/CSS/Javascript rendering, there doesn't appear to be a reason why a Python solution can't exist without them. Some solutions come close, such as pyppeteer, but it still leans on a headless Chrome installation locally. These dependencies mean that microservices can't be leveraged, even though PDF generation would otherwise seem to be a viable use case for them.
While similar questions have come up many times over on SO, there doesn't appear to have been a viable technique shown without having to install specialized dependencies on the OS.
Some similar questions which routinely recommend wkhtmltopdf or are otherwise out of date (e.g., moving PDF printing support outside of Chrome is dead now):
How to convert webpage into PDF by using Python
How to convert a local HTML file to PDF using Python in Windows
HTML to PDF conversion using Chrome pdfium
How does Chrome render PDFs from HTML so well?
Convert a HTML/CSS/Javascript file to PDF using Python?
If I've somehow missed a viable approach, please feel free to mark this as a duplicate with my thanks!
Edit February 2021: It appears that the cefpython project may meet these demands - PDF printing support seems like it could be implemented in the near future.
So to clarify and formalize what others have said:
If you want to create PDF documents from HTML/CSS/javascript content, you will necessarily need a javascript engine (because you obviously need to execute the javascript if it affects the visuals of the document). This is the most complex component that you need.
As for now, there is no ECMAscript compliant engine written in pure python that is well-maintained (that would be a huge project)... There will probably never be one, since compilers and VMs for languages need to be performant and are thus usually written in a performant low-level language.
So you will always need compiled binaries for that and the HTML renderers which are less complex but also need to be performant if used in browsers, so usually they're also C++ or the likes.
The javascript engine and HTML renderer are the major part of a browser, so a headless browser is a good solution to this requirement.
Try this library: xhtml2pdf
It worked for me. Here is the documentation: doc
Some sample code:
from xhtml2pdf import pisa
def convert_html_to_pdf(source_html, output_filename):
# open output file for writing (truncated binary)
result_file = open(output_filename, "w+b")
# convert HTML to PDF
pisa_status = pisa.CreatePDF(
source_html, # the HTML to convert
dest=result_file) # file handle to recieve result
# close output file
result_file.close() # close output file
# return False on success and True on errors
return pisa_status.err
# Define your data
source_html = open('2020-06.html')
output_filename = "test.pdf"
convert_html_to_pdf(source_html, output_filename)

Getting started with GUI for student who has only worked with console

So far I have only been taught how to program/script at my school for console programs but wanted to start making some applications with an actual interface other then the command-line, unfortunately I have no idea where to begin. I tried to look it up but all I found were guides on how to "design" them not program them. As such I would like to ask you how exactly should I get started on this, I know this is a rather broad question but just a few links to some study material that can help me get started is enough. The languages I have been taught are Visual Basic and Python but I also know HTML and CSS, I am only slightly familiar with JavaScript.
(Bonus Question: so when looking at the website for Atom.io I saw that the editor was made in JavaScript, node.js, HTML and CSS. I was wondering how did they use HTML and CSS for a desktop app? Also, is it possible to do that without node.js and JavaScript, say for example with Python?)
Recommend begin with:
Visual Basic: Windows Forms(have a nice GUI editor, easier than TkInter)
Python: TkInter(Everything simple with pack()).
About HTML/CSS/JS desktop app use Electron http://electron.atom.io/
Given you already know python , you can use python library "Tkinter" to get started.
https://www.tutorialspoint.com/python/python_gui_programming.htm
https://wiki.python.org/moin/TkInter
Have a look at Graphics Py document created by John Zelle
http://mcsp.wartburg.edu/zelle/python/graphics/graphics.pdf

What's a good document standard to use programmatically?

I'm writing a program that requires input in the form of a document, it needs to replace a few values, insert a table, and convert it to PDF. It's written in Python + Qt (PyQt). Is there any well known document standard which can be easily used programmatically? It must be cross platform, and preferably open.
I have looked into Microsoft Doc and Docx, which are binary formats and I can't edit them. Python has bindings for it, but they're only on Windows.
Open Office's ODT/ODF is zipped in an xml file, so I can edit that one but there's no command line utilities or any way to programmatically convert the file to a PDF. Open Office provides bindings, but you need to run Open Office from the command line, start a server, etc. And my clients may not have Open Office installed.
RTF is readable from Python, but I couldn't find any way/libraries to convert RTF documents to PDF.
At the moment I'm exporting from Microsoft Word to HTML, replacing the values and using PyQt to convert it to a PDF. However it loses formatting features and looks awful. I'm surprised there isn't a well known library which lets you edit a variety of document formats and convert them into other formats, am I missing something?
Update: Thanks for the advice, I'll have a look at using Latex.
Thanks,
Jackson
Have you looked into using LaTeX documents?
They are perfect to use programatically (compiling documents? You gotta love that...), and you have several Python frameworks you can use such as plasTeX and PyTex.
Exporting a LaTeX documents to PDF is almost immediate.
Since you're already using PyQt anyway, it might be worth looking at Qt's built-in RTF processing module which looks decent. Here's the documentation on detailed content manipulation including inserting tables. Also the QPrinter module's default print-to-file format happens to be PDF.
Without knowing more about your particular needs it's hard to say if these would do what you want, but since your application already has PyQt as a dependency, seems silly to introduce any more without evaluating the functionality you've already got available.
The non-GUI parts of the Qt framework are often overlooked though.
edit: included more links.
You might want to try ReportLab. The open source version can write PDFs, and the commercial version has a lot of really nice abstractions to allow output to a variety of different formats from a single input.
I don't know the kind of odience of your program, Tex is good and i would go with it.
Another possible choice is Excel format, parsing it with xlrd.
I've used it a couple of time and it's pretty straightforward.
Excel file is a good for the following reasons:
Well known format easy to edit
You could prepare a predefined template with constrains and table
Creating XML documents, transforming them to XSL/fo and rendering with Fop or RenderX. If you use docbook as the primary input, there are toolchains freely available for converting that to PDF, RTF, HTML and so forth.
It is rather quirky to use and not my idea of fun, but is does deliver and can be embedded in an application, AFAICT.
Creating docbook is very straightforward as it has a wide range of semantic tags, table support etc to give a "meaningful" markup which can be reliably formatted. The XSL stylesheets are modular and allow parts to be customized or replaced to generate your own look and feel.
It works well for relatively free flow documents with lots of text.
For filling in the blanks kind of documents, a regular reporting engine may be a better fit, or some straighforward XSL stylesheets spitting out the XSL-fo directly.

HTML page to PDF in Python?

Is there a library available to convert a HTML page (text, images, layout elements etc. ) to a PDF file.
I have an HTML page with figures, text and tables with numbers etc. which I want my clients to be able to download as PDF. How do I do this with Python?
Not too familiar with python, and prince is nice if you are willing to shell out the cash. There is this http://github.com/antialize/wkhtmltopdf that uses webkit. It is a simple command line utility that you can call and it will honor html+css. As far as I know, it is the only free tool to do so well. There is a ruby gem for it http://github.com/jdpace/PDFKit, not that it helps you but might give you some ideas.
Well, there are the reportlab and html2pdf modules, but for best results I'd probably try calling Prince externally (http://www.princexml.com/doc/6.0/python/) .
Have you heard of xhtml2pdf/pisa?
It has the ability to work as a python module or as a separate command line utility.
You can use the documentation here to get started:
http://www.xhtml2pdf.com/doc/pisa-en.html

Using Sphinx to create context-sensitive help files in HTML

I am currently using AsciiDoc for documenting my software projects because it supports PDF and HTML help generation. I am currently running it through Cygwin so that the a2x toolchain functions properly. This works well for me but is a pain to setup on other Windows computers. I have been looking for alternative methods and recently revisited Sphinx. Noticing that it now produces HTML help files I gave it a try and it seems to work well in the small tests I performed.
My question is, is there a way to specify map id's for context sensitive help in the text so that my Windows programs can call the proper help API and the file is launched and opened to the desired location?
In AsciiDoc I am using pass::[<?dbhh topicname="_about" topicid="801"?>]. By using these constructs a context.h and alias.h are generated along with the other HTML help files (context sensitive help information).
I do not know about AcsiiDoc much, but in Sphinx you can reference arbitrary locations by placing anchors where you need them. See :ref: role.

Categories

Resources