Python HTML to PDF with floating divs - python

Is there a way to convert XHTML/HTML with CSS to PDF with floating divs?
I have tried pisa/xhtml2pdf in python and dompdf in PHP both are not able to do so.
Is there any way?

See html-tables-to-pdf-in-php-neither-dompdf-nor-html2ps-pdf-are-working.
A possible path is to use some Layout (Rendering) Engine, such as Webkit or Gecko.
The rendered HTML page can then be saved as PDF. An example of a tool that uses this method is the wkhtmltopdf project.
(I know, this is not related to Python or PHP - you can still drive the tool from a script.).

Found a blog post that does this very thing
http://web.archive.org/web/20130525082452/http://notes.alexdong.com/xhtml-to-pdf-using-pyqt4-webkit-and-headless
got it working rather quickly using the more mature pyqt4 module

Related

Import python projects to a HTML page

Supose I have a python game and I want to "post" it on a site like Friv that I am making. Is there any way
for me import the "game.py" to the "site.html" and it show when I enter the site? I made a search and found to use django, but I would need to pass all the html code that I already have to other aplication.
The language of browsers is JavaScript.
There is a project called PyJs which translates Python code to JavaScript and is useful in your case that you want to run Python code inside web browsers.
Finally you can use your resulting JavaScript files to fill up your HTML page.
In addition to PyJs, there are numerous other projects that "run Python code in a browser" like Brython. However, any of them have not been standardized and if you want a robust game in your browser, use JavaScript!
There are number of projects that compile python into JavaScript in order to be run on browser.
Here are two links that might help
Web Browser Programming: https://wiki.python.org/moin/WebBrowserProgramming
PyGame Trinket: https://trinket.io/features/pygame
The way I integrate python code in an html is to use templating language like jinja2 but if you want to write full python code in html then use need to use a transpiler like PyJS but since you want to integrate the same code in multiple program, why not use FLASK it is much more easier.
and make an api. Django is an option but it has a steep learning curve. you can make the UI using HTML and get the data from python using API.

Get a result page from a research in google -> to PDF

Here is what I'm trying to do: through a python script, I would like to get the first 5 pages of results of a Google search and save them as PDF files in a folder.
What do you suggest ?
(1) I start by parsing the HTML pages one by one and then find a tool to convert them into PDF ?
(2) I find a way to direclty do all the step in one through a mod which I don't know yet ?
Thank you very much in advance for your insights !
Use the standard Python library to download the file(s). Then you can use http://www.xhtml2pdf.com/ to convert the pages to PDF.
Note: Most web pages uses a lot of JavaScript to do all kinds of magic. So for many pages, only a full-blown web browser will get you nice/useful results. If you run into this problem, then there is no pure Python solution. Try phantomjs as explained here:
phantomjs rasterize.js 'http://en.wikipedia.org/w/index.php?title=Jakarta&printable=yes' jakarta.pdf
PS: I found these solutions by googling for python convert html to pdf You should try it once in a while.

How to automatically generate a PDF of a website?

I have a website that has some charts and graphs made using JavaScript libraries. What's a good way to, server-side, auto-generate the HTML, CSS, and JS, and then capture the result in a PDF / PNG / JPG? I'd like to auto-generate reports and email them to my users.
Any programming language is fine, but Ruby / Rails would be best.
I've heard of the wkhtmltopdf project. With the help of the webkit rendering enginge it produces PDFs from a webpage. It offers Python bindings. Ruby bindings are also available: PDFKit
wkhtmltopdf is a good tool to use. I've just used it to generate 500+ pdf documents in one day using a rake task. If you're interested with gems that take advantage of wkhtmltopdf, then you can try WickedPDF or PDFKit.

HTML page to PDF in Python?

Is there a library available to convert a HTML page (text, images, layout elements etc. ) to a PDF file.
I have an HTML page with figures, text and tables with numbers etc. which I want my clients to be able to download as PDF. How do I do this with Python?
Not too familiar with python, and prince is nice if you are willing to shell out the cash. There is this http://github.com/antialize/wkhtmltopdf that uses webkit. It is a simple command line utility that you can call and it will honor html+css. As far as I know, it is the only free tool to do so well. There is a ruby gem for it http://github.com/jdpace/PDFKit, not that it helps you but might give you some ideas.
Well, there are the reportlab and html2pdf modules, but for best results I'd probably try calling Prince externally (http://www.princexml.com/doc/6.0/python/) .
Have you heard of xhtml2pdf/pisa?
It has the ability to work as a python module or as a separate command line utility.
You can use the documentation here to get started:
http://www.xhtml2pdf.com/doc/pisa-en.html

Pure python solution to convert XHTML to PDF

I am after a pure Python solution (for the GAE) to convert webpages to pdf.
I had a look at reportlab but the documentation focuses on generating pdfs from scratch, rather than converting from HTML.
What do you recommend? - pisa?
Edit:
My use case is I have a HTML report that I want to make available in PDF too. I will make updates to this report structure so I don't want to maintain a separate PDF version, but (hopefully) convert automatically.
Also because I generate the report HTML I can ensure it is well formed XHTML to make the PDF conversion easier.
Pisa claims to support what I want to do:
pisa is a html2pdf converter using the
ReportLab Toolkit, the HTML5lib and
pyPdf. It supports HTML 5 and CSS 2.1
(and some of CSS 3). It is completely
written in pure Python so it is
platform independent. The main benefit
of this tool that a user with Web
skills like HTML and CSS is able to
generate PDF templates very quickly
without learning new technologies.
Easy integration into Python
frameworks like CherryPy, KID
Templating, TurboGears, Django, Zope,
Plone, Google AppEngine (GAE) etc.
So I will investigate it further
Have you considered pyPdf? I doubt it has anywhere like the functional richness you require, but, it IS a start, and is in pure Python. The PdfFileWriter class would be the one to generate PDF output, unfortunately it requires PageObject instances and doesn't provide real ways to put those together, except extracting them from existing PDF documents. Unfortunately all richer pdf page-generation packages I can find do appear to depend on reportlab or other non-pure-Python libraries:-(.
What you're asking for is a pure Python HTML renderer, which is a big task to say the least ('real' renderers like webkit are the product of thousands of hours of work). As far as I'm aware, there aren't any.
Instead of looking for an HTML to PDF converter, what I'd suggest is building your report in a format that's easily converted to both - for example, you could build it as a DOM (a set of linked objects), and write converters for both HTML and PDF output. This is a much more limited problem than converting HTML to PDF, and hence much easier to implement.

Categories

Resources