Write data to excel template - python

I need to create some excel tables, but these tables don't have simple look.
There are some pictures, some special fonts etc.
But the complicated parts are static, that means always the same.
So my idea was, I will create an excel-template with these tricky parts and then from python just insert dynamic data to this template.
I am working with pandas framework, but I didn't find a way how to do that with or without this framework.
Any idea?

There isn't an easy way to do this with any of the usual "direct file manipulation" libraries in Python (xlrd, xlwt, XlsxWriter, OpenPyXL; these are what pandas uses). The reason is that the structure of a workbook file is such that it's impossible or prohibitively difficult (depending on whether you're talking about .xls or .xlsx) to do anything resembling "in-place" editing, short of re-implementing Excel itself.
So for what you're trying to do, your best option is to let Excel do the work. (I'm assuming you can run Excel, since you mention that you'd like to create Excel templates.) There are ways to automate Excel, the most straightforward probably being Microsoft's VBA or VBScript. But if you want to do it in Python, you can, using PyWin32 or pywinauto.

Related

Uploading files using design pattern with python

I used to upload csv, excel, json or geojson files in my a postegreSQL using Python/Django.
I noticed that the scripts is redundant and sometimes difficult to maintain when we need to update key or columns. Is there a way to use design pattern? I have never used it before.
Any suggestion or links could be hep!

Search office documents for string python

I've been looking for a fast and relatively easy way of searching (grep-ish) for user-defined strings in files of varying formats, i.e xlsx, docx, pptx, pdf using Python.
My research has led me to believe that there might not be a convenient way of doing this, as per a single module or similar. Am I forced to use a separate module for each file type? And if so are these approriate?
docx
openpyxl
pptx
slate
I also looked at forms of decompression to get to the xml-files containing actual text but it seems unwieldy. I just want to be sure that there is no simple, uniform way of handling all of these different filetypes.
Well, I've mostly figured it out. In the end I decided to use powershell combined with "itextsharp.dll" to process the files. It turned out to be simpler than using portable python. Thanks for the answers:-)

Programming Externally Linked Images in Excel

This may be a long shot, but I figured it's worth asking. I need a way to programmatically insert externally linked images in excel, meaning that every time you open the file, the spreadsheet will contact the url at which the image is located. It's easy to do this manually in excel, but I want to do it programmatically, preferably with python. I've tried using the openpyxl and XlsxWriter libraries, but neither have this specific functionality. My only other option is to look for the excel source code so I can see how an externally linked image is represented by excel. I don't suppose Microsoft makes that source code public, do they?
Thanks for any suggestions

Generating correct excel xls format

I created a little script in python to generate an excel compatible xml file (saved with xls extension). The file is generated from a part database so I can place an order with the extracted data.
On the website for ordering the parts, you can import the excel file so the order fills automatically. The problem here is that each time I want to make an order, I have to open excel and save the file with xls extension of type MS Excel 97-2003 to get the import working.
The excel document then looks exactly the same, but when opened with notepad, we cannot see the xml anymore, only binary dump.
Is there a way to automate this process, by running a bat file or maybe adding some line to my python script so it is converted in the proper format?
(I know that question has been asked before, but it never has been answered)
There are two basic approaches to this.
You asked about the first: Automating Excel to open and save the file. There are in fact two ways to do that. The second is to use Python tools that can create the file directly in Python without Excel's help. So:
1a: Automating Excel through its automation interface.
Excel is designed to be controlled by external apps, through COM automation. Python has a great COM-automation interface inside of pywin32. Unfortunately, the documentation on pywin32 is not that great, and all of the documentation on Excel's COM automation interface is written for JScript, VB, .NET, or raw COM in C. Fortunately, there are a number of questions on this site about using win32com to drive Excel, such as this one, so you can probably figure it out yourself. It would look something like this:
import win32com.client
excel = win32com.client.Dispatch('Excel.Application')
spreadsheet = excel.Workbooks.Open('C:/path/to/spreadsheet.xml')
spreadsheet.SaveAs('C:/path/to/spreadsheet.xls', fileformat=excel.xlExcel8)
That isn't tested in any way, because I don't have a Windows box with Excel handy. And I vaguely remember having problems getting access to the fileformat names from win32com and just punting and looking up the equivalent numbers (a quick google for "fileformat xlExcel8" shows that the numerical equivalent is 56, and confirms that's the right format for 97-2003 binary xls).
Of course if you don't need to do it in Python, MSDN is full of great examples in JScript, VBA, etc.
The documentation you need is all on MSDN (since the Office Developer Network for Excel was merged into MSDN, and then apparently became a 404 page). The top-level page for Excel is Welcome to the Excel 2013 developer reference (if you want a different version, click on "Office client development" in the navigation thingy above and pick a different version), and what you mostly care about is the Object model reference. You can also find the same documentation (often links to the exact same webpages) in Excel's built-in help. For example, that's where you find out that the Application object has a Workbooks property, which is a Workbooks object, which has Open and Add methods that return a Workbook object, which has a SaveAs method, which takes an optional FileFormat parameter of type XlFileFormat, which has a value xlExcel8 = 56.
As I implied earlier, you may not be able to access enumeration values like xlExcel8 for some reason which I no longer remember, but you can look the value up on MSDN (or just Google it) and put the number 56 instead.
The other documentation (both here and elsewhere within MSDN) is usually either stuff you can guess yourself, or stuff that isn't relevant from win32com. Unfortunately, the already-sparse win32com documentation expects you to have read that documentation—but fortunately, the examples are enough to muddle your way through almost everything but the object model.
1b: Automating Excel via its GUI.
Automating a GUI on Windows is a huge pain, but there are a number of tools that make it a whole lot easier, such as pywinauto. You may be able to just use swapy to write the pywinauto script for you.
If you don't need to do it in Python, separate scripting systems like AutoIt have an even larger user base and even more examples to make your life easier.
2: Doing it all in Python.
xlutils, part of python-excel, may be able to do what you want, without touching Excel at all.

What's a good document standard to use programmatically?

I'm writing a program that requires input in the form of a document, it needs to replace a few values, insert a table, and convert it to PDF. It's written in Python + Qt (PyQt). Is there any well known document standard which can be easily used programmatically? It must be cross platform, and preferably open.
I have looked into Microsoft Doc and Docx, which are binary formats and I can't edit them. Python has bindings for it, but they're only on Windows.
Open Office's ODT/ODF is zipped in an xml file, so I can edit that one but there's no command line utilities or any way to programmatically convert the file to a PDF. Open Office provides bindings, but you need to run Open Office from the command line, start a server, etc. And my clients may not have Open Office installed.
RTF is readable from Python, but I couldn't find any way/libraries to convert RTF documents to PDF.
At the moment I'm exporting from Microsoft Word to HTML, replacing the values and using PyQt to convert it to a PDF. However it loses formatting features and looks awful. I'm surprised there isn't a well known library which lets you edit a variety of document formats and convert them into other formats, am I missing something?
Update: Thanks for the advice, I'll have a look at using Latex.
Thanks,
Jackson
Have you looked into using LaTeX documents?
They are perfect to use programatically (compiling documents? You gotta love that...), and you have several Python frameworks you can use such as plasTeX and PyTex.
Exporting a LaTeX documents to PDF is almost immediate.
Since you're already using PyQt anyway, it might be worth looking at Qt's built-in RTF processing module which looks decent. Here's the documentation on detailed content manipulation including inserting tables. Also the QPrinter module's default print-to-file format happens to be PDF.
Without knowing more about your particular needs it's hard to say if these would do what you want, but since your application already has PyQt as a dependency, seems silly to introduce any more without evaluating the functionality you've already got available.
The non-GUI parts of the Qt framework are often overlooked though.
edit: included more links.
You might want to try ReportLab. The open source version can write PDFs, and the commercial version has a lot of really nice abstractions to allow output to a variety of different formats from a single input.
I don't know the kind of odience of your program, Tex is good and i would go with it.
Another possible choice is Excel format, parsing it with xlrd.
I've used it a couple of time and it's pretty straightforward.
Excel file is a good for the following reasons:
Well known format easy to edit
You could prepare a predefined template with constrains and table
Creating XML documents, transforming them to XSL/fo and rendering with Fop or RenderX. If you use docbook as the primary input, there are toolchains freely available for converting that to PDF, RTF, HTML and so forth.
It is rather quirky to use and not my idea of fun, but is does deliver and can be embedded in an application, AFAICT.
Creating docbook is very straightforward as it has a wide range of semantic tags, table support etc to give a "meaningful" markup which can be reliably formatted. The XSL stylesheets are modular and allow parts to be customized or replaced to generate your own look and feel.
It works well for relatively free flow documents with lots of text.
For filling in the blanks kind of documents, a regular reporting engine may be a better fit, or some straighforward XSL stylesheets spitting out the XSL-fo directly.

Categories

Resources