Creating of HDF5 file of SPSS file for and by Python

Creating of HDF5 file of SPSS file for and by Python - python

What is the easiest way to create a HDF5-file of an SPSS-file by Python?

If you haven't already stumbled upon it, check out h5py. Iterating over SPSS's data with SPSS's addon python module, and placing the data in an h5py object should be all you need to do.

The Python code at this link
http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/
reads and write sav files. It uses the free i/o modules produced by IBM SPSS.

Related

Do I need to have Excel installed to use Openpyxl "interactively"? Can it interact with .ODS format?

I have currently written a Python script that uses UNO to interact with a LibreOffice .ods document. However, it is very unstable and is causing a lot of system errors and crashing on Ubuntu 17.04.
I read through the openpyxl docs but I can't find the answer to this quick question: Can I use openpyxl on Ubuntu to dynamically manipulate a .XLSX or a .ODS document with embedded formulas?
I.e. I want to open the document, read data from the cells, update the cell values in a loop, read the new output data from the document's own formulas, and then load this output into a numpy array and close the document without saving it.
Does Excel/libreoffice have to be installed for openpyxl to use the embedded formulas in this way?
Thank you

OpenPyXL doesn't need and cannot use a running instance of Excel. It doesn't interact with .ods at all. The closest thing to OpenPyXL for .ods is pyexcel-ods. The only practical way to evaluate Excel formulas is with a running instance of Excel. Likewise, the only practical way to evaluate LibreOffice formulas is with a running instance of LibreOffice.
So if you want to use OpenPyXL or pyexcel-ods "interactively", you would have to not use Excel/LibreOffice formulas but instead do all the calculations with Python.
If you need the formulas and you need to run it in Linux, then using UNO in some form (to control a running instance of LibreOffice) is probably still your best bet. Note that there are a few different Python interfaces to UNO, such as PyOO and UnoTools. I don't know what you're using now.
Finally, if the UNO-based methods are really too unstable, and you do have access to Excel, then you could try xlwings. It would be fairly convoluted to get this to run in Linux (it might require a virtual machine; testimonials for Excel under Wine have been mixed at best).

Manipulate Excel from Python

I would like to manipulate an Excel spreadsheet from Python, e.g. open it, and execute an embedded VBA macro, or some set of actions on the data inside. I have a choice of whether I would like to work on Windows or Linux for this.
There are packages ways to do this (xlwings package from this answer) but I am looking for something more native to explore what options are there.
I think one way to do this is the COM interface on Windows, which xlwings itself wraps. I found win32com python module, but seem to have very little documentation on manipulating particular objects, and what methods and attributes are available. Can someone please point me to a good information source (books are also ok)?
Alternatively, are there other options to do this, and if so, where can I read up on them?

I have gotten around this by using the Auto_Open() function in VBA, which runs when the Excel file opens.
Public Sub Auto_Open()
''' Run other macro here
End Sub
You could then open the Excel file using subprocess.popen. This is a bit of a hack, but avoids using win32com. I had to take this route because win32com wouldn't work on our locked down computers.

Beginner Python: Saving an excel file while it is open

I have a simple problem that I hope will have a simple solution.
I am writing python(2.7) code using the xlwt package to write excel files. The program takes data and writes it out to a file that is being saved constantly. The problem is that whenever I have the file open to check the data and python tries to save the file the program crashes.
Is there any way to make python save the file when I have it open for reading?

My experience is that sashkello is correct, Excel locks the file. Even OpenOffice/LibreOffice do this. They lock the file on disk and create a temp version as a working copy. ANY program trying to access the open file will be denied by the OS. The reason for this is because many corporations treat Excel files as databases but the users have no understanding of the issues involved in concurrency and synchronisation.
I am on linux and I get this behaviour (at least when the file is on a SAMBA share). Look in the same directory as your file, if a file called .~lock.[filename]# exists then you will be unable to read your file from another program. I'm not sure what enforces this lock but I suspect it's an NTFS attribute. Note that even a simple cp or cat fails: cp: error reading ‘CATALOGUE.ods’: Input/output error
UPDATE: The actual locking mechanism appears to be 'oplocks`, a concept connected to Windows shares: http://oreilly.com/openbook/samba/book/ch05_05.html . If the share is managed by Samba the workaround is to disable locks on certain file types, eg:
veto oplock files = /*.xlsx/
If you aren't using a share or NTFS on linux then I guess you should be able to RW the file as long as your script has write permissions. By default only the user who created the file has write access.
WORKAROUND 2: The restriction only seems to apply if you have the file open in Excel/LO as writable, however LO at least allows you to open a file as read-only (Go to File -> Properties -> Security, set Read-Only, Save and re-open the file). I don't know if this will also make it RO for xlwt though.

Hah, funny I ran across your post. I actually just implemented this tonight.
The issue is that Excel files write, and that's it, not both. You cannot read/write off the same object. So if you have another method to save data please do. I'm in a position where I don't have an option.. and so might you.
You're going to need xlutils it's the bread and butter to this.
Here's some example code:
from xlutils.copy import copy
wb_filename = 'example.xls'
wb_object = xlrd.open_workbook(wb_filename)
# And then you can read this file to your hearts galore.
# Now when it comes to writing to this, you need to copy the object and work off that.
write_object = copy(wb_object)
# Write to it all you want and then save that object.
And that's it, now if you read the object, write to it, and read the original one again it won't be updated. You either need to recreate wb_object or you need to create some sort of table in memory that you can keep track of while working through it.

How to open Excel instance in python on MAC?

I think this question has been asked before but it's not clear, in the original question the user has provided excel.exe which is a windows executable extension and not for mac.
I need to open new Excel instance in Python on MAC.
which module should I import?
I'm a newbie I have completed learning python language, but have trouble understanding documentation.

If all you need to do is launch Excel, the best way to do it is to use LaunchServices to do it.
If you have PyObjC (which you do if you're using the Python that Apple pre-installs on 10.6 and later; otherwise, you may have to install it):
import Foundation
ws = Foundation.NSWorkspace.sharedWorkspace()
ws.launchApplication_('Microsoft Excel')
If not, you can always use the open tool:
import subprocess
subprocess.check_call(['open', '-a', 'Microsoft Excel'])
Either way, you're effectively launching Excel the same way as if the user double-clicked the app icon in Finder.
If you want to make Excel do something simple like open a specific document, that's not much harder. Look at the NSWorkspace or open documentation to see how to do whatever you want.
If you actually want to control Excel—e.g., open a document, make some changes, and save it—you'll want to use its AppleScript interface.
Apple's recommended way of doing that is via ScriptingBridge, or using a dual-language approach (write AppleScripts and execute them via NSAppleScript—which, in Python, you do through PyObjC). However, I'd probably use appscript (get the code from here). Despite the fact that it's been abandoned by its original creator, and is only being sparsely maintained, and will probably eventually stop working with some future OS X version, it's still much better than the official solutions.
Here's a sample (untested, because I don't have Excel here):
import appscript
excel = appscript.app('Microsoft Excel')
excel.workbooks[1].column[2].row[2].formula.set('=A2+1')

From the comments it is not completely clear if you need to 'update' an Excel file with data, and just assume that you need Excel to do so, or that you need to change some excel files to include new data.
It is usually much easier, and certainly faster (wrt excution speed) to go with 'updating' an Excel file without starting Excel. However updating is not the right word: you have to read in the file and write it out new. You can of course overwrite the orginal file, so it looks like an update.
For 'updating' you can use the trio xlrd, xlwt, xlutils if the files you work with are .xls files (Excel 2003). IIRC xlwt does not support .xlsx for writing (but xlrd can read those files).
For .xlsx files I use openpyxl,
Both are good enough for writing things like data, formula and basic formatting.
If you have existing Excel files which you use as 'templates' with information that would get lost if you read/write using one of the above packages, then you have to go with updating the file in Excel. I had to do so because I had no easy way to include Visual Basic macros and very specific formatting specified by a client. And sometimes it is just easier to visually setup a spreadsheet and then just fill the cells programmatically. But this was all done on Windows.
If you really have to drive Excel on Mac, because you need to use existing files as templates, I suggest you look at Applescript. Or, if it is an option, look at OpenOffice/LibreOffice PyUno interface.

Creating excel chart from text file

I am running a simulation that saves some result in a file from which I take the data I need and save it in
result.dat
like so:
SAN.statisticsCollector.Network Response Time
Min: 0.210169
Max: 8781.55
average: 346.966666667
I do all this using python, and it was easy to convert result.dat into an excel file using xlwt. The problem is that creating charts using xlwt is not possible. I than came across Jpype, but installation on my ubuntu 12.04 machine was a headache. I'm probably just being lazy but still - is there any other way, not necessarily python-related, to convert result.dat into an excel file with charts?
Thanks
P.s the file I want to create is a spreadsheet, not Microsoft's Excel!

there is a now a new possibility: http://pythonhosted.org/openpyxl/charts.html
and http://xlsxwriter.readthedocs.org/chart.html

The main problem is that currently there's no Python library that implements MS Excel chart creation, and, obviously, they will not appear due to lack of good chart format documentation (as python-excel.org guys told) and its huge complecity.
There are two other options though:
Another option is to use 3-rd party tools (like JPype that you've mentioned) combining them with Python scripts. As far as I know, except Java smartXML there's no libraries that are capable of creating excel charts (or course, there are ones for .NET, e.g. expertXLS) and I'm not sure it will run on Mono + IronPython, though you can try.
The third option is Win32 COM API, e.g. like described in this SO post, which is not quite an option for you due to your working operating system.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.