Is it possible to write, from SPSS, (using Python), into a newly created Excel file, the variable list and variable labels?
Yes, lookup DISPLAY DICTIONARY and/or CODEBOOK. It would then be a case of exporting these outputs (from SPSS's output viewer) to Excel (OUTPUT EXPORT command).
If you needed something more customized then you can either capture the output via OMS and do manipulations as you please (and then export to Excel) or you can use python APIS directly to retrieve variable, value labels and then write results to Excel (using any Python/Excel library of your choice such as xlrd or xlsxwriter, to name a couple).
The latter requires much more programming knowledge whereas the former can all be done with native SPSS syntax.
I have done something similar (producing a customized data dictionary) taking the Python programming approach and found this module written by an unknown author very useful as a basis.
(Assuming you meant an automated way of achieving this else you could just copy and past the column of variable names and labels to Excel! Value labels can't be done similarly though for obvious reasons).
You can also consult the discussion on
http://www.spssforum.com/viewtopic.php?f=12&t=12076
that also includes some python code (only for the variable labels but value labels are an easy extension, using the GetVariableLabel function. However, it depends a bit on how you want to have them, though. (on separate lines, or as following the variable.)
You may also do it like so, followed by code of e.g. openpyxl:
from savReaderWriter import *
with SavHeaderReader(filename) as header:
report = str(header) # txt report
metadata = header.all()
Related
I want to be able to open an xlsx file in Python and select a different dropdown value in a cell which should trigger an update for the entire spreadsheet based on the new value (just how it currently does so if I manually select a different value). How can I do this in Python and which library can help me?
TL;DR: You can't.
In order to get cascading execution, you need to access the Excel execution engine. Python libraries do not have a copy of this.
If you wish to change additional values in the spreadsheet, you will need to write your Python code to make the changes.
Caveat: There technically is a way to do it using pywin32 if you have a version of Excel installed. In this case Python is simply feeding instructions to Excel, no differently than if you were using VBA. It is significantly more complicated than changing a value using a library such as Openpyxl.
I created a .xlsx file using the python module Xlsxwriter.
Works well but any cells that contain formula will show up as empty or 0 when I run
unoconv -f pdf spreadsheet.xlsx
How do I force the unoconv autocalculate the values before converting to pdf?
As of this writing, apparently unoconv is purposely not doing any recalculation because it can cause errors. See Issue 97. The relevant part of unoconv's source code is commented out.
So one thing you could try is to go into your copy of unoconv and uncomment those lines (make them active again). You may get errors or you may be fine.
Another thing you could try, if your formulas are simple, is to calculate them using Python and write the results along with the formulas using XlsxWriter. Notice that the worksheet.write_formula() method has an optional value parameter. Quoting from the official docs:
XlsxWriter doesn’t calculate the result of a formula and instead stores the value 0 as the formula result. It then sets a global flag in the XLSX file to say that all formulas and functions should be recalculated when the file is opened. This is the method recommended in the Excel documentation and in general it works fine with spreadsheet applications. However, applications that don’t have a facility to calculate formulas, such as Excel Viewer, or some mobile applications will only display the 0 results.
If required, it is also possible to specify the calculated result of the formula using the optional value parameter. This is occasionally necessary when working with non-Excel applications that don’t calculate the result of the formula: [...]
Note that if the formulas are such that you can calculate them with Python, and you only care about the final PDF (no one is going to use the spreadsheet, and after you convert it to PDF you just throw it away anyway), then you could write the values directly, without the formulas at all.
I'll start off by saying that I'm new to python. I'm trying to create an application that is a simple Q+A and will export the answers to specific cells of an excel. I have an existing spreadsheet that i would like to modify and save as a separate outfile leaving the original untouched. I've seen various ways that i can append the file but will overwrite the original.
As an example, i would like this code;
hq = input('Headquarters: ')
to put the response in cell S1
Am I way off base trying to use Python for this task? Any Help would be greatly appreciated!
-Paul
There may not be very straightforward solutions but there are a couple of tools which might help you.
The first one is openpyxl: https://openpyxl.readthedocs.org/en/2.0.2/# If you have xlsx files, you should be able to modify them with this.
You might also be able to do what you want to do by using xlutils module: http://pythonhosted.org/xlutils/index.html However, then you'll need to first read the file, then edit it, and then save it to another file. Formatting may be lost, etc.
This is heavily YMMV due to the not-so-well defined file format, but I'd start with openpyxl.
What solutions exist (and what are the pros/cons) for including free-format, rich text data in an Excel file (along side the normal tabular data)?
This is the kind of data I'd like to include (in a separate worksheet):
Note that we're currently using openpyxl to generate Excel files from Python. We can use something other than Python/OpenPyXL if necessary, but keeping with Excel is a must (the accountants who use these reports won't use anything else).
The Excel specification makes it pretty tricky to do what you want. Currently, the smallest unit that you can apply styles to in openpyxl is a cell. As long as you can work with that restriction, you should be okay. Outlining is also supported.
I am very new to python and want to make a script for a spreadsheet I use at work. Basically, i need to associate an address with multiple 5 digit reference codes. There are multiple addresses with a corresponding group of reference codes.
i.e:
Address:
1234 E. 32nd Street,
New York, NY, 10001
Ref #'s
RL081
RL089
LA063
Address 2:
etc....
I need my script to look up a location by ref code. This information is then used to build a new spreadsheet (each row needs an address and the address is looked up using a ref code). What is the best way to use this info in python? Would it be a dictionary? Should I put the addresses / ref codes into an XML type file?
Thanks
Edit (clarification):
Basically, I have those addresses and corresponding ref codes (they could be in a plain text document, I could organize them in a spreadsheet, or whatever so python can use them). The script I'm building needs to use those ref codes to enter an address into a new spreadsheet. Basically, I input a half complete spreadsheet and the script fills in the addresses based on the ref code in each row.
Import into what?
If you have everything in a spreadsheet, Python has a very good CSV reader library. Once you've read it in, the challenge becomes what to do with it.
If you are looking at a medium term solution, I'd recommend looking at using SQLite to set up a simple spreadsheet that can manage the information in a more structured way. SQLite scales well in the beginning stages of a project and it becomes a trivial case to insert into a fully-fledged RDBMS like PostGreSQL or MySQL if it becomes neccessary.
From there it becomes a case of writing the libraries you need to manipulate your data, and present it. In the initial stages this can be done using the command line but by using an SQL database this can be exposed through a webpage for multiple people down the line without worrying about managing data integrity.
I prefer to use JSON over XML for storing data that will later be used in python. The json module is fairly robust and easy to use. Since you will be performing lookups I would definitely loading the information as a python dictionary. Since you'll be querying by ref codes you'll want to use those for keys and have the address as the value.
I need my script to look up a location by ref code
Since this is the only requirement you've stated, I would recommend using a dict where keys are ref codes and values are addresses.
I'm not sure why you are asking about "file types". It seems you already have all this information stored in a spreadsheet - no need to write a new file.