Saving data from SPSS to Excel - custom sheet name - python

Is it possible, When exporting a dataset from SPSS to Excel, to control the name of the worksheet the data is being saved into ? The "SAVE TRANSLATE OUTFILE" command does not allow for this. I have SPSS 21, with Python installed (although I am fairly new to Python...)

Yes. See this weblink on IBM website for details.
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
SAVE TRANSLATE
/TYPE=ODBC
/CONNECT='DSN=Excel Files;DBQ=C:\Daten\Temp\EmployeeDataExcelExport.xlsx;'
/ENCRYPTED
/MISSING=IGNORE
/REPLACE
/TABLE='EmployeeData'.
EDIT:
The syntax provided in the link on IBM website does NOT work for me however the below does:
save translate
/connect="dsn=excel files;dbq=C:\Temp\EmployeeDataExcelExport.xls;driverid=790;maxbuffersize=2048;pagetimeout=5;"
/table="EmployeeData"
/type=odbc /map /replace.

SAVE TRANSLATE relies on ODBC drivers, which means that your Statistics and Office bitness has to match - 64-bit Statistics with 32-bit Office won't work. Otherwise, you can write to an Excel file with SAVE TRANSLATE and then use VBA automation via a Basic script in Statistics to rename the sheet. There is a Basic module available from the SPSS Community website that writes output tables to an Excel file that does some sheet renaming that you could adapt for your purposes.
You can find the module here
https://www.ibm.com/developerworks/community/files/app?lang=en#/file/8e0dfcb6-aa57-4639-a20e-1780010cfe83

Related

openpyxl corrupts spreadsheet if it contains a data source

I use openpyxl to interact with Excel files using Python 3.7. I open and save my .xlsx spreadsheets as follows:
from openpyxl import load_workbook
wb.load_workbook('file.xlsx', read_only=False)
wb.save('file.xlsx')
If file.xlsx contains no links to external data sources (such as SQL Server or Postgre-SQL), then there is no problem with the saved file and it opens okay in Excel after being processed by my Python script.
However, if file.xlsx does contain a link to external data, then upon executing the above script, the output file is now corrupted. When opening the file in Excel, the following error is reported and I have the option of attempting to recover it. When recovering, the data remains but all links to the data source are gone.
> We found a problem with some content in file.xlsx. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.
It is easy to reproduce this error as follows:
Create a blank spreadsheet and save it as file.xlsx.
Run the above three lines of Python code to open and save the file. You will see this works fine and has no impact on the spreadsheet.
Now open file.xlsx in Excel and, from the Data tab, choose a data source. You can choose any data source (link to a csv file, a table within Excel, or an external data source - it doesn't matter).
Save the spreadsheet, then run the above Python script (which again, simply opens and saves it).
Open file.xlsx in Excel. You will see that it is now corrupted.
My conclusion is that, at the moment, openpyxl doesn't support spreadsheets that contain links to external data. It would be useful to have this confirmed, or for a workaround to the above issue to be proposed.
Thanks!!

Convert a CSV/Excel file into and EXCEL file with formatting as table

Context:
I have a query that runs daily. I would like to email the query result data to stakeholders but I need it pre formatted for them as they may not have the skills required to format the data as a table. I have SSRS which is great for presenting the data but I still want the file emailed to them for filtering. I can generate CSV or Excel from the query and automate the email of the output but not without pre formatting the table first.
Problem:
I have a simple CSV/Excel file output from a query.
cust_id,cust_name
1,bishop
2,ripley
I want to convert this CSV/Excel output file, to a file that is pre-formatted as a sortable, filterable table with headers, and automate this process.
Excel formatted table image
Is this possible from python or some other server friendly code snippet? Either CSV >> Excel Formatted or Excel >> Excel Formatted. Both starting file types are fine in this instance.
Limitation:
I cannot install, update or import packages that are not part of stock python libraries.
Python's xlsxwriter should have all the functionality you need.
Module page here: http://xlsxwriter.readthedocs.io/index.html
Example for table formatting here http://xlsxwriter.readthedocs.io/tutorial02.html
Example for adding auto filters here: http://xlsxwriter.readthedocs.io/example_autofilter.html
Upon reading the updated Limitations (can't install modules on the server), it's clear that using xlsxwriter without something like Anaconda is a no-go. Fallback solution unknown, I don't know of any module that comes with the standard python installation that does what you need.

how to connect to an external API using python?

I am trying to write a script which takes the usernames from an excel sheet in a loop and then connect to an external API of a website and get the user ID's from it and give gave the response in the excel sheet. Please help me with an example code.
I need help on two things:-
1:- How to read a particular column elements from an excel sheet
2:- Write a code in the script which uses an API of a website to feed the excel sheet usernames in it in a loop and retrieve the user ID's
For reading the information from the excel sheet take a look at https://docs.python.org/2/library/csv.html
For retrieving the user IDs it depends on the API itself so you would need to provide more information.
Additionally, you might want to look at this Python library for the Instagram API.
This site contains pointers to the best information available about working with Excel files in the Python programming language.
This site will show you how to use APIs with python.

Generating correct excel xls format

I created a little script in python to generate an excel compatible xml file (saved with xls extension). The file is generated from a part database so I can place an order with the extracted data.
On the website for ordering the parts, you can import the excel file so the order fills automatically. The problem here is that each time I want to make an order, I have to open excel and save the file with xls extension of type MS Excel 97-2003 to get the import working.
The excel document then looks exactly the same, but when opened with notepad, we cannot see the xml anymore, only binary dump.
Is there a way to automate this process, by running a bat file or maybe adding some line to my python script so it is converted in the proper format?
(I know that question has been asked before, but it never has been answered)
There are two basic approaches to this.
You asked about the first: Automating Excel to open and save the file. There are in fact two ways to do that. The second is to use Python tools that can create the file directly in Python without Excel's help. So:
1a: Automating Excel through its automation interface.
Excel is designed to be controlled by external apps, through COM automation. Python has a great COM-automation interface inside of pywin32. Unfortunately, the documentation on pywin32 is not that great, and all of the documentation on Excel's COM automation interface is written for JScript, VB, .NET, or raw COM in C. Fortunately, there are a number of questions on this site about using win32com to drive Excel, such as this one, so you can probably figure it out yourself. It would look something like this:
import win32com.client
excel = win32com.client.Dispatch('Excel.Application')
spreadsheet = excel.Workbooks.Open('C:/path/to/spreadsheet.xml')
spreadsheet.SaveAs('C:/path/to/spreadsheet.xls', fileformat=excel.xlExcel8)
That isn't tested in any way, because I don't have a Windows box with Excel handy. And I vaguely remember having problems getting access to the fileformat names from win32com and just punting and looking up the equivalent numbers (a quick google for "fileformat xlExcel8" shows that the numerical equivalent is 56, and confirms that's the right format for 97-2003 binary xls).
Of course if you don't need to do it in Python, MSDN is full of great examples in JScript, VBA, etc.
The documentation you need is all on MSDN (since the Office Developer Network for Excel was merged into MSDN, and then apparently became a 404 page). The top-level page for Excel is Welcome to the Excel 2013 developer reference (if you want a different version, click on "Office client development" in the navigation thingy above and pick a different version), and what you mostly care about is the Object model reference. You can also find the same documentation (often links to the exact same webpages) in Excel's built-in help. For example, that's where you find out that the Application object has a Workbooks property, which is a Workbooks object, which has Open and Add methods that return a Workbook object, which has a SaveAs method, which takes an optional FileFormat parameter of type XlFileFormat, which has a value xlExcel8 = 56.
As I implied earlier, you may not be able to access enumeration values like xlExcel8 for some reason which I no longer remember, but you can look the value up on MSDN (or just Google it) and put the number 56 instead.
The other documentation (both here and elsewhere within MSDN) is usually either stuff you can guess yourself, or stuff that isn't relevant from win32com. Unfortunately, the already-sparse win32com documentation expects you to have read that documentation—but fortunately, the examples are enough to muddle your way through almost everything but the object model.
1b: Automating Excel via its GUI.
Automating a GUI on Windows is a huge pain, but there are a number of tools that make it a whole lot easier, such as pywinauto. You may be able to just use swapy to write the pywinauto script for you.
If you don't need to do it in Python, separate scripting systems like AutoIt have an even larger user base and even more examples to make your life easier.
2: Doing it all in Python.
xlutils, part of python-excel, may be able to do what you want, without touching Excel at all.

IronPython Excel RTDServer

How do you build a Excel RTDServer in Python/IronPython. (IE I want to implement the IRTDServer interface in Python/IronPython. So I can push data into Excel in real time from Python/IronPython)
I have looked all over the place but the only examples that I'm able to find are in C#.
You can use pyrtd that implements an Excel RTD Server.
You can find the code on Google Code page of the project

Categories

Resources