In my company, we use Linux in development and production environment. But we have a machine running Windows and Excel because we use a third party application excel addin to get financial market data to the machine. The add-in provides some functions (just like Excel function) for getting these datas into the local machine and then sending back to a MySql Database. We've also developed some VBA script to automation the task but still not satisfy with the result.
I'm considering using Python to do all these jobs, but before jumping in, i need to find a python package that can do
Use python to manipulate Excel (with its add-ins) and use its functions without opening Excel?
If I need to open Excel, I need to automate that task of executing the script every day, or in specific moment of the day (the market data need to be feed a specific time)
Thanks for suggestion
You'll need the Python Win32 extensions - http://sourceforge.net/projects/pywin32/
(now migrated to GitHub: https://github.com/mhammond/pywin32)
Then you can use COM.
from win32com.client import Dispatch
excel = Dispatch('Excel.Application')
wb = excel.Workbooks.Open(r'c:\path\to\file.xlsx')
ws = wb.Sheets('My Sheet')
# do other stuff, just like VBA
wb.Close()
excel.Quit()
You can put your script on Windows Task Scheduler to run for the times you need.
As an alternative, you might consider openpyxl.
import openpyxl
wb= openpyxl.Workbook()
ws = wb.get_active_sheet()
ws.title = 'My Title'
wb.save('C:\\Development\\Python\\alpha.xlsx')
Here's a chapter from the book I'm working through.
https://automatetheboringstuff.com/chapter12/
Luck
Related
I am regularly receiving a spreadsheet from an external source (via google docs) that I have to convert into a local (kinda proprietary) format. To do that, I have written a script that can convert the spreadsheet as an ODS file into the needed format (non-ODS).
This script needs to interact with a lot of higher-level business-specific PHP stuff, so I use PhpSpreadsheet for this purpose (https://github.com/PHPOffice/PhpSpreadsheet/).
This PHP library does theoretically everything I need, but it cannot deal with overly complex spreadsheets without taking an gigantic amount of time dealing with all the cross-referencing formulas. To speed up the processing in the script, I manually prepare the ODS file by hand by converting all formulas to values (Select all Cells in the needed Sheets, then trigger [Data] > [Calculate] > [Formula to Value]) in the needed sheets. Then I delete all the unneeded sheets (which otherwise only contain source-data for the replaced formulas). The resulting file is a lot smaller and does not contain any formulas. The execution of the PHP script finishes within a few seconds with the simplified spreadsheet file, while it runs out of memory after a long while with the original spreadsheet file.
I now seek to automate this process of converting all the formulas to values using a new python script (This needs to happen on a linux server, so my best bet would be a headless libreoffice controlled via an UNO socket in python, correct?).
So far I have managed to connect to the libreoffice UNO socket and manipulate the cells via the old OpenOffice-API (https://www.openoffice.org/api/docs/common/ref/com/sun/star/sheet/module-ix.html).
My current big question is:
How do I access the UI-Formula to Value-functionality on all cells of a sheet at once via the UNO API in Python?
I have tried searching the old OpenOffice API documentation for this for a while, but so far I cannot find what I am looking for.
Currently the python script looks (in essence) like this:
import uno
localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext(
"com.sun.star.bridge.UnoUrlResolver",
localContext
)
context = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
serviceManager = context.ServiceManager
desktop = serviceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)
# com.sun.star.lang.XComponent / com.sun.star.sheet.SpreadsheetDocument
document = desktop.getCurrentComponent()
# com.sun.star.sheet.XSpreadsheets / XNameAccess
sheets = document.getSheets()
# com.sun.star.sheet.XSpreadsheet
# https://www.openoffice.org/api/docs/common/ref/com/sun/star/sheet/XSpreadsheet.html
sheet = sheets.getByName('OneOfTheSheets')
#print(sheet.getCellRangeByName("A1:AP1000"))
# WAY TOO SLOW AND DESTRUCTIVE:
for row in range(0, 1000):
for column in range(0, 42):
cell = sheet.getCellByPosition(column, row)
cell.setFormula(cell.getString())
Thank you for any help you can provide.
I am new Python user and trying to get an Excel spreadsheet to scroll automatically in a loop as a video test.
Using VBA it seems that the SmallScroll method is an easy way to scroll Excel
Example:
Worksheets("Sheet1").Activate
ActiveWindow.SmallScroll down:=3
I can create an Excel worksheet form Python( Tutorial on using VBA from Python)
import win32com.client
ExcelApp = win32com.client.Dispatch("Excel.Application")
ExcelApp.visible = True
#This creates a new workbook
ExcelWorkbook = ExcelApp.Workbooks.Add()
# Add a new sheet
Excelwkrsht = ExcelWorkbook.Worksheets.Add()
However, if I try to access method
scroll1 = Excelwkrsht.SmallScroll(3)
I get an error.
Any thoughts?
Thanks.
If you google for Automate Tasks Python you will find some material.
If you are trying to automate tasks with Excel, there is a module for that.
You can install the openpyxl Module.
There is a full chapter here where you can find real nice info!
Hope that helps you!
I'm currently implementing a tool to automise parts of my daily work. Therefore I need to create a python tool which creates an excel-file (workbook) with several informations and encrypts the sheets of the file.
The first part which creates the file and fills it with the data works perfectly.
But the encryption doesn't work at all.
I'm using win32com, win32com.client and openpyxl. The workbook hast two different sheets, named "1" and "2".
My Workbook:
import win32com.client
import os, sys, win32com, os.path, time
excel = win32com.client.Dispatch("Excel.Application")
excel.Visible = True
workbook = excel.Workbooks.Open(reading_path) ####this is the path where the file is stored
sheet = workbook.Worksheets(1)
So I searched through other topics and got the following:
import openpyxl
sheet.protection.set_password('test')
sheet.save(saving_path)
Unfortunately this doesn't work... My shell response an AttributeError. In Detail:
AttributeError: <unknown>.set_password
Does someone knows another way how to encrypt just the pages in excel with python?
Thanks a lot for your help!
It is not entirely clear what you mean by "encrypting the sheet" as the openpyxl code you refer to has nothing to do with encryption; see the warning in the documentation. Excel does support encryption of entire workbooks though, but that appears to be different from what you want.
In any case, your code fails because the sheet you get from win32com is a wildly different beast than what openpyxl expects. For example, sheet being based on COM requires an Excel process to run for manipulation to be possible, while openpyxl does not even require Excel to be available on the host machine.
Now in your particular case, you do not actually need openpyxl (although you might find that using it over win32com has plenty of benefits), and you could stay entirely within COM. As such, adding password protection is possible through Worksheet.Protect which in your case would boil down to simply running
sheet.Protect('test')
Is there a way to update a spreadsheet in real time while it is open in Excel? I have a workbook called Example.xlsx which is open in Excel and I have the following python code which tries to update cell B1 with the string 'ID':
import openpyxl
wb = openpyxl.load_workbook('Example.xlsx')
sheet = wb['Sheet']
sheet['B1'] = 'ID'
wb.save('Example.xlsx')
On running the script I get this error:
PermissionError: [Errno 13] Permission denied: 'Example.xlsx'
I know its because the file is currently open in Excel, but was wondering if there is another way or module I can use to update a sheet while its open.
I have actually figured this out and its quite simple using xlwings. The following code opens an existing Excel file called Example.xlsx and updates it in real time, in this case puts in the value 45 in cell B2 instantly soon as you run the script.
import xlwings as xw
wb = xw.Book('Example.xlsx')
sht1 = wb.sheets['Sheet']
sht1.range('B2').value = 45
You've already worked out why you can't use openpyxl to write to the .xlsx file: it's locked while Excel has it open. You can't write to it directly, but you can use win32com to communicate with the copy of Excel that is running via its COM interface.
You can download win32com from https://github.com/mhammond/pywin32 .
Use it like this:
from win32com.client import Dispatch
xlApp = Dispatch("Excel.Application")
wb=xlApp.Workbooks.Item("MyExcelFile.xlsx")
ws=wb.Sheets("MyWorksheetName")
At this point, ws is a reference to a worksheet object that you can change. The objects you get back aren't Python objects but a thin Python wrapper around VBA objects that obey their own conventions, not Python's.
There is some useful if rather old Python-oriented documentation here: http://timgolden.me.uk/pywin32-docs/contents.html
There is full documentation for the object model here: https://msdn.microsoft.com/en-us/library/wss56bz7.aspx but bear in mind that it is addressed to VBA programmers.
If you want to stream real time data into Excel from Python, you can use an RTD function. If you've ever used the Bloomberg add-in use for accessing real time market data in Excel then you'll be familiar with RTD functions.
The easiest way to write an RTD function for Excel in Python is to use PyXLL. You can read how to do it in the docs here: https://www.pyxll.com/docs/userguide/rtd.html
There's also a blog post showing how to stream live tweets into Excel using Python here: https://www.pyxll.com/blog/a-real-time-twitter-feed-in-excel/
If you wanted to write an RTD server to run outside of Excel you have to register it as a COM server. The pywin32 package includes an example that shows how to do that, however it only works for Excel prior to 2007. For 2007 and later versions you will need this code https://github.com/pyxll/exceltypes to make that example work (see the modified example from pywin32 in exceltypes/demos in that repo).
You can't change an Excel file that's being used by another application because the file format does not support concurrent access.
Is it possible, When exporting a dataset from SPSS to Excel, to control the name of the worksheet the data is being saved into ? The "SAVE TRANSLATE OUTFILE" command does not allow for this. I have SPSS 21, with Python installed (although I am fairly new to Python...)
Yes. See this weblink on IBM website for details.
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
SAVE TRANSLATE
/TYPE=ODBC
/CONNECT='DSN=Excel Files;DBQ=C:\Daten\Temp\EmployeeDataExcelExport.xlsx;'
/ENCRYPTED
/MISSING=IGNORE
/REPLACE
/TABLE='EmployeeData'.
EDIT:
The syntax provided in the link on IBM website does NOT work for me however the below does:
save translate
/connect="dsn=excel files;dbq=C:\Temp\EmployeeDataExcelExport.xls;driverid=790;maxbuffersize=2048;pagetimeout=5;"
/table="EmployeeData"
/type=odbc /map /replace.
SAVE TRANSLATE relies on ODBC drivers, which means that your Statistics and Office bitness has to match - 64-bit Statistics with 32-bit Office won't work. Otherwise, you can write to an Excel file with SAVE TRANSLATE and then use VBA automation via a Basic script in Statistics to rename the sheet. There is a Basic module available from the SPSS Community website that writes output tables to an Excel file that does some sheet renaming that you could adapt for your purposes.
You can find the module here
https://www.ibm.com/developerworks/community/files/app?lang=en#/file/8e0dfcb6-aa57-4639-a20e-1780010cfe83