I have a Python script (gui.py - a calculation program, written by somebody else). I try to run this from excel with input data from the same excel file. It works fine when I start the script as:
objShell.Run PythonExe & PythonScript3
In the python script I used the following to get f.ex data from H5 cell, works also fine:
import openpyxl
wb = openpyxl.load_workbook("01.xlsm")
ws = wb.active
mastNumber = ws['H5']
But I don't want to edit a lot in gui.py (have just simple changes) so I planned to use a 2nd script (caller.py) which gets the data from Excel, then I import this to gui.py and there I just use the variable from caller.py. It works also as long as I start gui.py directly. When I start it from Excel I get an error msg.
error msg
So as long as the flow is not Excel -> gui-py -> which imports caller to get data from same Excel file -everything works fine.
I am open to any solution for this problem or a completely new approach if there is better for somebody with limited programming skills.
Looking at the error, can you try putting complete path of the excel file at line wb = openpyxl.load_workbook("01.xlsm") instead of just 01.xlsm.
I'm trying to create a Python script (I'm using Python 3.7.3 with UTF-8 encoding on Windows 10 64-bit with Microsoft Office 365) that exports user selected worksheets to PDF, after the user has selected the Excel-files.
The Excel-files contain a lot of different settings for page setup and each worksheet in each Excel-file has a different page setup.
The task is therefore that I need to read all current variables regarding page setup to be able to assign them to the related variables for export.
The problem is when I'm trying to get Excel to return the current print area of the worksheet, which I can't figure out.
As far as I understand I need to be able to read the current print area, to be able to set it for the export.
The Excel-files are a mixture of ".xlxs" and ".xlsm".
I've tried using all kind of different methods from the Excel VBA documentation, but nothing has worked so far e.g. by adding ".Range" and ".Address" etc.
I've also tried the ".UsedRange", but there is no significant difference in the cells that I can search for and I can't format them in a specific way so I can't use this.
I've also tried using the "IgnorePrintAreas = False" variable in the "ExportAsFixedFormat"-function, but that didn't work either.
#This is some of the script.
#I've left out irrelevant parts (dialogboxes etc.) just to make it shorter
#Import pywin32 and open Excel and selected workbook.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch("Excel.Application")
excel.Visible = False
wb = excel.Workbooks.Open(wb_path)
#Select the 1st worksheet in the workbook
#This is just used for testing
wb.Sheets([1]).Select()
#This is the line I can't get to work
ps_prar = wb.ActiveSheet.PageSetup.PrintArea
#This is just used to test if I get the print area
print(ps_prar)
#This is exporting the selected worksheet to PDF
wb.Sheets([1]).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, pdf_path, Quality = 0, IncludeDocProperties = True, IgnorePrintAreas = False, OpenAfterPublish = True)
#This closes the workbook and the Excel-file (although Excel sometimes still exists in Task Manager
wb.Close()
wb = None
excel.Quit()
excel = None
If I leave the code as above and try and open a test Excel-file (.xlxs) with a small PrintArea (A1:H8) the print function just gives me a blank line.
If I add something to .PrintArea (as mentioned above) I get 1 of 2 errors:
"TypeError: 'str' object is not callable".
or
"ps_prar = wb.ActiveSheet.PageSetup.PrintArea.Range
AttributeError: 'str' object has no attribute 'Range'"
I'm hoping someone can help me in this matter - thanks, in advance.
try
wb = excel.Workbooks.OpenXML(wb_path)
insead of
wb = excel.Workbooks.Open(wb_path)
My problem was with a german version of ms-office. It works now. Check here https://social.msdn.microsoft.com/Forums/de-DE/3dce9f06-2262-4e22-a8ff-5c0d83166e73/excel-api-interne-namen?forum=officede
I wrote a Python script to prompt users for a spreadsheet to rename photos automatically based on the column called "Rename_Code" in the spreadsheet. The column is extracted and saved as a batch file, called "Rename_Code.bat". Below is the script:
import pandas as pd from tkinter.filedialog
import askopenfilename
#prompt user to browser excel sheet and load the spreadsheet in python
xl = pd.ExcelFile(askopenfilename() , index_col=None)
# load a sheet into a dataframe called df1
df1 = xl.parse('Sheet1')
# assign "Rename Code" column as rename_col
rename_col = df1['Rename_Code']
# extract the column and write as a batch file
rename_col.to_csv('Rename_Code.bat', index=False)
from subprocess import Popen p = Popen("Rename_Code.bat", cwd=r"C:\Users\username\Documents\XXXXX") stdout, stderr = p.communicate()
What I am trying to do is convert this script (in .pyw format) into an .exe so that anyone can use it without having Python installed in their computers. I used py2exe to convert the script. However, it kicked an error when I double clicked on the .exe. It says "Failed to launch application".
I am wondering if there is something wrong with the script. Alternatively, is there a better way to execute this, perhaps with Window Services? If yes, how? I am a novice programmer, most of my learning in programming involves a lot of trial and error and research.
I truly appreciate your feedback and help!
I'm on a Debian GNU/Linux computer, working with Python 2.7.9.
As a part of my job, I have been making python scripts that read inputs in various formats (e.g. Excel, Csv, Txt) and parse the information to more standarized files. It's not my first time opening or working with Excel files.
There's a particular file which is giving me problems, I just can't open it. When I tried with xlrd (version 0.9.3), it gave me the following error:
xlrd.open_workbook('sample.xls')
XLRDError: Unsupported format, or corrupt file: BOF not
workbook/worksheet: op=0x0009 vers=0x0002 strm=0x000a build=0 year=0
-> BIFF21
I tried to investigate the matter on my own, found a couple of answers in StackOverflow but I couldn't open it anyway. This particular answer I found may be the problem (the second explanation), but it doesn't include a workaround: https://stackoverflow.com/a/16518707/4345659
A tool that could conert the file to csv/txt would also solve the problem.
I already tried with:
xlrd
openpyxl
xlsx2csv (the shell tool)
A sample file is available here:
https://ufile.io/r4m6j
As a side note, I can open it with LibreOffice Calc and MS Excel, so I could eventually change it to csv that way. The thing is, I need to do it all with a python script.
Thanks in advance!
It seems like MS Problem. The xls file is very strange, maybe you should contact xlrd support.
But I have a crazy workaround for you: xls2ods. It works for me even though xls2csv doesn't (SiC!).
So, install catdoc first:
$sudo apt-get install catdoc
Then convert your xls file to ods and open ods using pyexcel_ods or whatever you prefer. To use pyexcel_ods install it first using pip install pyexcel_ods.
import subprocess
from pyexcel_ods import get_data
file_basename = 'sample'
returncode = subprocess.call(['xls2ods', '{}.xls'.format(file_basename)])
if returnecode > 0:
# consider to use subprocess.Popen if you need more control on stderr
exit(returncode)
data = get_data('{}.ods'.format(file_basename))
print(data)
I'm getting following output:
OrderedDict([(u'sample',
[[u'labo',
u'codfarm',
u'farmacia',
u'direccion',
u'localidad',
u'nom_medico',
u'matricula',
u'troquel',
u'producto',
u'cant_total']])])
Here is a kludge I would use:
Assuming you have LibreOffice on Debian, you could either convert all your *.xls files into *.csv using:
import os
os.system("libreoffice --headless --convert-to csv *.xls")
#or use os.call
... and then work consistently with csv.
Or you could convert only the corrupted file(s) when needed using a try/except block:
import os
try:
xlrd.open_workbook('sample.xls')
except XLRDError:
os.system("libreoffice --headless --convert-to csv sample.xls")
# mycsv = open("sample.csv", "r")
# for line in mycsv.readlines():
# ...
# ...
OBS: Keep LibreOffice closed while running the script.
Alternatively there are other tools out there to do the conversion. Here is one (which I have not tested): https://github.com/dilshod/xlsx2csv
If you are targeting windows, if you have Excel installed, and if you are familiar with Excel VBA, you will have a quick solution using the comtypes package:
http://pythonhosted.org/comtypes/
You will have direct access to Excel by its COM interfaces.
This code open an xls file and saves it as a cvs file, using the comtypes package:
import comtypes.client as cl
progId = "Excel.Application.15"
xl = cl.CreateObject(progId)
wb = xl.Workbooks.Open(r"C:\Users\aUser\Desktop\thermoList.xls")
wb.SaveAs(r"C:\Users\aUser\Desktop\thermoList.csv",FileFormat=6)
xl.DisplayAlerts = False
xl.Quit()
I could not test it with "sample.xls" which is corrupt.
Your could try with another file.
You might need to adjust the progId according to your version of Excel.
It's a file format issue. I'm not sure what file type is it but it's not Excel. I just open and saved the file with sample2.xls name and compare the types:
How are you creating this file?
If you need to get the words as a list of strings:
text_file = open("sample.xls", "r")
lines = text_file.read().replace(chr(200), '').replace(chr(0), '').replace(chr(1), '').replace(chr(5), '').replace(chr(2), '').replace(chr(3), '').replace(chr(4), '').replace(chr(6), '').replace(chr(7), '').replace(chr(8), '').replace(chr(9), '').replace(chr(10), '').replace(chr(12), '').replace(chr(15), '').replace(chr(16), '').replace(chr(17), '').replace(chr(18), '').replace(chr(49), '').replace('Arial', '')
for line in lines.split(chr(128)):
print(line)
the output:
The file you provided is corrupted, so there is no way for other responders to test it and recommend a good solution. And exception you posted confirming that.
As a solution you can try to debug some things, please see some steps below:
You mentioned you tried the xlrd library. Try to check if your xlrd module is upto date by executing this:
Python 2.7.9
>>> import xlrd
>>> xlrd.__VERSION
update to the latest official version if needed
Try to open any other *.xls file and see if it works with Python version you're using and current library.
Check module documentation it's pretty good, and there are some different things described how to use this module on various platforms( Win vs. Linux)http://xlrd.readthedocs.io/en/latest/dates.html
You always can rich out to the community (there is still a chance that you might be getting into some weird state or bug) the link is here https://github.com/python-excel/xlrd/issues
Hope that helps.
Unable to open your Excel either. Just as yadayada said, I think it is the problem of data source. If you really want to figure out the reason, I suggest you ask questions about the excel instead of python.
It's always work for me with any xls or xlsx files:
def csv_from_excel(filename_xls, filename_csv):
wb = xlrd.open_workbook(filename_xls, encoding_override='YOUR_ENCODING_HERE (f.e. "cp1251"')
sh = wb.sheet_by_index(0)
your_csv_file = open(filename_csv, 'wb')
wr = unicodecsv.writer(your_csv_file)
for rownum in xrange(sh.nrows):
wr.writerow(sh.row_values(rownum))
your_csv_file.close()
So, i don't work directly with excel file before convert them to csv. Mb it will help you
Why do I receive this warning message every time I run my code? (below). Is it possible to get rid of it? If so, how do I do that?
My code:
from openpyxl import load_workbook
from openpyxl import Workbook
wb = load_workbook('NFL.xlsx', data_only = True)
ws = wb.active
sh = wb["Sheet1"]
ptsDiff = (sh['J127'].value)
print ptsDiff
The code works but I get this warning message:
Warning (from warnings module):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/openpyxl/reader/worksheet.py", line 320
warn(msg)
UserWarning: Unknown extension is not supported and will be removed
This error happens when openpyxl cannot understand/read an extension (source). Here is the list of built-in extensions openpyxl currently knows that is doesn't support:
Conditional Formatting
Data Validation
Sparkline Group
Slicer List
Protected Range
Ignored Error
Web Extension
Slicer List
Timeline Ref
Also see the Worksheet extension list specification.
Try to add single quotes to your data_only parameter like this:
wb = load_workbook('NFL.xlsx', data_only = **'True'**)
This works for me.
Using python 3.5 under Anaconda3, Excel 2016, Windows10 -- I had the same problem initially with an xlsx file. Tried to make it into a csv and did not work. What worked was: select the entire spreadsheet, copy on a Notepad, select the notepad text, paste in a new spreadsheet, save as xslx. It looks like any extra formatting would result in a warning.
It is already listed in the first answer what is wrong with it If you only want to get rid of the error that is given in red for some reason. You can go to the file location of the error and # the line where is says warn(msg) this will stop the error being displayed the code still works fine in my experience.I am not sure if this will work after compiled but this should work in the same machine.
PS:I had the same error and this is what I did because I though it could be confusing for the end user
PS:You can use a try and except error catcher too but this is quicker.