I want to convert .xls to .xlsx, so I use win32com module
this is my code:
import os
import win32com.client as win32
address = address = os.getcwd()
fname = address + "\\Bundles.xls"
fname2 = address + "\\searchresults.xls"
excel = win32.gencache.EnsureDispatch('Excel.Application')
excel2 = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)
wb5 = excel.Workbooks.Open(fname2)
wb.SaveAs(fname+"x", FileFormat = 51)
wb5.SaveAs(fname2+"x", FileFormat = 51) #FileFormat = 51 is for .xlsx extension
wb.Close()
wb5.Close() #FileFormat = 56 is for .xls extension
excel.Application.Quit()
excel2.Application.Quit()
print('File .xls convert .xlsx successful!!')
then I got the error, here it is the traceback:
Traceback (most recent call last):
File "c:/Users/shenshuaic/Desktop/SFP Program/win32test.py", line 7, in <module>
excel = win32.gencache.EnsureDispatch('Excel.Application')
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\gencache.py", line 527, in EnsureDispatch
disp = win32com.client.Dispatch(prog_id)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\__init__.py", line 96, in Dispatch
return __WrapDispatch(dispatch, userName, resultCLSID, typeinfo, clsctx=clsctx)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\__init__.py", line 37, in __WrapDispatch
klass = gencache.GetClassForCLSID(resultCLSID)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\gencache.py", line 183, in GetClassForCLSID
mod = GetModuleForCLSID(clsid)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\gencache.py", line 226, in GetModuleForCLSID
mod = GetModuleForTypelib(typelibCLSID, lcid, major, minor)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\gencache.py", line 266, in GetModuleForTypelib
AddModuleToCache(typelibCLSID, lcid, major, minor)
File "C:\Users\shenshuaic\AppData\Roaming\Python\Python37\site-packages\win32com\client\gencache.py", line 552, in AddModuleToCache
dict = mod.CLSIDToClassMap
AttributeError: module 'win32com.gen_py.00020813-0000-0000-C000-000000000046x0x1x9' has no attribute 'CLSIDToClassMap'
It looks like you're running into a bug which happens when you use early binding with win32com. My recommendation is if you can use late binding as you won't receive the error. If you do need to use early binding then you can need to delete the auto-generated python code from win32com. If you go to this location on your system:
C:\Users\<USERNAME>\AppData\Local\Temp\gen_py
You'll see some folders in there, where each folder represents the version of python that code was generated for. For example, if you see 3.7 that means python 3.7. Regardless of which one you see go inside and you should see a bunch of different python files. Where each file represents a different object that you've specified early binding for.
All you need to do is just delete the 3.7 folder or whatever it is and re-run your code. That fixes the issue 90% of the time.
Now with your code, I would recommend you modify it a little since you really do not need to have two instances of Excel opened at the same time.
import os
import win32com.client as win32
address = address = os.getcwd()
file_name_1 = address + "\\Bundles.xls"
file_name_2 = address + "\\searchresults.xls"
new_file_name_1 = file_name_1 + "_converted"
new_file_name_2 = file_name_2 + "_converted"
excel = win32.gencache.EnsureDispatch('Excel.Application')
file_1 = excel.Workbooks.Open(file_name_1)
file_2 = excel.Workbooks.Open(file_name_2)
file_1.SaveAs(new_file_name_1, FileFormat = 51)
file_2.SaveAs(new_file_name_2, FileFormat = 51)
file_1.Close()
file_2.Close()
excel.Application.Quit()
print('File .xls convert .xlsx successfully!!')
Related
I am a bit lost. I'm trying to move a bunch of files to a new folder in FTP using python. I have tried a lot of function but what seems to work best is the ftp.rename function. In fact, it works to move only one file at a time to a new folder but it doesn't work to do it for a lot of files (like in my screenshot) using a for loop.
Do you know another technique to move efficiently files in a new folder?
Please help
This is the code to move a single file :
ftp = ftplib.FTP(host, username, password)
ftp.encoding = "utf-8"
FtpImage = ftp.mkd("image")
ftp.rename("img1.png", "/image/img1.png")
ftp.quit()
This is the code to move a bunch of files at the same time :
ftp = ftplib.FTP(host, username, password)
ftp.encoding = "utf-8"
#creating a list with all my files
dirList = ftp.nlst()
#creating a folder
ftpFolder = ftp.mkd("folder1")
#moving my file using their name and adding the folder1 to their name
for file in dirList:
ftp.rename(file, "/folder1/" + file)
# shutil.move(file, "/folder1/" + file )
ftp.quit()
Error that I get when I run the second programm:
DeprecationWarning: The Tix Tk extension is unmaintained, and the tkinter.tix wrapper module is deprecated in favor of tkinter.ttk
from tkinter.tix import IMAGETEXT
Traceback (most recent call last):
File "\\wsl$\Ubuntu\home\q******\projet_python\FTP-sorting\test.py", line 26, in <module>
ftp.rename(file, "/folder1/")
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\ftplib.py", line 604, in rename
return self.voidcmd('RNTO ' + toname)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\ftplib.py", line 286, in voidcmd
return self.voidresp()
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\ftplib.py", line 259, in voidresp
resp = self.getresp()
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\ftplib.py", line 254, in getresp
raise error_perm(resp)
ftplib.error_perm: 550 Rename /folder1/: Device or resource busy
I was able to find a way to sort my different files :
I had first to sort my dirList (list with all the files) with new sub list (like allDivers) and then I used the following code
for file in allDivers:
destination_folder = "/divers/"
destination = destination_folder + file
ftp.rename(file, destination )
I have this code:
import xlrd
path = "C:\\Users\\m.macapanas\\Desktop\\OFCCP_Default_Values.xlsm"
excel_workbook = xlrd.open_workbook(path)
excel_worksheet = excel_workbook.sheet_by_index(0)
#Read from Excel Worksheet
print("Your Worksheet has " + str(excel_worksheet.ncols) + " columns")
print("Your Worksheet has " + str(excel_worksheet.nrows) + " rows")
for row in range (excel_worksheet.nrows):
for col in range(excel_worksheet.ncols):
print(excel_worksheet.cell_value(row, col), end='')
print('\t', end='')
print()
Then the result is error
Traceback (most recent call last):
File "C:/Users/m.macapanas/IdeaProjects/OFCCP Tool/Read Excel File with Python/Pandas.py", line 4, in
excel_workbook = xlrd.open_workbook(path)
File "C:\Users\m.macapanas\AppData\Roaming\Python\Python36\site-packages\xlrd_init_.py", line 141, in open_workbook
ragged_rows=ragged_rows,
File "C:\Users\m.macapanas\AppData\Roaming\Python\Python36\site-packages\xlrd\xlsx.py", line 808, in open_workbook_2007_xml
x12book.process_stream(zflo, 'Workbook')
File "C:\Users\m.macapanas\AppData\Roaming\Python\Python36\site-packages\xlrd\xlsx.py", line 265, in process_stream
meth(self, elem)
File "C:\Users\m.macapanas\AppData\Roaming\Python\Python36\site-packages\xlrd\xlsx.py", line 392, in do_sheet
sheet = Sheet(bk, position=None, name=name, number=sheetx)
File "C:\Users\m.macapanas\AppData\Roaming\Python\Python36\site-packages\xlrd\sheet.py", line 326, in init
self.extract_formulas = book.extract_formulas
AttributeError: 'Book' object has no attribute 'extract_formulas'
According to the xlrd documentation states in a warning:
This library will no longer read anything other than .xls files.
Your error is popping up when you attempt to open a workbook for the file "C:\\Users\\m.macapanas\\Desktop\\OFCCP_Default_Values.xlsm", which has a .xlsm extension.
The xlrd library explicitly doesn't support reading the newer file formats like .xlsm. So you'll either have to switch libraries or find a way to downgrade your input file to supported .xls format.
Issue
Analyze the error
line 4, in excel_workbook = xlrd.open_workbook(path)
Your script fails to open the workbook.
AttributeError: 'Book' object has no attribute 'extract_formulas'
The attribute-error states, it does not find extract_formulas as attribute of xlrd's Book object.
Caused by unsupported file-format .xlsx
As Nathaniel Ford's answer explained:
xlrd (as of current version 2.0.1) only supports older Excel file-format .xls
See also
Pandas cannot open an Excel (.xlsx) file
Why is python xlrd errors when opening a .xlsm instead of .xls
Alternative solution
Research on Stackoverflow gave:
How can I open an Excel file in Python?
Working with Excel Files in Python is a great resources-collection which lists popular libraries.
Ported to OpenPyXL
There on top: openpyxl
The recommended package for reading and writing Excel 2010 files (ie: .xlsx)
After installing using:
pip install openpyxl
Your code might be ported to this library like:
from openpyxl import load_workbook
path = "C:\\Users\\m.macapanas\\Desktop\\OFCCP_Default_Values.xlsm"
excel_workbook = load_workbook(filename = path)
excel_worksheet = excel_workbook. worksheets[0] # first worksheet
# Read from Excel Worksheet
print("Your Worksheet has " + str(excel_worksheet.ncols) + " columns")
print("Your Worksheet has " + str(excel_worksheet.nrows) + " rows")
for row in excel_worksheet.rows:
for col in excel_worksheet.cols:
print(excel_worksheet.cell(row, col), end='')
print('\t', end='')
print()
I have been trying to get openpyxl working with pycharm but the excel documents appear with a question mark, and when I try to run code it says filenotfounderror
import openpyxl as xl
wb = xl.load_workbook("transactions.xlsx")
print(wb)
I expect the output to be the cell values but instead i get this:
Traceback (most recent call last): File
"C:/Users/nicol/.PyCharmCE2019.1/config/scratches/excel_work.py", line
3, in
wb = xl.load_workbook("transactions.xlsx") File "C:\Users\nicol\PycharmProjects\FirstProject\venv\lib\site-packages\openpyxl\reader\excel.py",
line 311, in load_workbook
data_only , keep_links) File "C:\Users\nicol\PycharmProjects\FirstProject\venv\lib\site-packages\openpyxl\reader\excel.py",
line 126, in init
self.archive = _validate_archive(fn) File "C:\Users\nicol\PycharmProjects\FirstProject\venv\lib\site-packages\openpyxl\reader\excel.py",
line 98, in _validate_archive
archive = ZipFile(filename, 'r') File "C:\Users\nicol\AppData\Local\Programs\Python\Python37-32\lib\zipfile.py",
line 1204, in init
self.fp = io.open(file, filemode) FileNotFoundError: [Errno 2] No such file or directory: 'transactions.xlsx'
Add full path to the file like:
C:\Users\mee\Desktop\Test
import openpyxl as xl
wb = xl.load_workbook("C:\Users\mee\Desktop\Test\transactions.xlsx") ' Change your path
print(wb)
you must use full path **
**or change directory
for example
import openpyxl as xl
import os
os.chdir("c:/user/sam/desktop/test")
wb = xl.load_workbook("transactions.xlsx")
print(wb)
transaction in test folder
It is also always a good idea to check if the "filename string" actually refers to a file. In order to check this, use something like
import os
absolute_filename = r"C:\Users\mee\Desktop\Test\transactions.xlsx"
if not os.path.isfile(absolute_filename):
print("ERROR: File not found!")
exit(-1)
This way you can be sure the file is actually there! If it isn't, all libraries (e.g. openpyxl) will throw some sort of error/exception.
I've faced the same issue. It worked with me when I copied the relative path, which is the path starting from the project name.
from openpyxl import Workbook, load_workbook
wb = load_workbook('Projects/automate_excel/book1.xlsx')
enter image description here
I hope it will work with you! also, make sure that your excel file and you Python file are in the same folder.
I'm trying to use pandas to parse an .xlsm document. My code worked perfectly with the example file I was given, but once I got the rest of the documents, it failed with the above error. Here's the offending stack trace:
Traceback (most recent call last):
File "########/UnsupervisedCAM.py", line 9, in <module>
info_dict = read_excel_to_dict('files/' + filename)
File "########\readCAM.py", line 7, in read_excel_to_dict
df = pandas.read_excel(filename, parse_cols='E,G,I,K,Q,O')
File "########\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 191, in read_excel
io = ExcelFile(io, engine=engine)
File "########\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 249, in __init__
self.book = xlrd.open_workbook(io)
File "########\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\__init__.py", line 441, in open_workbook
ragged_rows=ragged_rows,
File "########\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 87, in open_workbook_xls
ragged_rows=ragged_rows,
File "########\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 595, in biff2_8_load
raise XLRDError("Can't find workbook in OLE2 compound document")
xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document
I'm not even sure where to start... Haven't found anything of use online.
I got the same error message and could solve it by removing the password protection of the xlsx-file.
(not saying that it's the only reason for the error, but worth checking!)
After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. I automated the process with the following vbs script:
Dim objFSO, objFolder, objFile
Dim objExcel, objWB
Set objExcel = CreateObject("Excel.Application")
Set objFSO = CreateObject("scripting.filesystemobject")
MyFolder = "<PATH/TO/FILES"
Set objFolder = objfso.getfolder(myfolder)
For Each objFile In objfolder.Files
If Right(objFile.Name,4) = "<EXTENSION>" Then
Set objWB = objExcel.Workbooks.Open(objFile)
objWB.save
objWB.close
End If
Next
objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing
Wscript.Echo "Done"
Make sure to change the path to the folder and extension.
In case you face this issue over Jupyter notebook as I did when searching for the error, you can simply restart the kernel and the issue gets resolved.
I am using Python Openpyxl to import excel files which are generated by a online tool. When I import the files generated in the morning, I got an error like this:
Traceback (most recent call last):
File "test4.py", line 8, in <module>
wb = openpyxl.load_workbook (temp2)
File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 201, in load_workbook
wb.properties = DocumentProperties.from_tree(src)
File "C:\Python27\lib\site-packages\openpyxl\descriptors\serialisable.py", line 89, in from_tree
return cls(**attrib)
File "C:\Python27\lib\site-packages\openpyxl\packaging\core.py", line 106, in__init__
self.modified = modified
File "C:\Python27\lib\site-packages\openpyxl\descriptors\base.py", line 267, in __set__
value = W3CDTF_to_datetime(value)
File "C:\Python27\lib\site-packages\openpyxl\utils\datetime.py", line 40, in W3CDTF_to_datetime
dt = [int(v) for v in match.groups()[:6]]
AttributeError: 'NoneType' object has no attribute 'groups'
The strange thing is I only got this error when I importing the files which are generated by the online tool in the morning. I tried the same file but generated in the afternoon, it works very well. I'm confused where the problem is. There are no fields in the excel files related to time. And the files generated in the morning and in the afternoon are exactly the same except the modified time. Does anybody can help me with it? Thank you.
Excel files created from this online tool isn't well compatible with openpyxl
The function load_workbook will get workbook-level information and assign to Workbook()'s wb.properties from 'docProps/core.xml' by opening excel file through zipfile. One piece of information is modified time.
The value of modified raise the error, it can't be transported into datetime. The pattern of 'modified' must be openpyxl.utils.datetime.W3CDTF_REGEX, which is W3CDTF|W3C Date and Time Formats
You can check the excel's modified time if it corresponds to W3CDTF. Here is the code:
from openpyxl.reader.excel import _validate_archive
archive = _validate_archive('/path/to/yourexcel.xlsx')
valid_files = archive.namelist()
# you'll find 'xx/core.xml' I'm not sure if it's 'docProps/core.xml'
print valid_files
# read 'xx/core.xml'
wb_info = archive.read('docProps/core.xml')
print wb_info
In wb_info, you will find something like
<dcterms:modified xsi:type="dcterms:W3CDTF">2017-04-01T22:48:48Z</dcterms:modified>.
Contrast wb_info of excel files from online tool and your pc.