I am trying to find a library that overwrites an existing cell to change its contents using Python.
what I want to do:
read from .xlsx file
compare cell data determine if change is needed.
change data in cell Eg. overwrite date in cell 'O2'
save file.
I have tried the following libraries:
xlsxwriter
combination of:
xlrd
xlwt
xlutils
openpyxl
xlsxwriter only writes to a new excel sheet and file.
combination: works to read from .xlsx but only writes to .xls
openpyxl: reads from existing file but doesn't write to existing cells can only create new rows and cells, or can create entire new workbook
Any suggestions would greatly be appreciated. Other libraries? how to manipulate the libraries above to overwrite data in an existing file?
from win32com.client import Dispatch
import os
xl = Dispatch("Excel.Application")
xl.Visible = True # otherwise excel is hidden
# newest excel does not accept forward slash in path
wbs_path = r'C:\path\to\a\bunch\of\workbooks'
for wbname in os.listdir(wbs_path):
if not wbname.endswith(".xlsx"):
continue
wb = xl.Workbooks.Open(wbs_path + '\\' + wbname)
sh = wb.Worksheets("name of sheet")
sh.Range("A1").Value = "some new value"
wb.Save()
wb.Close()
xl.Quit()
Alternatively you can use xlwing, which (if I had to guess) seems to be using this approach under the hood.
>>> import xlwings as xw
>>> wb = xw.Book() # this will create a new workbook
>>> wb = xw.Book('FileName.xlsx') # connect to an existing file in the current working directory
>>> wb = xw.Book(r'C:\path\to\file.xlsx') # on Windows: use raw strings to escape backslashes
Related
After saving my dataframe to a csv in a specific location, the csv file doesn't appear in the location I saved it to. Is there any reason why it possibly is not showing?
Here is the code to save my dataframe to csv:
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Even changing an empty df does not seem to work.
import pandas as pd
olympics={}
df = pd.DataFrame(olympics)
df.to_csv(r'C:\Users\gibso\OneDrive\Documents\JOSEPH\export_dataframe.csv', index = False)
Thanks for the help!
I would rather use the module openpyxl. Example of saving:
import openpyxl
workbook = openpyxl.Workbook()
sheet = workbook.active
# Work on your workbook. Once finished:
workbook.save(file_name) # file_name is a variable you must define
Don't forget installing openpyxl with pip first!
I have multiple excel workbooks with the same format but different monthly data. I want to copy these data into an existing worksheet under an existing Master wkbook (same data format with the other workbooks)& without losing the formatting in the Master file using python
I have tried using xlwings and pywin libraries. The xlwings code below was able to copy the contents of a source wkbk into the Result wkbook but however into a separate sheet. I want the data to be copied into a specified sheet of the Master wkbook!(Both libraries generated the same result)
#Using xlwings
import xlwings as wx
path1='C:\\Users\\G852589\\data transfer\\data1.xlsx'
#path0 = 'C:\\Users\\G852589\\data transfer\\data2.xlsx'
path2='C:\\Users\\G852589\\data transfer\\Result.xlsx'
wb1 = xw.Book(path1)
wb2 = xw.Book(path2)
ws1 = wb1.sheets(1)
ws1.api.Copy(Before=wb2.sheets(1).api)
wb2.save()
wb2.app.quit()
#Using pywin32
import os
import win32com.client as win32
from win32com.client import Dispatch
path1='C:\\Users\\G852589\\data transfer\\data1.xlsx'
#path0 = 'C:\\Users\\G852589\\data transfer\\data2.xlsx'
path2='C:\\Users\\G852589\\data transfer\\Result.xlsx'
xl=Dispatch('Excel.Application')
xl.Visible = True
wb1= xl.Workbooks.Open(Filename=path1)
wb2= xl.Workbooks.Open(Filename=path2)
ws1 =wb1.Worksheets(1)
ws1.Copy(Before=wb2.Worksheets(1))
wb2.Close(SaveChanges=True)
xl.Quit()
I need to be able to copy multiple data from several workbook sheets into a specified existing sheets in the Result workbook
I have attached screenshot to show the visual representation of what I am trying to achieve. data 1&2 are the original data files, the result sheet is how I want my Master workbook to look like after the files have been copied.
https://i.stack.imgur.com/0G4lM.png
Use pandas library for this:
import pandas as pd
import os
# collect files names
files_list = os.listdir('files_folder')
# collect data frames from each file
data_list = []
for file in files_list:
df = pd.read_excel('files_folder/'+file)
data_list.append(df)
# concat all data frames into one
result = pd.concat(data_list, sort=True)
result.to_excel('final_data.xlsx')
I have created an excel sheet using XLWT plugin using Python. Now, I need to re-open the excel sheet and append new sheets / columns to the existing excel sheet. Is it possible by Python to do this?
After investigation today, (2014-2-18) I cannot see a way to read in a XLS file using xlwt. You can only write from fresh. I think it is better to use openpyxl. Here is a simple example:
from openpyxl import Workbook, load_workbook
wb = Workbook()
ws = wb.create_sheet()
ws.title = 'Pi'
ws.cell('F5').value = 3.14156265
wb.save(filename=r'C:\book2.xls')
# Re-opening the file:
wb_re_read = load_workbook(filename=r'C:\book2.xls')
sheet = wb_re_read.get_sheet_by_name('Pi')
print sheet.cell('F5').value
See other examples here: http://pythonhosted.org/openpyxl/usage.html (where this modified example is taken from)
You read in the file using xlrd, and then 'copy' it to an xlwt Workbook using xlutils.copy.copy().
Note that you'll need to install both xlrd and xlutils libraries.
Note also that not everything gets copied over. Things like images and print settings are not copied, for example, and have to be reset.
I would like to access worksheets of a spreadsheet. I've copied the main workbook to another workbook using xlutils.copy(). But don't know the right way to access worksheets using xlwt module.
My sample code:
import xlrd
import xlwt
from xlutils.copy import copy
wb1 = xlrd.open_workbook('workbook1.xls', formatting_info=True)
wb2 = copy(master_wb)
worksheet_name = 'XYZ' (worksheet_name is a iterative parameter)
worksheet = wb2.get_sheet(worksheet_name)
Could someone please tell me what's the right command line to access the existing worksheets in a workbook using xlwt module? I know we can use 'add_sheet' method to add a worksheet in the existing workbook using xlwt module.
Any help, appreciated.
You can do sheets = wb1.sheets() to get a list of sheet objects, then call .name on each to get their names. To find the index of your sheet, use
[s.name for s in sheets].index(sheetname)
The sheets() method is curiously absent from the xlwt.Workbook class, so the other answer using that method will not work - only xlrd.book (for reading XLS files) has a sheets() method. Because all the class attributes are private, you have to do something like this:
def get_sheet_by_name(book, name):
"""Get a sheet by name from xlwt.Workbook, a strangely missing method.
Returns None if no sheet with the given name is present.
"""
# Note, we have to use exceptions for flow control because the
# xlwt API is broken and gives us no other choice.
try:
for idx in itertools.count():
sheet = book.get_sheet(idx)
if sheet.name == name:
return sheet
except IndexError:
return None
If you don't need it to return None for a non-existent sheet then just remove the try/except block. If you want to access multiple sheets by name repeatedly it would be more efficient to put them in a dictionary, like this:
sheets = {}
try:
for idx in itertools.count():
sheet = book.get_sheet(idx)
sheets[sheet.name] = sheet
except IndexError:
pass
Well, here is my answer. Let me take it step-by-step.
Considerting previous answers, xlrd is the right module to get the worksheets.
xlrd.Book object is returned by open_workbook.
rb = open_workbook('sampleXLS.xls',formatting_info=True)
nsheets is an attribute integer which returns the total number of sheets in the workbook.
numberOfSheets=rb.nsheets
Since you have copied this to a new workbook wb -> basically to write things, wb to modify excel
wb = copy(rb)
there are two ways to get the sheet information,
a. if you just want to read the sheets, use sheet=rb.sheet_by_index(sheetNumber)
b. if you want to edit the sheet, use ws = wb.get_sheet(sheetNumber) (this is required in this context to the asked question)
you know how many number of sheets in excel workbook now and how to get them individually,
putting all of them together,
Sample Code:
reference: http://www.simplistix.co.uk/presentations/python-excel.pdf
from xlrd import open_workbook
from xlutils.copy import copy
from xlwt import Workbook
rb = open_workbook('sampleXLS.xls',formatting_info=True)
numberOfSheets=rb.nsheets
wb = copy(rb)
for each in range(sheetsCount):
sheet=rb.sheet_by_index(each)
ws = wb.get_sheet(each)
## both prints will give you the same thing
print sheet.name
print ws.name
hi im using the xlrd module to read an excel file. How can i rename the first worksheet of each excel file.
Thank you.
I don't think you can modify files with either xlrd or xlwt. You can however copy the file with xlrd and then modify and write the copy with xlwt.
Here's an example adapted from here: writing to existing workbook using xlwt:
from xlutils.copy import copy
from xlrd import open_workbook
# open the file you're interested
rb = open_workbook('some_document.xlsx')
# copy it to a writable variant
wb = copy(rb)
# find the index of a sheet you wanna rename,
# let's say you wanna rename Sheet1
idx = rb.sheet_names().index('Sheet1')
# now rename the sheet in the writable copy
wb.get_sheet(idx).name = u'Renamed Sheet1'
# save the new spreadsheet
wb.save('new_some_document.xlsx')
# done