I have created an excel sheet using XLWT plugin using Python. Now, I need to re-open the excel sheet and append new sheets / columns to the existing excel sheet. Is it possible by Python to do this?
After investigation today, (2014-2-18) I cannot see a way to read in a XLS file using xlwt. You can only write from fresh. I think it is better to use openpyxl. Here is a simple example:
from openpyxl import Workbook, load_workbook
wb = Workbook()
ws = wb.create_sheet()
ws.title = 'Pi'
ws.cell('F5').value = 3.14156265
wb.save(filename=r'C:\book2.xls')
# Re-opening the file:
wb_re_read = load_workbook(filename=r'C:\book2.xls')
sheet = wb_re_read.get_sheet_by_name('Pi')
print sheet.cell('F5').value
See other examples here: http://pythonhosted.org/openpyxl/usage.html (where this modified example is taken from)
You read in the file using xlrd, and then 'copy' it to an xlwt Workbook using xlutils.copy.copy().
Note that you'll need to install both xlrd and xlutils libraries.
Note also that not everything gets copied over. Things like images and print settings are not copied, for example, and have to be reset.
Related
I get a huge Excel-Sheet (normal table with header and data) on a regular basis and I need to filter and delete some data and split the table up into seperate sheets based on some rules. I think I can save me some time if I use Python for that tedious task because the filtering, deleting and splitting up into several sheets is based on always the same rules that can logically be defined.
Unfortunately the sheet and the data is partially color-coded (cells and font) and I need to maintain this formating for the resulting sheets. Is there a way of doing that with python? I think I need a pointer in the right direction. I only found workarounds with pandas but that does not allow me to keep the formatting.
You can take a look at an excellent Python library for Excel called openpyxl.
Here's how you can use it.
First, install it through your command prompt using:
pip install openpyxl
Open an existing file:
import openpyxl
wb_obj = openpyxl.load_workbook(path) # Open notebook
Deleting rows:
import openpyxl
from openpyxl import load_workbook
wb = load_wordbook(path)
ws = wb.active
ws.delete_rows(7)
Inserting rows:
import openpyxl
from openpyxl import load_workbook
wb = load_wordbook(path)
ws = wb.active
ws.insert_rows(7)
Here are some tutorials that you can take a look at:
Tutorial 1
Youtube Video
I want to add some object to my excel sheet,
I'm using openpyxl,
In excel you do it by:
Insert->Object
Is there a way to do it thru openpyxl or any other excel tool that working with python?
While this is not currently possible with openpyxl I suspect it would be fairly straightforward to add the relevant functionality using the add_image() method as a starting place.
import openpyxl
wb = openpyxl.Workbook()
ws = wb.worksheets[0]
picture = openpyxl.drawing.Image('/path/to/picture')
picture.anchor(ws.cell('cell to put the image'))
ws.add_image(picture)
wb.save('whatever you want to save the workbook as')
This code of course refers to creating a new workbook and adding the image into it. To add the image to your preexisting workbook you would obviously just load that workbook using load_workbook.
I want to fill in some pandas data frames into an existing excel file. I followed the instructions in:
How to write to an existing excel file without overwriting data (using pandas)?
using:
from openpyxl import load_workbook
import pandas as pd
import numpy as np
book=load_workbook("excel_proc.xlsx")
writer=pd.ExcelWriter("excel_proc.xlsx", engine="openpyxl")
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_df.to_excel(writer, sheet_name="example", startrow=100, startcol=5, index=False)
writer.save()
However, the existing sheets will be deleted, the "example" sheet is generated and only the df is integrated at the defined location. What did I do wrong? I want the "data_df" written into the existing excel file in the existing "example" sheet, keeping the other sheets and data.
Thanks
Example df:
data_df=pd.DataFrame(np.arange(12).reshape((2, 6)), index=["Time","Value"])
I resolved the problem on my own. I realised that even load_workbook cannot load my file. Therefore, I updated the openpyxl package (conda install openpyxl). The version not working was : v2.3.2 (python 35). The version now working is: v2.4.0.
I do not really know, if it was the reason at the end. But now the excels are filled in the defined locations and the data is kept.
You might be interested in learning xlwings, which makes it a lot easier to work with excel files from python.
In any case I would start by reading the existing data in the sheet, combine the data as you wish in python, and finally overwrite the sheet.
I am trying to find a library that overwrites an existing cell to change its contents using Python.
what I want to do:
read from .xlsx file
compare cell data determine if change is needed.
change data in cell Eg. overwrite date in cell 'O2'
save file.
I have tried the following libraries:
xlsxwriter
combination of:
xlrd
xlwt
xlutils
openpyxl
xlsxwriter only writes to a new excel sheet and file.
combination: works to read from .xlsx but only writes to .xls
openpyxl: reads from existing file but doesn't write to existing cells can only create new rows and cells, or can create entire new workbook
Any suggestions would greatly be appreciated. Other libraries? how to manipulate the libraries above to overwrite data in an existing file?
from win32com.client import Dispatch
import os
xl = Dispatch("Excel.Application")
xl.Visible = True # otherwise excel is hidden
# newest excel does not accept forward slash in path
wbs_path = r'C:\path\to\a\bunch\of\workbooks'
for wbname in os.listdir(wbs_path):
if not wbname.endswith(".xlsx"):
continue
wb = xl.Workbooks.Open(wbs_path + '\\' + wbname)
sh = wb.Worksheets("name of sheet")
sh.Range("A1").Value = "some new value"
wb.Save()
wb.Close()
xl.Quit()
Alternatively you can use xlwing, which (if I had to guess) seems to be using this approach under the hood.
>>> import xlwings as xw
>>> wb = xw.Book() # this will create a new workbook
>>> wb = xw.Book('FileName.xlsx') # connect to an existing file in the current working directory
>>> wb = xw.Book(r'C:\path\to\file.xlsx') # on Windows: use raw strings to escape backslashes
hi im using the xlrd module to read an excel file. How can i rename the first worksheet of each excel file.
Thank you.
I don't think you can modify files with either xlrd or xlwt. You can however copy the file with xlrd and then modify and write the copy with xlwt.
Here's an example adapted from here: writing to existing workbook using xlwt:
from xlutils.copy import copy
from xlrd import open_workbook
# open the file you're interested
rb = open_workbook('some_document.xlsx')
# copy it to a writable variant
wb = copy(rb)
# find the index of a sheet you wanna rename,
# let's say you wanna rename Sheet1
idx = rb.sheet_names().index('Sheet1')
# now rename the sheet in the writable copy
wb.get_sheet(idx).name = u'Renamed Sheet1'
# save the new spreadsheet
wb.save('new_some_document.xlsx')
# done