Python Openpyxl - Add column in write_only spreadsheet - python

I'm using Python and openpyxl library, but, I'm not able to use the insert_cols() function in openpyxl when my spreadsheet is in write_only=True mode. So, basically, I just want to add a new column to my spreadsheet when it's in write_only=True mode.
I'm able to use insert_cols() when loading the workbook by load_workbook(), but, not when I'm using the write_only mode. I have to use the write_only mode because my spreadsheets are quite large.
Any ideas on how to add a new column are appreciated.
Thank you.
This is my code:
import openpyxl
from openpyxl import Workbook
from openpyxl import load_workbook
wb = load_workbook(filename=r'path\myExcel.xlsx', read_only=True)
ws = wb['PC Details']
wb_output = Workbook(write_only=True)
ws_output = wb_output.create_sheet(title='PC Details')
for row in ws.rows:
rowInCorrectFormat = [cell.value for cell in row]
ws_output.append(rowInCorrectFormat)
for cell in row:
print(cell.value)
### THIS IS THE PART OF THE CODE WHICH DOES NOT WORK
ws_output.insert_cols(12)
ws_output['L5'] = 'OK or NOT GOOD?'
###
wb_output.save(r'path\test_Output_optimized.xlsx')
This is the exact error that I'm getting:
ws_output.insert_cols(12)
AttributeError: 'WriteOnlyWorksheet' object has no attribute 'insert_cols'

The problem here lies in the flag write_only = True. Workbooks created by this flag set to true are different from regular Workbooks as you can look below.
Functions like insert_cols & insert_rows also do not work for such workbooks.
Possible solutions might be to not use this flag or use the ways suggested in the official documentation for adding data to the sheet.
For working with workbooks you might also find this article interesting. https://medium.com/aubergine-solutions/working-with-excel-sheets-in-python-using-openpyxl-4f9fd32de87f
You can read more in the official documentation. https://openpyxl.readthedocs.io/en/stable/optimized.html

Related

How to split Excel-Data into seperate Worksheets and maintain cell formating with Python 3?

I get a huge Excel-Sheet (normal table with header and data) on a regular basis and I need to filter and delete some data and split the table up into seperate sheets based on some rules. I think I can save me some time if I use Python for that tedious task because the filtering, deleting and splitting up into several sheets is based on always the same rules that can logically be defined.
Unfortunately the sheet and the data is partially color-coded (cells and font) and I need to maintain this formating for the resulting sheets. Is there a way of doing that with python? I think I need a pointer in the right direction. I only found workarounds with pandas but that does not allow me to keep the formatting.
You can take a look at an excellent Python library for Excel called openpyxl.
Here's how you can use it.
First, install it through your command prompt using:
pip install openpyxl
Open an existing file:
import openpyxl
wb_obj = openpyxl.load_workbook(path) # Open notebook
Deleting rows:
import openpyxl
from openpyxl import load_workbook
wb = load_wordbook(path)
ws = wb.active
ws.delete_rows(7)
Inserting rows:
import openpyxl
from openpyxl import load_workbook
wb = load_wordbook(path)
ws = wb.active
ws.insert_rows(7)
Here are some tutorials that you can take a look at:
Tutorial 1
Youtube Video

Insert an object to excel by using python

I want to add some object to my excel sheet,
I'm using openpyxl,
In excel you do it by:
Insert->Object
Is there a way to do it thru openpyxl or any other excel tool that working with python?
While this is not currently possible with openpyxl I suspect it would be fairly straightforward to add the relevant functionality using the add_image() method as a starting place.
import openpyxl
wb = openpyxl.Workbook()
ws = wb.worksheets[0]
picture = openpyxl.drawing.Image('/path/to/picture')
picture.anchor(ws.cell('cell to put the image'))
ws.add_image(picture)
wb.save('whatever you want to save the workbook as')
This code of course refers to creating a new workbook and adding the image into it. To add the image to your preexisting workbook you would obviously just load that workbook using load_workbook.

openpyxl.load_workbook(file, data_only=True doens't work?

Why does x = "None" instead of "500"?
I have tried everything that I know and searched 1 hour for answer...
Thank you for any help!
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet["A1"] = 200
sheet["A2"] = 300
sheet["A3"] = "=SUM(A1+A2)"
wb.save("writeFormula.xlsx")
wbFormulas = openpyxl.load_workbook("writeFormula.xlsx")
sheet = wbFormulas.active
print(sheet["A3"].value)
wbDataOnly = openpyxl.load_workbook("writeFormula.xlsx", data_only=True)
sheet = wbDataOnly.active
x = (sheet["A3"].value)
print(x) # None? Should print 500?
From the documentation
openpyxl never evaluates formula
Documentation says:
data_only controls whether cells with formulae have either the formula
(default) or the value stored the last time Excel read the sheet.
So, if you have not used Excel to open that .xlsx file(writeFormula.xlsx) once, Excel won't have any data to store then. As a result, your program will return a NoneType value.
If you want your program return '500', you should manually open 'writeFormula.xlsx'. Then, annotate the file creation part of your program. You will get '500'.
I have already tried it. And it works. Tell me if you have a different oppinion. Thanks.
There is an easy way to launch excel and get the formula values updated.
Sample Code Snippet
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel.Workbooks.Open(inputFile)
workbook.Save()
workbook.Close()
excel.Quit()
# And for reading the data back we can use data_only mode as True.
oxl = openpyxl.load_workbook(inputFile,data_only=True)
Check the format of the cell in Excel.
I was running into this issue as well. The documentation indicated that you would have to open up the workbook through the excel application and resave it, then the value would return as the last calculated one. Such as
I did that and I still got 'None' as my return.
As with many excel/vba issues, it turned out it was a format issue. I had the cell formatted as 'Accounting' instead of 'Number.' After changing it to number, it worked.
I just have the same questions. The solution is open the xlsx file manually and close it, then click save. After this operation, you can try the wbDataonly programming part and get the data 500

How to append to an existing excel sheet with XLWT in Python

I have created an excel sheet using XLWT plugin using Python. Now, I need to re-open the excel sheet and append new sheets / columns to the existing excel sheet. Is it possible by Python to do this?
After investigation today, (2014-2-18) I cannot see a way to read in a XLS file using xlwt. You can only write from fresh. I think it is better to use openpyxl. Here is a simple example:
from openpyxl import Workbook, load_workbook
wb = Workbook()
ws = wb.create_sheet()
ws.title = 'Pi'
ws.cell('F5').value = 3.14156265
wb.save(filename=r'C:\book2.xls')
# Re-opening the file:
wb_re_read = load_workbook(filename=r'C:\book2.xls')
sheet = wb_re_read.get_sheet_by_name('Pi')
print sheet.cell('F5').value
See other examples here: http://pythonhosted.org/openpyxl/usage.html (where this modified example is taken from)
You read in the file using xlrd, and then 'copy' it to an xlwt Workbook using xlutils.copy.copy().
Note that you'll need to install both xlrd and xlutils libraries.
Note also that not everything gets copied over. Things like images and print settings are not copied, for example, and have to be reset.

Extracting Hyperlinks From Excel (.xlsx) with Python

I have been looking at mostly the xlrd and openpyxl libraries for Excel file manipulation. However, xlrd currently does not support formatting_info=True for .xlsx files, so I can not use the xlrd hyperlink_map function. So I turned to openpyxl, but have also had no luck extracting a hyperlink from an excel file with it. Test code below (the test file contains a simple hyperlink to google with hyperlink text set to "test"):
import openpyxl
wb = openpyxl.load_workbook('testFile.xlsx')
ws = wb.get_sheet_by_name('Sheet1')
r = 0
c = 0
print ws.cell(row = r, column = c). value
print ws.cell(row = r, column = c). hyperlink
print ws.cell(row = r, column = c). hyperlink_rel_id
Output:
test
None
I guess openpyxl does not currently support formatting completely either? Is there some other library I can use to extract hyperlink information from Excel (.xlsx) files?
This is possible with openpyxl:
import openpyxl
wb = openpyxl.load_workbook('yourfile.xlsm')
ws = wb['Sheet1']
# This will fail if there is no hyperlink to target
print(ws.cell(row=2, column=1).hyperlink.target)
Starting from at least version openpyxl-2.4.0b1 this bug https://bitbucket.org/openpyxl/openpyxl/issue/152/hyperlink-returns-empty-string-instead-of was fixed. Now it's return for cell Hyperlink object:
hl_obj = ws.row(col).hyperlink # getting Hyperlink object for Cell
#hl_obj = ws.cell(row = r, column = c).hyperlink This could be used as well.
if hl_obj:
print(hl_obj.display)
print(hl_obj.target)
print(hl_obj.tooltip) # you can see it when hovering mouse on hyperlink in Excel
print(hl_obj) # to see other stuff if you need
FYI, the problem with openpyxl is an actual bug.
And, yes, xlrd cannot read the hyperlink without formatting_info, which is currently not supported for xlsx.
In my experience getting good .xlsx interaction requires moving to IronPython. This lets you work with the Common Language Runtime (clr) and interact directly with excel'
http://ironpython.net/
import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
excel = Excel.ApplicationClass()
wb = excel.Workbooks.Open('testFile.xlsx')
ws = wb.Worksheets['Sheet1']
address = ws.Cells(row, col).Hyperlinks.Item(1).Address
A successful solution I've worked with is to install unoconv on the server and implement a
method that invokes this command line tool via the subprocess module to convert the file from xlsx to xls since hyperlink_map.get() works with xls.
For direct manipulation of Excel files it's also worth looking at the excellent XlWings library.
import openpyxl
wb = openpyxl.load_workbook('yourfile.xlsx')
ws = wb['Sheet1']
try:
print(ws.cell(row=2, column=1).hyperlink.target)
#This fail if their is no hyperlink
except:
print(ws.cell(row=2, column=1).value)
In order to handle the exception 'message': "'NoneType' object has no attribute 'target'", we can use it in a try/except block. So even if there are no hyperlinks available in the given cell, it will print the content contained in the cell.
If instead of just .hyperlink, doing .hyperlink.target should work. I was getting a 'None' as well from using just ".hyperlink" on the cell object before that.

Categories

Resources