openpyxl.load_workbook(file, data_only=True doens't work?

openpyxl.load_workbook(file, data_only=True doens't work? - python

Why does x = "None" instead of "500"?
I have tried everything that I know and searched 1 hour for answer...
Thank you for any help!
import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet["A1"] = 200
sheet["A2"] = 300
sheet["A3"] = "=SUM(A1+A2)"
wb.save("writeFormula.xlsx")
wbFormulas = openpyxl.load_workbook("writeFormula.xlsx")
sheet = wbFormulas.active
print(sheet["A3"].value)
wbDataOnly = openpyxl.load_workbook("writeFormula.xlsx", data_only=True)
sheet = wbDataOnly.active
x = (sheet["A3"].value)
print(x) # None? Should print 500?

From the documentation
openpyxl never evaluates formula

Documentation says:
data_only controls whether cells with formulae have either the formula
(default) or the value stored the last time Excel read the sheet.
So, if you have not used Excel to open that .xlsx file(writeFormula.xlsx) once, Excel won't have any data to store then. As a result, your program will return a NoneType value.
If you want your program return '500', you should manually open 'writeFormula.xlsx'. Then, annotate the file creation part of your program. You will get '500'.
I have already tried it. And it works. Tell me if you have a different oppinion. Thanks.

There is an easy way to launch excel and get the formula values updated.
Sample Code Snippet
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel.Workbooks.Open(inputFile)
workbook.Save()
workbook.Close()
excel.Quit()
# And for reading the data back we can use data_only mode as True.
oxl = openpyxl.load_workbook(inputFile,data_only=True)

Check the format of the cell in Excel.
I was running into this issue as well. The documentation indicated that you would have to open up the workbook through the excel application and resave it, then the value would return as the last calculated one. Such as
I did that and I still got 'None' as my return.
As with many excel/vba issues, it turned out it was a format issue. I had the cell formatted as 'Accounting' instead of 'Number.' After changing it to number, it worked.

I just have the same questions. The solution is open the xlsx file manually and close it, then click save. After this operation, you can try the wbDataonly programming part and get the data 500

Related

Unable to save formulas under excel file when it is saved using openpyxl lib

Formulas in the excel sheet are getting removed when it is saved through an openpyxl python script.
Is there any way to save excel file without removing formulas using a python script
Expected: Formulas should not be removed and data should be read through openpyxl lib
Actual: Data is read, but formulas are getting removed

If you read file with data_only = True argument you read value from formula, but not formula.
From docs
data_only controls whether cells with formulae have either the formula (default) or the value stored the last time Excel read the sheet.

Though xlswings, this issue is resolved

I am able to successfully resolve this issue for my assignment.
First do not use data_only parameter. Only define the excel and the sheet using -
e.g.:
exl = openpyxl.load_workbook(exlFile)
sheet = exl["Sheet1"]
now again define the same excel this time using data_only=true
exl1 = openpyxl.load_workbook(exlFile, data_only=True)
sheet1 = exl1["Sheet1"]
Now while reading the data from excel, use sheet1 while writing back to excel, use sheet.
Also while saving the workbook, use exl.save(exlFile) instead of exl1.save(exlFile)
With this I was able to retain all the formulas and also could update the required cells.
Let me know if this is sufficient or need more info.

Openpyxl not writing to my Excel spreadsheet

I am trying to write to an Excel worksheet but the code does not do anything. I have the file name right and it correctly detects the only sheet ('Sheet1') but when I try to write to a cell nothing happens. I am running Microsoft Office 365 if that matters.
I have tried
wb = openpyxl.load_workbook('Spendings 2019.xlsx')
ws = wb.active
ws['B3'] = 4
This does not change the Excel file at all when run.

Did you read the docs? You need
print(cell.value)
As explained https://openpyxl.readthedocs.io/en/stable/usage.html
Also, if you are making changes to the spreadsheet, you need to save it

Openpyxl is not able to read function value refering to a cell which it wrote into previously

I have a problem with openpyxl package. To illustrate the isue, I have prepared a simple example.
I have an excel file which contains nothing but formula =A1 in B1 cell. I would like to (1) write a value of 123 into cell A1, (2) save the workbook, (3) open it again and (4) read a content of cell B1. Instead of 123 I get None. Below you can find a simple code, which (I hope) should do as just described. Can anyone see, what I am doing wrong?
import openpyxl
# open file and select sheet
wb = openpyxl.load_workbook('example.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
# write value into cell A1
sheet['A1'].value = 123
# save the file and close it
wb.save('example.xlsx')
wb.close()
# open the file again and select sheet
wb = openpyxl.load_workbook('example.xlsx', data_only=True)
sheet = wb.get_sheet_by_name('Sheet1')
# read value from cell containing referece to cell A1 => why it returns None?
print(sheet['B1'].value)
# close the file
wb.close()
Many thx,
Macky
PS: I am using python 3.5.5, openpyxl 2.5.6 and MS Office 2013 on Win7.

Try setting 'data_only' parameter to True while loading workbook.
wb = load_workbook("example.xlsx", data_only=True)
print(sheet['B1'].value)
This will print the result after computing formula at B1 cell.

openpyxl does not and will not calculate the result of formulas, hence the formula B1=A1 will only be calculated when you open the excel sheet or use another program that will calculate it. There are other libraries that I believe can help, like pycel.

Since the above comments do not fully answer my question, I will add a link to another complementary thread. Hopefully it will help others facing the same problem. I did not realize that the updated excel file has to be opened and saved using Excel application...
Regards,
Macky

Python openpyxl data_only=True returning None

I have a simple excel file:
A1 = 200
A2 = 300
A3 = =SUM(A1:A2)
this file works in excel and shows proper value for SUM, but while using openpyxl module for python I cannot get value in data_only=True mode
Python code from shell:
wb = openpyxl.load_workbook('writeFormula.xlsx', data_only = True)
sheet = wb.active
sheet['A3']
<Cell Sheet.A3> # python response
print(sheet['A3'].value)
None # python response
while:
wb2 = openpyxl.load_workbook('writeFormula.xlsx')
sheet2 = wb2.active
sheet2['A3'].value
'=SUM(A1:A2)' # python response
Any suggestions what am I doing wrong?

It depends upon the provenance of the file. data_only=True depends upon the value of the formula being cached by an application like Excel. If, however, the file was created by openpyxl or a similar library, then it's probable that the formula was never evaluated and, thus, no cached value is available and openpyxl will report None as the value.

I have replicated the issue with Openpyxl and Python.
I am currently using openpyxl version 2.6.3 and Python 3.7.4. Also I am assuming that you are trying to complete an exercise from ATBSWP by Al Sweigart.
I tried and tested Charlie Clark's answer, considering that Excel may indeed cache values. I opened the spreadsheet in Excel, copied and pasted the formula into the same exact cell, and finally saved the workbook. Upon reopening the workbook in Python with Openpyxl with the data_only=True option, and reading the value of this cell, I saw the proper value, 500, instead of the wrong value, the None type.
I hope this helps.

I had the same issue. This may not be the most elegant solution, but this is what worked for me:
import xlwings
from openpyxl import load_workbook
excel_app = xlwings.App(visible=False)
excel_book = excel_app.books.open('writeFormula.xlsx')
excel_book.save()
excel_book.close()
excel_app.quit()
workbook = load_workbook(filename='writeFormula.xlsx', data_only=True)

I have suggestion to this problem. Convert xlsx file to csv :).
You will still have the original xlsx file. The conversion is done by libreoffice (it is that subprocess.call() line).You can use also Pandas for this as a more pythonic way.
from subprocess import call
from openpyxl import load_workbook
from csv import reader
filename="test"
wb = load_workbook(filename+".xlsx")
spread_range = wb['Sheet1']
#what ever function there is in A1 cell to be evaluated
print(spread_range.cell(row=1,column=1).value)
wb.close()
#this line can be done with subprocess or os.system()
#libreoffice --headless --convert-to csv $filename --outdir $outdir
call("libreoffice --headless --convert-to csv "+filename+".xlsx", shell=True)
with open(filename+".csv", newline='') as f:
reader = reader(f)
data = list(reader)
print(data[0][0])
or
# importing pandas as pd
import pandas as pd
# read an excel file and convert
# into a dataframe object
df = pd.DataFrame(pd.read_excel("Test.xlsx"))
# show the dataframe
df
I hope this helps somebody :-)

Yes, #Beno is right. If you want to edit the file without touching it, you can make a little "robot" that edits your excel file.
WARNING: This is a recursive way to edit the excel file. These libraries are depend on your machine, make sure you set time.sleep properly before continuing the rest of the code.
For instance, I use time.sleep, subprocess.Popen, and pywinauto.keyboard.send_keys, just add random character to any cell that you set, then save it. Then the data_only=True is working perfectly.
for more info about pywinauto.keyboard: pywinauto.keyboard
# import these stuff
import subprocess
from pywinauto.keyboard import send_keys
import time
import pygetwindow as gw
import pywinauto
excel_path = r"C:\Program Files\Microsoft Office\root\Office16\EXCEL.EXE"
excel_file_path = r"D:\test.xlsx"
def focus_to_window(window_title=None): # function to focus to window. https://stackoverflow.com/a/65623513/8903813
window = gw.getWindowsWithTitle(window_title)[0]
if not window.isActive:
pywinauto.application.Application().connect(handle=window._hWnd).top_window().set_focus()
subprocess.Popen([excel_path, excel_file_path])
time.sleep(1.5) # wait excel to open. Depends on your machine, set it propoerly
focus_to_window("Excel") # focus to that opened file
send_keys('%{F3}') # excel's name box | ALT+F3
send_keys('AA1{ENTER}') # whatever cell do you want to insert somthing | Type 'AA1' then press Enter
send_keys('Stackoverflow.com') # put whatever you want | Type 'Stackoverflow.com'
send_keys('^s') # save | CTRL+S
send_keys('%{F4}') # exit | ALT+F4
print("Done")
Sorry for my bad english.

As others already mentioned, Openpyxl only reads cashed formula value in data_only mode. I have used PyWin32 to open and save each XLSX file before it's processed by Openpyxl to read the formulas result value. This works for me well, as I don't process large files. This solution will work only if you have MS Excel installed on your PC.
import os
import win32com.client
from openpyxl import load_workbook
# Opening and saving XLSX file, so results for each stored formula can be evaluated and cashed so OpenPyXL can read them.
excel_file = os.path.join(path, file)
excel = win32com.client.gencache.EnsureDispatch('Excel.Application')
excel.DisplayAlerts = False # disabling prompts to overwrite existing file
excel.Workbooks.Open(excel_file )
excel.ActiveWorkbook.SaveAs(excel_file, FileFormat=51, ConflictResolution=2)
excel.DisplayAlerts = True # enabling prompts
excel.ActiveWorkbook.Close()
wb = load_workbook(excel_file)
# read your formula values with openpyxl and do other stuff here

I ran into the same issue. After reading through this thread I managed to fix it by simply opening the excel file, making a change then saving the file again. What a weird issue.

Extracting Hyperlinks From Excel (.xlsx) with Python

I have been looking at mostly the xlrd and openpyxl libraries for Excel file manipulation. However, xlrd currently does not support formatting_info=True for .xlsx files, so I can not use the xlrd hyperlink_map function. So I turned to openpyxl, but have also had no luck extracting a hyperlink from an excel file with it. Test code below (the test file contains a simple hyperlink to google with hyperlink text set to "test"):
import openpyxl
wb = openpyxl.load_workbook('testFile.xlsx')
ws = wb.get_sheet_by_name('Sheet1')
r = 0
c = 0
print ws.cell(row = r, column = c). value
print ws.cell(row = r, column = c). hyperlink
print ws.cell(row = r, column = c). hyperlink_rel_id
Output:
test
None
I guess openpyxl does not currently support formatting completely either? Is there some other library I can use to extract hyperlink information from Excel (.xlsx) files?

This is possible with openpyxl:
import openpyxl
wb = openpyxl.load_workbook('yourfile.xlsm')
ws = wb['Sheet1']
# This will fail if there is no hyperlink to target
print(ws.cell(row=2, column=1).hyperlink.target)

Starting from at least version openpyxl-2.4.0b1 this bug https://bitbucket.org/openpyxl/openpyxl/issue/152/hyperlink-returns-empty-string-instead-of was fixed. Now it's return for cell Hyperlink object:
hl_obj = ws.row(col).hyperlink # getting Hyperlink object for Cell
#hl_obj = ws.cell(row = r, column = c).hyperlink This could be used as well.
if hl_obj:
print(hl_obj.display)
print(hl_obj.target)
print(hl_obj.tooltip) # you can see it when hovering mouse on hyperlink in Excel
print(hl_obj) # to see other stuff if you need

FYI, the problem with openpyxl is an actual bug.
And, yes, xlrd cannot read the hyperlink without formatting_info, which is currently not supported for xlsx.

In my experience getting good .xlsx interaction requires moving to IronPython. This lets you work with the Common Language Runtime (clr) and interact directly with excel'
http://ironpython.net/
import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
excel = Excel.ApplicationClass()
wb = excel.Workbooks.Open('testFile.xlsx')
ws = wb.Worksheets['Sheet1']
address = ws.Cells(row, col).Hyperlinks.Item(1).Address

A successful solution I've worked with is to install unoconv on the server and implement a
method that invokes this command line tool via the subprocess module to convert the file from xlsx to xls since hyperlink_map.get() works with xls.

For direct manipulation of Excel files it's also worth looking at the excellent XlWings library.

import openpyxl
wb = openpyxl.load_workbook('yourfile.xlsx')
ws = wb['Sheet1']
try:
print(ws.cell(row=2, column=1).hyperlink.target)
#This fail if their is no hyperlink
except:
print(ws.cell(row=2, column=1).value)
In order to handle the exception 'message': "'NoneType' object has no attribute 'target'", we can use it in a try/except block. So even if there are no hyperlinks available in the given cell, it will print the content contained in the cell.

If instead of just .hyperlink, doing .hyperlink.target should work. I was getting a 'None' as well from using just ".hyperlink" on the cell object before that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

openpyxl.load_workbook(file, data_only=True doens't work? - python

From the documentation openpyxl never evaluates formula

I just have the same questions. The solution is open the xlsx file manually and close it, then click save. After this operation, you can try the wbDataonly programming part and get the data 500

Related

Unable to save formulas under excel file when it is saved using openpyxl lib

Openpyxl not writing to my Excel spreadsheet

Openpyxl is not able to read function value refering to a cell which it wrote into previously

Python openpyxl data_only=True returning None

Extracting Hyperlinks From Excel (.xlsx) with Python

Categories

Resources