Write formula to Excel with Python error - python

I try to follow this question to add some formula in my excel using python and openpyxl package.
That link is what i need for my task.
but in this code :
for i, cellObj in enumerate(Sheet.columns[2], 1):
cellObj.value = '=IF($A${0}=$B${0}, "Match", "Mismatch")'.format(i)
i take an error at Sheet.columns[2] any idea why ? i follow the complete code.
i have python 2.7.13 version if that helps for this error.
****UPDATE****
COMPLETE CODE :
import openpyxl
wb = openpyxl.load_workbook('test1.xlsx')
print wb.get_sheet_names()
Sheet = wb.worksheets[0]
for i, cellObj in enumerate(Sheet.columns[2], 1):
cellObj.value = '=IF($A${0}=$B${0}, "Match", "Mismatch")'.format(i)
error message :
for i, cellObj in enumerate(Sheet.columns[2], 1):
TypeError: 'generator' object has no attribute 'getitem'

ws.columns and ws.rows are properties that return generators. But openpyxl also supports slicing and indexing for rows and columns
So, ws['C'] will give a list of the cells in the third column.

For other Stack adventurers looking to copy/paste a formula:
# Writing from pandas back to an existing EXCEL workbook
wb = load_workbook(filename=myfilename, read_only=False, keep_vba=True)
ws = wb['Mysheetname']
# Paste a formula Vlookup! Look at column A, put result in column AC.
for i, cellObj in enumerate(ws['AC'], 1):
cellObj.value = "=VLOOKUP($A${0}, 'LibrarySheet'!C:D,2,FALSE)".format(i)
One issue, I have a header and the formula overwrites it. Anyone know how to start from row 2?

If you want to start from another row you can either use an if statement to skip the first row, or specify the range in the enumeration. A coded example is below:
wb = load_workbook(filename=myfilename, read_only=False, keep_vba=True)
ws = wb['Mysheetname']
# using an if statement
for i, cellObj in enumerate(ws['AC'], 1):
if i > 1:
cellObj.value = "=VLOOKUP($A${0}, 'LibrarySheet'!C:D,2,FALSE)".format(i)
# specifying range, up to max row on worksheet - or you can specify an exact range
for i, cellObj in enumerate(ws['AC2:AC'+str(ws.max_row)],2):
cellObj[0].value = "=VLOOKUP($A${0}, 'LibrarySheet'!C:D,2,FALSE)".format(i)
The second method requires you to begin the index at 2 and returns a tuple rather than a cell object, so you need to specify cellObj[0].value to return the value of the cell object.

fortunately now you can easy do formulas in certain records. Also there are simpler functions to use, such as:
wb.sheetnames instead of wb.read_sheet_names()
sheet = wb['SHEET_NAME'] instead of sheet = wb.get_sheet_by_name('SHEET_NAME')
And formulas can be easily inserted with:
sheet['A1'] = '=SUM(1+1)'

Related

Python openpyxl to automate entire column in excel

import openpyxl
i=2
workbook= openpyxl.load_workbook()
sheet = workbook.active
for i, cellObj in enumerate (sheet['I'],2):
cellObj.value = '=IF(ISNUMBER(A2)*(A2<>0),A2,IF(ISNUMBER(F2)*(F2<>0),F2,IF(ISBLANK(A2)*ISBLANK(F2)*ISBLANK(H2),0,H2)))'
workbook.save()
Using openpxl, I tried to apply formula to entire column 'I' its not working as per the formula, I wanted formula to start from I2 but its start from I1 and wrong output as well.
I have attached a screenshot.
.
Can someone please correct the code?
Output of print(list(enumerate(sheet['I']))):
You'd probably be better off to do it this way, auto skip row 1 by starting the iteration at row 2 and update the formula using the cell row number.
import openpyxl
excelfile = 'foo.xlsx'
workbook= openpyxl.load_workbook(excelfile)
sheet = workbook.active
mr = sheet.max_row # Last row to add formula to
for row in sheet.iter_rows(min_col=9, max_col=9, min_row=2, max_row=mr):
for cell in row:
cr = cell.row # Get the current row number to use in formula
cell.value = f'=IF(ISNUMBER(A{cr})*(A{cr} <> 0), A{cr}, IF(ISNUMBER(F{cr})*(F{cr} <> 0), F{cr}, IF(ISBLANK(A{cr})*ISBLANK(F{cr})*ISBLANK(H{cr}), 0, H{cr})))'
workbook.save(excelfile)
If you know the from and to row numbers, then you can use it like this:
from openpyxl import load_workbook
wb = load_workbook(filename="/content/sample_data/Book1.xlsx")
ws = wb.active
from_row = 2
to_row = 4
for i in range(from_row, to_row+1):
ws[f"C{i}"] = f'=_xlfn.CONCAT(A{i}, "_", B{i})'
wb.save("/content/sample_data/formula.xlsx")
Input (Book1.xlsx):
Output (formula.xlsx):
I don't have your data, so I did not test the following formula; but your formula can be translated to format string as:
for i in range(from_row, to_row+1):
ws[f"I{i}"] = f'=IF(ISNUMBER(A{i})*(A{i}<>0),A{i},IF(ISNUMBER(F{i})*(F{i}<>0),F{i},IF(ISBLANK(A{i})*ISBLANK(F{i})*ISBLANK(H{i}),0,H{i})))'
It formats the formula as:
=IF(ISNUMBER(A2)*(A2<>0),A2,IF(ISNUMBER(F2)*(F2<>0),F2,IF(ISBLANK(A2)*ISBLANK(F2)*ISBLANK(H2),0,H2)))
=IF(ISNUMBER(A3)*(A3<>0),A3,IF(ISNUMBER(F3)*(F3<>0),F3,IF(ISBLANK(A3)*ISBLANK(F3)*ISBLANK(H3),0,H3)))
=IF(ISNUMBER(A4)*(A4<>0),A4,IF(ISNUMBER(F4)*(F4<>0),F4,IF(ISBLANK(A4)*ISBLANK(F4)*ISBLANK(H4),0,H4)))

How to paste values only in Excel using Python and openpyxl

I have an Excel worksheet.
In column J i have some some source data which i used to make calculations in column K.
Column K has the values I need, but when i click on a cell the formula shows up.
I only want the values from column K, not the formula.
I read somewhere that i need to set data only=True, which I have done.
I then pasted data from Column K to Column L(with the intention of later deleting Columns J and K).
I thought that Column L will have only the values from K but if i click on a cell, the formula still shows up.
How do I simply paste values only from one column to another?
import openpyxl
wb = openpyxl.load_workbook('edited4.xlsx', data_only=True)
sheet = wb['Sheet1']
last_row = 100
for i in range(2, last_row):
cell = "K" + str(i)
a_cell = "J" + str(i)
sheet[cell] = '=IF(' + a_cell + '="R","Yes","No")'
rangeselected = []
for i in range (1, 100,1):
rangeselected.append(sheet.cell(row = i, column = 11).value)
for i in range (1, 1000,1):
sheet.cell(row=i, column=12).value = rangeselected[i-1]
wb.save('edited4.xlsx')
It's been a while since I've used openpyxl. But:
Openpyxl doesn't run an Excel formula. It reads either the formula string or the results of the last calculation run by Excel*. This means that if a calculation is created outside of Excel, and the file has never been open by Excel, then only the formula will be available. Unless you need to display (for historical purposes, etc.) what the formula is, you should do the calculation in Python - which will be faster and more efficient anyway.
* When I say Excel, I also include any Excel-like spreadsheet that will cache the results of the last run.
Try this (adjust column numbers as desired):
import openpyxl
wb = openpyxl.load_workbook('edited4.xlsx', data_only=True)
sheet = wb['Sheet1']
last_row = 100
data_column = 11
test_column = 12
result_column = 13
for i in range(2, last_row):
if sheet.cell(row=i, column=test_column).value == "R":
sheet.cell(row=i, column=result_column).value = "Yes"
else:
sheet.cell(row=i, column=result_column).value = "No"
wb.save('edited4.xlsx')
If you have a well-formed data sheet, you could probably shorten this by another step or two by using enumerate() and Worksheet.iter_rows() but I'll leave that to your imagination.

Openpyxl - Copy range of cells(with formula) from a workbook to another

I'm trying to copy specific rows from Workbook 1 and append it to the existing data in Workbook 2.
Copy the highlighed rows from
Workbook 1,
and append them in Workbook 2 below 'March'
So far I succeeded to copy and paste the range, but there are two problems:
1.Cells are a shifted
2.The percentage(formula) is missing, leaving only numeric values.
See Result here
import openpyxl as xl
source = r"C:\Users\Desktop\Test_project_20200401.xlsx"
wbs = xl.load_workbook(source)
wbs_sheet = wbs["P2"] #selecting the sheet
destination = r"C:\Users\Desktop\Try999.xlsx"
wbd = xl.load_workbook(destination)
wbd_sheet = wbd["A3"] #select the sheet
row_data = 0
for row in wbs_sheet.iter_rows():
for cell in row:
if cell.value == "Yes":
row_data += cell.row
for row in wbs_sheet.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data+1):
wbd_sheet.append((cell.value for cell in row))
wbd.save(destination)
Does anyone have any idea on how can I solve this?
Any feedback/solution would help!
Thanks!
I think min_col should = 0
Range("A1").Formula (in VBA) gets the formula.
Range("A1").Value (in VBA) gets the value.
So try using .formula in Python
(thanks to: Get back a formula from a cell - VBA ... if this works)
Just want to add my own solution in here.
What I did, was to iterate through the columns and apply "cell.number_format = '0%', which converts your cell value to percentage.
for col in ws.iter_cols(min_row=1, min_col=2, max_row=250, max_col=250):
for cell in col:
cell.number_format = '0%'
More info can be found in here:
https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html

Finding Range of active/selected cell in Excel using Python and xlwings

I am trying to write a simple function in Python (with xlwings) that reads a current 'active' cell value in Excel and then writes that cell value to the cell in the next column along from the active cell.
If I specify the cell using an absolute reference, for example range(3, 2), then I everything is ok. However, I can't seem to manage to find the row and column values of whichever cell is selected once the function is run.
I have found a lot of examples where the reference is specified but not where the active cell range can vary depending on the user selection.
I have tried a few ideas. The first option is trying to use the App.selection that I found in the v0.10.0 xlwings documentation but this doesn't seem to return a range reference that can be used - I get an error "Invalid parameter" when trying to retrieve the row from 'cellRange':
def refTest():
import xlwings as xw
wb = xw.Book.caller()
cellRange = xw.App.selection
rowNum = wb.sheets[0].range(cellRange).row
colNum = wb.sheets[0].range(cellRange).column
url = wb.sheets[0].range(rowNum, colNum).value
wb.sheets[0].range(rowNum, colNum + 1).value = url
The second idea was to try to read the row and column directly from the cell selection but this gives me the error "Property object has no attribute 'row'":
def refTest():
import xlwings as xw
wb = xw.Book.caller()
rowNum = xw.App.selection.row
colNum = xw.App.selection.column
url = wb.sheets[0].range(rowNum, colNum).value
wb.sheets[0].range(rowNum, colNum + 1).value = url
Is it possible to pass the range of the active/selected cell from Excel to Python with xlwings? If anyone is able to shed some light on this then I would really appreciate it.
Thanks!
You have to get the app object from the workbook. You'd only use xw.App directly if you wanted to instantiate a new app. Also, selection returns a Range object, so do this:
cellRange = wb.app.selection
rowNum = cellRange.row
colNum = cellRange.column

Using generators outside of a loop

Relatively new to python so please excuse the newbie question, but google isn't helpful at this time.
I have 100 very large xlsx files from which I need to extract the first row (specifically cell A2). I found this gem of a tool called openpyxl which will iterate through my data files without loading everything in memory. It uses a generaotor to get the relevant row on each call
The thing that I can't get is how to initialize a generator outside of a loop. Right now my code is:
from openpyxl import load_workbook
wb = load_workbook(filename = "merged01.xlsx", use_iterators= True)
sheetName = wb.get_sheet_names()
ws = wb.get_sheet_by_name(name = sheetName[0])
row = ws.iter_rows() #row is a generator
for cell in row:
break
print (cell[1].internal_value) # A2
But there has to be a better way of doing this such as:
...
row = ws.iter_rows() #row is a generator
cell = row.first # line I'm trying to KISS
print (cell[1].internal_value) # A2
cell = next(row)
The next function retrieves the next value from any iterator.
You're looking for next().
cell = next(row)

Categories

Resources