Replace missing values in excel worksheet using openpyxl module

Replace missing values in excel worksheet using openpyxl module - python

I’m trying to replace cells in my Excel worksheet that contains hyphen “-“ with the average value between the above lying cell and the below lying cell. I’ll been trying to do this by looping through each row in column 3
import math
from openpyxl import load_workbook
import openpyxl
d_filename="Snow.xlsx"
wb = load_workbook(d_filename)
sheet_ranges=wb["PIT 1"]'
def interpolatrion_of_empty_cell():
for i in range(7,31):
if i =="-":
sheet_ranges.cell(row = i, column = 3).value = mean(i-1,i+1)
else:
sheet_ranges.cell(row = i, column = 3).value
wb.save(filename = d_filename)
is this just to easy to do or is it not possible with openpyxl?
cheers//
Smiffo

The reason values are not replaced is that you use i to check if its equal to -. i is an index, not the value of a cell. Also to calculate the mean, you are using indices, not the values of top and below cells.
So you could solve this in following way:
def interpolatrion_of_empty_cell():
for i in range(7,31):
cell_value = sheet_ranges.cell(row=i, column=3).value
if cell_value == "-":
top_value = sheet_ranges.cell(row=i+1, column=3).value
bottom_value = sheet_ranges.cell(row=i - 1, column=3).value
sheet_ranges.cell(row=i, column=3).value = (float(top_value) + float(bottom_value))/2
Not that this may require tweaking, as it does not accout for cases where tob and bottom rows are -, not numbers, or just empty cells.

Related

Python openpyxl to automate entire column in excel

import openpyxl
i=2
workbook= openpyxl.load_workbook()
sheet = workbook.active
for i, cellObj in enumerate (sheet['I'],2):
cellObj.value = '=IF(ISNUMBER(A2)*(A2<>0),A2,IF(ISNUMBER(F2)*(F2<>0),F2,IF(ISBLANK(A2)*ISBLANK(F2)*ISBLANK(H2),0,H2)))'
workbook.save()
Using openpxl, I tried to apply formula to entire column 'I' its not working as per the formula, I wanted formula to start from I2 but its start from I1 and wrong output as well.
I have attached a screenshot.
.
Can someone please correct the code?
Output of print(list(enumerate(sheet['I']))):

You'd probably be better off to do it this way, auto skip row 1 by starting the iteration at row 2 and update the formula using the cell row number.
import openpyxl
excelfile = 'foo.xlsx'
workbook= openpyxl.load_workbook(excelfile)
sheet = workbook.active
mr = sheet.max_row # Last row to add formula to
for row in sheet.iter_rows(min_col=9, max_col=9, min_row=2, max_row=mr):
for cell in row:
cr = cell.row # Get the current row number to use in formula
cell.value = f'=IF(ISNUMBER(A{cr})*(A{cr} <> 0), A{cr}, IF(ISNUMBER(F{cr})*(F{cr} <> 0), F{cr}, IF(ISBLANK(A{cr})*ISBLANK(F{cr})*ISBLANK(H{cr}), 0, H{cr})))'
workbook.save(excelfile)

If you know the from and to row numbers, then you can use it like this:
from openpyxl import load_workbook
wb = load_workbook(filename="/content/sample_data/Book1.xlsx")
ws = wb.active
from_row = 2
to_row = 4
for i in range(from_row, to_row+1):
ws[f"C{i}"] = f'=_xlfn.CONCAT(A{i}, "_", B{i})'
wb.save("/content/sample_data/formula.xlsx")
Input (Book1.xlsx):
Output (formula.xlsx):
I don't have your data, so I did not test the following formula; but your formula can be translated to format string as:
for i in range(from_row, to_row+1):
ws[f"I{i}"] = f'=IF(ISNUMBER(A{i})*(A{i}<>0),A{i},IF(ISNUMBER(F{i})*(F{i}<>0),F{i},IF(ISBLANK(A{i})*ISBLANK(F{i})*ISBLANK(H{i}),0,H{i})))'
It formats the formula as:
=IF(ISNUMBER(A2)*(A2<>0),A2,IF(ISNUMBER(F2)*(F2<>0),F2,IF(ISBLANK(A2)*ISBLANK(F2)*ISBLANK(H2),0,H2)))
=IF(ISNUMBER(A3)*(A3<>0),A3,IF(ISNUMBER(F3)*(F3<>0),F3,IF(ISBLANK(A3)*ISBLANK(F3)*ISBLANK(H3),0,H3)))
=IF(ISNUMBER(A4)*(A4<>0),A4,IF(ISNUMBER(F4)*(F4<>0),F4,IF(ISBLANK(A4)*ISBLANK(F4)*ISBLANK(H4),0,H4)))

How to paste values only in Excel using Python and openpyxl

I have an Excel worksheet.
In column J i have some some source data which i used to make calculations in column K.
Column K has the values I need, but when i click on a cell the formula shows up.
I only want the values from column K, not the formula.
I read somewhere that i need to set data only=True, which I have done.
I then pasted data from Column K to Column L(with the intention of later deleting Columns J and K).
I thought that Column L will have only the values from K but if i click on a cell, the formula still shows up.
How do I simply paste values only from one column to another?
import openpyxl
wb = openpyxl.load_workbook('edited4.xlsx', data_only=True)
sheet = wb['Sheet1']
last_row = 100
for i in range(2, last_row):
cell = "K" + str(i)
a_cell = "J" + str(i)
sheet[cell] = '=IF(' + a_cell + '="R","Yes","No")'
rangeselected = []
for i in range (1, 100,1):
rangeselected.append(sheet.cell(row = i, column = 11).value)
for i in range (1, 1000,1):
sheet.cell(row=i, column=12).value = rangeselected[i-1]
wb.save('edited4.xlsx')

It's been a while since I've used openpyxl. But:
Openpyxl doesn't run an Excel formula. It reads either the formula string or the results of the last calculation run by Excel*. This means that if a calculation is created outside of Excel, and the file has never been open by Excel, then only the formula will be available. Unless you need to display (for historical purposes, etc.) what the formula is, you should do the calculation in Python - which will be faster and more efficient anyway.
* When I say Excel, I also include any Excel-like spreadsheet that will cache the results of the last run.
Try this (adjust column numbers as desired):
import openpyxl
wb = openpyxl.load_workbook('edited4.xlsx', data_only=True)
sheet = wb['Sheet1']
last_row = 100
data_column = 11
test_column = 12
result_column = 13
for i in range(2, last_row):
if sheet.cell(row=i, column=test_column).value == "R":
sheet.cell(row=i, column=result_column).value = "Yes"
else:
sheet.cell(row=i, column=result_column).value = "No"
wb.save('edited4.xlsx')
If you have a well-formed data sheet, you could probably shorten this by another step or two by using enumerate() and Worksheet.iter_rows() but I'll leave that to your imagination.

Openpyxl - Copy range of cells(with formula) from a workbook to another

I'm trying to copy specific rows from Workbook 1 and append it to the existing data in Workbook 2.
Copy the highlighed rows from
Workbook 1,
and append them in Workbook 2 below 'March'
So far I succeeded to copy and paste the range, but there are two problems:
1.Cells are a shifted
2.The percentage(formula) is missing, leaving only numeric values.
See Result here
import openpyxl as xl
source = r"C:\Users\Desktop\Test_project_20200401.xlsx"
wbs = xl.load_workbook(source)
wbs_sheet = wbs["P2"] #selecting the sheet
destination = r"C:\Users\Desktop\Try999.xlsx"
wbd = xl.load_workbook(destination)
wbd_sheet = wbd["A3"] #select the sheet
row_data = 0
for row in wbs_sheet.iter_rows():
for cell in row:
if cell.value == "Yes":
row_data += cell.row
for row in wbs_sheet.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data+1):
wbd_sheet.append((cell.value for cell in row))
wbd.save(destination)
Does anyone have any idea on how can I solve this?
Any feedback/solution would help!
Thanks!

I think min_col should = 0
Range("A1").Formula (in VBA) gets the formula.
Range("A1").Value (in VBA) gets the value.
So try using .formula in Python
(thanks to: Get back a formula from a cell - VBA ... if this works)

Just want to add my own solution in here.
What I did, was to iterate through the columns and apply "cell.number_format = '0%', which converts your cell value to percentage.
for col in ws.iter_cols(min_row=1, min_col=2, max_row=250, max_col=250):
for cell in col:
cell.number_format = '0%'
More info can be found in here:
https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html

how to write to a new cell in python using openpyxl

I wrote code which opens an excel file and iterates through each row and passes the value to another function.
import openpyxl
wb = load_workbook(filename='C:\Users\xxxxx')
for ws in wb.worksheets:
for row in ws.rows:
print row
x1=ucr(row[0].value)
row[1].value=x1 # i am having error at this point
I am getting the following error when I tried to run the file.
TypeError: IndexError: tuple index out of range
Can I write the returned value x1 to the row[1] column. Is it possible to write to excel (i.e using row[1]) instead of accessing single cells like ws.['c1']=x1

Try this:
import openpyxl
wb = load_workbook(filename='xxxx.xlsx')
ws = wb.worksheets[0]
ws['A1'] = 1
ws.cell(row=2, column=2).value = 2
This will set Cells A1 and B2 to 1 and 2 respectively (two different ways of setting cell values in a worksheet).
The second method (specifying row and column) is most useful for your situation:
import openpyxl
wb = load_workbook(filename='xxxxx.xlsx')
for ws in wb.worksheets:
for index, row in enumerate(ws.rows, start=1):
print row
x1 = ucr(row[0].value)
ws.cell(row=index, column=2).value = x1

Insert column using openpyxl

I'm working on a script that modifies an existing excel document and I need to have the ability to insert a column between two other columns like the VBA macro command .EntireColumn.Insert.
Is there any method with openpyxl to insert a column like this?
If not, any advice on writing one?

Here is an example of a much much faster way:
import openpyxl
wb = openpyxl.load_workbook(filename)
sheet = wb.worksheets[0]
# this statement inserts a column before column 2
sheet.insert_cols(2)
wb.save("filename.xlsx")

Haven't found anything like .EntireColumn.Insert in openpyxl.
First thought coming into my mind is to insert column manually by modifying _cells on a worksheet. I don't think it's the best way to insert column but it works:
from openpyxl.workbook import Workbook
from openpyxl.cell import get_column_letter, Cell, column_index_from_string, coordinate_from_string
wb = Workbook()
dest_filename = r'empty_book.xlsx'
ws = wb.worksheets[0]
ws.title = "range names"
# inserting sample data
for col_idx in xrange(1, 10):
col = get_column_letter(col_idx)
for row in xrange(1, 10):
ws.cell('%s%s' % (col, row)).value = '%s%s' % (col, row)
# inserting column between 4 and 5
column_index = 5
new_cells = {}
ws.column_dimensions = {}
for coordinate, cell in ws._cells.iteritems():
column_letter, row = coordinate_from_string(coordinate)
column = column_index_from_string(column_letter)
# shifting columns
if column >= column_index:
column += 1
column_letter = get_column_letter(column)
coordinate = '%s%s' % (column_letter, row)
# it's important to create new Cell object
new_cells[coordinate] = Cell(ws, column_letter, row, cell.value)
ws._cells = new_cells
wb.save(filename=dest_filename)
I understand that this solution is very ugly but I hope it'll help you to think in a right direction.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replace missing values in excel worksheet using openpyxl module - python

Related

Python openpyxl to automate entire column in excel

How to paste values only in Excel using Python and openpyxl

Openpyxl - Copy range of cells(with formula) from a workbook to another

how to write to a new cell in python using openpyxl

Insert column using openpyxl

Categories

Resources