Move to adjacent cells using openpyxl - python

I have an algorithm that finds a value in a cell, for this case lets say that cell is C10. I need to look next to that in column D for a value, and if that value doesnt match what i need, to go up one cell from that and check for a match, etc. I have this so far:
bits = []
for row in ws.iter_rows(row_offset=4,column_offset=3):
#skip over empty rows
if row:
#current cell is in column C
cell = row[2]
try:
#find the lowest address in the excel sheet
if cell.internal_value == min(address):
#somehow match up in column d
for '''loop and search col D''':
if str(row[3].internal_value).upper == ('CONTROL 1' or 'CON 1'):
#add bits
for cell in row[4:]:
bits.append(cell.internal_value)
#pass over cells that aren't a number, ie values that will never match an address
except ValueError:
pass
except TypeError:
pass
Is there a way to do this? I know the comparison using row[3] compares in column D, but if it isnt correct the first time, i dont know how to go up the column. Or in other words, changing the value in row[value] moves around the row, and I need to know what value/how to move around the column.
Thanks!

bits = []
min_address = False
for row in ws.iter_rows(row_offset=4,column_offset=3):
c = row[2]
d = row[3]
if not d.internal_value: #d will always have a value if the row isn't blank
if min_address:
break #bits is what you want it to be now
bits = [] #reset bits every time we hit a new row
continue #this will just skip to next row
for bits_cell in row[4:]:
if bits_cell.internal_value:
bits.append(bits_cell.internal_value)
if c.internal_value:
if c.internal_value == min(address):
min_address = True #we set it to true, then kept going until blank row

Related

Openpyxl: backfill a row that has a cell with a partial string match?

I'm working on a program that filters a large .xlsx file and then splits those filtered sets out onto new sheets in a new workbook.
One problem I'm trying to fix, is recreating the old format on the new sheets using openpyxl. I can't seem to figure out how to make a partial string match not result in a TypeError.
Relevant code snippet:
def set_style(sheet, type_row, group1, group2):
for row in sheet.iter_rows():
for cell in row:
if row[type_row].value in group1:
if 'START:' in cell.value:
cell.fill = PatternFill(start_color='00FFOO', end_color='00FF00', fill_type='solid')
cell.font = Font(bold=True)
if 'END:' in cell.value:
cell.fill = PatternFill(start_color='99FF99', end_color='99FF99', fill_type='solid')
cell.font = Font(bold=True)
if row[type_row].value in group2:
cell.font = Font(bold=True)
cell.fill = PatternFill(start_color='FFA500', end_color='FFA500', fill_type='solid')
The if-statement related to group2 works fine on it's own, it's only when I try to check if "START:" or "END:" that ends up resulting in my error.
Any help would be appreciated!
This is how i solved it for a near match for one of my use cases.
Please see if it helps:
# Method to get row indexes of searched values in a specific column with near match condition
# Arguments to this method are:- Row number for header row, Column name & Search value
def get_row_idx_lst_based_on_search_val_specific_col_near_match(self, _header_row_num, _col_name, _search_val):
# Fetch the column index from column name using 'ref_col_name_letter_map' method
_col_idx = column_index_from_string(self.ref_col_name_letter_map(_header_row_num)[_col_name])
# Get the list of indexes where near match of searched value is found excluding the header row
# The Excel columns start with 1, however when iterating, the tuples start with index 0
_row_idx_list = [_xl_row_idx for _xl_row_idx, _row_val in
enumerate(self.my_base_active_ws.iter_rows(values_only=True), start=1) if
str(_search_val) in str(_row_val[_col_idx - 1]) if _xl_row_idx != _header_row_num]
# Return type is list
return _row_idx_list
I managed to find out what the problem was and sort of worked around it:
if row[5].value is not None:
if row[5].value != int:
if 'START:' in row[5].value:
#<rest of code here>

How to select a dynamic range based on a cell value in Excel with Python

I am having a hard time trying to find anything relating to my question. All I have found so far is selecting ranges based off of a static range, but unfortunately the data can change from week to week.
There are multiple data blocks with different rows and columns located in the same sheet but have titles above the data. My goal is to find a title i.e. row 36 or 40, move a row down and essentially do a ctrl+down ctrl+right for selecting a range and then creating a table and naming a table based off of the title.
import openpyxl
def tables(title):
for cell in pws_sheet["A"]: #pws_sheet["A"] will return all cells on the A column until the last one
if (cell.value is not None): #check if cell is not empty
if title in cell.value: #check if the value of the cell contains the title
row_coord = cell.row #put row number into a variable
tables("All Call Distribution by Hour")
I'm currently able to find the row based off of the title, save the title into a variable, but I am lost on figuring out how to select the bottom right of each data block and selecting it as a range and creating the table from that range.
EDIT 1: Title row is correct, end row is the acting like max_row, and the num_cols is showing the cell.values instead of just a single max column for that table.
def find_table(title, sheet):
title_row = None
for row in sheet.iter_rows():
if row[0].value == title:
#Find the title row
title_row = row[0].row
if row[0].value is None and title_row:
end_row = row[0].row - 1
num_cols = [cell.value for cell in sheet[title_row+1] if cell.value is not None]
else:
#The last row in the sheet
end_row = row[0].row
print(f"Row: {title_row}, Column: {num_cols}, End Row: {end_row}")
return title_row, num_cols, end_row
OUTPUTS: Row: 40, Column: ['Within', '# Calls', '% Calls'], End Row: 138
For selecting the cells you want, try something like this
def find_table(sheet, title):
title_row = None
for row in sheet.iter_rows():
if row[0].value == title:
# Find the title row
title_row = row[0].row
if row[0].value is None and title_row:
end_row = row[0].row - 1
break
else:
# The last row in the sheet
end_row = row[0].row
return title_row, end_row
You can find the specific number of columns, for the given table with;
num_cols = len([cell.value for cell in sheet[title_row+1] if cell.value is not None])
That should give you the start and end rows, and the number of columns. You can then select those cells and use them to "make a table" in whatever form that takes for your specific example.
If you want to select a range of cells using Excels 'A1' style notation, you can always use openpyxl.utils.cell.get_column_letter(idx) to translate a numeric column number into the corresponding letter.
This solution is quite simplistic, and makes some assumptions about the format of your excel sheets, such as that the data always starts in ColumnA, that an empty cell in ColumnA indicates a totally empty row, and that the heading row always follows the title row. You would also probably want to add some error handling - for example, what if the title row is not found?
Hopefully this can give you a start in the right direction though, and some ideas to try out.

Deleting row in worksheet

I'm trying to delete a whole row from a worksheet with an index. I do this because i'm trying to do the 3sigma clipping method. Here is my code:
import openpyxl
from statistics import mean, stdev
wb=openpyxl.load_workbook('try1.xlsx')
sheet=wb.get_sheet_by_name('Blad1')
v = []
for i in range(1,555):
v.append(sheet['T'][i].value)
m = mean(v)
s = 3* stdev(v)
# clipping in the list BUT i dont know how to delete a row from a worksheet
for i in range(0,len(v)-1):
#if the value is more then 3 sigma (=s) away from the mean, i want to delete the whole row
# of information
if v[i] >= m+ s or v[i] <= m-s:
# tryin to delete row 'Ai':'Zi'
sheet.delete(sheet[['A'][i]:['Z'][i]])
To delete a row, use this: sheet.delete_rows(index, length).
Basically, index is a number which will represent the start of the row that will be deleted. Then, length is also a number which represents the amount of columns that will be deleted from index. For example, sheet.delete_rows(1, 26) will delete all the rows, from A to Z.
If you wanted to empty a cell, you can do wb['A1'] = "".
You can also use move to shift the cells, which will overwrite any cells. To do this, try this:
sheet.move_range("A1:Z1", rows=-1, cols=2)
This will move the cells in the range A1:Z1 up one row, and right two columns. You can change the values of rows and cols to whatever you need.

How to evaluate rows in a csv file python

I have this code:
import csv
def main():
file1 = open("filepath", "r")
reader = csv.reader(file1)
i = next(reader)
for row in file1:
if i[3] < i[4]:
print("troubling")
elif i[3] < i[5]:
print("concerning")
else:
print("None")
main()
So far, what this has done is split my columns up so I can compare them with each other, however it now is comparing the entire column, rather than within each row, how can i make it do each row instead of comparing two entire columns. Right now column 4's value is the greatest so it prints "troubling" 100 times, I want it to print "troubling" only if a certain row's 4th column is greater than that same rows 3rd column. Thank you in advance for your help.
It is simpler to iterate over the reader object. Try the following:
import csv
def main():
file1=open("filepath","r")
reader=csv.reader(file1)
for row in reader:
if row[3]<row[4]:
print ("troubling")
#... and so on
each row in reader is a list, and so you access each column using the appropriate index.
I believe the error is due to you using the variable 'i' inside the loop instead of row. The value of 'i' is not changing within the loop and hence you get the same result every time.
for row in file1:
if row[3] < row[4]:
print("troubling")
elif row[3] < row[5]:
print("concerning")
else:
print("None")
I believe this should solve your issue.

How do I apply a style to the whole row using XLWT Python Excel?

I'm trying to apply a style that will highlight the whole row if one of the columns contains the value "Assets". The code below will highlight only the column with "Assets" in it, instead of the entire row. Is there a way to apply the style to the whole row?
for row in csv_input:
#Iterate through each column
for col in range(len(row)):
#Apply different styles depending on row
if row_count == 0:
sheet.write(row_count,col,row[col],headerStyle)
elif row_count == 3:
sheet.write(row_count,col,row[col],subheadStyle)
elif "Assets" in row[col]:
sheet.write(row_count,col,row[col],highlightStyle)
else:
if (is_number(row[col]) == True):
sheet.write(row_count,col,float(row[col]),rowStyle)
else:
sheet.write(row_count,col,row[col],rowStyle)
As you can see, depending on the row I am applying different styles. How can I make it so that any row that contains the keyword "Assets" will be highlighted? Thanks!
Your main problem is that your code is checking for "Assets" after it has written some cells in the row. You need to do your "what style to use for the whole row" tests before you write any cells in the row. Setting a style on the xlwt Row object doesn't work; that's a default style for use with cells that don't have any formatting applied otherwise.
Other problems:
contains the value "Assets". The code below will highlight only the
column with "Assets" in it
This is ambiguous. Suppose a cell value is exactly equal to "Equity Assets"; what do you want to do? Note: your code will highlight such a cell and those to its right. Also it's not apparent whether the "Assets"-bearing cell should be the first (example in your comment on another answer) or any cell (as per your code).
Some of your choices for variable names make your code very hard to read e.g. row is a list of cell values but col is a column index. Use enumerate() where possible.
Try something like this:
for row_index, cell_values in enumerate(csv_input):
# Determine what style to use for the whole row
if row_index == 0:
common_style = headerStyle
elif row_index == 3:
common_style = subheadStyle
elif "Assets" in cell_values:
# perhaps elif any("Assets" in cell_value for cell_value in cell_values):
# perhaps elif cell_values and cell_values[0] == "Assets":
# perhaps elif cell_values and "Assets" in cell_values[0]:
common_style = highlightStyle
else:
common_style = rowStyle
# Iterate over the columns
for col_index, cell_value in enumerate(cell_values):
if common_style == rowStyle and is_number(cell_value):
cell_value = float(cell_value)
sheet.write(row_index, col_index, cell_value, common_style)
I'm curious about the is_number function ... I'd use this:
def is_number(s):
try:
float(s)
return True
except ValueError:
return False
which automatically leads to:
if common_style == rowStyle:
try:
cell_value = float(cell_value)
except ValueError:
pass
and also raises the question of whether you should perhaps have different styles for numbers and text.

Categories

Resources