Get the maximum value from an excel column using XLRD - python

I would like to get the maximum value from 25th column of an excel spreadsheet using xlrd. Here's what I did.
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
row = 1
max = 1
while row < 100:
a = sheet.cell(row, col).value
if a>max:
return a
else:
return max
row+=1
print(follow_up())
I run into an issue for cells with less than 100 values in a column (gives me IndexError) and the code won't work for cells with more than 100 values in a column. This can be fixed if I know how to get the number of values in a column. But I was wondering if anyone knows a "cleaner" way to do this.

Try:
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
return max(sheet.col_values(col))
print(follow_up())
I hope this helps.

Try this:
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
return max(sheet.col_values(col, start_rowx=1, end_rowx=101))
#with start_rowx and end_rowx you can define the range
#we start with 1 so as to skip the header row
print(follow_up())
col_values() returns a list of all values in the column you mention.
Hope this helps :)

Related

How can I find the last non-empty row of excel using openpyxl 3.03?

How can I find the number of the last non-empty row of an whole xlsx sheet using python and openpyxl?
The file can have empty rows between the cells and the empty rows at the end could have had content that has been deleted. Furthermore I don't want to give a specific column, rather check the whole table.
For example the last non-empty row in the picture is row 13.
I know the subject has been extensively discussed but I haven't found an exact solution on the internet.
# Open file with openpyxl
to_be = load_workbook(FILENAME_xlsx)
s = to_be.active
last_empty_row = len(list(s.rows))
print(last_empty_row)
## Output: 13
s.rows is a generator and its list contains arrays of each rows cells.
If you are looking for the last non-empty row of an whole xlsx sheet using python and openpyxl.
Try this:
import openpyxl
def last_active_row():
workbook = openpyxl.load_workbook(input_file)
wp = workbook[sheet_name]
last_row = wp.max_row
last_col = wp.max_column
for i in range(last_row):
for j in range(last_col):
if wp.cell(last_row, last_col).value is None:
last_row -= 1
last_col -= 1
else:
print(wp.cell(last_row,last_col).value)
print("The Last active row is: ", (last_row+1)) # +1 for index 0
if __name__ = '___main__':
last_active_row()
This should help.
openpyxl's class Worksheet has the attribute max_rows

Openpyxl - Copy range of cells(with formula) from a workbook to another

I'm trying to copy specific rows from Workbook 1 and append it to the existing data in Workbook 2.
Copy the highlighed rows from
Workbook 1,
and append them in Workbook 2 below 'March'
So far I succeeded to copy and paste the range, but there are two problems:
1.Cells are a shifted
2.The percentage(formula) is missing, leaving only numeric values.
See Result here
import openpyxl as xl
source = r"C:\Users\Desktop\Test_project_20200401.xlsx"
wbs = xl.load_workbook(source)
wbs_sheet = wbs["P2"] #selecting the sheet
destination = r"C:\Users\Desktop\Try999.xlsx"
wbd = xl.load_workbook(destination)
wbd_sheet = wbd["A3"] #select the sheet
row_data = 0
for row in wbs_sheet.iter_rows():
for cell in row:
if cell.value == "Yes":
row_data += cell.row
for row in wbs_sheet.iter_rows(min_row=row_data, min_col = 1, max_col=250, max_row = row_data+1):
wbd_sheet.append((cell.value for cell in row))
wbd.save(destination)
Does anyone have any idea on how can I solve this?
Any feedback/solution would help!
Thanks!
I think min_col should = 0
Range("A1").Formula (in VBA) gets the formula.
Range("A1").Value (in VBA) gets the value.
So try using .formula in Python
(thanks to: Get back a formula from a cell - VBA ... if this works)
Just want to add my own solution in here.
What I did, was to iterate through the columns and apply "cell.number_format = '0%', which converts your cell value to percentage.
for col in ws.iter_cols(min_row=1, min_col=2, max_row=250, max_col=250):
for cell in col:
cell.number_format = '0%'
More info can be found in here:
https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html

Using Python Count Number Of Populated Cells Populated In First Column

I am new to python and trying to learn - I have a spreadsheet that I need to open and count the populated cells (in first column) and print the count. Here is my code - I keep getting a traceback. Can you please assist?
book = xlrd.open_workbook('C:\\Users\\am\\Book1.xlsx')
sheet = book.sheet_by_name('Sheet1')
count = 0
for row in sheet.col(1):
count = count + 1
print (count)
You can use col_slice function for this.
import xlrd
book = xlrd.open_workbook('C:\\Users\\am\\Book1.xlsx')
sheet = book.sheet_by_name('Sheet1')
cells = sheet.col_slice(start_rowx=0,
end_rowx=sheet.nrows,
colx=0)
count = len(cells)
By the way, the first column's index is 0, as usual in Python.

Python Openpyxl iter_rows and add defined value in each cell

Question: Can someone please let me know how I can achieve the following task:
I've defined the column, but i need a specific value to go into each cell within that column.
Also, if column 6 only has x amount of rows, then i want column 7 to also have only x amount of rows with the values pasted in it.
This is the code i've tried.
import openpyxl
wb = openpyxl.load_workbook(filename=r'C:\Users\.spyder-py3\data\BMA.xlsx')
ws = wb.worksheets[0]
for row in ws.iter_rows('G{}:G{}'.format(ws.min_row,ws.max_row)):
for cell in row:
ws.cell(row=cell, column=7).value = 'BMA'
wb.save(r'C:\Users\.spyder-py3\data\BMA.csv')
wb.close()
I figured out most of the issue by looking at this answer:
https://stackoverflow.com/a/15004956/9649146
This is the code i end up with:
import openpyxl
wb = openpyxl.load_workbook(filename=r'C:\Users\.spyder-py3\data\AAXN.xlsx')
ws = wb.worksheets[0]
r = 2
for row in ws.iter_rows('G{}:G{}'.format(ws.min_row,ws.max_row)):
for cell in row:
ws.cell(row=r, column=7).value = 'AAXN'
r += 1
wb.save(r'C:\Users\.spyder-py3\data\AAXN.csv')
wb.close()
Or, you can do something like this:
for row in filesheet.iter_rows(min_row=2, max_row=file_sheet.max_row):
filesheet.cell(row=row[0].row, column=7).value = 'my value'

How to grab the value in a certain column in a certain row using XLRD

I am trying to loop through a spreadsheet and grab the value of a cell in a row under a certain column, as so:
# Row by row, go through the originalWorkSheet and save the values from the selected columns
numberOfRowsInOriginalWorkSheet = originalWorkSheet.nrows - 1
rowCounter = 0
while rowCounter <= numberOfRowsInOriginalWorkSheet:
row = originalWorkSheet.row(rowCounter)
#Grab the values in certain columns, say with the
# column name "Promotion" and save them to a variable
Is this possible? my google-foo has failed me on this one.
Thank you for the help!
The simplest way:
from xlrd import open_workbook
book = open_workbook(path_to_file)
sheet = book.sheet_by_index(0)
for i in range(1, sheet.nrows):
row = sheet.row_values(i)
variable = row[0] # Instead zero number of certain column
or you can loop row list and print each cell value
book = open_workbook(path_to_file)
sheet = book.sheet_by_index(0)
for i in range(1, sheet.nrows):
row = sheet.row_values(i)
for cnt in range(len(row)):
print row[cnt]
Hope this helps
There are many ways to do this, take a look at the docs
Something like this:
promotion_col_index = <promotion column index>
list_of_promotion_cells = originalWorkSheet.col(promotion_col_index)
list_of_promotion_values = [cell.value for cell in list_of_promotion_cells]
will get you a list of the values in the "Promotion" column

Categories

Resources