I am new to python and trying to learn - I have a spreadsheet that I need to open and count the populated cells (in first column) and print the count. Here is my code - I keep getting a traceback. Can you please assist?
book = xlrd.open_workbook('C:\\Users\\am\\Book1.xlsx')
sheet = book.sheet_by_name('Sheet1')
count = 0
for row in sheet.col(1):
count = count + 1
print (count)
You can use col_slice function for this.
import xlrd
book = xlrd.open_workbook('C:\\Users\\am\\Book1.xlsx')
sheet = book.sheet_by_name('Sheet1')
cells = sheet.col_slice(start_rowx=0,
end_rowx=sheet.nrows,
colx=0)
count = len(cells)
By the way, the first column's index is 0, as usual in Python.
Related
How can I find the number of the last non-empty row of an whole xlsx sheet using python and openpyxl?
The file can have empty rows between the cells and the empty rows at the end could have had content that has been deleted. Furthermore I don't want to give a specific column, rather check the whole table.
For example the last non-empty row in the picture is row 13.
I know the subject has been extensively discussed but I haven't found an exact solution on the internet.
# Open file with openpyxl
to_be = load_workbook(FILENAME_xlsx)
s = to_be.active
last_empty_row = len(list(s.rows))
print(last_empty_row)
## Output: 13
s.rows is a generator and its list contains arrays of each rows cells.
If you are looking for the last non-empty row of an whole xlsx sheet using python and openpyxl.
Try this:
import openpyxl
def last_active_row():
workbook = openpyxl.load_workbook(input_file)
wp = workbook[sheet_name]
last_row = wp.max_row
last_col = wp.max_column
for i in range(last_row):
for j in range(last_col):
if wp.cell(last_row, last_col).value is None:
last_row -= 1
last_col -= 1
else:
print(wp.cell(last_row,last_col).value)
print("The Last active row is: ", (last_row+1)) # +1 for index 0
if __name__ = '___main__':
last_active_row()
This should help.
openpyxl's class Worksheet has the attribute max_rows
I am struggling to write codes that find me the first empty row of a google sheet.
I am using gspread package from github.com/burnash/gspread
I would be glad if someone can help :)
I currently have just imported modules and opened the worksheet
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('ddddd-61d0b758772b.json', scope)
gc = gspread.authorize(credentials)
sheet = gc.open("Event Discovery")
ws = sheet.worksheet('Event Discovery')
I want to find row 1158 which is the first empty row of the worksheet with a function, which means everytime the old empty row is filled, it will find the next empty row
See here
I solved this using:
def next_available_row(worksheet):
str_list = list(filter(None, worksheet.col_values(1)))
return str(len(str_list)+1)
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('auth.json', scope)
gc = gspread.authorize(credentials)
worksheet = gc.open("sheet name").sheet1
next_row = next_available_row(worksheet)
#insert on the next available row
worksheet.update_acell("A{}".format(next_row), somevar)
worksheet.update_acell("B{}".format(next_row), somevar2)
This alternative method resolves issues with the accepted answer by accounting for rows that may have skipped values (such as fancy header sections in a document) as well as sampling the first N columns:
def next_available_row(sheet, cols_to_sample=2):
# looks for empty row based on values appearing in 1st N columns
cols = sheet.range(1, 1, sheet.row_count, cols_to_sample)
return max([cell.row for cell in cols if cell.value]) + 1
If you can count on all of your previous rows being filled in:
len(sheet.get_all_values()) + 1
will give you the first free row
get_all_values returns a 2D list of the sheet's data. Each nested list is a row, so the length of the 2D list is the number of rows that has any data.
Similar problem is first free column:
from xlsxwriter.utility import xl_col_to_name
# Square 2D list, doesn't matter which row len you check
column_count = len(sheet.get_all_values()[0])
column = xl_col_to_name(column_count)
def find_empty_cell():
alphabet = list(map(chr, range(65, 91)))
for letter in alphabet[0:1]: #look only at column A and B
for x in range(1, 1000):
cell_coord = letter+ str(x)
if wks.acell(cell_coord).value == "":
return(cell_coord)
I use this kinda sloppy function to find the first empty cell. I can't find an empty row because the other columns already have values.
Oh, and there are some issues between 2.7 and 3.6 with mapping that required me to turn the alphabet into a string.
import pygsheets
gc = pygsheets.authorize(service_file='************************.json')
ss = gc.open('enterprise_finance')
ws = ss[0]
row_count = len(ws.get_all_records()) + 2
ws.set_dataframe(raw_output,(row_count,1), copy_index = 'TRUE', copy_head = 'TRUE')
ws.delete_rows(row_count , number=1)
I would like to get the maximum value from 25th column of an excel spreadsheet using xlrd. Here's what I did.
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
row = 1
max = 1
while row < 100:
a = sheet.cell(row, col).value
if a>max:
return a
else:
return max
row+=1
print(follow_up())
I run into an issue for cells with less than 100 values in a column (gives me IndexError) and the code won't work for cells with more than 100 values in a column. This can be fixed if I know how to get the number of values in a column. But I was wondering if anyone knows a "cleaner" way to do this.
Try:
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
return max(sheet.col_values(col))
print(follow_up())
I hope this helps.
Try this:
import xlrd
book = xlrd.open_workbook("File Location\Filename.xlsx")
sheet = book.sheet_by_index(0)
def follow_up():
col = 24
return max(sheet.col_values(col, start_rowx=1, end_rowx=101))
#with start_rowx and end_rowx you can define the range
#we start with 1 so as to skip the header row
print(follow_up())
col_values() returns a list of all values in the column you mention.
Hope this helps :)
I have run into a problem with trying to read several csv's and finding a specific string within these csv's and be able to run a formula on it.
The csv all have the following main fields (waterlevel, flow ID and a value):
Water Level, Flow, Water Level,Flow
NEU_NEU_065,NEU_NEU_065,NEU_NEU_065,NEU_NEU_065
(274.4925,0,261.3318,-3.2)
With the example above there are duplicates for (NEU_NEU_065) for flow which are one value is 0 and another value being -3.2. I manually find and search within this csv this ID and do a Absolute MAX formula on that column range. So for this case I manually take out NEU_NEU_065 and make it NEU_NEU_065a = 0 and the second one NEU_NEU_065 make it NEU_NEU_065b = 3.2.
I don't need to run absolute max on all IDS within the csv just the particular list of IDs I have in another sheet. By running Absoltute max on everything within the csv will not give the right result because it will consider NEU_NEU_065 as one ID and the value will just be = 0. Which is not correct for what I am trying to achieve. I need to extract it out as NEU_NEU_065a = 0 and NEU_NEU_065b = 3.2 (absolute max of -3.2).
import csv
ifile = open('BCC_R_002c_E_00005Y_0090m_5m_01_PO.csv', "rb")
reader = csv.reader(ifile)
rownum = 0
for row in reader:
# Save header row.
if rownum == 0:
header = row
else:
colnum = 0
for col in row:
print '%-8s: %s' % (header[colnum], col)
colnum += 1
rownum += 1
ifile.close()
I am trying to loop through a spreadsheet and grab the value of a cell in a row under a certain column, as so:
# Row by row, go through the originalWorkSheet and save the values from the selected columns
numberOfRowsInOriginalWorkSheet = originalWorkSheet.nrows - 1
rowCounter = 0
while rowCounter <= numberOfRowsInOriginalWorkSheet:
row = originalWorkSheet.row(rowCounter)
#Grab the values in certain columns, say with the
# column name "Promotion" and save them to a variable
Is this possible? my google-foo has failed me on this one.
Thank you for the help!
The simplest way:
from xlrd import open_workbook
book = open_workbook(path_to_file)
sheet = book.sheet_by_index(0)
for i in range(1, sheet.nrows):
row = sheet.row_values(i)
variable = row[0] # Instead zero number of certain column
or you can loop row list and print each cell value
book = open_workbook(path_to_file)
sheet = book.sheet_by_index(0)
for i in range(1, sheet.nrows):
row = sheet.row_values(i)
for cnt in range(len(row)):
print row[cnt]
Hope this helps
There are many ways to do this, take a look at the docs
Something like this:
promotion_col_index = <promotion column index>
list_of_promotion_cells = originalWorkSheet.col(promotion_col_index)
list_of_promotion_values = [cell.value for cell in list_of_promotion_cells]
will get you a list of the values in the "Promotion" column