ValueError: Row numbers must be between 1 and 1048576 - python

I'm using python openpyxl to extract specific data from an xlsx file to another xlsx. I defined a function which extracts the data I need and then I ran it using a while loop and told it to stop when it finds an empty cell.
But for some reason it gives me this error: Row numbers must be between 1 and 1048576
Here is my code:
x=3; y=2; z=4; i=5
def line():
c1 = ws1.cell(row = z, column = 1)
ws2.cell(row = y, column = 1).value = c1.value
c2 = ws1.cell(row = i, column = 2)
ws2.cell(row = y, column = 2).value = c2.value
c3 = ws1.cell(row = i, column = x)
ws2.cell(row = y, column = 3).value = c3.value
while ws1.cell(row=i, column=x+2).value != "":
line()
y+=1
x+=2
i+=1
else:
sys.exit()
What am I doing wrong?

The cell value returns a None when there is no data. It will not return "". So, the condition is NOT satisfied and the while loop goes on till the last row of excel. Change...
while ws1.cell(row=i, column=x+2).value != "":
to
while ws1.cell(row=i, column=x+2).value is not None:
...and the code would run as expected.

Related

Find beginning index and last row index of consecutive values from excel in python

I have a list of excel data, and I'm trying to index the rows that the first and last consecutive numbers have, since I need the indexes for 2 other columns associated with the row. Right now I have a list index out of range. [enter image description here][1]
import xlsxwriter
import xlrd
import sys
workbook = xlrd.open_workbook('3speedtest2.xlsx')
worksheet = workbook.sheet_by_name('Sheet1')
count = 0
rowcount = 1
firstindex = 1
lastindex = 1
"""
for i in range(0,3):
print("pap")
"""
while count<999999999:
print(count)
list=[]
firstindex = lastindex
list.append(firstindex)
if str(worksheet.cell(rowcount,28)) == -1:
sys.exit()
if str(worksheet.cell(rowcount, 28)) == str(worksheet.cell(rowcount+1,28)):
print (str(worksheet.cell(rowcount, 28)))
else:
print (str(worksheet.cell(rowcount, 28)))
print ("new number")
lastindex=rowcount
list.append(lastindex)
rowcount+=1
count+=1
print (list)

How to get the row value and column value(date)together of a cell in excel using python

I have an excel pivot table of format:
Names 2/1/2010 3/1/2010 4/1/2010
A 8
B 4 5 7
C 5 3
D 6 6
I need to get the names and date of the cells which are empty. How can I do it?
I want the output as a list: [A:3/1/2010,4/1/2010].
Assuming format is same as above, Check this code snippet, you can use different python module to read excel sheet
import xlrd
def get_list_vals() :
res = []
path="C:/File_PATH.xlsx"
wb=xlrd.open_workbook(path)
sheet=wb.sheet_by_index(0)
# Get rows from 2nd line
for row in range(1, sheet.nrows) :
temp = []
for column in range (sheet.ncols) :
val = sheet.cell_value(row,column)
# get first column values like(A, B, C)
if column == 0:
temp.append(val)
continue
# if not first column, get the date data from row = 1
elif val=="" :
date_val = sheet.cell_value(0,column)
temp.append(date_val)
res.append(temp)
return res
If you want specific format like [A : date1, date2] for thhis instead of temp = [] , you can append to string value
temp = [] -->> temp = ""
temp.append(val) --> temp += str(val) + ":"
temp.append(date_val) -->> temp + str(val) + ","

Shift cells up if entire row is empty in Openpyxl

I want the entire row to be removed(shift cells up) if there are no values in the entire row. I'm using Openpyxl.
My code:
for row in range(1, ws1.max_row):
flag = 0
for col in range(1, 50):
if ws1.cell(row, col).value is not None:
flag = 1
if flag == 0:
ws1.delete_rows(row, 1)
The rows are not getting deleted in the above case.
I tried using iter_rows function to do the same and it gives me:
TypeError: '>' not supported between instances of 'tuple' and 'int'
for row in ws1.iter_rows(min_row = 1, max_col=50, max_row = ws1.max_row):
flag = 0
for cell in row:
if cell.value is not None:
flag = 1
if flag == 0:
ws1.delete_rows(row, 1)
Help is appreciated!
The following is a generic approach to finding and then deleting empty rows.
empty_rows = []
for idx, row in enumerate(ws.iter_rows(max_col=50), start=1):
empty = not any((cell.value for cell in row))
if empty:
empty_rows.append(idx)
for row_idx in reversed(empty_rows):
ws.delete_rows(row_idx, 1)
Thanks to Charlie Clark for the help, here is a working solution I came up with, let me know if I can make any improvements to it:
i = 1
emptyRows = []
for row in ws1.iter_rows(min_row = 1, max_col=50, max_row = ws1.max_row):
flag = 0
for cell in row:
if cell.value is not None:
flag = 1
if flag == 0:
emptyRows.append(i)
i += 1
for x in emptyRows:
ws1.delete_rows(x, 1)
emptyRows[:] = [y - 1 for y in emptyRows]

How do I put all my looped output in a variable (for generating an output file)? (CSV related)

I am quite new to working with python, so i hope you can help me out here. I have to write a programm that opens a csv file, reads it and let you select columns you want by entering the number. those have to be put in a new file. the problem is: after doing the input of which columns i want and putting "X" to start the main-part it generates exactly what i want but by using a loop, not printing a variable that contains it. But for the csv-writer i need to have a variable containg it. any ideas? here you have my code, for questions feel free to ask. the csvfile is just like:
john, smith, 37, blue, michigan
tom, miller, 25, orange, new york
jack, o'neill, 40, green, Colorado Springs
...etc
Code is:
import csv
with open("test.csv","r") as t:
t_read = csv.reader(t, delimiter=",")
t_list = []
max_row = 0
for row in t_read:
if len(row) != 0:
if max_row < len(row):
max_row = len(row)
t_list = t_list + [row]
print([row], sep = "\n")
twrite = csv.writer(t, delimiter = ",")
tout = []
counter = 0
matrix = []
for i in range(len(t_list)):
matrix.append([])
print(len(t_list), max_row, len(matrix), "Rows / Columns / Matrix Dimension")
eingabe = input("Enter column number you need or X to start generating output: ")
nr = int(eingabe)
while type(nr) == int:
colNr = nr-1
if max_row > colNr and colNr >= 0:
nr = int(nr)
# print (type(nr))
for i in range(len(t_list)):
row_A=t_list[i]
matrix[i].append(row_A[int(colNr)])
print(row_A[int(colNr)])
counter = counter +1
matrix.append([])
else:
print("ERROR")
nr = input("Enter column number you need or X to start generating output: ")
if nr == "x":
print("\n"+"Generating Output... " + "\n")
for row in matrix:
# Loop over columns.
for column in row:
print(column + " ", end="")
print(end="\n")
else:
nr = int(nr)
print("\n")
t.close()
Well you have everything you need with matrix, apart from an erroneous line that adds an unneeded row:
counter = counter +1
matrix.append([]) # <= remove this line
else:
print("ERROR")
You can then simply do:
if nr == "x":
print("\n"+"Generating Output... " + "\n")
with open("testout.csv", "w") as out:
wr = csv.writer(out, delimiter=",")
wr.writerows(matrix)

Using index in Python to iterate through columns

I am trying to iterate through columns in a data file, to perform my task and then save the output to a file. I have almost 200 columns and unfortunately so far I can only get the required output by changing the column index manually (where ###). I have managed to get the index numbers that I want to use from my row names into a list (called x). I've been playing around with this but I am stuck as to how to make it iterate through these indices in the correct places. Below is what I have so far:
with open('matrix.txt', 'r') as file:
motif = file.readline().split()
x = [i for i, j in enumerate(motif)]
print x ### list of indices I want to use
for column in (raw.strip().split() for raw in file):
chr = column[0].split("_")
coordinates = "\t".join(chr)
name = motif[1] ### using column index
print name
for value in column[1]: ### using column index
if value == "1":
print coordinates
out = open("%s.bed" %name, "a")
out.write(str(coordinates)+"\n")
elif value == "0":
pass
When I return x I get:
x = [0, 1, 2, 3, 4,...]
Using motif[x[1]] returns the correct names and columns, however this is the same as me putting the index in manually. Any help is appreciated!
Instead of:
name = motif[1] ### using column index
print name
for value in column[1]: ### using column index
if value == "1":
print coordinates
out = open("%s.bed" %name, "a")
out.write(str(coordinates)+"\n")
elif value == "0":
pass
you can iterate through x since x is a list of the column indices:
for index in x:
name = motif[index]
print name
for value in column[index]:
if value == "1":
print coordinates
out = open("%s.bed" %name, "a")
out.write(str(coordinates)+"\n")
elif value == "0":
pass
You can read more about for loops here.

Categories

Resources