Removing Line From Python List - All Lines Containing Certain Number

Removing Line From Python List - All Lines Containing Certain Number - python

I am looking to remove lines from a list that have a 0 in the 4th position. When I write out the file now it is not eliinating all the zero lines.
counter = 0
for j in all_decisions:
if all_decisions[counter][4] == 0:
all_decisions.remove(j)
counter += 1
ofile = open("non_zero_decisions1.csv","a")
writer = csv.writer(ofile, delimiter=',')
for each in all_decisions:
writer.writerow(each)
ofile.close()

Use a list comprehension.
all_decisions = [x for x in all_decisions if x[4] != 0]
Or, use filter.
all_decisions = filter(lambda x: x[4] != 0, all_decisions)
The way you're doing this is not a good idea because you're modifying all_decisions while you're iterating over it. If you wanted to do it in a loop, I would suggest something like:
temp = []
for x in all_decisions:
if x[4] != 0:
temp.append(x)
all_decisions = temp
But this is basically just a more verbose equivalent of the list comprehension and filter approaches I showed above.

I think the problem is in your loop which eliminates the lines:
counter = 0
for j in all_decisions:
if all_decisions[counter][4] == 0:
all_decisions.remove(j)
counter += 1
If you remove an element, you also bump the counter. The consequence of that is that you're skipping lines. So you might miss lines to be removed. Try only bumping the counter if you didn't remove an element, i.e.
counter = 0
for j in all_decisions:
if all_decisions[counter][4] == 0:
all_decisions.remove(j)
else:
counter += 1
That being said, a more concise way to do what you want would be
with open("non_zero_decisions1.csv","a") as ofile:
writer = csv.writer(ofile, delimiter=',')
writer.writerows(d for d in all_decisions if d[4] != 0)
The with clause will take care of calling close on ofile after executing the code, even if an exception is thrown. Also, csv.writer features a writerows method which takes a list of rows. Thirdly, you can use a generator expression d for d in all_decisions if d[4] != 0 to replace your filtering loop.

Related

Splitting a special type of list data and saving data into two separate dataframe using condition in python

Want to seperate a list data into two parts based on condition. If the value is less than "H1000", we want in a first dataframe(Output for list 1) and if it is greater or equal to "H1000" we want in a second dataframe(Output for list2). First column starts the value with H followed by a four numbers.
Here is my python code:
with open(fn) as f:
text = f.read().strip()
print(text)
lines = [[(Path(fn.name), line_no + 1, col_no + 1, cell) for col_no, cell in enumerate(
re.split('\t', l.strip())) if cell != ''] for line_no, l in enumerate(re.split(r'[\r\n]+', text))]
print(lines)
if (lines[:][:][3] == "H1000"):
list1
list2
I am not able to write a python logic to divide the list data into two parts.
Attach python code & file here

So basically you want to check if the number after the H is greater or not than 1000 right? If I'm right then just do like this:
with open(fn) as f:
text = f.read().strip()
print(text)
lines = [[(Path(fn.name), line_no + 1, col_no + 1, cell) for col_no, cell in enumerate(
re.split('\t', l.strip())) if cell != ''] for line_no, l in enumerate(re.split(r'[\r\n]+', text))]
print(lines)
value = lines[:][:][3]
if value[1:].isdigit():
if (int(value[1:]) < 1000):
#list 1
else:
#list 2
we simply take the numerical part of the factor "hxxxx" with the slices, convert it into an integer and compare it with 1000

with open(fn) as f:
text = f.read().strip()
lines =text.split('\n')
list1=[]
list2=[]
for i in lines:
if int(i.split(' ')[0].replace("H",""))>=1000:
list2.append(i)
else:
list1.append(i)
print(list1)
print("***************************************")
print(list2)

I'm not sure exactly where the problem lies. Assuming you read the above text file line by line, you can simply make use of str.__le__ to check your condition, e.g.
lines = """
H0002 Version 3
H0003 Date_generated 5-Aug-81
H0004 Reporting_period_end_date 09-Jun-99
H0005 State WAA
H0999 Tene_no/Combined_rept_no E79/38975
H1001 Tene_holder Magnetic Resources NL
""".strip().split("\n")
# Or
# with open(fn) as f: lines = f.readlines()
list_1, list_2 = [], []
for line in lines:
if line[:6] <= "H1000":
list_1.append(line)
else:
list_2.append(line)
print(list_1, list_2, sep="\n")
# ['H0002 Version 3', 'H0003 Date_generated 5-Aug-81', 'H0004 Reporting_period_end_date 09-Jun-99', 'H0005 State WAA', 'H0999 Tene_no/Combined_rept_no E79/38975']
# ['H1001 Tene_holder Magnetic Resources NL']

Do not add "," after last field in CSV

f=open('students.csv', 'r')
a=f.readline()
length=len(a.split(","))
fw=open('output.csv', 'w')
lst = []
while a:
lst.append(a)
a=f.readline()
for counter in range(length):
for item in lst:
x = len(item.split(","))
if x == length:
x = item.split(",")
#here i want if condition to check whether it is the last element of row and add","?
fw.write(x[counter].split("\n")[0]+",")
#elif the condition that it is the last element of each row to not add ","?
fw.write("\n")
fw.close()
f.close()

join will be your friend here, if you cannot use the csv module:
for counter in range(length):
fw.write(','.join(x[counter] for x in (item.split(',') for item in lst)))
fw.write('\n')
But you should first strip the end of line characters:
a=f.readline().strip()
length=len(a.split(","))
fw=open('output.csv', 'w')
lst = []
while a:
lst.append(a)
a=f.readline().strip()
But your code is neither Pythonic nor efficient.
You split the same string in every iteration of counter, when you could have splitted it once at read time. Next for iterating the lines of a text file, the Pythonic way is to iterate the file. And finaly, with ensure that the files will be properly closed at the end of the block. Your code could become:
with open('students.csv', 'r') as f, open('output.csv', 'w') as fw
lst = [a.strip().split(',') for a in f]
counter = len(lst[0])
for counter in range(length):
fw.write(','.join(x[counter] for x in (item for item in lst)))
fw.write('\n')

Python looping and appending arrays

I'm creating an application to test how many 'steps' it takes for a number to reach 1 using the colatz conjecture. Here is the code:
import sys
import csv
def findSteps(mode):
count = 0
while mode != 1:
count = count + 1
if mode%2 == 0:
mode = mode/2
else:
mode = (3*mode)+1
return count
numbers = []
counts = []
for n in range(1, 100):
important = findSteps(n)
numbers[n] = n
counts[n] = important
with open('colatz conjecture table.csv', 'wb') as csvfile:
myWriter = csv.writer(csvfile, delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL)
myWriter.writerow(numbers)
myWriter.writerow(counts)
Unfortunately, whenever I run it, it gives me a "Index Error: List Assignment out of range."

In addition of the list.append() variant, you may also use
numbers = range(1, 100)
counts = [findSteps(n) for n in numbers]
Or, if you like to keep it functional
numbers = range(1, 100)
counts = map(findSteps, numbers)

Both numbers and counts have a type of list, based on how you defined them in your code. So you can use the method of append to add the data to both of them. Just remember that the indexes are zero-based.
Moreover, I agree with #Evert's comment it looks like a dictionary object is better suited for your needs.

Reading coordinates from text file in python

I have a text file (coordinates.txt):
30.154852145,-85.264584254
15.2685169645,58.59854265854
...
I have a python script and there is a while loop inside it:
count = 0
while True:
count += 1
c1 =
c2 =
For every run of the above loop, I need to read each line (count) and set c1,c2 to the numbers of each line (separated by the comma). Can someone please tell me the easiest way to do this?
============================
import csv
count = 0
while True:
count += 1
print 'val:',count
for line in open('coords.txt'):
c1, c2 = map(float, line.split(','))
break
print 'c1:',c1
if count == 2: break

The best way to do this would be, as I commented above:
import csv
with open('coordinates.txt') as f:
reader = csv.reader(f)
for count, (c1, c2) in enumerate(reader):
# Do what you want with the variables.
# You'll probably want to cast them to floats.
I have also included a better way to use the count variable using enumerate, as pointed out by #abarnert.

f=open('coordinates.txt','r')
count =0
for x in f:
x=x.strip()
c1,c2 = x.split(',')
count +=1

Python trouble exiting a 'while' loop for simple script

I wrote a script to reformat a tab-delimited matrix (with header) into a "long format". See example below. It performs the task correctly but it seems to get stuck in an endless loop...
Example of input:
WHO THING1 THING2
me me1 me2
you you1 you2
Desired output:
me THING1 me1
me THING2 me2
you THING1 you1
you THING2 you2
Here is the code:
import csv
matrix_file = open('path')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
j = 1
while j:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum+=1
if j == int(len(i)):
j = False
elif rownum ==1:
header = i[j]
else:
print i[0], "\t",header, "\t",i[j]
j +=1
I think it has to do with my exit command (j = False). Any ideas?
edit: Thanks for suggestions. I think a typo in my initial posting led to some confusion, sorry about that For now I have employed a simple solution:
valid = True
while valid:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum+=1
if j == int(len(i)):
valid = False
etc, etc, etc...

Your j += 1 is outside the while loop, so j never increases. If len(i) is never less than 2, then you'll have an infinite loop.
But as has been observed, there are other problems with this code. Here's a working version based on your idiom. I would do a lot of things differently, but perhaps you'll find it useful to see how your code could have worked:
j = 1
while j:
matrix_file.seek(0)
rownum = 0
for i in matrix_reader:
rownum += 1
if j == len(i) or j == -1:
j = -1
elif rownum == 1:
header = i[j]
else:
print i[0], "\t", header, "\t", i[j]
j += 1
It doesn't print the rows in the order you wanted, but it gets the basics right.
Here's how I would do it instead. I see that this is similar to what Ashwini Chaudhary posted, but a bit more generalized:
import csv
matrix_file = open('path')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
headers = next(matrix_reader, '')
for row in matrix_reader:
for header, value in zip(headers[1:], row[1:]):
print row[0], header, value

j+=1 is outside the while loop as senderle's answer says.
other improvements can be:
int(len(i)) ,just use len(i) ,as len() always returns a int so no need of int() around
it
use for rownum,i in enumerate(matrix_reader): so now there's no
need of handling an extra variable rownum, it'll be incremented by
itself.
EDIT: A working version of your code, I don't think there's a need of while here, the for loop is sufficient.
import csv
matrix_file = open('data1.csv')
matrix_reader = csv.reader(matrix_file, delimiter="\t")
header=matrix_reader.next()[0].split() #now header is ['WHO', 'THING1', 'THING2']
for i in matrix_reader:
line=i[0].split()
print line[0], "\t",header[1], "\t",line[1]
print line[0], "\t",header[2], "\t",line[2]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing Line From Python List - All Lines Containing Certain Number - python

Related

Splitting a special type of list data and saving data into two separate dataframe using condition in python

Do not add "," after last field in CSV

Python looping and appending arrays

Reading coordinates from text file in python

Python trouble exiting a 'while' loop for simple script

Categories

Resources