Read file and separate to different lists - python

below is my CSV dataset.
a,d,g
b,e,h
c,f,i
I would like to separate these 3 column as row[0], row[1], and row[2].
And also to make them as 3 different lists.
Here is my code:
import csv
file1 = open('....my_path.csv')
Row_0 = [row[0].upper() for row in csv.reader(file1)]
Row_1 = [row[1].upper() for row in csv.reader(file1)]
Row_2 = [row[2].upper() for row in csv.reader(file1)]
print Row_0
print Row_1
print Row_2
However, I only can see Row_0 result from the console. But the Row_1 and Row_2 are always showing [ ]. Which means I only can see the first row, but not the second and following rows.
['A', 'B', 'C']
[]
[]
Does anyone can help me to deal with this "simple" issue?

open returns an iterator, which becomes exhausted (no longer usable) after you iterate through it once. Moreover, each time you do csv.reader(file1), csv.reader will try to iterate over the iterator referenced by file1. This means that, after the first call, this iterator will become exhausted and all subsequent calls to csv.reader will be using an empty iterator.
To do what you want, you would need something like:
import csv
file1 = open('....my_path.csv')
Row_0, Row_1, Row_2 = ([row[0].upper(), row[1].upper(), [row[2].upper()]
for row in csv.reader(file1))
print Row_0
print Row_1
print Row_2
Now, we get all of the data in one read and only iterate over the iterator once.
Also, in case you are wondering, the code I used is known as a generator expression. You could also use a list comprehension:
Row_0, Row_1, Row_2 = [[row[0].upper(), row[1].upper(), [row[2].upper()]
for row in csv.reader(file1)]
but I don't see a point in building a list just to throw it away.

You've read through the whole file! check out the file position after your first comprehension,
You could seek(0) between comprehensions or just iterate once, or reopen the file as #kdopen stated

Related

Function not executing through all the rows

Need help understanding why this is only returning the results for the first row and not the remaining rows inside a csv file. Thank you
with open('customerData.csv') as csvfile:
reader = csv.DictReader(csvfile)
for rows in reader:
data.append(rows)
print(data[0])
print(data[1]["Name"])
print(data[2]["Zip"])
print(data[3]["Gender"])
print(data[3]["Favorite Radio Station"])
When referencing items in a list, you can use the [] notation to indicate which item in the list you want. The first item in the list is some_list[0], the second is some_list[1], and so on. You can also go from the end of the list, where some_list[-1] is the last item, and some_list[-2] is the second to last, and so on.
However you want to print every row, you would need to iterate through the list like so:
for item in data:
print(item)
From here you can reference the keys of the item in the list directly, in a similar manner to how you did originally in your code:
for item in data:
print(item["Name"])
print(item["Zip"])
print(item["Gender"])
Hope this helps!

Alternative to "for" LINQ equivalent of Where?

I have a CSV file contents in a variable raw_data. I only need certain rows depending on whether the first element of the row (row_split[0]) matches a price_number. The below code works fine however coming from a C# background I know there is a LINQ equivalent. Is there anything in Python that use Where or Any which also includes the row.split(',').
for row in raw_data:
row_split = row.split(',')
if str(row_split[0]) == price_number:
filtered_data.append(row)
I think you can do it using list comprehension
filtered_data = [row for row in row_data if str(row.split(',')[0]) == price_number]

Python - Get item from a list under a list

I have a list like below.
list = [[Name,ID,Age,mark,subject],[karan,2344,23,87,Bio],[karan,2344,23,87,Mat],[karan,2344,23,87,Eng]]
I need to get only the name 'Karan' as output.
How can I get that?
This is a 2D list,
list[i][j]
will give you the 'i'th list within your list and the 'j'th item within that list.
So to get Karen you want list[1][0]
I upvoted Lio Elbammalf, but decided to provide an answer that made a couple of assumptions that should have been clarified in the question:
The First item of the list is the headers, they are actually in the list (and not there as part of the question), and they are provided as part of the list because there is no guarantee that the headers will always be in the same order.
This is probably a CSV file
Ignoring 2 for the moment, what you would want to do is remove the "headers" from the list (so that the rest of the list is uniform), and then find the index of "Name" (your desired output).
myinput = [["Name","ID","Age","mark","subject"],
["karan",2344,23,87,"Bio"],
["karan",2344,23,87,"Mat"],
["karan",2344,23,87,"Eng"]]
## Remove the headers from the list to simplify everything
headers = myinput.pop(0)
## Figure out where to find the person's Name
nameindex = headers.index("Name")
## Return a list of the Name in each row
return [stats[nameindex] for stats in myinput]
If the name is guaranteed to be the same in each row, then you can just return myinput[0][nameindex] like is suggested in the other answer
Now, if 2 is true, I'm assuming you're using the csv module, in which case load the file using the DictReader class and then just access each row using the 'Name' key:
def loadfile(myfile):
with open(myfile) as f:
reader = csv.DictReader(f)
return list(reader)
def getname(rows):
## This is the same return as above, and again you can just
## return rows[0]['Name'] if you know you only need the first one
return [row['Name'] for row in rows]
In Python 3 you can do this
_, [x, _, _, _, _], *_ = ls
Now x will be karan.

Python looping a dictionary multiple times

My issue is from a much larger program but I shrunk and dramatically simplified the specific problem for the purpose of this question.
I've used the dictreader method to create a dictionary from a csv file. I want to loop through the dictionary printing out its contents, which I can, but I want to do this multiple times.
The content of test.csv is simply one column with the numbers 1-3 and a header row called Number.
GetData is a class with a method called create_dict() that I wrote that creates and returns a dictionary from test.csv using csv.dictreader
My code is as follows:
dictionary = GetData('test.csv').create_dict()
for i in range(5):
print("outer loop")
for row in dictionary:
print(row['Number'])
class GetData:
def __init__(self, file):
self._name = file
def create_dict(self):
data = csv.DictReader(open(self._name, 'r'), delimiter=",")
return data
The output is as follows:
outer loop
1
2
3
outer loop
outer loop
outer loop
outer loop
My desired output is:
outer loop
1
2
3
outer loop
1
2
3
outer loop
1
2
3
outer loop
1
2
3
outer loop
1
2
3
Does anyone know why this happens in Python?
Since you're using a file object, it's reading from the cursor position. This isn't a problem the first time through because the position is at the beginning of the file. After that, it's reading from the end of the file to the, well, end of the file.
I'm not sure how GetData works, but see if it has a seek command in which case:
for i in range(5):
print('outer loop')
dictionary.seek(0)
for row in dictionary:
print(row['Number'])
As g.d.d.c points out in a comment, it may also be a generator instead of a file object, in which case this approach is flawed. The generator will only run once, so you may have to dict() it. It all depends on how GetData.create_dict works!
As per your comment that GetData.create_dict gives you a csv.DictReader, your options are somewhat limited. Remember that the DictReader is essentially just a list of dicts, so you may be able to get away with:
list_of_dicts = [row for row in dictionary]
then you can iterate through the list_of_dicts
for i in range(5):
print('outer loop')
for row in list_of_dicts:
print(row['Number'])
csv.DictReader is an iterator for the associated open file. After one loop over the file, you're at the end (EOF).
To loop over it again, simply seek to the beginning of the file: your_filehandle.seek(0)

Python: writing int and list in a csv row

Maybe this question is too naive, but is giving me a hard time!. I want to write 2 float values and a list of int to a row in csv file in a loop. The file may or may not exist before an attempt is made to write in it. In case it does not, a new file should be created. This is what I am doing:
f = open('stat.csv','a')
try:
writer=csv.writer(f,delimiter=' ',quoting=csv.QUOTE_MINIMAL)
writer.writerow((some_float1,some_float2,alist))
finally:
f.close()
where alist = [2,3,4,5]. I am getting the following output:
some_float1 some_float2 "[2,3,4,5]"
What I want is this:
some_float1 some_float2 2 3 4 5
i.e. I would like to get rid of the "", the square brackets and make the delimiter consistent throughout. Any suggestions ?
How about:
writer.writerow([some_float1, some_float2] + alist)

Categories

Resources