I have a csv file with 2 rows and multiple lines.
import csv
with open('data.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
for row in csv_reader:
print(row[0])
The output is:
row0line0
row0line1
row0line2
...
Is there a way i could further separate the rows into a list of individual cells?
Thanks
As I understand your csv file look like this:
row0line0
row1line1
...
If its possible i should reccomand to change it to:
row0 line0
row1 line1
...
(Add a space between the rows and the lines)
Then you can update your code to the code bellow to print only the rows and create two lists - one that contain the rows and another that contain the lines:
import csv
with open('data.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
rows = []
lines = []
for item in csv_reader:
temp = item[0].split(" ")
rows.append(temp[0])
lines.append(temp[1])
print(temp[0])
If you mean that each row becomes an element of the list that goes this way:
with open('data.csv', 'r') as csv_file:
reader = csv.reader(csv_file)
data_list = [row[0] for row in reader]
Otherwise, if you want to create a list of the first elements of each line, you can do this:
with open('data.csv', 'r') as csv_file:
reader = csv.reader(csv_file)
row0_list = []
for row in reader:
row0_list.append(row[0])
I hope the problem is solved with this explanation.
My understanding is that you are asking to output all the data fields. csv_reader is already separating your rows into a list individual cells!
You current script reads the file one line at a time and prints the first item in each row with this line:
for row in csv_reader:
print(row[0])
Instead of printing row[0], which only prints the first field in the csv row, you can just print the row:
for row in csv_reader:
print(row)
That will output the field lists (from my sample csv):
['r0v0', 'r0v1', 'r0v2']
['r1v0', 'r1v1', 'r1v2']
If you want to print in a nicer format, you can use join:
for row in csv_reader:
print(", ".join(row))
Output:
r0v0, r0v1, r0v2
r1v0, r1v1, r1v2
My csv:
r0v0,r0v1,r0v2
r1v0,r1v1,r1v2
Related
Hey intelligent community,
I need a little bit of help because i think i don't see the the wood in the trees.
i have to CSV files that look like this:
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.2
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.5
I would like to compare both files and than write any changes like this:
Name,Number,Changes
AAC;2.2.3
AAF;2.4.4
ZCX;5.5.5;change: 3.5.2
So on every line when there is a difference in the number, i want to add this as a new column at the end of the line.
The Files are formated the same but sometimes have a new row so thats why i think i have to map the keys.
I come this far but now iam lost in my thoughts:
Python 3.10.9
import csv
Reading the first csv and set mapping
with open('test1.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file1_dict = {row[1]: row[0] for row in rows}
Reading the second csv and set mapping
with open('test2.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file2_dict = {row[1]: row[0] for row in rows}
comparing the keys and find the diff
for k in test1_dict:
if test1_dict[k] != test2:dict[k]
test1_dict[k] = test2_dict[k]
for row in rows:
if row[1] == k:
row.append(test2_dict[k])
#write the csv (not sure how to add the word "change:")
with open('test1.csv', 'w', newline ='') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(rows)
If i try this, i don't get a new column, it just "updates" the csv file with the same columns.
For example this code gives me the diff row but i'am not able to just add it to existing file and row.
with open('test1.csv') as fin1:
with open('test2.csv') as fin2:
read1 = csv.reader(fin1)
read2 = csv.reader(fin2)
diff_rows = (row1 for row1, row2 in zip(read1, read2) if row1 != row2)
with open('test3.csv', 'w') as fout:
writer = csv.writer(fout)
writer.writerows(diff_rows)
Does someone have any tips or help for my problem? I read many answers on here but can't figure it out.
Thanks alot.
#bigkeefer
Thanks for your answer, i tried to change it for the delimiter ; but it gives an "list index out of range error".
with open('test3.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=';')
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows}
with open('test4.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=';')
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows}
new_file = ["Name;Number;Changes\n"]
with open('output.csv', 'w') as nf:
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
nf.writelines(new_file)
You will need to adapt this to overwrite your first file etcetera, as you mentioned above, but I've left it like this for your testing purposes. Hopefully this will help you in some way.
I've assumed you've actually got the headers above in each file. If not, remove the slicing on the list creations, and change the new_file variable assignment to an empty list ([]).
with open('f1.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=";")
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows if row}
with open('f2.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=";")
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows if row}
new_file = ["Name,Number,Changes\n"]
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
with open('new.csv', 'w') as nf:
nf.writelines(new_file)
I open a file and read it with csv.DictReader. I iterate over it twice, but the second time nothing is printed. Why is this, and how can I make it work?
with open('MySpreadsheet.csv', 'rU') as wb:
reader = csv.DictReader(wb, dialect=csv.excel)
for row in reader:
print row
for row in reader:
print 'XXXXX'
# XXXXX is not printed
You read the entire file the first time you iterated, so there is nothing left to read the second time. Since you don't appear to be using the csv data the second time, it would be simpler to count the number of rows and just iterate over that range the second time.
import csv
from itertools import count
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
row_count = count(1)
for row in reader:
next(count)
print(row)
for i in range(row_count):
print('Stack Overflow')
If you need to iterate over the raw csv data again, it's simple to open the file again. Most likely, you should be iterating over some data you stored the first time, rather than reading the file again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print('Stack Overflow')
If you don't want to open the file again, you can seek to the beginning, skip the header, and iterate again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
f.seek(0)
next(reader)
for row in reader:
print('Stack Overflow')
You can create a list of dictionaries, each dictionary representing a row in your file, and then count the length of the list, or use list indexing to print each dictionary item.
Something like:
with open('YourCsv.csv') as csvfile:
reader = csv.DictReader(csvfile)
rowslist = list(reader)
for i in range(len(rowslist))
print(rowslist[i])
add a wb.seek(0) (goes back to the start of the file) and next(reader) (skips the header row) before your second loop.
You can try store the dict in list and output
input_csv = []
with open('YourCsv.csv', 'r', encoding='UTF-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
input_csv.append(row)
for row in input_csv:
print(row)
for row in input_csv:
print(row)
I want to print only second row of my csv file.I have two rows but i want to fetch only second row.Please help me.
use code below to print only second row of CSV file named f.csv, and datas are separated by comma in each row:
import csv
with open('f.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
if line_count == 1:
print(row)
break
line_count += 1
I have a list of approximately 500 strings that I want to check against a CSV file containing 25,000 rows. What I currently have seems to be getting stuck looping. I basically want to skip the row if it contains any of the strings in my string list and then extract other data.
stringList = [] #strings look like "AAA", "AAB", "AAC", etc.
with open('BadStrings.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for row in filereader:
stringToExclude = row[0]
stringList.append(stringToExclude)
with open('OtherData.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
next(filereader, None) #Skip header row
for row in filereader:
for s in stringList:
if s not in row:
data1 = row[1]
Edit: Not an infinite loop, but looping is taking too long.
according to Niels I would change the 2 loop and iterate over the row itself and check if the current row entry is inside the "bad" list:
for row in filereader:
for s in row:
if s not in stringlist:
data1 = row[0]
And I also dont know what you want to do with data1 but you always change the object reference when an item is not in stringList.
You could use a list to add the items to a list with data1.append(item)
You could try something like this.
stringList = [] #strings look like "AAA", "AAB", "AAC", etc.
with open('BadStrings.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for row in filereader:
stringToExclude = row[0]
stringList.append(stringToExclude)
data1 = [] # Right now you are overwriting your data1 every time. I don't know what you want to do with it, but you could for exmaple add all row[1] to a list data1
with open('OtherData.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
next(filereader, None) #Skip header row
for row in filereader:
found_s = False
for s in stringList:
if s in row:
found_s = True
break
if not found_s:
data1.append(row[1]) # Add row[1] to the list is no element of stringList is found in row.
Still probably not a huge performance improvement, but at least the for loop for s in stringList: will now stop after s is found.
I have a csv file with around 500 lines, i want to insert multiple empty rows for each row , i.e add 5 empty rows after each row, so i tried this
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
i = 0
for row in read:
outwriter.writerow(row)
i += 1
outwriter.writerow([])
This creates 3 empty rows but not 5, i am not sure on how to add 5 rows for each row. what am i missing here
Update:
CSV File sample
No,First,Sec,Thir,Fourth
1,A,B,C,D
2,A,B,C,D
3,A,B,C,D
4,A,B,C,D
5,A,B,C,D
6,A,B,C,D
7,A,B,C,D
8,A,B,C,D
Adding the output csv file for answer code
Your code actually only adds one blank line. Use a loop to add as many as you want:
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])
The following code fixes it from the answer given by John Anderson, adding an additional newline='' parameter inside the open method gives the exact number of empty rows in range
import csv
with open("new.csv", 'r') as infile:
read=csv.reader(infile, delimiter=',')
with open("output1.csv", 'wt',newline='') as output:
outwriter=csv.writer(output, delimiter=',')
for row in read:
outwriter.writerow(row)
for i in range(5):
outwriter.writerow([])