I'm reading a two column .csv file of variable length. I have some code written that should be able to read the data bar the first line into an x Data column and a y Data column. Here is the code:
def csvReader(filename):
with open(filename) as csvFile:
csvReader = csv.reader(csvFile, delimiter = ',')
rowCount = sum(1 for row in csvReader)
xData = np.zeros(rowCount)
yData = np.zeros(rowCount)
line_count = 0
firstLine = True
for row in csvReader:
print(row)
if firstLine:
firstLine = False
continue
xData[line_count] = row[0]
yData[line_count] = row[1]
line_count += 1
return xData,yData
It outputs an array of zeros, and the console never shows any printed output, which seems to imply that the entire for loop is getting skipped. Any help on this issue would be appreciated.
You're exhausting the iterator when you do
rowCount = sum(1 for row in csvReader)
You need to rewind the file to read it again.
csvFile.seek(0)
Related
Hey intelligent community,
I need a little bit of help because i think i don't see the the wood in the trees.
i have to CSV files that look like this:
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.2
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.5
I would like to compare both files and than write any changes like this:
Name,Number,Changes
AAC;2.2.3
AAF;2.4.4
ZCX;5.5.5;change: 3.5.2
So on every line when there is a difference in the number, i want to add this as a new column at the end of the line.
The Files are formated the same but sometimes have a new row so thats why i think i have to map the keys.
I come this far but now iam lost in my thoughts:
Python 3.10.9
import csv
Reading the first csv and set mapping
with open('test1.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file1_dict = {row[1]: row[0] for row in rows}
Reading the second csv and set mapping
with open('test2.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file2_dict = {row[1]: row[0] for row in rows}
comparing the keys and find the diff
for k in test1_dict:
if test1_dict[k] != test2:dict[k]
test1_dict[k] = test2_dict[k]
for row in rows:
if row[1] == k:
row.append(test2_dict[k])
#write the csv (not sure how to add the word "change:")
with open('test1.csv', 'w', newline ='') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(rows)
If i try this, i don't get a new column, it just "updates" the csv file with the same columns.
For example this code gives me the diff row but i'am not able to just add it to existing file and row.
with open('test1.csv') as fin1:
with open('test2.csv') as fin2:
read1 = csv.reader(fin1)
read2 = csv.reader(fin2)
diff_rows = (row1 for row1, row2 in zip(read1, read2) if row1 != row2)
with open('test3.csv', 'w') as fout:
writer = csv.writer(fout)
writer.writerows(diff_rows)
Does someone have any tips or help for my problem? I read many answers on here but can't figure it out.
Thanks alot.
#bigkeefer
Thanks for your answer, i tried to change it for the delimiter ; but it gives an "list index out of range error".
with open('test3.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=';')
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows}
with open('test4.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=';')
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows}
new_file = ["Name;Number;Changes\n"]
with open('output.csv', 'w') as nf:
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
nf.writelines(new_file)
You will need to adapt this to overwrite your first file etcetera, as you mentioned above, but I've left it like this for your testing purposes. Hopefully this will help you in some way.
I've assumed you've actually got the headers above in each file. If not, remove the slicing on the list creations, and change the new_file variable assignment to an empty list ([]).
with open('f1.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=";")
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows if row}
with open('f2.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=";")
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows if row}
new_file = ["Name,Number,Changes\n"]
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
with open('new.csv', 'w') as nf:
nf.writelines(new_file)
I want to print only second row of my csv file.I have two rows but i want to fetch only second row.Please help me.
use code below to print only second row of CSV file named f.csv, and datas are separated by comma in each row:
import csv
with open('f.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
if line_count == 1:
print(row)
break
line_count += 1
I need to find profit/loss from two different lines on a csv file. I cant find a way to hold a variable whilst on one row and then once i move onto another line have the same variable to make a comparison.
I have already tried the next() function but have had no luck.
import csv
symbolCode = input("Please enter a symbol code: ")
with open("prices.csv", "r") as f:
reader = csv.reader(f, delimiter=",")
with open(symbolCode + ".csv", "w") as d:
writer = csv.writer(d)
for row in reader:
item = 0
item2 = 0
if symbolCode == row[1]:
print(row)
writer.writerow(row)
d.close()
I expect to find an output of a number but while having used the two other numbers to minus and equal the output
Are you looking for something like this?
symbolCode = input("Please enter a symbol code: ")
with open("prices.csv", "r") as f:
reader = csv.reader(f, delimiter=",")
with open(symbolCode + ".csv", "w") as d:
writer = csv.writer(d)
previous_row = None # <--- initialize with special (empty/none) value
for row in reader:
item = 0
item2 = 0
if symbolCode == row[1]:
print(row)
writer.writerow(row)
if previous_row != None: # <-- if we're not processing the very first row.
if previous_row[7] < row[7]: # <-- do your comparison with previous row
print("7th value is bigger now") # <-- do something
previous_row = row # <-- store this row to be the previous row in the next loop iteration
Note that I've left out the d.close() line. It's not needed when you open a file in a with statement. Other than that, I only added lines to your example, and marked these line with # <-- comments.
I have a list of approximately 500 strings that I want to check against a CSV file containing 25,000 rows. What I currently have seems to be getting stuck looping. I basically want to skip the row if it contains any of the strings in my string list and then extract other data.
stringList = [] #strings look like "AAA", "AAB", "AAC", etc.
with open('BadStrings.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for row in filereader:
stringToExclude = row[0]
stringList.append(stringToExclude)
with open('OtherData.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
next(filereader, None) #Skip header row
for row in filereader:
for s in stringList:
if s not in row:
data1 = row[1]
Edit: Not an infinite loop, but looping is taking too long.
according to Niels I would change the 2 loop and iterate over the row itself and check if the current row entry is inside the "bad" list:
for row in filereader:
for s in row:
if s not in stringlist:
data1 = row[0]
And I also dont know what you want to do with data1 but you always change the object reference when an item is not in stringList.
You could use a list to add the items to a list with data1.append(item)
You could try something like this.
stringList = [] #strings look like "AAA", "AAB", "AAC", etc.
with open('BadStrings.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
for row in filereader:
stringToExclude = row[0]
stringList.append(stringToExclude)
data1 = [] # Right now you are overwriting your data1 every time. I don't know what you want to do with it, but you could for exmaple add all row[1] to a list data1
with open('OtherData.csv', 'r')as csvfile:
filereader = csv.reader(csvfile, delimiter=',')
next(filereader, None) #Skip header row
for row in filereader:
found_s = False
for s in stringList:
if s in row:
found_s = True
break
if not found_s:
data1.append(row[1]) # Add row[1] to the list is no element of stringList is found in row.
Still probably not a huge performance improvement, but at least the for loop for s in stringList: will now stop after s is found.
I am given the task to write a script to check MX records of the given data in the CSV file. I have started by trying checking it using regex and before that I trying to read the CSV file. I would also like to log the progress so I am printing the row number it is on, but whenever I use the cvs_reader object to calculate the row length I am unable to get inside the for loop
import csv
with open('test_list.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
data = list(csv_reader)
row_count = len(data)
for row in csv_reader:
print({row[2]})
line_count += 1
print('Checking '+ str(line_count) +' of '+ str(row_count))
print('Processed lines :'+str(row_count))
I only get the result as
Processed lines : 40
New at python scripting. Please help
My test_list.csv look like this
fname, lname, email
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
Total 40 times continued
first thing csv data has nothing to do with this problem,
Solution:
import csv
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
csv_reader = csv.reader(input_file)
line_count = 0
# data = list(csv_reader)
# row_count = len(data)
for row in csv_reader:
print({row[2]})
line_count += 1
print('Checking ' + str(line_count) + ' of ' + str(len(input_file)))
print('Processed lines :' + str(len(input_file)))
Problem Recognition:
with open('test_list.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
data = list(csv_reader)
row_count = len(data)
in your code data = list(csv_reader) because of this line you are exhausting your variable. so it won't be able to loop through in your for loop
so for that you can read csv file like
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
then use csv.reader()
csv.reader returns an iterable, and when you use list(csv_reader) to read all the rows of the CSV, you have already exhausted the iterable, so when you want to iterate through csv_reader again with a for loop, it has nothing left to iterate.
Since you have a complete list of rows materialized in the variable data, you can simply iterate over it instead.
Change:
for row in csv_reader:
to:
for row in data: