Not able to use for after using csv reader - python

I am given the task to write a script to check MX records of the given data in the CSV file. I have started by trying checking it using regex and before that I trying to read the CSV file. I would also like to log the progress so I am printing the row number it is on, but whenever I use the cvs_reader object to calculate the row length I am unable to get inside the for loop
import csv
with open('test_list.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
data = list(csv_reader)
row_count = len(data)
for row in csv_reader:
print({row[2]})
line_count += 1
print('Checking '+ str(line_count) +' of '+ str(row_count))
print('Processed lines :'+str(row_count))
I only get the result as
Processed lines : 40
New at python scripting. Please help
My test_list.csv look like this
fname, lname, email
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
bhanu2, singh2, bhanudoesnotexist#doesnotexit.com
Total 40 times continued

first thing csv data has nothing to do with this problem,
Solution:
import csv
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
csv_reader = csv.reader(input_file)
line_count = 0
# data = list(csv_reader)
# row_count = len(data)
for row in csv_reader:
print({row[2]})
line_count += 1
print('Checking ' + str(line_count) + ' of ' + str(len(input_file)))
print('Processed lines :' + str(len(input_file)))
Problem Recognition:
with open('test_list.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
data = list(csv_reader)
row_count = len(data)
in your code data = list(csv_reader) because of this line you are exhausting your variable. so it won't be able to loop through in your for loop
so for that you can read csv file like
input_file = open("test_list.csv", "r").readlines()
print(len(input_file))
then use csv.reader()

csv.reader returns an iterable, and when you use list(csv_reader) to read all the rows of the CSV, you have already exhausted the iterable, so when you want to iterate through csv_reader again with a for loop, it has nothing left to iterate.
Since you have a complete list of rows materialized in the variable data, you can simply iterate over it instead.
Change:
for row in csv_reader:
to:
for row in data:

Related

Compare two CSV files and write difference in the same file as an extra column in python

Hey intelligent community,
I need a little bit of help because i think i don't see the the wood in the trees.
i have to CSV files that look like this:
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.2
Name,Number
AAC;2.2.3
AAF;2.4.4
ZCX;3.5.5
I would like to compare both files and than write any changes like this:
Name,Number,Changes
AAC;2.2.3
AAF;2.4.4
ZCX;5.5.5;change: 3.5.2
So on every line when there is a difference in the number, i want to add this as a new column at the end of the line.
The Files are formated the same but sometimes have a new row so thats why i think i have to map the keys.
I come this far but now iam lost in my thoughts:
Python 3.10.9
import csv
Reading the first csv and set mapping
with open('test1.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file1_dict = {row[1]: row[0] for row in rows}
Reading the second csv and set mapping
with open('test2.csv', 'r') as csvfile:
reader= csv.reader(csvfile)
rows = list(reader)
file2_dict = {row[1]: row[0] for row in rows}
comparing the keys and find the diff
for k in test1_dict:
if test1_dict[k] != test2:dict[k]
test1_dict[k] = test2_dict[k]
for row in rows:
if row[1] == k:
row.append(test2_dict[k])
#write the csv (not sure how to add the word "change:")
with open('test1.csv', 'w', newline ='') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(rows)
If i try this, i don't get a new column, it just "updates" the csv file with the same columns.
For example this code gives me the diff row but i'am not able to just add it to existing file and row.
with open('test1.csv') as fin1:
with open('test2.csv') as fin2:
read1 = csv.reader(fin1)
read2 = csv.reader(fin2)
diff_rows = (row1 for row1, row2 in zip(read1, read2) if row1 != row2)
with open('test3.csv', 'w') as fout:
writer = csv.writer(fout)
writer.writerows(diff_rows)
Does someone have any tips or help for my problem? I read many answers on here but can't figure it out.
Thanks alot.
#bigkeefer
Thanks for your answer, i tried to change it for the delimiter ; but it gives an "list index out of range error".
with open('test3.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=';')
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows}
with open('test4.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=';')
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows}
new_file = ["Name;Number;Changes\n"]
with open('output.csv', 'w') as nf:
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
nf.writelines(new_file)
You will need to adapt this to overwrite your first file etcetera, as you mentioned above, but I've left it like this for your testing purposes. Hopefully this will help you in some way.
I've assumed you've actually got the headers above in each file. If not, remove the slicing on the list creations, and change the new_file variable assignment to an empty list ([]).
with open('f1.csv', 'r') as file1:
reader = csv.reader(file1, delimiter=";")
rows = list(reader)[1:]
file1_dict = {row[0]: row[1] for row in rows if row}
with open('f2.csv', 'r') as file2:
reader = csv.reader(file2, delimiter=";")
rows = list(reader)[1:]
file2_dict = {row[0]: row[1] for row in rows if row}
new_file = ["Name,Number,Changes\n"]
for key, value in file1_dict.items():
if value != file2_dict[key]:
new_file.append(f"{key};{file2_dict[key]};change: {value}\n")
else:
new_file.append(f"{key};{value}\n")
with open('new.csv', 'w') as nf:
nf.writelines(new_file)

Check if a cell in csv only contains a value

I have a csv of 2 columns A and B. A contains words and B contains the word type. I want to append a count that increases when the cell contains either "." or "?" or "!". However they must only contain one "." or one "?" or one "!". It shouldn't increase when the cell contains "..." or "!!!???!"
I have created the code:
from csv import writer
from csv import reader
sentence_number = 1
with open('input.csv', 'r') as read_obj,\
open('output.csv', 'w', newline='') as write_obj:
csv_reader = reader(read_obj)
csv_writer = writer(write_obj)
for row in csv_reader:
if str(row[0])== "." or str(row[0])=="?" or str(row[0]) == "!":
sentence_number = sentence_number + 1
row.append(sentence_number)
csv_writer.writerow(row)
Edit: The original csv file is
This;Adverb
flower;Noun
is;Verb
pretty;Adjective
.;Punctuation
I;Pronoun
like;Verb
flowers;Noun
!;Punctuation
However it gives rows as
This;Adverb,1
flower;Noun,1
is;Verb,1
pretty;Adjective,1
.;Punctuation,1
I;Pronoun,1
like;Verb,1
flowers;Noun,1
!;Punctuation,1
Expected cvs outcome is:
This;Adverb;1
flower;Noun;1
is;Verb;1
pretty;Adjective;1
.;Punctuation;1
I;Pronoun;2
like;Verb;2
flowers;Noun;2
!;Punctuation;2
Basically I want to recognize which sentence a word belongs to, i.e. "This" belongs to sentence 1. How can I achieve this?
Thank you in advance :)
Once you have determined the file you want to read, you read it with this line:
csv_reader = reader(read_obj)
However, reader doesn't return a string, but an object of this type:
<_csv.reader object at 0x000002145A5B71C0>
The problem occurs because you expect this line:
for row in csv_reader:
to iterate over the object storing in "row" a string with the content of each row. But what it actually stores is an array with the string inside, such as:
["This;Adverb"]
To solve this, you simply need to add another [0] when checking for the punctuation signs.
Besides that, i noticed another error that led to the concatenation of the number with a "," instead of a ";", which was due to row.append(sentence_number), so i swapped it with row += ";" + str(sentence_number).
Here's the code with the changes, i hope it helps:
from csv import writer
from csv import reader
sentence_number = 1
with open('a.txt', 'r') as read_obj, \
open('output.csv', 'w', newline='') as write_obj:
csv_reader = reader(read_obj)
csv_writer = writer(write_obj)
for row in csv_reader:
row = row[0]
row += ";" + str(sentence_number)
if row[0] == "." or row[0] == "?" or row[0] == "!":
sentence_number = sentence_number + 1
csv_writer.writerow([row])

I want to print only second row of my csv file

I want to print only second row of my csv file.I have two rows but i want to fetch only second row.Please help me.
use code below to print only second row of CSV file named f.csv, and datas are separated by comma in each row:
import csv
with open('f.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
if line_count == 1:
print(row)
break
line_count += 1

Beginner deleting columns from CSV (no pandas)

I've just started coding, I'm trying to remove certain columns from a CSV for a project, we aren't supposed to use pandas. For instance, one of the fields I have to delete is called DwTm, but there's about 15 columns I have to get rid of; I only want the first few, Here's what I've gotten:
import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"
with open(FTemp, 'r') as csv_file:
csv_reader = csv.reader(csv_file)
with open(FOut, 'w') as new_file:
fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)
for line in csv_reader:
del line['DwTm']
csv_writer.writerow(line)
When I run this, I get the error
del line['DwTm']
TypeError: list indices must be integers or slices, not str
This is the only method I've found to almost work without using pandas. Any ideas?
The easiest way around this is to use a DictReader to read the file. Like DictWriter, which you are using to write the file, DictReader uses dictionaries for rows, so your approach of deleting keys from the old row then writing to the new file will work as you expect.
import csv
FTemp = "D:/tempfile.csv"
FOut = "D:/NewFile.csv"
with open(FTemp, 'r') as csv_file:
# Adjust the list to be have the correct order
old_fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm', 'DwTm']
csv_reader = csv.DictReader(csv_file, fieldnames=old_fieldnames)
with open(FOut, 'w') as new_file:
fieldnames = ['Stn_Name', 'Lat', 'Long', 'Prov', 'Tm']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames)
for line in csv_reader:
del line['DwTm']
csv_writer.writerow(line)
Below
import csv
# We only want to read the 'department' field
# We are not interested in 'name' and 'birthday month'
# Make sure the list items are in ascending order
NON_INTERESTING_FIELDS_IDX = [2,0]
rows = []
with open('example.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
for idx in NON_INTERESTING_FIELDS_IDX:
del row[idx]
rows.append(','.join(row))
with open('example_out.csv','w') as out:
for row in rows:
out.write(row + '\n')
example.csv
name,department,birthday month
John Smith,Accounting,November
Erica Meyers,IT,March
example_out.csv
department
Accounting
IT
It's possible to simultaneously open the file to read from and the file to write to. Let's say you know the indices of the columns you want to keep, say, 0,2, and 4:
good_cols = (0,2,4)
with open(Ftemp, 'r') as fin, open(Fout, 'w') as fout:
for line in fin:
line = line.rstrip() #clean up newlines
temp = line.split(',') #make a list from the line
data = [temp[x] for x in range(len(temp)) if x in good_cols]
fout.write(','.join(data) + '\n')
The list comprehension (data) pulls only the columns you want to keep out of each row and immediately writes line-by-line to your new file, using the join method (plus tacking on an endline for each new row).
If you only know the names of the fields you want to keep/remove it's a bit more involved, you have to extract the indices from the first line of the csv file, but it's not much more difficult.

How to fix "row in csv_reader" not working?

I'm reading a two column .csv file of variable length. I have some code written that should be able to read the data bar the first line into an x Data column and a y Data column. Here is the code:
def csvReader(filename):
with open(filename) as csvFile:
csvReader = csv.reader(csvFile, delimiter = ',')
rowCount = sum(1 for row in csvReader)
xData = np.zeros(rowCount)
yData = np.zeros(rowCount)
line_count = 0
firstLine = True
for row in csvReader:
print(row)
if firstLine:
firstLine = False
continue
xData[line_count] = row[0]
yData[line_count] = row[1]
line_count += 1
return xData,yData
It outputs an array of zeros, and the console never shows any printed output, which seems to imply that the entire for loop is getting skipped. Any help on this issue would be appreciated.
You're exhausting the iterator when you do
rowCount = sum(1 for row in csvReader)
You need to rewind the file to read it again.
csvFile.seek(0)

Categories

Resources