Data in .csv file is being replaced instead of being appended - python

I am working on a attendance system using face recognition code. I wanted to save the face recognition output(Name of the recognized people) in a .csv file.
So, I tried this:
def Attendance(name):
moment=time.strftime("%Y-%b-%d",time.localtime())
open('Attendance'+moment+'.csv','w')
with open ('Attendance'+moment+'.csv','a+',newline="\n") as f:
DataList = f.readlines()
knownNames = []
for data in DataList:
ent = data.split(',')
knownNames.append(ent[0])
if name not in knownNames:
curr=_datetime.date.today()
dt=curr.strftime('%d%m%Y_%H:%M:%S')
f.writelines(f'\n{name}, {dt}, P'+'\n')
It creates a .csv file by date.
But the issue is - this function I created, makes new data replace the older data in the .csv file, instead of appending the newer data in the next lines.
I need to append new data and eliminate re-entry of already existing data.
Kindly help!
Regards,
Vishwesh V Bhat

Use the opening mode 'a' for append:
with open("filename.csv", "a") as f:
...

You are opening the file in write mode. This overwrites your file. Remove that line and your code should work.
Fixed Code:
def Attendance(name):
moment=time.strftime("%Y-%b-%d",time.localtime())
with open ('Attendance'+moment+'.csv','a+',newline="\n") as f:
DataList = f.readlines()
knownNames = []
for data in DataList:
ent = data.split(',')
knownNames.append(ent[0])
if name not in knownNames:
curr=datetime.date.today()
dt=curr.strftime('%d%m%Y_%H:%M:%S')
f.writelines(f'\n{name}, {dt}, P'+'\n')
Also, make sure to follow PEP 8. f-string can also help make your code more readable.
Fixed Code that follows PEP 8 and is cleaner:
def Attendance(name):
moment = time.strftime("%Y-%b-%d",time.localtime())
with open(f'Attendance{moment}.csv', 'a+') as f:
knownNames = [data.split(',')[0] for data in f.readlines()]
if name not in knownNames:
dt = datetime.date.today().strftime('%d%m%Y_%H:%M:%S')
print(f'\n{name}, {dt}, P', file=f)

Issue solved!
Mistake I was making was, I was opening the .csv with w mode. So everytime I run the code, even if the .csv created on that date existed already it would overwrite the newer data to the first row.
So I used if os.path.exists('Attend'+moment+'.csv'):
This solved the issue.
Solution:
def Attendance(name):
moment=time.strftime("%Y-%b-%d",time.localtime())
if os.path.exists('Attend'+moment+'.csv'):
with open('Attend'+moment+'.csv','r+',newline="\n") as f:
DataList = f.readlines()
knownNames = []
for data in DataList:
ent = data.split(',')
knownNames.append(ent[0])
with open('Attend'+moment+'.csv','a',newline="\n") as f:
if name not in knownNames:
curr=_datetime.date.today()
dt=curr.strftime('%d%m%Y_%H:%M:%S')
f.writelines(f'\n{name}, {dt}, P')
else:
open('Attend'+moment+'.csv','w')

Related

How to find size of a csv and either be able to iterate on the reader object [duplicate]

I am probably making a stupid mistake, but I can't find where it is. I want to count the number of lines in my csv file. I wrote this, and obviously isn't working: I have row_count = 0 while it should be 400. Cheers.
f = open(adresse,"r")
reader = csv.reader(f,delimiter = ",")
data = [l for l in reader]
row_count = sum(1 for row in reader)
print row_count
with open(adresse,"r") as f:
reader = csv.reader(f,delimiter = ",")
data = list(reader)
row_count = len(data)
You are trying to read the file twice, when the file pointer has already reached the end of file after saving the data list.
First you have to open the file with open
input_file = open("nameOfFile.csv","r+")
Then use the csv.reader for open the csv
reader_file = csv.reader(input_file)
At the last, you can take the number of row with the instruction 'len'
value = len(list(reader_file))
The total code is this:
input_file = open("nameOfFile.csv","r+")
reader_file = csv.reader(input_file)
value = len(list(reader_file))
Remember that if you want to reuse the csv file, you have to make a input_file.fseek(0), because when you use a list for the reader_file, it reads all file, and the pointer in the file change its position
If you are working with python3 and have pandas library installed you can go with
import pandas as pd
results = pd.read_csv('f.csv')
print(len(results))
I would consider using a generator. It would do the job and keeps you safe from MemoryError of any kind
def generator_count_file_rows(input_file):
for row in open(input_file,'r'):
yield row
And then
for row in generator_count_file_rows('very_large_set.csv'):
count+=1
The important stuff is hidden in comments section of solution which is marked correct.
Re-sharing Erdős-Bacon's solution here for better visibility.
Why ?
Because: It saves lot of memory without having to create list.
So I think it is better do this way
def read_raw_csv(file_name):
with open(file_name, 'r') as file:
csvreader = csv.reader(file)
# count number of rows
entry_count = sum(1 for row in csvreader)
print(entry_count-1) # -1 is for discarding header row.
Checkout this link for more info
# with built in libraries
opened_file = open('f.csv')
from csv import reader
read_file = reader(opened_file)
apps_data = list(read_file)
rowcount = len(apps_data) #which incudes header row
print("Total rows incuding header: " + str(rowcount))
Simply Open the csv file in Notepad++. It shows the total row count in a jiffy. :)
Or
in cmd prompt , Provide file path and key in the command
find \c \v "some meaningless string" Filename.csv

Reading Two Files and Writing To One File Using Python3

I'm currently using Python 3 on Ubuntu 18.04. I'm not a programmer by any means and I'm not asking for a code review, however, I'm having an issue that I can't seem to resolve.
I have 1 text file named content.txt that I'm reading lines from.
I have 1 text file named standard.txt that I'm reading lines from.
I have 1text file named outfile.txt that I'm writing to.
content = open("content.txt", "r").readlines()
standard = open("standard.txt", "r").readlines()
outfile = "outfile.txt"
outfile_set = set()
with open(outfile, "w") as f:
for line in content:
if line not in standard:
outfile_set.add(line)
f.writelines(sorted(outfile_set))
I'm not sure where to put the following line though. My for loop nesting may all be off:
f.write("\nNo New Content")
Any code examples to make this work would be most appreciated. Thank you.
if i understand good you whant to add outfile_set if this is not empty to the outfile or add the string "\nNo New Content"
Replace the line
f.writelines(sorted(outfile_set))
to
if any(outfile_set):
f.writelines(sorted(outfile_set))
else:
f.write("\nNo New Content")
I'm assuming that you want to write "No new content" to the file if every line in content is in standard. So you might do something like:
with open(outfile, "w") as f:
for line in content:
if line not in standard:
outfile_set.add(line)
if len(outfile_set) > 0:
f.writelines(sorted(outfile_set))
else:
f.write("\nNo New Content")
Your original code was almost there!
You can reduce your runtime a lot by using set/frozenset:
with open("content.txt", "r") as f:
content = frozenset(f.readlines()) # only get distinct values from file
with open("standard.txt", "r") as f:
standard = frozenset(f.readlines()) # only get distinct values from file
# only keep whats in content but not in standard
outfile_set = sorted(content-standard) # set difference, no loops or tests needed
with open ("outfile.txt","w") as outfile:
if outfile_set:
outfile.writelines(sorted(outfile_set))
else:
outfile.write("\nNo New Content")
You can read more about it here:
set operator list (python 2 - but valid for 3 - can't find this overview in py3 doku
set difference
Demo:
# Create files
with open("content.txt", "w") as f:
for n in map(str,range(1,10)): # use range(1,10,2) for no changes
f.writelines(n+"\n")
with open("standard.txt", "w") as f:
for n in map(str,range(1,10,2)):
f.writelines(n+"\n")
# Process files:
with open("content.txt", "r") as f:
content = set(f.readlines())
with open("standard.txt", "r") as f:
standard = set(f.readlines())
# only keep whats in content but not in standard
outfile_set = sorted(content-standard)
with open ("outfile.txt","w") as outfile:
if outfile_set:
outfile.writelines(sorted(outfile_set))
else:
outfile.write("\nNo New Content")
with open ("outfile.txt") as f:
print(f.read())
Output:
2
4
6
8
or
No New Content

Putting items into array

I'm working on a Python project in Visual Studio. I want to process a longer text file, this is a simplified version:
David Tubb
Eduardo Cordero
Sumeeth Chandrashekar
So for reading this file I use this code:
with open("data.txt", "r") as f:
f_contents = f.read()
print(f_contents)
I want to put these items into a new array that looks like that:
['David Tubb','Eduardo Cordero','Sumeeth Chandrashekar']
Is that possible?
Yes, the following code will work for this:
output = [] # the output list
nameFile = open('data.txt', 'r')
for name in nameFile:
# get rid of new line character and add it to your list
output.append(name.rstrip('\n'))
print output
# don't forget to close the file!
nameFile.close()
result = []
with open("data.txt", "r") as f:
result = f.read().splitlines()
print(result)
Output:
['David Tubb', 'Eduardo Cordero', 'Sumeeth Chandrashekar']
The method stated by python for opening a file context is using "with open", this ensures the context will end during clean up.
python.org-pep-0343
dalist = list()
with open('data.txt', 'r') as infile:
for line in infile.readlines():
dalist.append(line)
Additonal resource for contex handeling: https://docs.python.org/3/library/contextlib.html

Parsing a text file with line breaks in python

I have a text file with about 20 entries. They look like this:
~
England
Link: http://imgur.com/foobar.jpg
Capital: London
~
Iceland
Link: http://imgur.com/foobar2.jpg
Capital: Reykjavik
...
etc.
I would like to take these entries and turn them into a CSV.
There is a '~' separating each entry. I'm scratching my head trying to figure out how to go thru line by line and create the CSV values for each country. Can anyone give me a clue on how to go about this?
Use the libraries luke :)
I'm assuming your data is well formatted. Most real world data isn't that way. So, here goes a solution.
>>> content.split('~')
['\nEngland\nLink: http://imgur.com/foobar.jpg\nCapital: London\n', '\nIceland\nLink: http://imgur.com/foobar2.jpg\nCapital: Reykjavik\n', '\nEngland\nLink: http://imgur.com/foobar.jpg\nCapital: London\n', '\nIceland\nLink: http://imgur.com/foobar2.jpg\nCapital: Reykjavik\n']
For writing the CSV, Python has standard library functions.
>>> import csv
>>> csvfile = open('foo.csv', 'wb')
>>> fieldnames = ['Country', 'Link', 'Capital']
>>> writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
>>> for entry in entries:
... cols = entry.strip().splitlines()
... writer.writerow({'Country': cols[0], 'Link':cols[1].split(': ')[1], 'Capital':cols[2].split(':')[1]})
...
If your data is more semi structured or badly formatted, consider using a library like PyParsing.
Edit:
Second column contains URLs, so we need to handle the splits well.
>>> cols[1]
'Link: http://imgur.com/foobar2.jpg'
>>> cols[1].split(':')[1]
' http'
>>> cols[1].split(': ')[1]
'http://imgur.com/foobar2.jpg'
The way that I would do that would be to use the open() function using the syntax of:
f = open('NameOfFile.extensionType', 'a+')
Where "a+" is append mode. The file will not be overwritten and new data can be appended. You could also use "r+" to open the file in read mode, but would lose the ability to edit. The "+" after a letter signifies that if the document does not exist, it will be created. The "a+" I've never found to work without the "+".
After that I would use a for loop like this:
data = []
tmp = []
for line in f:
line.strip() #Removes formatting marks made by python
if line == '~':
data.append(tmp)
tmp = []
continue
else:
tmp.append(line)
Now you have all of the data stored in a list, but you could also reformat it as a class object using a slightly different algorithm.
I have never edited CSV files using python, but I believe you can use a loop like this to add the data:
f2 = open('CSVfileName.csv', 'w') #Can change "w" for other needs i.e "a+"
for entry in data:
for subentry in entry:
f2.write(str(subentry) + '\n') #Use '\n' to create a new line
From my knowledge of CSV that loop would create a single column of all of the data. At the end remember to close the files in order to save the changes:
f.close()
f2.close()
You could combine the two loops into one in order to save space, but for the sake of explanation I have not.

Python csv reader returns formula instead of value

I have a txt file which has some 'excel formulas', I have converted this to a csv file using Python csv reader/writer. Now I want to read the values of the csv file and do some calculation, but when i try to access the particular column of .csv file, it still returns me in the 'excel formula' instead of the actual value?? although When i open the csv file .. formulas are converted in to value??
Any ideas?
Here is the code
Code to convert txt to csv
def parseFile(filepath):
file = open(filepath,'r')
content = file.read()
file.close()
lines = content.split('\n')
csv_filepath = filepath[:(len(filepath)-4)]+'_Results.csv'
csv_out = csv.writer(open(csv_filepath, 'a'), delimiter=',' , lineterminator='\n')
for line in lines:
data = line.split('\t')
csv_out.writerow(data)
return csv_filepath
Code to do some calculation in csv file
def csv_cal (csv_filepath):
r = csv.reader(open(csv_filepath))
lines = [l for l in r]
counter =[0]*(len(lines[4])+6)
if lines[4][4] == 'Last Test Pass?' :
print ' i am here'
for i in range(0,3):
print lines[6] [4] ### RETURNS FORMULA ??
return 0
I am new to python, any help would be appreciated!
Thanks,
You can paste special in Excel with Values only option selected. You could select all and paste into a another sheet and save. This would save you from having to implement some kind of parser in python. Or, you could evaluate some simple arithmetic with eval.
edit:
I've heard of xlrd which can be downloaded from pypi. It loads .xls files.
It sounded like you just wanted the final data which past special can do.

Categories

Resources