I have a class that is supposed to be able to read data from .csv files. In the __init__ of the class I read the file and store it locally to the class as self.csv_table. The problem is that when I try to access this variable in another function I get a ValueError: I/O operation on closed file. How can I avoid this error and instead print the file?
import csv
class CsvFile(object):
"""
A class that allows the user to read data from a csv file. Can read columns, rows, specific fields
"""
def __init__(self, file, delimiter="'", quotechar='"'):
"""
file: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
with open(file, 'r') as csv_file:
self.csv_table = csv.reader(csv_file, delimiter=delimiter, quotechar=quotechar) # local copy of csv file
def read_csv(self):
"""
Prints the csv file in a simple manner. Not much can be done with this.
"""
for row in self.csv_table:
print(', '.join(row))
my_file = CsvFile(file)
my_file.read_csv() # this one causes an I/O error
Here, your problem is that self.csv_table contains the file reference itself, not the file content. Once you're out of the "with" statement, the file is closed, and you can no longer access it.
Since you care about the content, you need to store your content in the csv_table by iterating the csv_reader, for instance in your __init__ function, you can do something like this:
def __init__(self, file, delimiter="'", quotechar='"'):
"""
file: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
self.csv_table = []
with open(file, 'r') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=delimiter, quotechar=quotechar) # local copy of csv file
for data_entry in csv_reader:
self.csv_table.append(data_entry)
Then you'll be able to access the content in self.csv_table as a list of list.
Or, if you really care about the file, you need to reopen it, anytime you want to access it =>
Change your self.csv_table by self.csv_filename, and in your read_csv function, you just reopen the file and create the reader anytime you need =>
import csv
class CsvFile(object):
"""
A class that allows the user to read data from a csv file. Can read columns, rows, specific fields
"""
def __init__(self, filename, delimiter="'", quotechar='"'):
"""
filename: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
self.filename = filename
self.delimiter = delimiter
self.quotechar = quotechar
def read_csv(self):
"""
Prints the csv file in a simple manner. Not much can be done with this.
"""
with open(self.filename, 'r') as csv_file:
csv_table = csv.reader(csv_file, delimiter=self.delimiter, quotechar=self.quotechar)
for row in csv_table:
print(', '.join(row))
my_file = CsvFile(file)
my_file.read_csv() # this one causes an I/O error
Related
I'am having trouble with python csv module I'am trying to write a newline in a csv file is there any reson why it would not work?
Code:
csv writing function
def write_response_csv(name,games,mins):
with open("sport_team.csv",'w',newline='',encoding='utf-8') as csv_file:
fieldnames=['Vardas','Žaidimai','Minutės']
writer = csv.DictWriter(csv_file,fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'Vardas':name,'Žaidimai':games,"Minutės":mins})
with requests.get(url,headers=headers) as page:
content = soup(page.content,'html.parser')
content = content.findAll('table',class_='table01 tablesorter')
names = find_name(content)
times = 0
for name in names:
matches = find_matches(content,times)
min_in_matches = find_min(content,times)
times +=1
csv_file = write_response_csv(name,matches,min_in_matches)
try:
print(name,matches,min_in_matches)
except:
pass
When you call your write_response_csv function it is reopening the file and starting at line 1 again in the csv file and each new line of data you are passing to that function is overwriting the previous one written. What you could do try is creating the csv file outside of the scope of your writer function and setting your writer function to append mode instead of write mode. This will ensure that it will write the data on the next empty csv line, instead of starting at line 1.
#Outside of function scope
fieldnames=['Vardas','Žaidimai','Minutės']
#Create sport_team.csv file w/ headers
with open('sport_team.csv', 'w',encoding='utf-8') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames)
writer.writeheader()
#Write response function
def write_response_csv(name,games,mins):
with open('sport_team.csv','a',encoding='utf-8') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames)
writer.writerow({'Vardas':name,'Žaidimai':games,"Minutės":mins})
Note:
You will run into the same issue if you are reusing this script to continuously add new lines of data to the same file because each time you run it the code that creates the csv file will essentially recreate a blank sport_team.csv file with the headers. If you would like to reuse the code to continuously add new data, I would look into using os.path and utilizing it to confirm if sport_team.csv exists already and if so, to not run that code after the fieldnames.
Try using metabob, it find code errors for you. I've been using it as a Python beginner, and has been pretty successful with it.
It's a strange issue, I have a method where I read a csv file so I created an unit test for it, is something as simple as this:
def test_csv(self):
with open(self.csv_file_path, 'rb') as csv_file:
response = csv_parser_method(csv_file)
assert response here
So if I add a pdb breakpoint there and check the content of self.csv_file_path file it's empty
(Pdb) import csv
(Pdb) reader = csv.reader(csv_file, delimiter=str(','))
(Pdb) [row for row in reader]
[]
That's strange, if I open a normal shell it has content and of course the file has content...
Your csv_parser_method already reads the entire CSV file and the csv_file file object therefore already has its pointer positioned at the end of the file, so when you use csv.reader to try to read it, it gets nothing since there is no more content after the position of the end of the file.
You can use the seek method to reset the file pointer back to the beginning of the file so that csv.reader can read the file:
csv_file.seek(0)
reader = csv.reader(csv_file, delimiter=str(','))
Scenario is i need to convert dictionary object as json and write to a file . New Dictionary objects would be sent on every write_to_file() method call and i have to append Json to the file .Following is the code
def write_to_file(self, dict=None):
f = open("/Users/xyz/Desktop/file.json", "w+")
if json.load(f)!= None:
data = json.load(f)
data.update(dict)
f = open("/Users/xyz/Desktop/file.json", "w+")
f.write(json.dumps(data))
else:
f = open("/Users/xyz/Desktop/file.json", "w+")
f.write(json.dumps(dict)
Getting this error "No JSON object could be decoded" and Json is not written to the file. Can anyone help ?
this looks overcomplex and highly buggy. Opening the file several times, in w+ mode, and reading it twice won't get you nowhere but will create an empty file that json won't be able to read.
I would test if the file exists, if so I'm reading it (else create an empty dict).
this default None argument makes no sense. You have to pass a dictionary or the update method won't work. Well, we can skip the update if the object is "falsy".
don't use dict as a variable name
in the end, overwrite the file with a new version of your data (w+ and r+ should be reserved to fixed size/binary files, not text/json/xml files)
Like this:
def write_to_file(self, new_data=None):
# define filename to avoid copy/paste
filename = "/Users/xyz/Desktop/file.json"
data = {} # in case the file doesn't exist yet
if os.path.exists(filename):
with open(filename) as f:
data = json.load(f)
# update data with new_data if non-None/empty
if new_data:
data.update(new_data)
# write the updated dictionary, create file if
# didn't exist
with open(filename,"w") as f:
json.dump(data,f)
The following method:
def generateCSVfile(fileName,fileDescription,fileLocation,md5Hash):
with open('deploymentTemplate.csv', 'w') as csvfile:
createRow = csv.writer(csvfile,
quoting=csv.QUOTE_MINIMAL)
This generates my CSV file but since I am calling it in a loop it just overrides itself.
generateCSVfile(name, fileDescription, filePath+"/"+name, md5Hash)
I am trying to find a way to generate the file, leave it open, call the above method and have all the text written to it without the file overriding itself.
Use : open('deploymentTemplate.csv', 'a') to append values.
Syntax: open(<file_name> [,<mode>])
Different modes are :
mode can be 'r' when the file will only be read
'w' for only writing (an existing file with the same name will be erased)
'a' opens the file for appending and any data written to the file is
automatically added to the end.
'r+' opens the file for both reading and writing.
The mode argument is optional; 'r' will be assumed if it’s omitted.
Eg :
with open("test.txt", "a") as myfile:
myfile.write("appended text")
If the file needs to be emptied once per program run, but appended multiple times within a run, you could always just use a global (or class member state) to ensure it's only opened once.
import atexit
csvfile = None
def generateCSVfile(fileName,fileDescription,fileLocation,md5Hash):
global csvfile
if csvfile is None:
# Lazily open file on first call
csvfile = open('deploymentTemplate.csv', 'w')
atexit.atexit(csvfile.close) # Close cleanly on program exit
try:
csvwriter = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL, newline='')
# do whatever writing you need to csvwriter
finally:
csvfile.flush() # Match behavior of repeated with/open, force predictable flush
If there might be multiple CSV files involved, you might use a class with instance state and a method to do the writing, so each file can be independently cleared once and appended many times. In this case, due to limits on the number of open file handles, reopening for append on each use is slower but safer than opening once and leaving open. You can use caching so the class is a singleton for any given file name too:
import weakref
class CSVGenerator:
CACHE = {}
CACHELOCK = threading.Lock()
def __new__(cls, csvfilename):
canonicalname = os.path.realpath(csvfilename)
newself = super().__new__(cls)
with cls.CACHELOCK:
self = cls.CACHE.setdefault(canonicalname, newself)
if newself is self:
# First time we opened this file, clear file and initialize instance
with open(canonicalname, 'w') as f:
pass
self.csvfilename = canonicalname
self.appendlock = threading.Lock()
return self
def generateCSVfile(self, fileName, fileDescription, fileLocation, md5Hash):
with newself.appendlock, open(self.csvfilename, 'a', newline='') as csvfile:
createRow = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL)
# Perform writes to file
Usage of the class can be either:
CSVGenerator(somecsvfilename).generateCSVfile(...args...)
which acquires an instance briefly (creating it if needed) then writes once, or it can create and store an instance and reuse it (saves the overhead of cache lookup, but functionally identical).
I am using Blair's Python script which modifies a CSV file to add the filename as the last column (script appended below). However, instead of adding the file name alone, I also get the Path and File name in the last column.
I run the below script in windows 7 cmd with the following command:
python C:\data\set1\subseta\add_filename.py C:\data\set1\subseta\20100815.csv
The resulting ID field is populated by the following C:\data\set1\subseta\20100815.csv, although, all I need is 20100815.csv.
I'm new to python so any suggestion is appreciated!
import csv
import sys
def process_file(filename):
# Read the contents of the file into a list of lines.
f = open(filename, 'r')
contents = f.readlines()
f.close()
# Use a CSV reader to parse the contents.
reader = csv.reader(contents)
# Open the output and create a CSV writer for it.
f = open(filename, 'wb')
writer = csv.writer(f)
# Process the header.
header = reader.next()
header.append('ID')
writer.writerow(header)
# Process each row of the body.
for row in reader:
row.append(filename)
writer.writerow(row)
# Close the file and we're done.
f.close()
# Run the function on all command-line arguments. Note that this does no
# checking for things such as file existence or permissions.
map(process_file, sys.argv[1:])
Use os.path.basename(filename). See http://docs.python.org/library/os.path.html for more details.