csv file empty when running django tests - python

It's a strange issue, I have a method where I read a csv file so I created an unit test for it, is something as simple as this:
def test_csv(self):
with open(self.csv_file_path, 'rb') as csv_file:
response = csv_parser_method(csv_file)
assert response here
So if I add a pdb breakpoint there and check the content of self.csv_file_path file it's empty
(Pdb) import csv
(Pdb) reader = csv.reader(csv_file, delimiter=str(','))
(Pdb) [row for row in reader]
[]
That's strange, if I open a normal shell it has content and of course the file has content...

Your csv_parser_method already reads the entire CSV file and the csv_file file object therefore already has its pointer positioned at the end of the file, so when you use csv.reader to try to read it, it gets nothing since there is no more content after the position of the end of the file.
You can use the seek method to reset the file pointer back to the beginning of the file so that csv.reader can read the file:
csv_file.seek(0)
reader = csv.reader(csv_file, delimiter=str(','))

Related

having problems with python csv

I'am having trouble with python csv module I'am trying to write a newline in a csv file is there any reson why it would not work?
Code:
csv writing function
def write_response_csv(name,games,mins):
with open("sport_team.csv",'w',newline='',encoding='utf-8') as csv_file:
fieldnames=['Vardas','Žaidimai','Minutės']
writer = csv.DictWriter(csv_file,fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'Vardas':name,'Žaidimai':games,"Minutės":mins})
with requests.get(url,headers=headers) as page:
content = soup(page.content,'html.parser')
content = content.findAll('table',class_='table01 tablesorter')
names = find_name(content)
times = 0
for name in names:
matches = find_matches(content,times)
min_in_matches = find_min(content,times)
times +=1
csv_file = write_response_csv(name,matches,min_in_matches)
try:
print(name,matches,min_in_matches)
except:
pass
When you call your write_response_csv function it is reopening the file and starting at line 1 again in the csv file and each new line of data you are passing to that function is overwriting the previous one written. What you could do try is creating the csv file outside of the scope of your writer function and setting your writer function to append mode instead of write mode. This will ensure that it will write the data on the next empty csv line, instead of starting at line 1.
#Outside of function scope
fieldnames=['Vardas','Žaidimai','Minutės']
#Create sport_team.csv file w/ headers
with open('sport_team.csv', 'w',encoding='utf-8') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames)
writer.writeheader()
#Write response function
def write_response_csv(name,games,mins):
with open('sport_team.csv','a',encoding='utf-8') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames)
writer.writerow({'Vardas':name,'Žaidimai':games,"Minutės":mins})
Note:
You will run into the same issue if you are reusing this script to continuously add new lines of data to the same file because each time you run it the code that creates the csv file will essentially recreate a blank sport_team.csv file with the headers. If you would like to reuse the code to continuously add new data, I would look into using os.path and utilizing it to confirm if sport_team.csv exists already and if so, to not run that code after the fieldnames.
Try using metabob, it find code errors for you. I've been using it as a Python beginner, and has been pretty successful with it.

Converting a .csv.gz to .csv in Python 2.7

I have read the documentation and a few additional posts on SO and other various places, but I can't quite figure out this concept:
When you call csvFilename = gzip.open(filename, 'rb') then reader = csv.reader(open(csvFilename)), is that reader not a valid csv file?
I am trying to solve the problem outlined below, and am getting a coercing to Unicode: need string or buffer, GzipFile found error on line 41 and 7 (highlighted below), leading me to believe that the gzip.open and csv.reader do not work as I had previously thought.
Problem I am trying to solve
I am trying to take a results.csv.gz and convert it to a results.csv so that I can turn the results.csv into a python dictionary and then combine it with another python dictionary.
File 1:
alertFile = payload.get('results_file')
alertDataCSV = rh.dataToDict(alertFile) # LINE 41
alertDataTotal = rh.mergeTwoDicts(splunkParams, alertDataCSV)
Calls File 2:
import gzip
import csv
def dataToDict(filename):
csvFilename = gzip.open(filename, 'rb')
reader = csv.reader(open(csvFilename)) # LINE 7
alertData={}
for row in reader:
alertData[row[0]]=row[1:]
return alertData
def mergeTwoDicts(dictA, dictB):
dictC = dictA.copy()
dictC.update(dictB)
return dictC
*edit: also forgive my non-PEP style of naming in Python
gzip.open returns a file-like object (same as what plain open returns), not the name of the decompressed file. Simply pass the result directly to csv.reader and it will work (the csv.reader will receive the decompressed lines). csv does expect text though, so on Python 3 you need to open it to read as text (on Python 2 'rb' is fine, the module doesn't deal with encodings, but then, neither does the csv module). Simply change:
csvFilename = gzip.open(filename, 'rb')
reader = csv.reader(open(csvFilename))
to:
# Python 2
csvFile = gzip.open(filename, 'rb')
reader = csv.reader(csvFile) # No reopening involved
# Python 3
csvFile = gzip.open(filename, 'rt', newline='') # Open in text mode, not binary, no line ending translation
reader = csv.reader(csvFile) # No reopening involved
The following worked for me for python==3.7.9:
import gzip
my_filename = my_compressed_file.csv.gz
with gzip.open(my_filename, 'rt') as gz_file:
data = gz_file.read() # read decompressed data
with open(my_filename[:-3], 'wt') as out_file:
out_file.write(data) # write decompressed data
my_filename[:-3] is to get the actual filename so that it does get a random filename.

Open file has data but reports back length 0 in python

I must be missing something very simple here, but I've been hitting my head against the wall for a while and don't understand where the error is. I am trying to open a csv file and read the data. I am detecting the delimiter, then reading in the data with this code:
with open(filepath, 'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read())
delimiter = repr(dialect.delimiter)[1:-1]
csvdata = [line.split(delimiter) for line in csvfile.readlines()]
However, my csvfile is being read as having no length. If I run:
print(sum(1 for line in csvfile))
The result is zero. If I run:
print(sum(1 for line in open(filepath, 'r')))
Then I get five lines, as expected. I've checked for name clashes by changing csvfile to other random names, but this does not change the result. Am I missing a step somewhere?
You need to move the file pointer back to the start of the file after sniffing it. You don't need to read the whole file in to do that, just enough to include a few rows:
import csv
with open(filepath, 'r') as f_input:
dialect = csv.Sniffer().sniff(f_input.read(2048))
f_input.seek(0)
csv_input = csv.reader(f_input, dialect)
csv_data = list(csv_input)
Also, the csv.reader() will do the splitting for you.

How to store file locally to a class?

I have a class that is supposed to be able to read data from .csv files. In the __init__ of the class I read the file and store it locally to the class as self.csv_table. The problem is that when I try to access this variable in another function I get a ValueError: I/O operation on closed file. How can I avoid this error and instead print the file?
import csv
class CsvFile(object):
"""
A class that allows the user to read data from a csv file. Can read columns, rows, specific fields
"""
def __init__(self, file, delimiter="'", quotechar='"'):
"""
file: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
with open(file, 'r') as csv_file:
self.csv_table = csv.reader(csv_file, delimiter=delimiter, quotechar=quotechar) # local copy of csv file
def read_csv(self):
"""
Prints the csv file in a simple manner. Not much can be done with this.
"""
for row in self.csv_table:
print(', '.join(row))
my_file = CsvFile(file)
my_file.read_csv() # this one causes an I/O error
Here, your problem is that self.csv_table contains the file reference itself, not the file content. Once you're out of the "with" statement, the file is closed, and you can no longer access it.
Since you care about the content, you need to store your content in the csv_table by iterating the csv_reader, for instance in your __init__ function, you can do something like this:
def __init__(self, file, delimiter="'", quotechar='"'):
"""
file: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
self.csv_table = []
with open(file, 'r') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=delimiter, quotechar=quotechar) # local copy of csv file
for data_entry in csv_reader:
self.csv_table.append(data_entry)
Then you'll be able to access the content in self.csv_table as a list of list.
Or, if you really care about the file, you need to reopen it, anytime you want to access it =>
Change your self.csv_table by self.csv_filename, and in your read_csv function, you just reopen the file and create the reader anytime you need =>
import csv
class CsvFile(object):
"""
A class that allows the user to read data from a csv file. Can read columns, rows, specific fields
"""
def __init__(self, filename, delimiter="'", quotechar='"'):
"""
filename: A string. The full path to the file and the file. /home/user/Documents/table.csv
delimter & quotechar: Strings that define how the table's rows and columns are constructed
return: the file in a way use-able to other functions
Initializes the csv file
"""
self.filename = filename
self.delimiter = delimiter
self.quotechar = quotechar
def read_csv(self):
"""
Prints the csv file in a simple manner. Not much can be done with this.
"""
with open(self.filename, 'r') as csv_file:
csv_table = csv.reader(csv_file, delimiter=self.delimiter, quotechar=self.quotechar)
for row in csv_table:
print(', '.join(row))
my_file = CsvFile(file)
my_file.read_csv() # this one causes an I/O error

add file name without file path to csv in python

I am using Blair's Python script which modifies a CSV file to add the filename as the last column (script appended below). However, instead of adding the file name alone, I also get the Path and File name in the last column.
I run the below script in windows 7 cmd with the following command:
python C:\data\set1\subseta\add_filename.py C:\data\set1\subseta\20100815.csv
The resulting ID field is populated by the following C:\data\set1\subseta\20100815.csv, although, all I need is 20100815.csv.
I'm new to python so any suggestion is appreciated!
import csv
import sys
def process_file(filename):
# Read the contents of the file into a list of lines.
f = open(filename, 'r')
contents = f.readlines()
f.close()
# Use a CSV reader to parse the contents.
reader = csv.reader(contents)
# Open the output and create a CSV writer for it.
f = open(filename, 'wb')
writer = csv.writer(f)
# Process the header.
header = reader.next()
header.append('ID')
writer.writerow(header)
# Process each row of the body.
for row in reader:
row.append(filename)
writer.writerow(row)
# Close the file and we're done.
f.close()
# Run the function on all command-line arguments. Note that this does no
# checking for things such as file existence or permissions.
map(process_file, sys.argv[1:])
Use os.path.basename(filename). See http://docs.python.org/library/os.path.html for more details.

Categories

Resources