Reading CSV - Beginner - python

I have been trying to read a csv file from my desktop and have not been successful. I checked my current working directory and it is pointed to my desktop, so that doesn't seem to be the issue. Below is the module I used and the error output that I received. I am using Python 3.2.3
import csv
reader = csv.reader(open(name.csv, mode = 'r'))
for row in reader:
print (row)
Here is my result
Traceback (most recent call last):
File "C:/Users/User Name/Desktop/FileName.py", line 2,in
reader = csv.reader(open(name.csv, mode = 'r'))
NameError: name 'Beta' is not defined
Help? Thanks!

Try this...
import csv
with open('name.csv', 'r') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
print row

Related

Adding csv filename to a column in python (200 files)

I have 200 files with dates in the file name. I would like to add date from this file name into new column in each file.
I created macro in Python:
import pandas as pd
import os
import openpyxl
import csv
os.chdir(r'\\\\\\\')
for file_name in os.listdir(r'\\\\\\'):
with open(file_name,'r') as csvinput:
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('FileName')
all.append(row)
for row in reader:
row.append(file_name)
all.append(row)
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
if file_name.endswith('.csv'):
workbook = openpyxl.load_workbook(file_name)
workbook.save(file_name)
csv_filename = pd.read_csv(r'\\\\\\')
csv_data= pd.read_csv(csv_filename, header = 0)
csv_data['filename'] = csv_filename`
Right now I see "InvalidFileException: File is not a zip file" and only first file has added column with the file name.
Can you please advise what am I doing wrong? BTW I,m using Python 3.4.
Many thanks,
Lukasz
First problem, this section:
with open(file_name, 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
writer.writerows(all)
should be indented, to be included in the for loop. Now it is only executed once after the loop. This is why you only get one output file.
Second problem, the exception is probably caused by openpyxl.load_workbook(file_name). Presumably openpyxl can only open actual Excel files (which are .zip files with other extension), no CSV files. Why do you want to open and save it after all? I think you can just remove those three lines.

Unable to access csv.dictreader object

For a CSV file:
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11,12
While the code below works fine to output the rows of the CSV:
import csv
import sys
database = {}
with open(sys.argv[1], mode='r') as csv_file:
database = csv.DictReader(csv_file)
for row in database:
print(row)
the following does not.
import csv
import sys
database = {}
with open(sys.argv[1], mode='r') as csv_file:
database = csv.DictReader(csv_file)
for row in database:
print(row)
with error
> Traceback (most recent call last): File "test.py", line 9, in
> <module>
> for row in database: File "/usr/local/lib/python3.7/csv.py", line 111, in __next__
> self.fieldnames File "/usr/local/lib/python3.7/csv.py", line 98, in fieldnames
> self._fieldnames = next(self.reader) ValueError: I/O operation on closed file.
The csv.DictReader object appears to exist but I cannot iterate over it in the 2nd snippet.
Checking various comments, they seem to say that DictReader returns an iterator - but I do not know understand if this is the reason for the error and what to change to gain access to database.
Appreciate any help. Thanks in advance!
with open is a context manager which closes the file when execution goes out of scope. As the file is closed you can't read from it.
Use the original indentation.
import csv
import sys
database = {}
with open(sys.argv[1], mode='r') as csv_file:
database = csv.DictReader(csv_file)
for row in database:
print(row)
You could also do the following:
import csv
import sys
with open(sys.argv[1], mode='r') as csv_file:
rows = list(csv.DictReader(csv_file))
for row in rows:
print(row)
The second way will pull all the data into memory.

Using relative paths in Python with AWS Cloud9 [duplicate]

This question already has answers here:
Reading a file using a relative path in a Python project
(6 answers)
Closed 2 years ago.
Using AWS Cloud9 for a Python 3.x application. I am trying to open a file (using with open) in the same directory as the python file, however, it only works if I define the absolute path.
Relative Path
import csv
with open("test.csv", newline='') as f:
reader = csv.reader(f, delimiter=' ', quotechar='|')
for row in reader:
print(', '.join(row))
Error in Terminal
Traceback (most recent call last):
File "/home/ec2-user/environment/test/test.py", line 3, in <module>
with open("test.csv", newline='') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test.csv'
Absolute Path
import csv
with open("/home/ec2-user/environment/test/test.csv", newline='') as f:
reader = csv.reader(f, delimiter=' ', quotechar='|')
for row in reader:
print(', '.join(row))
No Errors
Found a similar question, posted an answer below that works. Reading file using relative path in python project
import csv
from pathlib import Path
path = Path(__file__).parent / "test.csv"
with path.open() as f:
reader = list(csv.reader(f, delimiter=' ', quotechar='|'))
for row in reader:
print(', '.join(row))
I can't comment so I am gonna answer here and hope it is right for you.
AWS uses Linux OSes, if you want to use a file in Linux in the same folder the script you are running, you have to prepend ./ in the file's name, e.g. in your case:
import csv
with open("./test.csv", newline='') as f:
reader = csv.reader(f, delimiter=' ', quotechar='|')
for row in reader:
print(', '.join(row))

CSV Should Return Strings, Not Bytes Error

I am trying to read CSV files from a directory that is not in the same directory as my Python script.
Additionally the CSV files are stored in ZIP folders that have the exact same names (the only difference being one ends with .zip and the other is a .csv).
Currently I am using Python's zipfile and csv libraries to open and get the data from the files, however I am getting the error:
Traceback (most recent call last): File "write_pricing_data.py", line 13, in <module>
for row in reader:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
My code:
import os, csv
from zipfile import *
folder = r'D:/MarketData/forex'
localFiles = os.listdir(folder)
for file in localFiles:
zipArchive = ZipFile(folder + '/' + file)
with zipArchive.open(file[:-4] + '.csv') as csvFile:
reader = csv.reader(csvFile, delimiter=',')
for row in reader:
print(row[0])
How can I resolve this error?
It's a bit of a kludge and I'm sure there's a better way (that just happens to elude me right now). If you don't have embedded new lines, then you can use:
import zipfile, csv
zf = zipfile.ZipFile('testing.csv.zip')
with zf.open('testing.csv', 'r') as fin:
# Create a generator of decoded lines for input to csv.reader
# (the csv module is only really happy with ASCII input anyway...)
lines = (line.decode('ascii') for line in fin)
for row in csv.reader(lines):
print(row)

erroneous line added while adding new columns python

I am trying to add extra columns in a csv file after processing an input csv file. But, I am getting extra new line added after each line in the output.
What's missing or wrong in my below code -
import csv
with open('test.csv', 'r') as infile:
with open('test_out.csv', 'w') as outfile:
reader = csv.reader(infile, delimiter=',')
writer = csv.writer(outfile, delimiter=',')
for row in reader:
colad = row[5].rstrip('0123456789./ ')
if colad == row[5]:
col2ad = row[11]
else:
col2ad = row[5].split(' ')[-1]
writer.writerow([row[0],colad,col2ad] +row[1:])
I am processing huge a csv file so would like to get rid of those extra lines.
I had the same problem on Windows (your OS as well, I presume?). CSV and Windows as combination make a \r\r\n at the end of each line (so: double newline).
You need to open the output file in binary mode:
with open('test_out.csv', 'wb') as outfile:
For other answers:
Python's CSV writer produces wrong line terminator
CSV in Python adding an extra carriage return

Categories

Resources