How to write output file in CSV format in python? - python

I tried to write output file as a CSV file but getting either an error or not the expected result. I am using Python 3.5.2 and 2.7 also.
Getting error in Python 3.5:
wr.writerow(var)
TypeError: a bytes-like object is required, not 'str'
and
In Python 2.7, I am getting all column result in one column.
Expected Result:
An output file same format as the input file.
Code:
import csv
f1 = open("input_1.csv", "r")
resultFile = open("out.csv", "wb")
wr = csv.writer(resultFile, quotechar=',')
def sort_duplicates(f1):
for i in range(0, len(f1)):
f1.insert(f1.index(f1[i])+1, f1[i])
f1.pop(i+1)
for var in f1:
#print (var)
wr.writerow([var])
If I am using resultFile = open("out.csv", "w"), I get one row extra in the output file.
If I am using above code, getting one row and column extra.

On Python 3, csv requires that you open the file in text mode, not binary mode. Drop the b from your file mode. You should really use newline='' too:
resultFile = open("out.csv", "w", newline='')
Better still, use the file object as a context manager to ensure it is closed automatically:
with open("input_1.csv", "r") as f1, \
open("out.csv", "w", newline='') as resultFile:
wr = csv.writer(resultFile, dialect='excel')
for var in f1:
wr.writerow([var.rstrip('\n')])
I've also stripped the lines from f1 (just to remove the newline) and put the line in a list; csv.writer.writerow wants a sequence with columns, not a single string.
Quoting the csv.writer() documentation:
If csvfile is a file object, it should be opened with newline='' [1]. [...] All other non-string data are stringified with str() before being written.
[1] If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

Others have answered that you should open the output file in text mode when using Python 3, i.e.
with open('out.csv', 'w', newline='') as resultFile:
...
But you also need to parse the incoming CSV data. As it is your code reads each line of the input CSV file as a single string. Then, without splitting that line into its constituent fields, it passes the string to the CSV writer. As a result, the csv.writer will treat the string as a sequence and output each character , including any terminating new line character, as a separate field. For example, if your input CSV file contains:
1,2,3,4
Your output file would be written like this:
1,",",2,",",3,",",4,"
"
You should change the for loop to this:
for row in csv.reader(f1):
# process the row
wr.writerow(row)
Now the input CSV file will be parsed into fields and row will contain a list of strings - one for each field. For the previous example, row would be:
for row in csv.reader(f1):
print(row)
['1', '2', '3', '4']
And when that list is passed to the csv.writer the output to the file will be:
1,2,3,4
Putting all of that together you get this code:
import csv
with open('input_1.csv') as f1, open('out.csv', 'w', newline='') as resultFile:
wr = csv.writer(resultFile, dialect='excel')
for row in csv.reader(f1):
wr.writerow(row)

open file without b mode
b mode open your file as binary
you can open file as w
open_file = open("filename.csv", "w")

You are opening the input file in normal read mode but the output file is opened in binary mode, correct way
resultFile = open("out.csv", "w")
As shown above if you replace "wb" with "w" it will work.

Related

How to read csv file to get output without list but in normal String?

I would like to ask how to change code to get output:
Nigel;UK;19
John;US;22
Carol;Germany;26
no this input (is it list?):
['Nigel', 'UK', '19']
['John', 'US', '22']
['Carol', 'Germany', '26']
code:
import csv
with open('friends.csv', 'r') as source:
reader = csv.reader(source)
for line in reader:
print(line)
I think that now exist some instructions for it but I did not find it.
Thank you
You get the output as a list because that's what csv.reader() does! It reads each line of a csv file, converts it to a list taking into account commas and escaping rules, and returns that list.
If you don't care about each individual element of the csv file, just read the file as a regular file:
with open("filename.csv", "r") as f:
for line in f:
print(line, end="")
the end="" argument prevents print() from adding its own newline characters, since the line from the file already includes the newline character
Alternatively, you can still read it as a csv file, but str.join() the resulting list before printing it. Use this if you are using the rows of the csv file as a list somewhere else, but you just want to print the file here
with open('friends.csv', 'r') as source:
reader = csv.reader(source)
for line in reader:
print(";".join(line))
Try unpacking the list elements via the * operator, and pass a custom separator to print:
import csv
with open('friends.csv', 'r') as source:
reader = csv.reader(source)
for line in reader:
print(*line, sep=';')
Result:
Nigel;UK;19
John;US;22
Carol;Germany;26
An easier (and more efficient) approach would be to read in file contents as a string, and then replace all commas with a colon:
with open('friends.csv', 'r') as source:
print(source.read().replace(',', ';'))
Note, this assumes the contents of your friends.csv is as follows:
Nigel,UK,19
John,US,22
Carol,Germany,26

want to write data in csv file using csv module but in file data are being written are alternate rows [duplicate]

import csv
with open('thefile.csv', 'rb') as f:
data = list(csv.reader(f))
import collections
counter = collections.defaultdict(int)
for row in data:
counter[row[10]] += 1
with open('/pythonwork/thefile_subset11.csv', 'w') as outfile:
writer = csv.writer(outfile)
for row in data:
if counter[row[10]] >= 504:
writer.writerow(row)
This code reads thefile.csv, makes changes, and writes results to thefile_subset1.
However, when I open the resulting csv in Microsoft Excel, there is an extra blank line after each record!
Is there a way to make it not put an extra blank line?
The csv.writer module directly controls line endings and writes \r\n into the file directly. In Python 3 the file must be opened in untranslated text mode with the parameters 'w', newline='' (empty string) or it will write \r\r\n on Windows, where the default text mode will translate each \n into \r\n.
#!python3
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
In Python 2, use binary mode to open outfile with mode 'wb' instead of 'w' to prevent Windows newline translation. Python 2 also has problems with Unicode and requires other workarounds to write non-ASCII text. See the Python 2 link below and the UnicodeReader and UnicodeWriter examples at the end of the page if you have to deal with writing Unicode strings to CSVs on Python 2, or look into the 3rd party unicodecsv module:
#!python2
with open('/pythonwork/thefile_subset11.csv', 'wb') as outfile:
writer = csv.writer(outfile)
Documentation Links
https://docs.python.org/3/library/csv.html#csv.writer
https://docs.python.org/2/library/csv.html#csv.writer
Opening the file in binary mode "wb" will not work in Python 3+. Or rather, you'd have to convert your data to binary before writing it. That's just a hassle.
Instead, you should keep it in text mode, but override the newline as empty. Like so:
with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile:
Note: It seems this is not the preferred solution because of how the extra line was being added on a Windows system. As stated in the python document:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
Windows is one such platform where that makes a difference. While changing the line terminator as I described below may have fixed the problem, the problem could be avoided altogether by opening the file in binary mode. One might say this solution is more "elegent". "Fiddling" with the line terminator would have likely resulted in unportable code between systems in this case, where opening a file in binary mode on a unix system results in no effect. ie. it results in cross system compatible code.
From Python Docs:
On Windows, 'b' appended to the mode
opens the file in binary mode, so
there are also modes like 'rb', 'wb',
and 'r+b'. Python on Windows makes a
distinction between text and binary
files; the end-of-line characters in
text files are automatically altered
slightly when data is read or written.
This behind-the-scenes modification to
file data is fine for ASCII text
files, but it’ll corrupt binary data
like that in JPEG or EXE files. Be
very careful to use binary mode when
reading and writing such files. On
Unix, it doesn’t hurt to append a 'b'
to the mode, so you can use it
platform-independently for all binary
files.
Original:
As part of optional paramaters for the csv.writer if you are getting extra blank lines you may have to change the lineterminator (info here). Example below adapated from the python page csv docs. Change it from '\n' to whatever it should be. As this is just a stab in the dark at the problem this may or may not work, but it's my best guess.
>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'w'), lineterminator='\n')
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.
Fiddling with the lineterminator is NOT the solution.
A lot of the other answers have become out of date in the ten years since the original question. For Python3, the answer is right in the documentation:
If csvfile is a file object, it should be opened with newline=''
The footnote explains in more detail:
If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.
Use the method defined below to write data to the CSV file.
open('outputFile.csv', 'a',newline='')
Just add an additional newline='' parameter inside the open method :
def writePhoneSpecsToCSV():
rowData=["field1", "field2"]
with open('outputFile.csv', 'a',newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(rowData)
This will write CSV rows without creating additional rows!
I'm writing this answer w.r.t. to python 3, as I've initially got the same problem.
I was supposed to get data from arduino using PySerial, and write them in a .csv file. Each reading in my case ended with '\r\n', so newline was always separating each line.
In my case, newline='' option didn't work. Because it showed some error like :
with open('op.csv', 'a',newline=' ') as csv_file:
ValueError: illegal newline value: ''
So it seemed that they don't accept omission of newline here.
Seeing one of the answers here only, I mentioned line terminator in the writer object, like,
writer = csv.writer(csv_file, delimiter=' ',lineterminator='\r')
and that worked for me for skipping the extra newlines.
with open(destPath+'\\'+csvXML, 'a+') as csvFile:
writer = csv.writer(csvFile, delimiter=';', lineterminator='\r')
writer.writerows(xmlList)
The "lineterminator='\r'" permit to pass to next row, without empty row between two.
Borrowing from this answer, it seems like the cleanest solution is to use io.TextIOWrapper. I managed to solve this problem for myself as follows:
from io import TextIOWrapper
...
with open(filename, 'wb') as csvfile, TextIOWrapper(csvfile, encoding='utf-8', newline='') as wrapper:
csvwriter = csv.writer(wrapper)
for data_row in data:
csvwriter.writerow(data_row)
The above answer is not compatible with Python 2. To have compatibility, I suppose one would simply need to wrap all the writing logic in an if block:
if sys.version_info < (3,):
# Python 2 way of handling CSVs
else:
# The above logic
I used writerow
def write_csv(writer, var1, var2, var3, var4):
"""
write four variables into a csv file
"""
writer.writerow([var1, var2, var3, var4])
numbers=set([1,2,3,4,5,6,7,2,4,6,8,10,12,14,16])
rules = list(permutations(numbers, 4))
#print(rules)
selection=[]
with open("count.csv", 'w',newline='') as csvfile:
writer = csv.writer(csvfile)
for rule in rules:
number1,number2,number3,number4=rule
if ((number1+number2+number3+number4)%5==0):
#print(rule)
selection.append(rule)
write_csv(writer,number1,number2,number3,number4)
When using Python 3 the empty lines can be avoid by using the codecs module. As stated in the documentation, files are opened in binary mode so no change of the newline kwarg is necessary. I was running into the same issue recently and that worked for me:
with codecs.open( csv_file, mode='w', encoding='utf-8') as out_csv:
csv_out_file = csv.DictWriter(out_csv)

Write to a csv file multiple times? [duplicate]

I am trying to add a new row to my old CSV file. Basically, it gets updated each time I run the Python script.
Right now I am storing the old CSV rows values in a list and then deleting the CSV file and creating it again with the new list value.
I wanted to know are there any better ways of doing this.
with open('document.csv','a') as fd:
fd.write(myCsvRow)
Opening a file with the 'a' parameter allows you to append to the end of the file instead of simply overwriting the existing content. Try that.
I prefer this solution using the csv module from the standard library and the with statement to avoid leaving the file open.
The key point is using 'a' for appending when you open the file.
import csv
fields=['first','second','third']
with open(r'name', 'a') as f:
writer = csv.writer(f)
writer.writerow(fields)
If you are using Python 2.7 you may experience superfluous new lines in Windows. You can try to avoid them using 'ab' instead of 'a' this will, however, cause you TypeError: a bytes-like object is required, not 'str' in python and CSV in Python 3.6. Adding the newline='', as Natacha suggests, will cause you a backward incompatibility between Python 2 and 3.
Based in the answer of #G M and paying attention to the #John La Rooy's warning, I was able to append a new row opening the file in 'a'mode.
Even in windows, in order to avoid the newline problem, you must declare it as newline=''.
Now you can open the file in 'a'mode (without the b).
import csv
with open(r'names.csv', 'a', newline='') as csvfile:
fieldnames = ['This','aNew']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writerow({'This':'is', 'aNew':'Row'})
I didn't try with the regular writer (without the Dict), but I think that it'll be ok too.
If you use pandas, you can append your dataframes to an existing CSV file this way:
df.to_csv('log.csv', mode='a', index=False, header=False)
With mode='a' we ensure that we append, rather than overwrite, and with header=False we ensure that we append only the values of df rows, rather than header + values.
Are you opening the file with mode of 'a' instead of 'w'?
See Reading and Writing Files in the python docs
7.2. Reading and Writing Files
open() returns a file object, and is most commonly used with two arguments: open(filename, mode).
>>> f = open('workfile', 'w')
>>> print f <open file 'workfile', mode 'w' at 80a0960>
The first argument is a string containing the filename. The second argument is
another string containing a few characters describing the way in which
the file will be used. mode can be 'r' when the file will only be
read, 'w' for only writing (an existing file with the same name will
be erased), and 'a' opens the file for appending; any data written to
the file is automatically added to the end. 'r+' opens the file for
both reading and writing. The mode argument is optional; 'r' will be
assumed if it’s omitted.
On Windows, 'b' appended to the mode opens the file in binary mode, so
there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows
makes a distinction between text and binary files; the end-of-line
characters in text files are automatically altered slightly when data
is read or written. This behind-the-scenes modification to file data
is fine for ASCII text files, but it’ll corrupt binary data like that
in JPEG or EXE files. Be very careful to use binary mode when reading
and writing such files. On Unix, it doesn’t hurt to append a 'b' to
the mode, so you can use it platform-independently for all binary
files.
If the file exists and contains data, then it is possible to generate the fieldname parameter for csv.DictWriter automatically:
# read header automatically
with open(myFile, "r") as f:
reader = csv.reader(f)
for header in reader:
break
# add row to CSV file
with open(myFile, "a", newline='') as f:
writer = csv.DictWriter(f, fieldnames=header)
writer.writerow(myDict)
I use the following approach to append a new line in a .csv file:
pose_x = 1
pose_y = 2
with open('path-to-your-csv-file.csv', mode='a') as file_:
file_.write("{},{}".format(pose_x, pose_y))
file_.write("\n") # Next line.
[NOTE]:
mode='a' is append mode.
# I like using the codecs opening in a with
field_names = ['latitude', 'longitude', 'date', 'user', 'text']
with codecs.open(filename,"ab", encoding='utf-8') as logfile:
logger = csv.DictWriter(logfile, fieldnames=field_names)
logger.writeheader()
# some more code stuff
for video in aList:
video_result = {}
video_result['date'] = video['snippet']['publishedAt']
video_result['user'] = video['id']
video_result['text'] = video['snippet']['description'].encode('utf8')
logger.writerow(video_result)

Python changing Comma Delimitation CSV

NEWBIE USING PYTHON (2.7.9)- When I export a gzipped file to a csv using:
myData = gzip.open('file.gz.DONE', 'rb')
myFile = open('output.csv', 'wb') with myFile:
writer = csv.writer(myFile)
writer.writerows(myData)
print("Writing complete")
It is printing in the csv with a comma deliminated in every character. eg.
S,V,R,","2,1,4,0,",",2,0,1,6,1,1,3,8,0,4,",",5,0,5,0,1,3,4,2,0,6,4,7,3,6,4,",",",",2,0,0,0,5,6,5,9,2,9,6,7,4,",",2,0,0,7,2,4,5,2,3,5,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,2,1,4,4,9,3,7,0,",":,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
"
S,V,R,",",4,7,3,3,5,5,",",2,0,5,7,",",5,0,5,0,1,4,5,0,1,6,4,8,6,3,7,",",",",2,0,0,0,5,5,3,9,2,9,2,8,0,",",2,0,4,4,1,0,8,3,7,8,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,4,7,3,3,5,4,5,5,",",,:,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
How do I get rid of the comma so that it is exported with the correct fields? eg.
SVR,2144370,20161804,50501342364,,565929674,2007245235,0002,1,PPDAP,PPLUS,DEACTIVE,,,EN,N/A,214370,:IR_,N/A,,,,,
SVR,473455,208082557,14501648637,,2000553929280,2044108378,0002,1,3G,CODAP,INACTIVE,,,EN,N/A,35455,:IR_,N/A,,,,,
You are only opening the gzip file. I think you are expecting the opened file to act automatically like an iterator. Which it does. However each line is a text string. The writerows expects an iterator with each item being an array of values to write with comma separation. Thus given an iterator with each item being a sting, and given that a string is an array of characters you get the result you found.
Since you didn't mention what the gzip data lines really contain I can't guess how to parse the lines into an array of reasonable chunks. But assuming a function called 'split_line' appropriate to that data you could do
with gzip.open('file.gz.Done', 'rb') as gzip_f:
data = [split_line(l) for l in gzip_f]
with open('output.csv', 'wb') as myFile:
writer = csv.writer(myFile)
writer.writerows(data)
print("Writing complete")
Of course at this point doing row by row and putting the with lines together makes sense.
See https://docs.python.org/2/library/csv.html
I think it's simply because gzip.open() will give you a file-like object but csvwriter.writerows() needs a list of lists of strings to do its work.
But I don't understand why you want to use the csv module. You look like you only want to extract the content of the gzip file and save it in a output file uncompressed. You could do that like that:
import gzip
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt') as input_file:
with open('output.csv', 'wt') as output_file:
for line in input_file:
output_file.write(line)
print("Writing complete")
If you want to use the csv module because you're not sure your input data is properly formatted (and you want an error message right away) you could then do:
import gzip
import csv
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt', newline='') as input_file:
reader_csv = csv.reader(input_file)
with open('output.csv', 'wt', newline='') as output_file:
writer_csv = csv.writer(output_file)
writer_csv.writerows(reader_csv)
print("Writing complete")
Is that what you were trying to do ? It's difficult to guess because we don't have the input file to understand.
If it's not what you want, could you care to clarify what you want?
Since I now have information the gzipped file is itself comma, separated values it simplifies thus..
with gzip.open('file.gz.DONE', 'rb') as gzip_f, open('output.csv', 'wb') as myFile:
myfile.write(gzip_f.read())
In other words it is just a round about gunzip to another file.

Excel disregards decimal separators when working with Python generated CSV file

I am currently trying to write a csv file in python. The format is as following:
1; 2.51; 12
123; 2.414; 142
EDIT: I already get the above format in my CSV, so the python code seems ok. It appears to be an excel issue which is olved by changing the settigs as #chucksmash mentioned.
However, when I try to open the generated csv file with excel, it doesn't recognize decimal separators. 2.414 is treated as 2414 in excel.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";")
writer.writerow(some_array_with_floats)
Did you check that the csv file is generated correctly as you want? Also, try to specify the delimeter character that your using for the csv file when you import/open your file. In this case, it is a semicolon.
For python 3, I think your above code will also run into a TypeError, which may be part of the problem.
I just made a modification with your open method to be 'w' instead of 'wb' since the array has float and not binary data. This seemed to generate the result that you were looking for.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'w')
An ugly solution, if you really want to use ; as the separator:
import csv
import os
with open('a.csv', 'wb') as csvfile:
csvfile.write('sep=;'+ os.linesep) # new line
writer = csv.writer(csvfile, delimiter=";")
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
This will produce:
sep=;
1;2.51;12
123;2.414;142
which is recognized fine by Excel.
I personally would go with , as the separator in which case you do not need the first line, so you can basically:
import csv
with open('a.csv', 'wb') as csvfile:
writer = csv.writer(csvfile) # default delimiter is `,`
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
And excel will recognize what is going on.
A way to do this is to specify dialect=csv.excel in the writer. For example:
a = [[1, 2.51, 12],[123, 2.414, 142]]
csvfile = open('data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";", dialect=csv.excel)
writer.writerows(a)
csvfile.close()
Unless Excel is already configured to use semicolon as its default delimiter, it will be necessary to import data.csv using Data/FromText and specify semicolon as the delimiter in the Text Import Wizard step 2 screen.
Very little documentation is provided for the Dialect class at csv.Dialect. More information about it is at Dialects in the PyMOTW's "csv – Comma-separated value files" article on the Python csv module. More information about csv.writer() is available at https://docs.python.org/2/library/csv.html#csv.writer.

Categories

Resources