csv writing alternate whitespace in lines within a loop python - python

I have a code for generating a csv file shown below.
import csv
csvData = [["HAI", "Hello"]]
while(1):
with open('test.csv', 'a') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_ALL)
writer.writerows(csvData)
csvFile.close()
When i run this code it will generate a csv file like this
1."HAI","Hello"
3."HAI","Hello"
That is it will generate on alternative lines,that is first write takes palce on line 1 ,second write takes on line 3.I want to write it on line 2. Anybody please help me.

If you look at the documentation for CSV formatting parameters, you will see that there is the argument lineterminator, which defaults to \r\n.
You can get your desired output by simply passing a regular \n like so:
writer = csv.writer(csvFile, delimiter=",", quoting=csv.QUOTE_ALL, lineterminator="\n")

Related

Python changing Comma Delimitation CSV

NEWBIE USING PYTHON (2.7.9)- When I export a gzipped file to a csv using:
myData = gzip.open('file.gz.DONE', 'rb')
myFile = open('output.csv', 'wb') with myFile:
writer = csv.writer(myFile)
writer.writerows(myData)
print("Writing complete")
It is printing in the csv with a comma deliminated in every character. eg.
S,V,R,","2,1,4,0,",",2,0,1,6,1,1,3,8,0,4,",",5,0,5,0,1,3,4,2,0,6,4,7,3,6,4,",",",",2,0,0,0,5,6,5,9,2,9,6,7,4,",",2,0,0,7,2,4,5,2,3,5,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,2,1,4,4,9,3,7,0,",":,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
"
S,V,R,",",4,7,3,3,5,5,",",2,0,5,7,",",5,0,5,0,1,4,5,0,1,6,4,8,6,3,7,",",",",2,0,0,0,5,5,3,9,2,9,2,8,0,",",2,0,4,4,1,0,8,3,7,8,",",0,0,0,2,","
I,V,E,",",",",",",E,N,",",N,/,A,",",0,4,4,7,3,3,5,4,5,5,",",,:,I,R,_,",",N,/,A,",",U,N,A,N,S,W,",",",",",",",","
How do I get rid of the comma so that it is exported with the correct fields? eg.
SVR,2144370,20161804,50501342364,,565929674,2007245235,0002,1,PPDAP,PPLUS,DEACTIVE,,,EN,N/A,214370,:IR_,N/A,,,,,
SVR,473455,208082557,14501648637,,2000553929280,2044108378,0002,1,3G,CODAP,INACTIVE,,,EN,N/A,35455,:IR_,N/A,,,,,
You are only opening the gzip file. I think you are expecting the opened file to act automatically like an iterator. Which it does. However each line is a text string. The writerows expects an iterator with each item being an array of values to write with comma separation. Thus given an iterator with each item being a sting, and given that a string is an array of characters you get the result you found.
Since you didn't mention what the gzip data lines really contain I can't guess how to parse the lines into an array of reasonable chunks. But assuming a function called 'split_line' appropriate to that data you could do
with gzip.open('file.gz.Done', 'rb') as gzip_f:
data = [split_line(l) for l in gzip_f]
with open('output.csv', 'wb') as myFile:
writer = csv.writer(myFile)
writer.writerows(data)
print("Writing complete")
Of course at this point doing row by row and putting the with lines together makes sense.
See https://docs.python.org/2/library/csv.html
I think it's simply because gzip.open() will give you a file-like object but csvwriter.writerows() needs a list of lists of strings to do its work.
But I don't understand why you want to use the csv module. You look like you only want to extract the content of the gzip file and save it in a output file uncompressed. You could do that like that:
import gzip
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt') as input_file:
with open('output.csv', 'wt') as output_file:
for line in input_file:
output_file.write(line)
print("Writing complete")
If you want to use the csv module because you're not sure your input data is properly formatted (and you want an error message right away) you could then do:
import gzip
import csv
input_file_name = 'file.gz.DONE'
output_file_name = 'output.csv'
with gzip.open(input_file_name, 'rt', newline='') as input_file:
reader_csv = csv.reader(input_file)
with open('output.csv', 'wt', newline='') as output_file:
writer_csv = csv.writer(output_file)
writer_csv.writerows(reader_csv)
print("Writing complete")
Is that what you were trying to do ? It's difficult to guess because we don't have the input file to understand.
If it's not what you want, could you care to clarify what you want?
Since I now have information the gzipped file is itself comma, separated values it simplifies thus..
with gzip.open('file.gz.DONE', 'rb') as gzip_f, open('output.csv', 'wb') as myFile:
myfile.write(gzip_f.read())
In other words it is just a round about gunzip to another file.

How to write output file in CSV format in python?

I tried to write output file as a CSV file but getting either an error or not the expected result. I am using Python 3.5.2 and 2.7 also.
Getting error in Python 3.5:
wr.writerow(var)
TypeError: a bytes-like object is required, not 'str'
and
In Python 2.7, I am getting all column result in one column.
Expected Result:
An output file same format as the input file.
Code:
import csv
f1 = open("input_1.csv", "r")
resultFile = open("out.csv", "wb")
wr = csv.writer(resultFile, quotechar=',')
def sort_duplicates(f1):
for i in range(0, len(f1)):
f1.insert(f1.index(f1[i])+1, f1[i])
f1.pop(i+1)
for var in f1:
#print (var)
wr.writerow([var])
If I am using resultFile = open("out.csv", "w"), I get one row extra in the output file.
If I am using above code, getting one row and column extra.
On Python 3, csv requires that you open the file in text mode, not binary mode. Drop the b from your file mode. You should really use newline='' too:
resultFile = open("out.csv", "w", newline='')
Better still, use the file object as a context manager to ensure it is closed automatically:
with open("input_1.csv", "r") as f1, \
open("out.csv", "w", newline='') as resultFile:
wr = csv.writer(resultFile, dialect='excel')
for var in f1:
wr.writerow([var.rstrip('\n')])
I've also stripped the lines from f1 (just to remove the newline) and put the line in a list; csv.writer.writerow wants a sequence with columns, not a single string.
Quoting the csv.writer() documentation:
If csvfile is a file object, it should be opened with newline='' [1]. [...] All other non-string data are stringified with str() before being written.
[1] If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.
Others have answered that you should open the output file in text mode when using Python 3, i.e.
with open('out.csv', 'w', newline='') as resultFile:
...
But you also need to parse the incoming CSV data. As it is your code reads each line of the input CSV file as a single string. Then, without splitting that line into its constituent fields, it passes the string to the CSV writer. As a result, the csv.writer will treat the string as a sequence and output each character , including any terminating new line character, as a separate field. For example, if your input CSV file contains:
1,2,3,4
Your output file would be written like this:
1,",",2,",",3,",",4,"
"
You should change the for loop to this:
for row in csv.reader(f1):
# process the row
wr.writerow(row)
Now the input CSV file will be parsed into fields and row will contain a list of strings - one for each field. For the previous example, row would be:
for row in csv.reader(f1):
print(row)
['1', '2', '3', '4']
And when that list is passed to the csv.writer the output to the file will be:
1,2,3,4
Putting all of that together you get this code:
import csv
with open('input_1.csv') as f1, open('out.csv', 'w', newline='') as resultFile:
wr = csv.writer(resultFile, dialect='excel')
for row in csv.reader(f1):
wr.writerow(row)
open file without b mode
b mode open your file as binary
you can open file as w
open_file = open("filename.csv", "w")
You are opening the input file in normal read mode but the output file is opened in binary mode, correct way
resultFile = open("out.csv", "w")
As shown above if you replace "wb" with "w" it will work.

Python CSV module with unix dialect writes different tab character when using QUOTE_NONE

I'm writing a tab-delimited csv file on Windows but want it to be a Linux-friendly format. However, the values should not be quoted. When I use the csv module with the below options, the tab appears as taking only a single space in the output.
with open('myFile.csv', 'w', newline='') as f:
writer = csv.writer(f, delimiter='\t', dialect='unix', quoting=csv.QUOTE_NONE)
writer.writerows(myData)
output with QUOTE_NONE (screenshot from Notepad++)
If I remove QUOTE_NONE, the tab character displays as the right size, but then the file has quoted values, which I don't want.
with open('myFile.csv', 'w', newline='') as f:
writer = csv.writer(f, delimiter='\t', dialect='unix')
writer.writerows(myData)
output with quotes left on
What's going on here?

Excel disregards decimal separators when working with Python generated CSV file

I am currently trying to write a csv file in python. The format is as following:
1; 2.51; 12
123; 2.414; 142
EDIT: I already get the above format in my CSV, so the python code seems ok. It appears to be an excel issue which is olved by changing the settigs as #chucksmash mentioned.
However, when I try to open the generated csv file with excel, it doesn't recognize decimal separators. 2.414 is treated as 2414 in excel.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";")
writer.writerow(some_array_with_floats)
Did you check that the csv file is generated correctly as you want? Also, try to specify the delimeter character that your using for the csv file when you import/open your file. In this case, it is a semicolon.
For python 3, I think your above code will also run into a TypeError, which may be part of the problem.
I just made a modification with your open method to be 'w' instead of 'wb' since the array has float and not binary data. This seemed to generate the result that you were looking for.
csvfile = open('C:/Users/SUUSER/JRITraffic/Data/data.csv', 'w')
An ugly solution, if you really want to use ; as the separator:
import csv
import os
with open('a.csv', 'wb') as csvfile:
csvfile.write('sep=;'+ os.linesep) # new line
writer = csv.writer(csvfile, delimiter=";")
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
This will produce:
sep=;
1;2.51;12
123;2.414;142
which is recognized fine by Excel.
I personally would go with , as the separator in which case you do not need the first line, so you can basically:
import csv
with open('a.csv', 'wb') as csvfile:
writer = csv.writer(csvfile) # default delimiter is `,`
writer.writerow([1, 2.51, 12])
writer.writerow([123, 2.414, 142])
And excel will recognize what is going on.
A way to do this is to specify dialect=csv.excel in the writer. For example:
a = [[1, 2.51, 12],[123, 2.414, 142]]
csvfile = open('data.csv', 'wb')
writer = csv.writer(csvfile, delimiter=";", dialect=csv.excel)
writer.writerows(a)
csvfile.close()
Unless Excel is already configured to use semicolon as its default delimiter, it will be necessary to import data.csv using Data/FromText and specify semicolon as the delimiter in the Text Import Wizard step 2 screen.
Very little documentation is provided for the Dialect class at csv.Dialect. More information about it is at Dialects in the PyMOTW's "csv – Comma-separated value files" article on the Python csv module. More information about csv.writer() is available at https://docs.python.org/2/library/csv.html#csv.writer.

Python csv: Optimizing header output

I am writing a huge (160k+) set of data from SQL to a CSV. My script functions exactly as intended, but I am sure there has to be a more efficient way of including a header in the output. I cobbled together the following from reading writing header in csv python with DictWriter but feel like it lacks elegance.
Here's my code:
f = open(outfile,'w')
wf = csv.DictWriter(f, fieldnames, restval='OOPS')
wf.writer.writerow(wf.fieldnames)
f.close()
f = open(outfile,'a')
wf = csv.writer(f)
wf.writerows(rows)
f.close()
fieldnames is defined explicitly (10 custom column names), rows contains the fetchall() from my query.
Untested, but I don't see why this shouldn't do the job:
import csv
with open(outfile, "wb") as outf:
outcsv = csv.writer(outf)
outcsv.writerow(fieldnames)
outcsv.writerows(rows)

Categories

Resources