How to write string to csv that contain escape chars? - python

I am trying to write a list of strings to csv using csv.writer.
writer = csv.writer(f)
writer.writerow(some_text)
However, some of the strings contain a random escape character, which seems to be causing the following error : _csv.Error: need to escape, but no escapechar set
I've tried using the escapechar option in csv.writer like the following
writer = csv.writer(f, escapechar='\\')
but this seems to be a partial solution, since all the newline characters(\n) are not recognized.
How would I solve this problem? An example of a problematic string would be the following:
problem_string = "this \n sentence \% is \n problematic \g"

What format do you want to achieve in the end? Writing this to a csv seems to be leading to some odd outcomes anyway.
In any case, both of these code work for me without errors, both giving slightly different results with respect to escape characters.
With normal string:
import csv
with open('test2.csv', 'w') as csvfile:
csvwriter = csv.writer(csvfile)
problem_string = "this \n sentence \% is \n problematic \g"
csvwriter.writerow(problem_string)
With raw input:
import csv
with open('test2.csv', 'w') as csvfile:
csvwriter = csv.writer(csvfile)
problem_string = r"this \n sentence \% is \n problematic \g"
csvwriter.writerow(problem_string)

Related

Python CSV Parsing, Escaped Quote Character

I am trying to parse a CSV file using the csv.reader, my data is separated by commas and each value starts and ends with quotation marks. Example:
"This is some data", "New data", "More \"data\" here", "test"
My problem is with the third value, the data I get which has quotation marks within it has an escape character to show it is part of the data. The python CSV reader does not use this escape character so it results in incorrect parsing.
I tried code like below:
with open(filepath) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',', quotechar='\\"')
But I get an error complaining the quotechar is not 1 character.
My current solution is just to replace all characters \" characters with a single quote ' before parsing with csv.reader - however, I would like to know if there is a better way without modifying the original data.
The issue here is that you need to define an escapechar, so that the csv reader knows to treat \" as ".
csv.reader(csv_file, quotechar='"', delimiter=',', escapechar='\\')

Escape commas when writing string to CSV

I need to prepend a comma-containing string to a CSV file using Python. Some say enclosing the string in double quotes escapes the commas within. This does not work. How do I write this string without the commas being recognized as seperators?
string = "WORD;WORD 45,90;WORD 45,90;END;"
with open('doc.csv') as f:
prepended = string + '\n' + f.read()
with open('doc.csv', 'w') as f:
f.write(prepended)
So as you point out, you can typically quote the string as below. Is the system that reads these files not recognizing that syntax? If you use python's csv module it will handle the proper escaping:
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(myIterable, quoting=csv.QUOTE_ALL)
The quoted strings would look like:
"string1","string 2, with, commas"
Note if you have a quote character within your string it will be written as "" (two quote chars in a row):
"string1","string 2, with, commas, and "" a quote"

Python CSV Reader splitting on comma inside of quotes

from csv import reader
csv_reader_results = reader(["办公室弥漫着\"女红\"缝扣子.编蝴蝶结..手绣花...呵呵..原来做 些也会有幸福的感觉,,,,用心做东西感觉真好!!!"],
escapechar='\\',
quotechar='"',
delimiter=',',
quoting=csv.QUOTE_ALL,
skipinitialspace=True)
for result in csv_reader_result:
print result[0]
What I'm expecting is:
办公室弥漫着"女红"缝扣子.编蝴蝶结..手绣花...呵呵..原来做 些也会有幸福的感觉,,,,用心做东西感觉真好!!!
But what I'm getting is:
办公室弥漫着"女红"缝扣子.编蝴蝶结..手绣花...呵呵..原来做 些也会有幸福的感觉
Because it splits on the four commas inside the sentence.
I'm escaping the quotes inside of the sentence. I've set the quotechar and escapechar for csv.reader. What am I doing wrong here?
Edit:
I used the answer by j6m8 https://stackoverflow.com/a/19881343/3945463 as a workaround. But it would be preferable to learn the correct way to do this with csv reader.

csv module automatically writing unwanted carriage returns

When using pythons csv module to create a csv it is automatically putting carriage return characters at the end of strings if the string has a comma inside it e.g:
['this one will have a carriage return, at the end','this one wont']
in an excel sheet this will turn out like:
| |this on|
because of the extra carriage return, it will also surround the string with the comma inside in double quotes, as expected.
The code I am using is:
with open(oldfile, 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
for row in data:
writer.writerow(row)
How do I create a csv using the same data format which won't have carriage returns if the strings have commas inside, I don't mind the strings being surrounded by double quotes though
Here's a link to the diagnosis of the problem with the output .csv:
Excel showing empty cells when importing file created with csv module
It's the accepted answer.
I have changed my code to:
with open(oldfile, 'w', newline='', quoting=csv.QUOTE_MINIMAL) as csvfile:
writer = csv.writer(csvfile)
for row in data:
writer.writerow(row)
I am now getting the error:
TypeError: 'quoting' is an invalid keyword argument for this function
The built-in CSV module of python has the option: csv.QUOTE_MINIMAL. When this option is added as an argument to the writer, it adds quotemarks when the delimeter is in the given string: "your text, with comma", "other field". This will eliminate the need for carriage returns.
The code is:
with open(oldfile, 'w') as csvfile: writer = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL) for row in data: writer.writerow(row)

Python CSV writer to quote strings with extra spaces

My data looks something like this:
data = [
[" trailing space", 19, 100],
[" ", 19, 100],
]
writer = csv.writer(csv_filename, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
Output
trailing space,19,100
,19,100
What I want
" trailing space",19,100
" ",19,100
Python default CSV writer has the option to "QUOTE_MINIMAL" but it doesn't include quoting strings with extra spaces in it. In my case, those empty spaces are actually critical, but without quoting, the reader (like libre-office) strips the spaces if not quoted.
Is there any built in options or quick cheap way to tell the writer to quote empty strings with spaces?
Also, "QUOTE_NONNUMERIC" is quoting too much. The actual data is huge ( few hundred megabytes with 60% - 70% of strings). It may sounds silly, but I'm trying to reduce the csv size by minimizing the quotes.
It's a bit of a hack but one way of achiving this could be
df.to_csv(quoting=csv.QUOTE_MINIMAL, escapechar=' ')
It's not document but QUOTE_MINIMAL seems to quote fields containing escapechar although it has no effect (as quoting is not NONE and doublequote is True by default)
Why not just use QUOTE_NONNUMERIC? That'll quote all strings, not just those with spaces, but it'll certainly quote those too.
with open("quote.csv", "w", newline="") as fp:
writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
writer.writerows(data)
gives me
(3.5.1) dsm#notebook:~/coding$ cat quote.csv
" leading space",19,100
" ",19,100
Have you tried csv writer in Python with custom quoting
Though make sure you know what you are quoting and take to manually escape stuff
Try removing the quoting altogether. Will keep all quote characters as required.
writer = csv.writer(csv_filename, delimiter=',', quoting=csv.QUOTE_NONE)

Categories

Resources