Write newline to csv in Python - python

I want to end each interation of a for loop with writing a new line of content (including newline) to a csv file. I have this:
# Set up an output csv file with column headers
with open('outfile.csv','w') as f:
f.write("title; post")
f.write("\n")
This does not appear to write an actual \n (newline) the file. Further:
# Concatenate into a row to write to the output csv file
csv_line = topic_title + ";" + thread_post
with open('outfile.csv','w') as outfile:
outfile.write(csv_line + "\n")
This, also, does not move the cursor in the outfile to the next line. Each new line, with every iteration of the loop, just overwrites the most recent one.
I also tried outfile.write(os.linesep) but did not work.

change 'w' to 'a'
with open('outfile.csv','a')

with open('outfile.csv', 'w', newline='') as f:
f.writerow(...)
Alternatively:
f = csv.writer('outfile.csv', lineterminator='\n')

I confront with same problem, only need follow:
f = csv.writer('outfile.csv', lineterminator='\n')

If your using python2 then use
with open('xxx.csv', 'a') as f:
writer = csv.writer(f)
writer.writerow(fields)
If you are using python3 make use of newline=''

Please try with: open(output_file_name.csv, 'a+', newline='') as f:

Related

CSV Writer (Python) with CRLF instead of LF

Hi I am trying to use csv library to convert my CSV file into a new one.
The code that I wrote is the following:
import csv
import re
file_read=r'C:\Users\Comarch\Desktop\Test.csv'
file_write=r'C:\Users\Comarch\Desktop\Test_new.csv'
def find_txt_in_parentheses(cell_txt):
pattern = r'\(.+\)'
return set(re.findall(pattern, cell_txt))
with open(file_write, 'w', encoding='utf-8-sig') as file_w:
csv_writer = csv.writer(file_w, lineterminator="\n")
with open(file_read, 'r',encoding='utf-8-sig') as file_r:
csv_reader = csv.reader(file_r)
for row in csv_reader:
cell_txt = row[0]
txt_in_parentheses = find_txt_in_parentheses(cell_txt)
if len(txt_in_parentheses) == 1:
txt_in_parentheses = txt_in_parentheses.pop()
cell_txt_new = cell_txt.replace(' ' + txt_in_parentheses,'')
cell_txt_new = txt_in_parentheses + '\n' + cell_txt_new
row[0] = cell_txt_new
csv_writer.writerow(row)
The only problem is that in the resulting file (Test_new.csv file), I have CRLF instead of LF.
Here is a sample image of:
read file on the left
write file on the right:
And as a result when I copy the csv column into Google docs Excel file I am getting a blank line after each row with CRLF.
Is it possible to write my code with the use of csv library so that LF is left inside a cell instead of CRLF.
From the documentation of csv.reader
If csvfile is a file object, it should be opened with newline=''1
[...]
Footnotes
1(1,2)
If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.
This is precisely the issue you're seeing. So...
with open(file_read, 'r', encoding='utf-8-sig', newline='') as file_r, \
open(file_write, 'w', encoding='utf-8-sig', newline='') as file_w:
csv_reader = csv.reader(file_r, dialect='excel')
csv_writer = csv.writer(file_w, dialect='excel')
# ...
You are on Windows, and you open the file with mode 'w' -- which gives you windows style line endings. Using mode 'wb' should give you the preferred behaviour.

Write CSV Python

I want write csv (outfile) file from another csv file (infile). In infile csv data write like this OF0A0C,00,D0,0F11AFCB I want to write to outfile same asinfile but I get like this "\r \n 0,F,0,A,0,C,","0,0,","D,0,","0,F,1,1,A,F,C,B \r \n
My code like this :
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
for line in inFile:
OutFile = csv.writer (outFile)
OutFile.writerow (line)
After write I want to save every data in row to list like this Data = [[length_of_all_data],[length_data_row_1,datarow1],[length_data_row_2,datarow1datarow2],[length_data_row_3,datarow1datarow3]]
I confused to save the with list mode like that. Thankyou
Few issues -
You should read the input csv file using csv module's csv.reader() , instead of iterating over its lines, since when you iterate over its lines, you get the line back as a string in the iteration - for line in inFile: , and then you are writing this line back using OutFile.writerow(line) , hence it writes each character into different columns.
You do not need to create separate OutFile = csv.writer (outFile) for every line.
Example code -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
for row in in_reader:
out_file.writerow(row)
EDIT: For the second issue that is updated, you can create a list and a counter to keep track of the complete length. Example -
with open ("from_baryon.csv", "r") as inFile:
with open (self.filename, "a") as outFile:
out_file = csv.writer (outFile)
in_reader = csv.reader(inFile)
data = []
lencount = 0
for row in in_reader:
out_file.writerow(row)
tlen = len(''.join(row))
data.append([tlen] + row)
lencount += tlen
data.insert(0,[lencount])

Pipe delimiter file, but no pipe inside data

Problem
I need to re-format a text from comma (,) separated values to pipe (|) separated values. Pipe characters within the values of the original (comma separated) text shall be replaced by a space for representation in the (pipe separated) result text.
The pipe separated result text shall be written back to the same file from which the original comma separated text has been read.
I am using python 2.6
Possible Solution
I should read the file first and remove all pipes with spaces in that and later replace (,) with (|).
Is there a the better way to achieve this?
Don't reinvent the value-separated file parsing wheel. Use the csv module to do the parsing and the writing for you.
The csv module will add "..." quotes around values that contain the separator, so in principle you don't need to replace the | pipe symbols in the values. To replace the original file, write to a new (temporary) outputfile then move that back into place.
import csv
import os
outputfile = inputfile + '.tmp'
with open(inputfile, 'rb') as inf, open(outputfile, 'wb') as outf:
reader = csv.reader(inf)
writer = csv.writer(outf, delimiter='|')
writer.writerows(reader)
os.remove(inputfile)
os.rename(outputfile, inputfile)
For an input file containing:
foo,bar|baz,spam
this produces
foo|"bar|baz"|spam
Note that the middle column is wrapped in quotes.
If you do need to replace the | characters in the values, you can do so as you copy the rows:
outputfile = inputfile + '.tmp'
with open(inputfile, 'rb') as inf, open(outputfile, 'wb') as outf:
reader = csv.reader(inf)
writer = csv.writer(outf, delimiter='|')
for row in reader:
writer.writerow([col.replace('|', ' ') for col in row])
os.remove(inputfile)
os.rename(outputfile, inputfile)
Now the output for my example becomes:
foo|bar baz|spam
Sounds like you're trying to work with a variation of CSV - in that case, Python's CSV library might as well be what you need. You can use it with custom delimiters and it will auto-handle escaping for you (this example was yanked from the manual and modified):
import csv
with open('eggs.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter='|')
spamwriter.writerow(['One', 'Two', 'Three])
There are also ways to modify quoting and escaping and other options. Reading works similarly.
You can create a temporary file from the original that has the pipe characters replaced, and then replace the original file with it when the processing is done:
import csv
import tempfile
import os
filepath = 'C:/Path/InputFile.csv'
with open(filepath, 'rb') as fin:
reader = csv.DictReader(fin)
fout = tempfile.NamedTemporaryFile(dir=os.path.dirname(filepath)
delete=False)
temp_filepath = fout.name
writer = csv.DictWriter(fout, reader.fieldnames, delimiter='|')
# writer.writeheader() # requires Python 2.7
header = dict(zip(reader.fieldnames, reader.fieldnames))
writer.writerow(header)
for row in reader:
for k,v in row.items():
row[k] = v.replace('|'. ' ')
writer.writerow(row)
fout.close()
os.remove(filepath)
os.rename(temp_filepath, filepath)

Using multiple re.sub() calls in one file with Python

I have a file with a large amount of random strings contained with in it. There are certain patterns that I wan't to remove, so I decided to use RegEX to check for them. So far this code, does exactly what I want it to:
#!/usr/bin/python
import csv
import re
import sys
import pdb
f=open('output.csv', 'w')
with open('retweet.csv', 'rb') as inputfile:
read=csv.reader(inputfile, delimiter=',')
for row in read:
f.write(re.sub(r'#\s\w+', ' ', row[0]))
f.write("\n")
f.close()
f=open('output2.csv', 'w')
with open('output.csv', 'rb') as inputfile2:
read2=csv.reader(inputfile2, delimiter='\n')
for row in read2:
a= re.sub('[^a-zA-Z0-9]', ' ', row[0])
b= str.split(a)
c= "+".join(b)
f.write("http://www.google.com/webhp#q="+c+"&btnI\n")
f.close()
The problem is, I would like to avoid having to open and close a file as this can get messy if I need to check for more patterns. How can I perform multiple re.sub() calls on the same file and write it out to a new file with all substitutions?
Thanks for any help!
Apply all your substitutions in one go on the current line:
with open('retweet.csv', 'rb') as inputfile:
read=csv.reader(inputfile, delimiter=',')
for row in read:
text = row[0]
text = re.sub(r'#\s\w+', ' ', text)
text = re.sub(another_expression, another_replacement, text)
# etc.
f.write(text + '\n')
Note that opening a file with csv.reader(..., delimiter='\n') sounds awfully much as if you are treating that file as a sequence of lines; you could just loop over the file:
with open('output.csv', 'rb') as inputfile2:
for line in inputfile2:

Removing specific text from every line

I have a txt file with this format:
something text1 pm,bla1,bla1
something text2 pm,bla2,bla2
something text3 am,bla3,bla3
something text4 pm,bla4,bla4
and in a new file I want to hold:
bla1,bla1
bla2,bla2
bla3,bla3
bla4,bla4
I have this which holds the first 10 characters for example of every line. Can I transform this or any other idea?
with open('example1.txt', 'r') as input_handle:
with open('example2.txt', 'w') as output_handle:
for line in input_handle:
output_handle.write(line[:10] + '\n')
This is what the csv module was made for.
import csv
reader = csv.reader(open('file.csv'))
for row in reader: print(row[1])
You can then just redirect the output of the file to the new file using your shell, or you can do something like this instead of the last line:
for row in reader:
with open('out.csv','w+') as f:
f.write(row[1]+'\n')
To remove the first ","-separated column from the file:
first, sep, rest = line.partition(",")
if rest: # don't write lines with less than 2 columns
output_handle.write(rest)
If the format is fixed:
with open('example1.txt', 'r') as input_handle:
with open('example2.txt', 'w') as output_handle:
for line in input_handle:
if line: # and maybe some other format check
od = line.split(',', 1)
output_handle.write(od[1] + "\n")
Here is how I would write it.
Python 2.7
import csv
with open('example1.txt', 'rb') as f_in, open('example2.txt', 'wb') as f_out:
writer = csv.writer(f_out)
for row in csv.reader(f_in):
writer.write(row[-2:]) # keeps the last two columns
Python 3.x (note the differences in arguments to open)
import csv
with open('example1.txt', 'r', newline='') as f_in:
with open('example2.txt', 'w', newline='') as f_out:
writer = csv.writer(f_out)
for row in csv.reader(f_in):
writer.write(row[-2:]) # keeps the last two columns
Try:
output_handle.write(line.split(",", 1)[1])
From the docs:
str.split([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements).

Categories

Resources