Python unwanted empty lines after string strip - python

I have a csv
"AA","AB","AC"
"BA","BB","BC"
"CA","CB","CC"
after removing a string say " the csv format changes to
AA,AB,AC
BA,BB,BC
CA,CB,CB
What should I do to avoid the unwanted lines ?
import fileinput
for line in fileinput.FileInput("test.csv",inplace=1):
line = line.replace('"','')
print (line)

Looks like you're printing it, looks like Python 3, and looks like your file content already includes the necessary newlines. Therefore, you need to tell the print() function not to add its own newlines:
print(line, end='')

When read each line includes the terminating new line character. Furthermore print() will also add a new line of it's own, so you end up with two new lines.
But you are not using strip() as suggested by your question's title.
To get around that you can use rstrip() to remove any whitespace at the end of each line:
import fileinput
for line in fileinput.FileInput("test.csv",inplace=1):
line = line.replace('"','').rstrip()
print (line)
That will get rid of the extra new line characters, but note that it will also remove other whitespace at the end of the line.
An alternative is to prevent print() adding its own new line:
Python 2:
print(line), # comma prevents new line
Python 3:
print(line, end='')

Why are you doing this? You should use csv module , it would handle both the , as well as the quotes for you. Example -
import csv
with fileinput.FileInput('test.csv',inplace=1) as f:
reader = csv.reader(f)
for row in reader:
print (','.join(row))
Example/Demo -
>>> import csv
>>> with fileinput.FileInput('test.csv',inplace=1) as f:
... reader = csv.reader(f)
... for row in reader:
... print(','.join(row))
...
AA,AB,AC
BA,BB,BC
CA,CB,CC

You are seeing extra lines because the lines read from the file end with '\n' and the print(line) statement appends an extra newline.
You can use rstrip() to strip out the trailing newline:
import fileinput
for line in fileinput.FileInput("test.csv",inplace=1):
line = line.rstrip().replace('"','')
print (line)

Related

How to remove spaces from file to make one long line?

I want to read a file and remove the spaces. I swear I've done this multiple times, but some reason the method I used to use doesn;t seem to be working. I must be making some small mistake somewhere, so I decided to make a small practice file (because the files I actually need to use are EXTREMELY LARGE) to find out.
the original file says:
abcdefg
(new line)
hijklmn
but I want it to say:
abcdefghijklmn
file = open('please work.txt', 'r')
for line in file:
lines = line.strip()
print(lines)
close.file()
However, it just says:
abcdefg
(new line)
hijklmn
and when I use line.strip('\n') it says:
abcdefg
(big new line)
hijklmn
Any help will be greatly appreciated, because this was the first thing I learned and suddenly I can't remember how to use it!
If what you want to do is to concatenate each line into a single line, you could utilize rstrip and concatenate to a result variable:
with open('test.txt', 'r') as fin:
lines = ''
for line in fin:
stripped_line = line.rstrip()
lines += stripped_line
print(lines)
From a text file looking like this:
abcdefg hijklmnop
this is a line
The result would be abcdefg hijklmnopthis is a line. If you did want to remove the whitespace as well you could lines = lines.replace(' ','') after the loop which would result in abcdefghijklmnopthisisaline.
The (new line) in your output is from the print, which will output a \n. you can use print(lines, end='') to remove it.
strip() only removes leading & trailing spaces.
You can use string.replace(' ', '') to remove all spaces.
'abcdefg (new line) hijklmn'.replace(' ', '')
If your file has tab newline or other forms of spaces, the above will not work and you will need to use regex to remove all forms of space in the file.
import re
string = '''this is a \n
test \t\t\t
\r
\v
'''
re.sub(r'\s', '', string)
#'thisisatest'

How do I remove the "\n" characters when reading my file but deleting one variable then replacing it

I am trying to make a program that reads from a file and deletes one specific line inside of it and then puts all the data stored back to the file separated with a new line. The file uses this format:
Jones|20|20|00
bob|30|19|90
James|40|19|80
So I want to delete (backup contains this and is the line I want to delete)
bob|30|19|90
but the code that I am using takes away the new line and doesnt replace it but when I try to add \n to it the file doesn't want to read as it does this (adds 2 "\n"s):
Jones|20|20|00
James|40|19|80
I am using this code below:
def deleteccsaver(backup):
lockaccount =""
lockaccount = lockaccount.strip("\n")
with open('accounts_project.txt','r+') as f:
newline=[]
for line in f.readlines():
newline.append(line.replace(backup, lockaccount).strip("\n"))
with open('accounts_project.txt','w+') as f:
for line in newline:
f.writelines(line +"\n")
f.close()
resetlogin()
Please help as I dont know how to add the \n back without it appearing as "\n\n"
Without the "\n "it appears as:
Jones|20|20|00James|40|19|80
Any suggestions:
What I am doing here is reading the entire file at once, please don't do this if you have a very very big file. After reading all file contents at once, I am making a list out of it using "\n" as a delimiter. Read about split function in python to know more about it. Then from the list I am replacing the backup with lockaccount, as you have been doing the same, these are the names of variables that you are using, hope I did not confuse between them in this case. Then it will be saved to a new file after adding new line after each element of list, i.e. each line of the previous file. This will cause the result file to have all the contents as previous file, but removing what you wanted to remove. I see that lockaccount is itself an empty string, so adding it might create a newline in your file. In case you dont want lockaccount to replace the backup variable in the file, just remove the backup from the list using contents.remove(backup) instead of contents[contents.index(backup)] == lockaccount keeping the rest of the code same. Hope this explains better.
def deleteccsaver(backup):
lockaccount =""
lockaccount = lockaccount.strip("\n")
with open('accounts_project.txt','r+') as f:
contents = f.read().split("\n")
if backup in contents:
contents[contents.index(backup)] = lockaccount
new_contents = "\n".join(contents)
with open('accounts_project.txt','w+') as f:
f.write(new_contents)
resetlogin()
You are priting a newline character after each element in the list. So, if you replace a line with the empty string, well, you will get an empty line.
Try to simply skip over the line you want to delete:
if line == backup:
contiune
else:
lines.append(...)
PS. There is room for improvment in the code above, but I'm on the phone, I will get back with an edit later if nobody gets ahead of me
You can try to add newline = '\n'.join(newline) after your first for loop and then just write it into the accounts_project.txt file without a loop.
The code should then look like:
def deleteccsaver(backup):
lockaccount =""
lockaccount = lockaccount.strip("\n")
with open('accounts_project.txt','r+') as f:
newline=[]
for line in f.readlines():
newline.append(line.replace(backup, lockaccount).strip("\n"))
newline = '\n'.join(newline)
with open('accounts_project.txt','w+') as f:
f.write(newline)
f.close() # you don't necessarily need it inside a with statement
resetlogin()
Edit:
Above code still results in
Jones|20|20|00
James|40|19|80
as output.
That's because during the replacement loop an empty string will be appended to newline (like newline: ['Jones|20|20|00','','James|40|19|80']) and newline = '\n'.join(newline) will then result in 'Jones|20|20|00\n\nJames|40|19|80'.
A possible fix can be to replace:
for line in f.readlines():
newline.append(line.replace(backup, lockaccount).strip("\n"))
with
for line in f.readlines():
line = line.strip('\n')
if line != backup:
newline.append(line)
def deleteccsaver(backup):
lockaccount =""
lockaccount = lockaccount.strip("\n")
with open('accounts_project.txt','r+') as f:
contents = f.read().split("\n")
if backup in contents:
contents.remove(backup)
new_contents = "\n".join(contents)
with open('accounts_project.txt','w+') as f:
f.write(new_contents)
resetlogin()

Removing lines from a csv with Python also adds an extra line

This code, borrowed from another place on stackoverflow, removes all of the places that the csv has "None" written in it. However, it also adds an extra line to the csv. How can I change this code to remove that extra line? I think the problem is caused by inplace, but when I take inplace away the file is no longer altered by running the code.
def cleanOutputFile(filename):
for line in fileinput.FileInput(filename,inplace=1):
line = line.replace('None',"")
print line
Thanks!
If you want to replace all the None's:
with open(filename) as f:
lines = f.read().replace("None","")
with open(filename,"w") as f1:
f1.write(lines)
Using rstrip with fileinput should also work:
import fileinput
for line in fileinput.FileInput(fileinput,inplace=1):
print line.replace('None',"").rstrip() # remove newline character to avoid adding extra lines in the output
The problem here has nothing to do with fileinput, or with the replace.
Lines read from a file always end in a newline.
print adds a newline, even if the thing you're printing already ends with a newline.
You can see this without even a file involved:
>>> a = 'abc'
>>> print a
abc
>>> a = 'abc\n'
>>> print a
abc
>>>
The solution is any of the following:
rstrip the newlines off the input: print line.rstrip('\n') (or do the strip earlier in your processing)
Use the "magic comma" to prevent print from adding the newline: print line,
Use Python 3-style print with from __future__ import print_function, so you can use the more flexible keyword arguments: print(line, end='')
Use sys.stdout.write instead of print.
Completely reorganize your code so you're no longer writing to stdout at all, but instead writing directly to a temporary file, or reading the whole file into memory and then writing it back out, etc.

How can I remove carriage return from a text file with Python?

The things I've googled haven't worked, so I'm turning to experts!
I have some text in a tab-delimited text file that has some sort of carriage return in it (when I open it in Notepad++ and use "show all characters", I see [CR][LF] at the end of the line). I need to remove this carriage return (or whatever it is), but I can't seem to figure it out. Here's a snippet of the text file showing a line with the carriage return:
firstcolumn secondcolumn third fourth fifth sixth seventh
moreoftheseventh 8th 9th 10th 11th 12th 13th
Here's the code I'm trying to use to replace it, but it's not finding the return:
with open(infile, "r") as f:
for line in f:
if "\n" in line:
line = line.replace("\n", " ")
My script just doesn't find the carriage return. Am I doing something wrong or making an incorrect assumption about this carriage return? I could just remove it manually in a text editor, but there are about 5000 records in the text file that may also contain this issue.
Further information:
The goal here is select two columns from the text file, so I split on \t characters and refer to the values as parts of an array. It works on any line without the returns, but fails on the lines with the returns because, for example, there is no element 9 in those lines.
vals = line.split("\t")
print(vals[0] + " " + vals[9])
So, for the line of text above, this code fails because there is no index 9 in that particular array. For lines of text that don't have the [CR][LF], it works as expected.
Depending on the type of file (and the OS it comes from, etc), your carriage return might be '\r', '\n', or '\r'\n'. The best way to get rid of them regardless of which one they are is to use line.rstrip().
with open(infile, "r") as f:
for line in f:
line = line.rstrip() # strip out all tailing whitespace
If you want to get rid of ONLY the carriage returns and not any extra whitespaces that might be at the end, you can supply the optional argument to rstrip:
with open(infile, "r") as f:
for line in f:
line = line.rstrip('\r\n') # strip out all tailing whitespace
Hope this helps
Here's how to remove carriage returns without using a temporary file:
with open(file_name, 'r') as file:
content = file.read()
with open(file_name, 'w', newline='\n') as file:
file.write(content)
Python opens files in so-called universal newline mode, so newlines are always \n.
Python is usually built with universal newlines support; supplying 'U'
opens the file as a text file, but lines may be terminated by any of
the following: the Unix end-of-line convention '\n', the Macintosh
convention '\r', or the Windows convention '\r\n'. All of these
external representations are seen as '\n' by the Python program.
You iterate through file line-by-line. And you are replacing \n in the lines. But in fact there are no \n because lines are already separated by \n by iterator and each line contains no \n.
You can just read from file f.read(). And then replace \n in it.
with open(infile, "r") as f:
content = f.read()
content = content.replace('\n', ' ')
#do something with content
Technically, there is an answer!
with open(filetoread, "rb") as inf:
with open(filetowrite, "w") as fixed:
for line in inf:
fixed.write(line)
The b in open(filetoread, "rb") apparently opens the file in such a way that I can access those line breaks and remove them. This answer actually came from Stack Overflow user Kenneth Reitz off the site.
Thanks everyone!
I've created a code to do it and it works:
end1='C:\...\file1.txt'
end2='C:\...\file2.txt'
with open(end1, "rb") as inf:
with open(end2, "w") as fixed:
for line in inf:
line = line.replace("\n", "")
line = line.replace("\r", "")
fixed.write(line)

Writing to file with unwanted empty lines

I have a piece of code that's removing some unwanted lines from a text file and writing the results to a new one:
f = open('messyParamsList.txt')
g = open('cleanerParamsList.txt','w')
for line in f:
if not line.startswith('W'):
g.write('%s\n' % line)
The original file is single-spaced, but the new file has an empty line between each line of text. How can I lose the empty lines?
You're not removing the newline from the input lines, so you shouldn't be adding one (\n) on output.
Either strip the newlines off the lines you read or don't add new ones as you write it out.
Just do:
f = open('messyParamsList.txt')
g = open('cleanerParamsList.txt','w')
for line in f:
if not line.startswith('W'):
g.write(line)
Every line that you read from original file has \n (new line) character at the end, so do not add another one (right now you are adding one, which means you actually introduce empty lines).
My guess is that the variable "line" already has a newline in it, but you're writing an additional newline with the g.write('%s*\n*' % line)
line has a newline at the end.
Remove the \n from your write, or rstrip line.

Categories

Resources