I want to strip spaces to single space but preserve one empty line separator in a file. I have tried the following code and it seems to work.
How can I do this with out writing to the file twice?
I want to collect all my substitutions may be in a text file and write them all at once.
i = open('inputfile.txt','r')
infile = i.readlines()
o = open('outputfile.txt','w')
for line in infile:
if line == '\n':
o.write('\n\n')
else:
o.write(re.sub(r'\s+',' ',line))
o.close()
i.close()
See my answer in this question here: Python save file to csv
I think the re.sub() replacement is tripping you up with the '\s' value. Just replace ' ' instead.
i = open('inputfile.txt','r')
infile = i.readlines()
o = open('outputfile.txt','w')
newoutputfile = ""
for line in infile:
if line == '\n':
newoutputfile+= '\n\n'
else:
newoutputfile +=' '.join(line.split())
o.write(newoutputfile)
o.close()
Related
I've written a script that appends a given character onto every line of a text file (I'm adding ',' to the end of IP addresses, one per row)
I want to prevent accidentally running the script multiple times and adding multiple of the same characters to the end of the script. i.e. adding one , is what I want, accidentally adding ten ,'s is annoying and I'll need to undo what I've done.
I'm trying to update the code to identify if the last character in a line is the same as the character that's trying to be added and if it is, not to add it.
This code adds char to the end of each line.
file = 'test.txt' # file to append text to, keep the ''
char = ','
newf=""
with open(file,'r') as f:
for line in f:
newf+=line.strip()+ char + '\n'
f.close()
with open(file,'w') as f:
f.write(newf)
f = open("test.txt", "r")
check = ","
And I've written this code to check what the last character per line is, it returns a ',' successfully for each line. What I can't figure out is how to combine if char and check are the same value, not to append anything.
f = open("test.txt", "r")
check = ","
for line in f:
l = line.strip()
if l[-1:].isascii():
check = l[-1:]
else:
check = 0
print(check)
f.close()
use the endswith() function to check if it already ends with ,.
check = ","
newf = ""
with open(file) as f:
for line in f:
line = line.strip()
if not line.endswith(check):
line += check
newf += line + "\n"
I have a file file.md that I want to read and get it as a string.
Then I want to take that string and save it in another file, but as a string with quotes (and all). The reason is I want to transfer the content of my markdown file to a markdown string so that I can include it in html using the javascript marked library.
How can I do that using a python script?
Here's what I have tried so far:
with open('file.md', 'r') as md:
text=""
lines = md.readlines()
for line in lines:
line = "'" + line + "'" + '+'
text = text + line
with open('file.txt', 'w') as txt:
txt.write(text)
Input file.md
This is one line of markdown
This is another line of markdown
This is another one
Desired output: file.txt
"This is one line of markdown" +
"This is another line of markdown" +
(what should come here by the way to encode an empty line?)
"This is another one"
There are two things you need to pay attention here.
First is that you should not change your iterator line while it is running through lines. Instead, assign it to a new string variable (I call it new_line).
Second, if you add more characters at the end of each line, it will be placed after the end-of-line character and thus be moved into the next line when you write it to a new file. Instead, skip the last character of each line and add the line break manually.
If I understand you right, this should give you the wanted output:
with open('file.md', 'r') as md:
text = ""
lines = md.readlines()
for line in lines:
if line[-1] == "\n":
text += "'" + line[:-1] + "'+\n"
else:
text += "'" + line + "'+"
with open('file.txt', 'w') as txt:
txt.write(text)
Note how the last line is treated different than the others (no eol-char and no + sign).
text += ... adds more characters to the existing string.
This also works and might be a bit nicer, because it avoids the if-statement. You can remove the newline-character right at reading the content from file.md. In the end you skip the last two characters of your content, which is the + and the \n.
with open('file.md', 'r') as md:
text = ""
lines = [line.rstrip('\n') for line in md]
for line in lines:
text += "'" + line + "' +\n"
with open('file.txt', 'w') as txt:
txt.write(text[:-2])
...and with using a formatter:
text += "'{}' +\n".format(line)
...checking for empty lines as you asked in the comments:
for line in lines:
if line == '':
text += '\n'
else:
text += "'{}' +\n".format(line)
This works:
>>> a = '''This is one line of markdown
... This is another line of markdown
...
... This is another one'''
>>> lines = a.split('\n')
>>> lines = [ '"' + i + '" +' if len(i) else i for i in lines]
>>> lines[-1] = lines[-1][:-2] # drop the '+' at the end of the last line
>>> print '\n'.join( lines )
"This is one line of markdown" +
"This is another line of markdown" +
"This is another one"
You may add reading/writing to files yourself.
I've learned that we can easily remove blank lined in a file or remove blanks for each string line, but how about remove all blanks at the end of each line in a file ?
One way should be processing each line for a file, like:
with open(file) as f:
for line in f:
store line.strip()
Is this the only way to complete the task ?
Possibly the ugliest implementation possible but heres what I just scratched up :0
def strip_str(string):
last_ind = 0
split_string = string.split(' ')
for ind, word in enumerate(split_string):
if word == '\n':
return ''.join([split_string[0]] + [ ' {} '.format(x) for x in split_string[1:last_ind]])
last_ind += 1
Don't know if these count as different ways of accomplishing the task. The first is really just a variation on what you have. The second does the whole file at once, rather than line-by-line.
Map that calls the 'rstrip' method on each line of the file.
import operator
with open(filename) as f:
#basically the same as (line.rstrip() for line in f)
for line in map(operator.methodcaller('rstrip'), f)):
# do something with the line
read the whole file and use re.sub():
import re
with open(filename) as f:
text = f.read()
text = re.sub(r"\s+(?=\n)", "", text)
You just want to remove spaces, another solution would be...
line.replace(" ", "")
Good to remove white spaces.
When opening a file and concatenating the 5th character from each string, I'm getting duplicates of each character in the new string. How can I fix this?
def fifthchar(filename):
l=""
fin=open(filename, "r")
for line in fin:
line=line.strip()
line=str(line)
for i in line:
if len(line)>=5:
a=line[4]
l+=a
fin.close()
return l
def fifthchar(filename):
l=''
lines = []
fin=open(filename, 'r')
all_lines = fin.read().decode("utf-8-sig").encode("utf-8")
lines = all_lines.splitlines()
line =''
for line in lines:
line=str(line)
line=line.strip()
print line
if len(line)>=5:
a=line[4]
l+=a
fin.close()
return l
if __name__ == '__main__':
print fifthchar("read_lines.txt")
if you want to reamove the withe space from the beginig and the end use
line = line.strip()
if you want to remove the all whitespace from the string use
line = line.replace(" ","")
this line automatically removes the expected BOM.
all_lines = fin.read().decode("utf-8-sig").encode("utf-8")
for details
hope this will help.
Just remove this unnecessary line and indent accordingly:
for i in line:
You were doing concatenation for each character in line due to this reason.
I want to replace a line in a file but my code doesn't do what I want. The code doesn't change that line. It seems that the problem is the space between ALS and 4277 characters in the input.txt. I need to keep that space in the file. How can I fix my code?
A part part of input.txt:
ALS 4277
Related part of the code:
for lines in fileinput.input('input.txt', inplace=True):
print(lines.rstrip().replace("ALS"+str(4277), "KLM" + str(4945)))
Desired output:
KLM 4945
Using the same idea that other user have already pointed out, you could also reproduce the same spacing, by first matching the spacing and saving it in a variable (spacing in my code):
import re
with open('input.txt') as f:
lines = f.read()
match = re.match(r'ALS(\s+)4277', lines)
if match != None:
spacing = match.group(1)
lines = re.sub(r'ALS\s+4277', 'KLM%s4945'%spacing, lines.rstrip())
print lines
As the spaces vary you will need to use regex to account for the spaces.
import re
lines = "ALS 4277 "
line = re.sub(r"(ALS\s+4277)", "KLM 4945", lines.rstrip())
print(line)
Try:
with open('input.txt') as f:
for line in f:
a, b = line.strip().split()
if a == 'ALS' and b == '4277':
line = line.replace(a, 'KLM').replace(b, '4945')
print(line, end='') # as line has '\n'