I want to loop over the contents of a text file and do a search and replace on some lines and write the result back to the file. I could first load the whole file in memory and then write it back, but that probably is not the best way to do it.
What is the best way to do this, within the following code?
f = open(file)
for line in f:
if line.contains('foo'):
newline = line.replace('foo', 'bar')
# how to write this newline back to the file
The shortest way would probably be to use the fileinput module. For example, the following adds line numbers to a file, in-place:
import fileinput
for line in fileinput.input("test.txt", inplace=True):
print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
# print "%d: %s" % (fileinput.filelineno(), line), # for Python 2
What happens here is:
The original file is moved to a backup file
The standard output is redirected to the original file within the loop
Thus any print statements write back into the original file
fileinput has more bells and whistles. For example, it can be used to automatically operate on all files in sys.args[1:], without your having to iterate over them explicitly. Starting with Python 3.2 it also provides a convenient context manager for use in a with statement.
While fileinput is great for throwaway scripts, I would be wary of using it in real code because admittedly it's not very readable or familiar. In real (production) code it's worthwhile to spend just a few more lines of code to make the process explicit and thus make the code readable.
There are two options:
The file is not overly large, and you can just read it wholly to memory. Then close the file, reopen it in writing mode and write the modified contents back.
The file is too large to be stored in memory; you can move it over to a temporary file and open that, reading it line by line, writing back into the original file. Note that this requires twice the storage.
I guess something like this should do it. It basically writes the content to a new file and replaces the old file with the new file:
from tempfile import mkstemp
from shutil import move, copymode
from os import fdopen, remove
def replace(file_path, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
with fdopen(fh,'w') as new_file:
with open(file_path) as old_file:
for line in old_file:
new_file.write(line.replace(pattern, subst))
#Copy the file permissions from the old file to the new file
copymode(file_path, abs_path)
#Remove original file
remove(file_path)
#Move new file
move(abs_path, file_path)
Here's another example that was tested, and will match search & replace patterns:
import fileinput
import sys
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
Example use:
replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")
This should work: (inplace editing)
import fileinput
# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1):
print line.replace("foo", "bar"),
Based on the answer by Thomas Watnedal.
However, this does not answer the line-to-line part of the original question exactly. The function can still replace on a line-to-line basis
This implementation replaces the file contents without using temporary files, as a consequence file permissions remain unchanged.
Also re.sub instead of replace, allows regex replacement instead of plain text replacement only.
Reading the file as a single string instead of line by line allows for multiline match and replacement.
import re
def replace(file, pattern, subst):
# Read contents from file as a single string
file_handle = open(file, 'r')
file_string = file_handle.read()
file_handle.close()
# Use RE package to allow for replacement (also allowing for (multiline) REGEX)
file_string = (re.sub(pattern, subst, file_string))
# Write contents to file.
# Using mode 'w' truncates the file.
file_handle = open(file, 'w')
file_handle.write(file_string)
file_handle.close()
As lassevk suggests, write out the new file as you go, here is some example code:
fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()
If you're wanting a generic function that replaces any text with some other text, this is likely the best way to go, particularly if you're a fan of regex's:
import re
def replace( filePath, text, subs, flags=0 ):
with open( filePath, "r+" ) as file:
fileContents = file.read()
textPattern = re.compile( re.escape( text ), flags )
fileContents = textPattern.sub( subs, fileContents )
file.seek( 0 )
file.truncate()
file.write( fileContents )
A more pythonic way would be to use context managers like the code below:
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with open(target_file_path, 'w') as target_file:
with open(source_file_path, 'r') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
You can find the full snippet here.
fileinput is quite straightforward as mentioned on previous answers:
import fileinput
def replace_in_file(file_path, search_text, new_text):
with fileinput.input(file_path, inplace=True) as file:
for line in file:
new_line = line.replace(search_text, new_text)
print(new_line, end='')
Explanation:
fileinput can accept multiple files, but I prefer to close each single file as soon as it is being processed. So placed single file_path in with statement.
print statement does not print anything when inplace=True, because STDOUT is being forwarded to the original file.
end='' in print statement is to eliminate intermediate blank new lines.
You can used it as follows:
file_path = '/path/to/my/file'
replace_in_file(file_path, 'old-text', 'new-text')
Create a new file, copy lines from the old to the new, and do the replacing before you write the lines to the new file.
Expanding on #Kiran's answer, which I agree is more succinct and Pythonic, this adds codecs to support the reading and writing of UTF-8:
import codecs
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with codecs.open(target_file_path, 'w', 'utf-8') as target_file:
with codecs.open(source_file_path, 'r', 'utf-8') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
Using hamishmcn's answer as a template I was able to search for a line in a file that match my regex and replacing it with empty string.
import re
fin = open("in.txt", 'r') # in file
fout = open("out.txt", 'w') # out file
for line in fin:
p = re.compile('[-][0-9]*[.][0-9]*[,]|[-][0-9]*[,]') # pattern
newline = p.sub('',line) # replace matching strings with empty string
print newline
fout.write(newline)
fin.close()
fout.close()
if you remove the indent at the like below, it will search and replace in multiple line.
See below for example.
def replace(file, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
print fh, abs_path
new_file = open(abs_path,'w')
old_file = open(file)
for line in old_file:
new_file.write(line.replace(pattern, subst))
#close temp file
new_file.close()
close(fh)
old_file.close()
#Remove original file
remove(file)
#Move new file
move(abs_path, file)
I am a newbie with python, so kindly excuse for asking basic question.
I am trying to use the string.replace method in python and getting a weird behavior. here is what I am doing:
# passing through command line a file name
with open(sys.argv[2], 'r+') as source:
content = source.readlines()
for line in content:
line = line.replace(placeholerPattern1Replace,placeholerPattern1)
#if I am printing the line here, I am getting the correct value
source.write(line.replace(placeholerPattern1Replace,placeholerPattern1))
try:
target = open('baf_boot_flash_range_test_'+subStr +'.gpj', 'w')
for line in content:
if placeholerPattern3 in line:
print line
target.write(line.replace(placeholerPattern1, <variable>))
target.close()
When I am checking the values in the new file, then these are not replaced. I could see that the value of the source is also not changed, but the content had changed, what am I doing wrong here?
Rather do something like this -
contentList = []
with open('somefile.txt', 'r') as source:
for line in source:
contentList.append(line)
with open('somefile.txt','w') as w:
for line in contentList:
line = line.replace(stringToReplace,stringToReplaceWith)
w.write(line)
Because with will close your file after runing all the statements wrapped within it, which means the content local variable will be nil in the second loop.
You are reading from the file source and also writing to it. Don't do that. Instead, you should write to a NamedTemporaryFile and then rename it over the original file after you finish writing and close it.
Try this:
# Read the file into memory
with open(sys.argv[2], 'r') as source:
content = source.readlines()
# Fix each line
new_content = list()
for line in content:
new_content.append(line.replace(placeholerPattern1Replace, placeholerPattern1))
# Write the data to a temporary file name
with open(sys.argv[2] + '.tmp', 'w') as dest:
for line in new_content:
dest.write(line)
# Rename the temporary file to the input file name
os.rename(sys.argv[2] + '.tmp', sys.argv[2])
I was wondering if there is a way to edit those lines of a file that contain certain characters.
Something like this:
file.readlines()
for line in file:
if 'characters' in line:
file[line] = 'edited line'
If it matters: I'm using python 3.5
I think what you want is something like:
lines = file.readlines()
for index, line in enumerate(lines):
if 'characters' in line:
lines[index] = 'edited line'
You can't edit the file directly, but you can write out the modified lines over the original (or, safer, write to a temporary file and renamed once you've validated it).
You can use tempfile.NamedTemporaryFile to create a temporary file object and write your lines in it the use shutil module to replace the temp file with your preceding file.
from tempfile import NamedTemporaryFile
import shutil
tempfile = NamedTemporaryFile(delete=False)
with open(file_name) as infile,tempfile:
for line in infile:
if 'characters' in line:
tempfile.write('edited line')
else:
tempfile.write(line)
shutil.move(tempfile.name, file_name)
I have a file called usernames.py that may contain a list or does exist at all:
usernames.py
['user1', 'user2', 'user3']
In Python I now want to read this file if it exists and append to the list a new user or create a list with that user i.e. ['user3']
This is what I have tried:
with open(path + 'usernames.py', 'w+') as file:
file_string = host_file.read()
file_string.append(instance)
file.write(file_string)
This gives me an error unresolved 'append'. How can I achieve this? Python does not know it is a list and if the file does not exist even worst as I have nothing to convert to a list.
Try this:
import os
filename = 'data'
if os.path.isfile(filename):
with open(filename, 'r') as f:
l = eval(f.readline())
else:
l = []
l.append(instance)
with open(filename, 'w') as f:
f.write(str(l))
BUT this is quite unsafe if you don't know where the file is from as it could include any code to do anything!
It would be better not to use a python file for persistence -- what happens if someone slips you a usernames.py that has exploit code in it? Consider a csv file or a pickle, or just a text file with one user per line.
That said, if you don't open it as a python file, something like this should work:
from os.path import join
with open( join(path, 'usernames.py'), 'r+') as file:
file_string = file.read()
file_string = file_string.strip().strip('[').strip(']')
file_data = [ name.strip().strip('"').strip("'") for name in file_string.split(',' )]
file_data.append( instance )
file.fseek(0)
file.write(str(file_data))
If usernames contain commas or end in quotes, you have to be more careful.
I should preface that I am a complete Python Newbie.
Im trying to create a script that will loop through a directory and its subdirectories looking for text files. When it encounters a text file it will parse the file and convert it to NITF XML and upload to an FTP directory.
At this point I am still working on reading the text file into variables so that they can be inserted into the XML document in the right places. An example to the text file is as follows.
Headline
Subhead
By A person
Paragraph text.
And here is the code I have so far:
with open("path/to/textFile.txt") as f:
#content = f.readlines()
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
pth = os.getcwd()
print head,sub,auth,data,pth
My question is: how do I iterate through the body of the text file(data) and wrap each line in HTML P tags? For example;
<P>line of text in file </P> <P>Next line in text file</p>.
Something like
output_format = '<p>{}</p>\n'.format
with open('input') as fin, open('output', 'w') as fout:
fout.writelines( output_format(line.strip()) for line in fin )
This assumes that you want to write the new content back to the original file:
with open('path/to/textFile.txt') as f:
content = f.readlines()
with open('path/to/textFile.txt', 'w') as f:
for line in content:
f.write('<p>' + line.strip() + '</p>\n')
with open('infile') as fin, open('outfile',w) as fout:
for line in fin:
fout.write('<P>{0}</P>\n'.format(line[:-1]) #slice off the newline. Same as `line.rstrip('\n')`.
#Only do this once you're sure the script works :)
shutil.move('outfile','infile') #Need to replace the input file with the output file
in you case, you should probably replace
data=f.read()
with:
data = '\n'.join("<p>%s</p>" % l.strip() for l in f)
use data=f.readlines() here,
and then iterate over data and try something like this:
for line in data:
line="<p>"+line.strip()+"</p>"
#write line+'\n' to a file or do something else
append the and <\p> for each line
ex:
data_new=[]
data=f.readlines()
for lines in data:
data_new.append("<p>%s</p>\n" % data.strip().strip("\n"))
You could use the fileinput module to modify one or more files in-place, with optional backup file creation if desired (see its documentation for details). Here's it being used to process one file.
import fileinput
for line in fileinput.input('testinput.txt', inplace=1):
print '<P>'+line[:-1]+'<\P>'
The 'testinput.txt' argument could also be a sequence of two or more file names instead of just a single one, which could be useful especially if you're using os.walk() to generate the list of files in the directory and its subdirectories to process (as you probably should be doing).