How to modify a text file? - python
I'm using Python, and would like to insert a string into a text file without deleting or copying the file. How can I do that?
Unfortunately there is no way to insert into the middle of a file without re-writing it. As previous posters have indicated, you can append to a file or overwrite part of it using seek but if you want to add stuff at the beginning or the middle, you'll have to rewrite it.
This is an operating system thing, not a Python thing. It is the same in all languages.
What I usually do is read from the file, make the modifications and write it out to a new file called myfile.txt.tmp or something like that. This is better than reading the whole file into memory because the file may be too large for that. Once the temporary file is completed, I rename it the same as the original file.
This is a good, safe way to do it because if the file write crashes or aborts for any reason, you still have your untouched original file.
Depends on what you want to do. To append you can open it with "a":
with open("foo.txt", "a") as f:
f.write("new line\n")
If you want to preprend something you have to read from the file first:
with open("foo.txt", "r+") as f:
old = f.read() # read everything in the file
f.seek(0) # rewind
f.write("new line\n" + old) # write the new line before
The fileinput module of the Python standard library will rewrite a file inplace if you use the inplace=1 parameter:
import sys
import fileinput
# replace all occurrences of 'sit' with 'SIT' and insert a line after the 5th
for i, line in enumerate(fileinput.input('lorem_ipsum.txt', inplace=1)):
sys.stdout.write(line.replace('sit', 'SIT')) # replace 'sit' and write
if i == 4: sys.stdout.write('\n') # write a blank line after the 5th line
Rewriting a file in place is often done by saving the old copy with a modified name. Unix folks add a ~ to mark the old one. Windows folks do all kinds of things -- add .bak or .old -- or rename the file entirely or put the ~ on the front of the name.
import shutil
shutil.move(afile, afile + "~")
destination= open(aFile, "w")
source= open(aFile + "~", "r")
for line in source:
destination.write(line)
if <some condition>:
destination.write(<some additional line> + "\n")
source.close()
destination.close()
Instead of shutil, you can use the following.
import os
os.rename(aFile, aFile + "~")
Python's mmap module will allow you to insert into a file. The following sample shows how it can be done in Unix (Windows mmap may be different). Note that this does not handle all error conditions and you might corrupt or lose the original file. Also, this won't handle unicode strings.
import os
from mmap import mmap
def insert(filename, str, pos):
if len(str) < 1:
# nothing to insert
return
f = open(filename, 'r+')
m = mmap(f.fileno(), os.path.getsize(filename))
origSize = m.size()
# or this could be an error
if pos > origSize:
pos = origSize
elif pos < 0:
pos = 0
m.resize(origSize + len(str))
m[pos+len(str):] = m[pos:origSize]
m[pos:pos+len(str)] = str
m.close()
f.close()
It is also possible to do this without mmap with files opened in 'r+' mode, but it is less convenient and less efficient as you'd have to read and temporarily store the contents of the file from the insertion position to EOF - which might be huge.
As mentioned by Adam you have to take your system limitations into consideration before you can decide on approach whether you have enough memory to read it all into memory replace parts of it and re-write it.
If you're dealing with a small file or have no memory issues this might help:
Option 1)
Read entire file into memory, do a regex substitution on the entire or part of the line and replace it with that line plus the extra line. You will need to make sure that the 'middle line' is unique in the file or if you have timestamps on each line this should be pretty reliable.
# open file with r+b (allow write and binary mode)
f = open("file.log", 'r+b')
# read entire content of file into memory
f_content = f.read()
# basically match middle line and replace it with itself and the extra line
f_content = re.sub(r'(middle line)', r'\1\nnew line', f_content)
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(f_content)
# close file
f.close()
Option 2)
Figure out middle line, and replace it with that line plus the extra line.
# open file with r+b (allow write and binary mode)
f = open("file.log" , 'r+b')
# get array of lines
f_content = f.readlines()
# get middle line
middle_line = len(f_content)/2
# overwrite middle line
f_content[middle_line] += "\nnew line"
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(''.join(f_content))
# close file
f.close()
Wrote a small class for doing this cleanly.
import tempfile
class FileModifierError(Exception):
pass
class FileModifier(object):
def __init__(self, fname):
self.__write_dict = {}
self.__filename = fname
self.__tempfile = tempfile.TemporaryFile()
with open(fname, 'rb') as fp:
for line in fp:
self.__tempfile.write(line)
self.__tempfile.seek(0)
def write(self, s, line_number = 'END'):
if line_number != 'END' and not isinstance(line_number, (int, float)):
raise FileModifierError("Line number %s is not a valid number" % line_number)
try:
self.__write_dict[line_number].append(s)
except KeyError:
self.__write_dict[line_number] = [s]
def writeline(self, s, line_number = 'END'):
self.write('%s\n' % s, line_number)
def writelines(self, s, line_number = 'END'):
for ln in s:
self.writeline(s, line_number)
def __popline(self, index, fp):
try:
ilines = self.__write_dict.pop(index)
for line in ilines:
fp.write(line)
except KeyError:
pass
def close(self):
self.__exit__(None, None, None)
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
with open(self.__filename,'w') as fp:
for index, line in enumerate(self.__tempfile.readlines()):
self.__popline(index, fp)
fp.write(line)
for index in sorted(self.__write_dict):
for line in self.__write_dict[index]:
fp.write(line)
self.__tempfile.close()
Then you can use it this way:
with FileModifier(filename) as fp:
fp.writeline("String 1", 0)
fp.writeline("String 2", 20)
fp.writeline("String 3") # To write at the end of the file
If you know some unix you could try the following:
Notes: $ means the command prompt
Say you have a file my_data.txt with content as such:
$ cat my_data.txt
This is a data file
with all of my data in it.
Then using the os module you can use the usual sed commands
import os
# Identifiers used are:
my_data_file = "my_data.txt"
command = "sed -i 's/all/none/' my_data.txt"
# Execute the command
os.system(command)
If you aren't aware of sed, check it out, it is extremely useful.
Related
How to find first and last characters in a file using python?
I am stuck on this revision exercise which asks to copy an input file to an output file and return the first and last letters. def copy_file(filename): input_file = open(filename, "r") content = input_file.read() content[0] content[1] return content[0] + content[-1] input_file.close() Why do I get an error message which I try get the first and last letters? And how would I copy the file to the output file? Here is the test: input_f = "FreeAdvice.txt" first_last_chars = copy_file(input_f) print(first_last_chars) print_content('cure737.txt') Error Message: FileNotFoundError: [Errno 2] No such file or directory: 'hjac737(my username).txt'
All the code after a return statement is never executed, a proper code editor would highlight it to you, so I recommend you use one. So the file was never closed. A good practice is to use a context manager for that : it will automatically call close for you, even in case of an exception, when you exit the scope (indentation level). The code you provided also miss to write the file content, which may be causing the error you reported. I explicitely used the "rt" (and "wt") mode for the files (althought they are defaults), because we want the first and last character of the file, so it supports Unicode (any character, not just ASCII). def copy_file(filename): with open(filename, "rt") as input_file: content = input_file.read() print(input_file.closed) # True my_username = "LENORMJU" output_file_name = my_username + ".txt" with open(output_file_name, "wt") as output_file: output_file.write(content) print(output_file.closed) # True # last: return the result return content[0] + content[-1] print(copy_file("so67730842.py")) When I run this script (on itself), the file is copied and I get the output d) which is correct.
How to print ASCII white space from file
I try to found on google but was useless... I use pycharm and I want to print file lines with ascii white space like "\n", "\t", "\r", etc. but all I get is just the normal string like I see it with any text editor but I want to see and that characters to to know how to reproduce that text file for some personal projects. I do not what to make them by my self so I want to shoe them on console and to just copy-paste them. Thanks in advance. # Local variable fileName = r"C:\Users\...\textFile.txt" # Create .mcmeta file def mcmetaMaker(fileName=fileName): try: with open(file=fileName, mode="r", encoding="utf-8") as f: line = f.readline() while line: print("\b{}".format(line)) line = f.readline() finally: f.close()
# Local variable fileName = r"C:\Users\...\textFile.txt" # Create .mcmeta file def mcmetaMaker(fileName): try: with open(fileName, mode='rb') as f: line = f.readline() while line: betterline = str(line) print(betterline[2:len(betterline)-1]) line = f.readline() finally: f.close() This worked for me. The 'b' in 'rb' stands for binary mode, which doesn't change the original bytes of the string. print(line) produces b'asdf\r\n' so I converted it to string and sliced it. Hope it doesn't run too slow this way.
Replacing a line in an already opened file python [duplicate]
I want to loop over the contents of a text file and do a search and replace on some lines and write the result back to the file. I could first load the whole file in memory and then write it back, but that probably is not the best way to do it. What is the best way to do this, within the following code? f = open(file) for line in f: if line.contains('foo'): newline = line.replace('foo', 'bar') # how to write this newline back to the file
The shortest way would probably be to use the fileinput module. For example, the following adds line numbers to a file, in-place: import fileinput for line in fileinput.input("test.txt", inplace=True): print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3 # print "%d: %s" % (fileinput.filelineno(), line), # for Python 2 What happens here is: The original file is moved to a backup file The standard output is redirected to the original file within the loop Thus any print statements write back into the original file fileinput has more bells and whistles. For example, it can be used to automatically operate on all files in sys.args[1:], without your having to iterate over them explicitly. Starting with Python 3.2 it also provides a convenient context manager for use in a with statement. While fileinput is great for throwaway scripts, I would be wary of using it in real code because admittedly it's not very readable or familiar. In real (production) code it's worthwhile to spend just a few more lines of code to make the process explicit and thus make the code readable. There are two options: The file is not overly large, and you can just read it wholly to memory. Then close the file, reopen it in writing mode and write the modified contents back. The file is too large to be stored in memory; you can move it over to a temporary file and open that, reading it line by line, writing back into the original file. Note that this requires twice the storage.
I guess something like this should do it. It basically writes the content to a new file and replaces the old file with the new file: from tempfile import mkstemp from shutil import move, copymode from os import fdopen, remove def replace(file_path, pattern, subst): #Create temp file fh, abs_path = mkstemp() with fdopen(fh,'w') as new_file: with open(file_path) as old_file: for line in old_file: new_file.write(line.replace(pattern, subst)) #Copy the file permissions from the old file to the new file copymode(file_path, abs_path) #Remove original file remove(file_path) #Move new file move(abs_path, file_path)
Here's another example that was tested, and will match search & replace patterns: import fileinput import sys def replaceAll(file,searchExp,replaceExp): for line in fileinput.input(file, inplace=1): if searchExp in line: line = line.replace(searchExp,replaceExp) sys.stdout.write(line) Example use: replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")
This should work: (inplace editing) import fileinput # Does a list of files, and # redirects STDOUT to the file in question for line in fileinput.input(files, inplace = 1): print line.replace("foo", "bar"),
Based on the answer by Thomas Watnedal. However, this does not answer the line-to-line part of the original question exactly. The function can still replace on a line-to-line basis This implementation replaces the file contents without using temporary files, as a consequence file permissions remain unchanged. Also re.sub instead of replace, allows regex replacement instead of plain text replacement only. Reading the file as a single string instead of line by line allows for multiline match and replacement. import re def replace(file, pattern, subst): # Read contents from file as a single string file_handle = open(file, 'r') file_string = file_handle.read() file_handle.close() # Use RE package to allow for replacement (also allowing for (multiline) REGEX) file_string = (re.sub(pattern, subst, file_string)) # Write contents to file. # Using mode 'w' truncates the file. file_handle = open(file, 'w') file_handle.write(file_string) file_handle.close()
As lassevk suggests, write out the new file as you go, here is some example code: fin = open("a.txt") fout = open("b.txt", "wt") for line in fin: fout.write( line.replace('foo', 'bar') ) fin.close() fout.close()
If you're wanting a generic function that replaces any text with some other text, this is likely the best way to go, particularly if you're a fan of regex's: import re def replace( filePath, text, subs, flags=0 ): with open( filePath, "r+" ) as file: fileContents = file.read() textPattern = re.compile( re.escape( text ), flags ) fileContents = textPattern.sub( subs, fileContents ) file.seek( 0 ) file.truncate() file.write( fileContents )
A more pythonic way would be to use context managers like the code below: from tempfile import mkstemp from shutil import move from os import remove def replace(source_file_path, pattern, substring): fh, target_file_path = mkstemp() with open(target_file_path, 'w') as target_file: with open(source_file_path, 'r') as source_file: for line in source_file: target_file.write(line.replace(pattern, substring)) remove(source_file_path) move(target_file_path, source_file_path) You can find the full snippet here.
fileinput is quite straightforward as mentioned on previous answers: import fileinput def replace_in_file(file_path, search_text, new_text): with fileinput.input(file_path, inplace=True) as file: for line in file: new_line = line.replace(search_text, new_text) print(new_line, end='') Explanation: fileinput can accept multiple files, but I prefer to close each single file as soon as it is being processed. So placed single file_path in with statement. print statement does not print anything when inplace=True, because STDOUT is being forwarded to the original file. end='' in print statement is to eliminate intermediate blank new lines. You can used it as follows: file_path = '/path/to/my/file' replace_in_file(file_path, 'old-text', 'new-text')
Create a new file, copy lines from the old to the new, and do the replacing before you write the lines to the new file.
Expanding on #Kiran's answer, which I agree is more succinct and Pythonic, this adds codecs to support the reading and writing of UTF-8: import codecs from tempfile import mkstemp from shutil import move from os import remove def replace(source_file_path, pattern, substring): fh, target_file_path = mkstemp() with codecs.open(target_file_path, 'w', 'utf-8') as target_file: with codecs.open(source_file_path, 'r', 'utf-8') as source_file: for line in source_file: target_file.write(line.replace(pattern, substring)) remove(source_file_path) move(target_file_path, source_file_path)
Using hamishmcn's answer as a template I was able to search for a line in a file that match my regex and replacing it with empty string. import re fin = open("in.txt", 'r') # in file fout = open("out.txt", 'w') # out file for line in fin: p = re.compile('[-][0-9]*[.][0-9]*[,]|[-][0-9]*[,]') # pattern newline = p.sub('',line) # replace matching strings with empty string print newline fout.write(newline) fin.close() fout.close()
if you remove the indent at the like below, it will search and replace in multiple line. See below for example. def replace(file, pattern, subst): #Create temp file fh, abs_path = mkstemp() print fh, abs_path new_file = open(abs_path,'w') old_file = open(file) for line in old_file: new_file.write(line.replace(pattern, subst)) #close temp file new_file.close() close(fh) old_file.close() #Remove original file remove(file) #Move new file move(abs_path, file)
Not able to clear file Contents
I have a file which has below data. edit 48 set dst 192.168.4.0 255.255.255.0 set device "Tague-VPN" set comment "Yeshtel" edit 180 set dst 64.219.107.45 255.255.255.255 set device "Austin-Backup" set comment "images.gsmc.org" I want to copy the commands under edit only if Set device is Austin-Backup. string = 'set device' word = '"Austin-Backup"' with open('test.txt') as oldfile, open('script.txt', 'w') as newfile: for line in oldfile: newfile.write(line) newfile.write('\n') if string not in line: pass elif string in line: if word not in line: a = open('script.txt', 'w') a.close() else: pass I am trying to write test file content to new file(script) and if command "set comment "Yeshtel"" is found i want to delete contents in new file. I tried to delete but its not happening. I am new to Python, Can you please tell what is the Prob?? I got to know that reopening the same file in Write mode will clear the contents..
I suspect the issue is that you have the same file open twice, once as newfile and a second time as a. While it should be truncated when you open it as a and then close it, the writes you made on newfile may still appear if the filesystem had cached them until after the truncated version was written. I suggest only opening the file once. When you need to truncate it, call the truncate method on it. if word not in line: newfile.truncate() If you might write more to the file after truncating, you should probably also seek back to the start position (e.g. newfile.seek(0)). If you're going to be done with the file after truncating it, that step is not needed.
Should be something like this temp_lines = [] last_line_was_edit = False found_keyword = False keyword = "Austin-Backup" with open('test.txt') as oldfile, open('script.txt', 'w') as newfile: for line in oldfile: if last_line_was_edit and temp_lines: if found_keyword: newfile.writelines(temp_lines) temp_lines = [] if line.startswith("edit"): last_line_was_edit = True else: if keyword in line: found_keyword = True temp_lines.append(line) Please note that you should not open the file twice. Just use an temporary variable and write only what have to be written
I'm tailing a file in python for any changes and it is not picking up a change to the file
Here is my script: def tail(file, delay=0.5): f = open(file, 'r') f.seek(0, 2) while True: line = f.readline() print 'line: ' + line if not line: time.sleep(delay) else: print 'line found!' When i open the file and add some lines to it, this script is not picking it up. I am doing this on linux.
use open('filename', 'a') instead of open('filename', 'r') for adding lines to the file ... I think you actually want to append to the file rather than reading it.
The code looks fine so there is likely a buffering issue. Try using f.read(100) instead of readline so that you read whatever is available rather than searching for line endings.