I have to edit some text files to include new information, but I will need to insert that information at specific locations in the file based on the surrounding text.
This doesn't work the way I need it to:
with open(full_filename, "r+") as f:
lines = f.readlines()
for line in lines:
if 'identifying text' in line:
offset = f.tell()
f.seek(offset)
f.write('Inserted text')
...in that it adds the text to the end of the file. How would I write it to the next line following the identifying text?
(AFAICT, this is not a duplicate of similar questions, since none of those were able to provide this answer)
If you don't need to work in place, then maybe something like:
with open("old.txt") as f_old, open("new.txt", "w") as f_new:
for line in f_old:
f_new.write(line)
if 'identifier' in line:
f_new.write("extra stuff\n")
(or, to be Python-2.5 compatible):
f_old = open("old.txt")
f_new = open("new.txt", "w")
for line in f_old:
f_new.write(line)
if 'identifier' in line:
f_new.write("extra stuff\n")
f_old.close()
f_new.close()
which turns
>>> !cat old.txt
a
b
c
d identifier
e
into
>>> !cat new.txt
a
b
c
d identifier
extra stuff
e
(Usual warning about using 'string1' in 'string2': 'name' in 'enamel' is True, 'hello' in 'Othello' is True, etc., but obviously you can make the condition arbitrarily complicated.)
You could use a regex and then replace the text.
import re
c = "This is a file's contents, apparently you want to insert text"
re.sub('text', 'text here', c)
print c
returns "This is a file's contents, apparently you want to insert text here"
Not sure if it'll work for your usecase but it's nice and simple if it fits.
This will look for any string, in the file (not specific, to be at the start of the line only, i.e. can exist spread over multiple lines as well).
Typically you can follow the algo as:
lookup for the string in the file, and capture "location"
then split the file about this "location", and attempt to create new files as
write start-to-loc content to new file
next, write your "NEW TEXT" to the new file
next, loc-to-end content to new file
Let us see code:
#!/usr/bin/python
import os
SEARCH_WORD = 'search_text_here'
file_name = 'sample.txt'
add_text = 'my_new_text_here'
final_loc=-1
with open(file_name, 'rb') as file:
fsize = os.path.getsize(file_name)
bsize = fsize
word_len = len(SEARCH_WORD)
while True:
found = 0
pr = file.read(bsize)
pf = pr.find(SEARCH_WORD)
if pf > -1:
found = 1
pos_dec = file.tell() - (bsize - pf)
file.seek(pos_dec + word_len)
bsize = fsize - file.tell()
if file.tell() < fsize:
seek = file.tell() - word_len + 1
file.seek(seek)
if 1==found:
final_loc = seek
print "loc: "+str(final_loc)
else:
break
# create file with doxygen comments
f_old = open(file_name,'r+')
f_new = open("new.txt", "w")
f_old.seek(0)
fStr = str(f_old.read())
f_new.write(fStr[:final_loc-1]);
f_new.write(add_text);
f_new.write(fStr[final_loc-1:])
f_new.close()
f_old.close()
Related
I have two text files. I can open both with Python successfully.
I open the first file and read a data element into a variable using the for l in file construct.
I open the second file and read a data element into a variable using the for l in file construct.
If both variables match I write data to a text file. For the first line read it works perfectly but subsequent lines do not. The FIN variable never changes even though it finds a new line that starts with D further along. Is there a way to loop through two files like this? Am I missing something obvious?
File2Split = 'c:\\temp\\datafile\\comparionIP.txt'
GetResident = 'c:\\temp\\datafile\\NPINumbers.txt'
writefile = open('c:\\temp\\datafile\\comparionIPmod.txt','w')
openfile = open(File2Split,'r')
openfileNPI = open(GetResident,'r')
FIN = ''
FirstChar = ''
FIN2 = ''
for l in openfile:
FirstChar = (l[0:1])
if FirstChar =='D':
FIN = (l[21:31])
#print (FIN)
if FIN.startswith('1'):
writefile.write(l)
elif FirstChar in ['F','G','C','R']:
writefile.write(l)
elif FirstChar =='N':
for l2 in openfileNPI:
FIN2 = (l2[0:10])
NPI = ('N' + (l2[11:21]))
if FIN2 == FIN:
writefile.write(NPI + '\n')
openfileNPI.close()
openfile.close()
writefile.close()
I made a script that works, but only if I know how many lines locations.txt will have.
Is there any way to keep the script working regardless of the amount of lines? (I am expecting 1 to 12 lines maximum).
locline1 = content[1]
locline2 = content[2]
Works great when there are two lines in the list of locations. If there is only one location then I need to change the script to;
locline1 = content[1]
locline2 = content[1]
That workaround avoids any errors and lets it run, providing duplicate results which is better than nothing.
I tried to make another script that will check how many lines there are and make appropriate replacements to the first script. It ran without errors but the replacements weren't made, I can investigate that later if that is a decent route to take.
In main script
import varilen
exec(open("varilen.py").read())
Varilen.py below;
file = open("locations.txt","r")
Counter = 1
Content = file.read()
CoList = Content.split("\n")
for i in CoList:
if i:
Counter += 1
if Counter < 2:
cr1test = open('amazing727p6.py', mode='w', encoding='UTF-8')
cr1 = (re.sub(r'content[2]', 'content[1]', cr1test, count=0, flags = re.DOTALL))
More detail about the main script (just in case it makes a difference);
The script reads a base file, finds the names of locations in there and writes one copy of each of those on the lines of a new file locations.txt.
Later the script reads the lines of locations.txt and assigns these to those lines.
locline1 = content[1]
locline2 = content[2]
Make a replacement to take out everything except for the location name on each line.
Make a new file saved as that (location name + _results.txt), then search the base file for lines containing that location name and write those lines to that location's results.txt
For example;
Search this base file
abc!New york aaaaaa a
gg3aa!New York aa bbbb
g44!Chicago au4s a3e
Make locations.txt
"New York
"Chicago
Make "New York results.txt" copy matching lines from base file and format columns.
abc! New york aaaaaa a
gg3!aa New York aa bbbb
It's all working but if locline12 = content[12] and there isn't a 12th line in locations.txt then it doesn't work at all.
Not asking for a specific answer although that's always welcome. I am just asking how should I try to go about fixing/improving the script in regards to this issue.
**
EDIT:
**
This will provide more detail,
This is the code before the line throwing the error;
rmuser4 = Path('lpgetandremoveuserresults1.txt', encoding='UTF-8').read_text()
rmuser5 = open('locations.txt', mode='w+')
rmuser6 = (re.sub(r'\[H.+', '', rmuser4))
print(rmuser6, file= rmuser5)
rmuser5.close()
lines_seen = set()
with open("locations.txt", "r+", encoding='UTF-8') as fqqw:
dqqw = fqqw.readlines()
fqqw.seek(0)
for iqqw in dqqw:
if iqqw not in lines_seen:
fqqw.write(iqqw)
lines_seen.add(iqqw)
fqqw.truncate()
fqqw.close()
fileyu = open('locations.txt', encoding='UTF-8')
content = fileyu.readlines()
locline1 = content[1]
locline2 = content[2]
locline3 = content[3]
locline4 = content[4]
locline5 = content[5]
locline6 = content[6]
locline7 = content[7]
locline8 = content[8]
locline9 = content[9]
locline10 = content[10]
locline11 = content[11]
locline12 = content[12]
So it makes the locations.txt document, depending on which base file is being used it might have only 1 line or 12 lines.
Below is part of the code that comes after, only the code using locline5 / content[5] is shown in this post, but the script has nearly identical code for each 1 through 12, not just 5.
filenameline5 = (re.sub(r'"(.+?Events).+', '\\1', locline5, count=0, flags = re.DOTALL))
loc5new = open('%s results.txt' % filenameline5, mode='w+', encoding='UTF-8')
print(filenameline5, file= loc5new)
loc5new.close()
filei5 = Path('%s results.txt' % filenameline5, mode='w+').read_text()
addq5rep = (re.sub(r'(.+?) .+', '"\\1', filei5, count=1, flags = re.A))
addq5base = open('search for %s.txt' % filenameline5, mode='w+', encoding='UTF-8')
print(addq5rep, file= addq5base)
addq5base.close()
list_file = open('search for %s.txt' % filenameline5)
search_words = []
for word in list_file:
search_words.append(word.strip())
list_file.close()
matches = []
master_file = open('readable_export.txt', encoding='UTF-8')
for line in master_file:
current_line = line.split()
for search_word in search_words:
if search_word in current_line:
matches.append(line)
break
master_file.close()
new_file = open('%s results.txt' % filenameline5, 'w+')
for line in matches:
new_file.write(line)
new_file.close()
So the problem is that if locations.txt is created with only 3 lines, the later code trying to use locline4 through locline12 will ruin everything. But changing the code like this would fix it (if locations.txt has 3 lines).
locline1 = content[1]
locline2 = content[2]
locline3 = content[3]
locline4 = content[1]
locline5 = content[1]
locline6 = content[1]
locline7 = content[1]
locline8 = content[1]
locline9 = content[1]
locline10 = content[1]
locline11 = content[1]
locline12 = content[1]
I'm trying to fix it so that I don't need to manually make these edits depending on how many lines I am expecting locations to have.
Thank you
Well for a small improvement, using len(CoList) will give you the number of lines in location.txt
Even you can user the if condition in side your for loop to check \n value like
if line != "\n":
line_count += 1
So as most of us are thinking it's a duplicate which is not, so what I'm trying to achieve is let's say there is a Master string like the below and couple of files mentioned in it then we need to open the files and check if there are any other files included in it, if so we need to copy that into the line where we fetched that particular text.
Master String:
Welcome
How are you
file.txt
everything alright
signature.txt
Thanks
file.txt
ABCDEFGHtele.txt
tele.txt
IJKL
signature.txt
SAK
Output:
Welcome
How are you
ABCD
EFGH
IJKL
everything alright
SAK
Thanks
for msplitin [stext.split('\n')]:
for num, items in enumerate(stext,1):
if items.strip().startswith("here is") and items.strip().endswith(".txt"):
gmsf = open(os.path.join(os.getcwd()+"\txt", items[8:]), "r")
gmsfstr = gmsf.read()
newline = items.replace(items, gmsfstr)
How to join these replace items in the same string format.
Also, any idea on how to re-iterate the same function until there are no ".txt". So, once the join is done there might be other ".txt" inside a ".txt.
Thanks for your help in advance.
A recursive approach that works with any level of file name nesting:
from os import linesep
def get_text_from_file(file_path):
with open(file_path) as f:
text = f.read()
return SAK_replace(text)
def SAK_replace(s):
lines = s.splitlines()
for index, l in enumerate(lines):
if l.endswith('.txt'):
lines[index] = get_text_from_file(l)
return linesep.join(lines)
You can try:
s = """Welcome
How are you
here is file.txt
everything alright
here is signature.txt
Thanks"""
data = s.split("\n")
match = ['.txt']
all_matches = [s for s in data if any(xs in s for xs in match)]
for index, item in enumerate(data):
if item in all_matches:
data[index] ="XYZ"
data = "\n".join(data)
print data
Output:
Welcome
How are you
XYZ
everything alright
XYZ
Thanks
Added new requirement:
def file_obj(filename):
fo = open(filename,"r")
s = fo.readlines()
data = s.split("\n")
match = ['.txt']
all_matches = [s for s in data if any(xs in s for xs in match)]
for index, item in enumerate(data):
if item in all_matches:
file_obj(item)
data[index] ="XYZ"
data = "\n".join(data)
print data
file_obj("first_filename")
We can create temporary file object and keep the replaced line in that temporary file object and once everything line is processed then we can replace with the new content to original file. This temporary file will be deleted automatically once its come out from the 'with' statement.
import tempfile
import re
file_pattern = re.compile(ur'(((\w+)\.txt))')
original_content_file_name = 'sample.txt'
"""
sample.txt should have this content.
Welcome
How are you
here is file.txt
everything alright
here is signature.txt
Thanks
"""
replaced_file_str = None
def replace_file_content():
"""
replace the file content using temporary file object.
"""
def read_content(file_name):
# matched file name is read and returned back for replacing.
content = ""
with open(file_name) as fileObj:
content = fileObj.read()
return content
# read the file and keep the replaced text in temporary file object(tempfile object will be deleted automatically).
with open(original_content_file_name, 'r') as file_obj, tempfile.NamedTemporaryFile() as tmp_file:
for line in file_obj.readlines():
if line.strip().startswith("here is") and line.strip().endswith(".txt"):
file_path = re.search(file_pattern, line).group()
line = read_content(file_path) + '\n'
tmp_file.write(line)
tmp_file.seek(0)
# assign the replaced value to this variable
replaced_file_str = tmp_file.read()
# replace with new content to the original file
with open(original_content_file_name, 'w+') as file_obj:
file_obj.write(replaced_file_str)
replace_file_content()
output_filename = r"C:\Users\guage\Output.txt"
RRA:
GREQ-299684_6j
GREQ-299684_6k
CZM:
V-GREQ-299684_6k
V-GREQ-299524_9
F_65624_1
R-GREQ-299680_5
DUN:
FB_71125_1
FR:
VQ-299659_18
VR-GREQ-299659_19
VEQ-299659_28
VR-GREQ-299659_31
VR-GREQ-299659_32
VEQ-299576_1
GED:
VEQ-299622_2
VR-GREQ-299618_13
VR-GREQ-299559_1
VR-GREQ-299524_14
FB_65624_1
VR-GREQ-299645_1
MNT:
FB_71125_1
FB_71125_2
VR-534_4
The above is the content of the the .txt file. how can I read it separately the content of it. for example -
RRA:VR-GREQ-299684_6j VR-GREQ-299684_6k VR-GREQ-299606_3 VR-GREQ-299606_4 VR-GREQ-299606_5 VR-GREQ-299606_7
and save it in a variable or something similar to it. Later I want to read CZM separately and so on. I did as below.
with open(output_filename, 'r') as f:
excel = f.read()
But how to read it separately ? can someone tell me how to do it ?
Something like this:
def read_file_with_custom_record_separator(file_path, delimiter='\n'):
fh = open(file_path)
data = ""
for line in fh:
if line.strip().endswith(delimiter) and data != "":
print "VARIABLE:\n<", data, ">\n"
data = line
else:
data += line
print "LAST VARIABLE:\n<", data, ">\n"
And then:
read_file_with_custom_record_separator("input.txt", ":")
You can make use of the file text : as indicator to create a new file like this:
savefilename = ""
with open(filename, 'r') as f:
for line in f:
line = line.strip() # get rid of the unnecessary white chars
lastchar = line[-1:] # get the last char
if lastchar == ":": # if the last char is ":"
savefilename = line[0:-1] # get file name from line (except the ":")
sf = open(savefilename + ".txt", 'w') # create a new file
else:
sf.write(line + "\n") # write the data to the opened file
Then you should get collection of files:
RRA.txt
CZM.txt
DUN.txt
# etc
which contains all the appropriate data:
RRA.txt
VR-GREQ-299684_6j
VR-GREQ-299684_6k
VR-GREQ-299606_3
VR-GREQ-299606_4
VR-GREQ-299606_5
VR-GREQ-299606_7
CZM.txt
VR-GREQ-299684_6k
VR-GREQ-299606_6
VR-GREQ-299606_8
VR-GREQ-299640_1
VR-GREQ-299640_5
VR-GREQ-299524_9
FB_65624_1
VR-GREQ-299680_5
DUN.txt
FB_71125_1
# and so on
You can replace the sf = open and the sf.write which whatever way you feel best to separate the data. Here, I use files...
You can iterate over the file and use the lines and indices to your advantage; something like this:
with open(output_filename, 'r') as f:
for index, line in enumerate(f):
# here you have access to each line and its index
# so you can save any number of lines you wish
What about reading it into a list, then process its element as you prefer
>>> f = open('myfile.txt', 'r').readlines()
>>> len(f)
46
>>> f[0]
RRA:
>>> f[-1]
VR-GREQ-299534_4
>>> f[:3]
['RRA:\n', 'VR-GREQ-299684_6j \n', 'VR-GREQ-299684_6k \n']
>>>
>>> [l for l in f if l.startswith('FB_')]
['FB_65624_1 \n', 'FB_71125_1 \n', 'FB_69228_1 \n', 'FB_65624_1 \n', 'FB_71125_1 \n', 'FB_71125_2 \n']
>>>
Assume:
self.base_version = 1000
self.target_version = 2000
I have a file as follows:
some text...
<tsr_args> \"upgrade_test test_mode=upgrade base_sw=1000 target_sw=2000 system_profile=eth\"</tsr_args>
some text...
<tsr_args> \"upgrade_test test_mode=rollback base_sw=2000 target_sw=1000 system_profile=eth manufacture_type=no-manufacture\"</tsr_args>
some text...
<tsr_args> \"upgrade_test test_mode=downgrade base_sw=2000 target_sw=1000 system_profile=eth no_boot_next_enable_flag=True\"</tsr_args>
I need the base and target version values to be placed as specified above (Note that on the 2nd and 3rd entry, the base and target are opposite).
I tried to do it as follows, but it does not work:
base_regex = re.compile('.*test_mode.*base_sw=(.*)')
target_regex = re.compile('.*test_mode.*target_sw=(.*)')
o = open(file,'a')
for line in open(file):
if 'test_mode' in line:
if 'upgrade' in line:
new_line = (re.sub(base_regex, self.base_version, line))
new_line = (re.sub(target_regex, self.target_version, line))
o.write(new_line)
elif 'rollback' in line or 'downgrade' in line):
new_line = (re.sub(base_regex, self.target_version, line))
new_line = (re.sub(target_regex, self.base_version, line))
o.write(new_line)
o.close()
Assume the above code runs properly without any syntax errors.
The file is not modified at all.
The complete line is modified instead of just the captured group. How can I make re.sub to substitute only the captured group?
You are opening file with a -> append. So, your changes should be at the end of file. You should create a new file and replace old_one at the end of your script.
There is only one way I know if you want replace several matching groups: first of all you find word using regexp and replace it like a string without regexp.
Thanks Jimilan for your remarks. I fixed my code, and now it`s working:
base_regex = re.compile(.*test_mode.*base_sw=(\S*))
target_regex = re.compile(.*test_mode.*target_sw=(\S*))
for file in self.upgrade_cases_files_list:
file_handle = open(file, 'r')
file_string = file_handle.read()
file_handle.close()
base_version_result = base_regex.search(file_string)
target_version_result = target_regex.search(file_string)
if base_version_result is not None:
current_base_version = base_version_result.group(1)
else:
raise Exception("Could not detect base version in the following file: -> %s \n" % (file))
if target_version_result is not None:
current_target_version = target_version_result.group(1)
else:
raise Exception("Could not detect target version in the following file: -> %s \n" % (file))
file_string = file_string.replace(current_base_version, self.base_version)
file_string = file_string.replace(current_target_version, self.target_version)
file_handle = open(file, 'w')
file_handle.write(file_string)
file_handle.close()