I have a file like this..
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
a b c invalid #seperated by tab
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
I need to replace a b c invalid to a b reviewed rd # separated by tab
Basically any line that ends with invalid, I need to replace that line with reviewed rd // separated by tab but I have to keep the first and second words on that line (only replace 3rd and 4th).
I have started doing something like this, but this won't exactly do what I want.
f1 = open('fileInput', 'r')
f2 = open('fileInput'+".tmp", 'w')
for line in f1:
f2.write(line.replace('invalid', ' reviewed'+\t+'rd'))
f1.close()
f2.close()
regex can be an option but I'm not that good with it yet. Can someone help.
P.S. a,b and c's are variables.. I can't do an exact search on 'a','b','c'.
f1 = open('fileInput', 'r')
f2 = open('fileInput+".tmp"', 'w')
for line in f1:
if line[:-1].endswith("invalid"):
f2.write("\t".join(line.split("\t")[:2] + ["reviewed", "rd"]) + "\n")
else:
f2.write(line)
f1.close()
f2.close()
import re
pattern = re.compile(r'\t\S+\tinvalid$')
with open('data') as fin:
with open('output', 'w') as fout:
for line in fin:
fout.write(pattern.sub('\treviewd\trd', line))
with open('input.tab') as fin, open('output.tab', 'wb') as fout:
tabin = csv.reader(fin, delimiter='\t')
tabout = csv.writer(fout, delimiter='\t')
for row in tabin:
if len(tabin) != 4:
continue # or raise - whatever
if row[-1] == 'invalid':
tabout.writerow(row[:2] + ['reviewed', 'rd'])
Related
To extract a certain part of text
in this example I want to extract from d to f
input.txt contains:
a
d
c
b
e
f
g
a
a
the output.txt should contain from d to f
but this program copies from d to last line of input.txt file
f = open('input.txt')
f1 = open('output.txt', 'a')
intermediate_variable=False
for line in f:
if 'd' in line:
intermediate_variable=True
if intermediate_variable==True:
f1.write(line)
f1.close()
f.close()
I think this should do it:
contents = open('input.txt').read()
f1.write(contents[contents.index("d"):contents.index("f")])
There are more convenient ways to read and write files, this version uses a generator and the 'with' keyword (context manager) which automatically closes the file for you. Generators (functions with 'yield' are nice because they give you the file a line at a time, although you have to wrap their output in try/except block)
def reader(filename):
with open(filename, 'r') as fin:
for line in fin:
yield line
def writer(filename, data):
with open(filename, 'w') as fout: #change 'w' to 'a' to append ('w' overwrites)
fout.write(data)
if __name__ == "__main__":
a = reader("input.txt")
while True:
try:
temp = next(a)
if 'd' in temp:
#this version of above answer captures the 'f' as well
writer("output.txt", temp[temp.index('d'):temp.index('f') + 1])
except StopIteration:
break
straight forward:
### load all the data at once, fine for small files:
with open('input.txt', 'r') as f:
content = f.read().splitlines() ## use f.readlines() to have the newline chars included
### slice what you need from content:
selection = content[content.index("d"):content.index("f")]
## use content.index("f")+1 to include "f" in the output.
### write selection to output:
with open('output.txt', 'a') as f:
f.write('\n'.join(selection))
## or:
# for line in selection:
# f.write(line + '\n')
I have key words to be search in one file let say abc.txt and in another file I have my data, def.txt.
I want a code in python to find key words written in abc.txt, in def.txt and if present, print those line in a new file.
Thank you.
I tried writing a code but it didn't work.
following is the code I write.
f = open('/home/vivek/Documents/abc.txt')
f1 = open('output.txt', 'a')
f2 = open('/home/vivek/Documents/def.txt', 'r')
# doIHaveToCopyTheLine=False
for line in f.readlines():
if f2 in line:
f1.write(line)
f1.close()
f.close()
f2.close()
Load the keywords into a list then you can check the other file line-by-line, and write to outfile as you find keywords in the line.
with open('/path/to/keywords.txt') as f:
keywords = set(line.strip() for line in f) # assuming words are separated by line
with open('/path/to/search_me.txt') as f, open('/path/to/outfile.txt', 'w') as outfile:
for line in f:
if any(kw in line for kw in keywords):
outfile.write(line)
You should record all the words in abc.txt use a set and then search them in def.txt
word_set = set()
with open('/home/vivek/Documents/abc.txt') as f:
for line in f:
word_set.add(line.strip())
f1 = open('output.txt', 'a')
with open('/home/vivek/Documents/def.txt') as f:
for line in f:
find = False
for word in word_set:
if word in line:
find = True
break
if find:
f1.write(line)
f1.close()
You can try this code:
with open("keyword.txt", "r") as keyword_file:
keywords = keyword_file.read().strip()
keywords = keywords.split()
with open("data.txt", "r") as data_file, open("output.txt", "w") as output_file:
for line in data_file.readlines():
line = line.strip()
for word in keywords:
if line.find(word) != -1:
print line
output_file.writelines(line + '\n')
break
In addition to sytech's answer you may try this:
with open('def.txt') as kw_obj, open('abc.txt') as in_obj:
keywords = set(kw_obj.read().split())
in_lines = in_obj.readlines()
match_lines = [line for keyword in keywords for line in in_lines if keyword in line]
if match_lines:
with open('out.txt', 'w') as out:
out.write(''.join(match_lines))
I'm trying to replace a value in a specific line in a text file.
My text file contains count of the searchterm, searchterm & date and time
Text file:
MemTotal,5,2016-07-30 12:02:33,781
model name,3,2016-07-30 13:37:59,074
model,3,2016-07-30 15:39:59,075
How can I replace for example the count of the searchterm for line 2 (model name,3,2016-07-30 13:37:59,074)?
This is what I have already:
f = open('file.log','r')
filedata = f.read()
f.close()
newdata = filedata.replace("2", "3")
f = open('file.log', 'w')
f.write(newdata)
f.close()
It replace all values 2.
You have to change three things in your code to get the job done:
Read the file using readlines.
filedata = f.readlines()
Modify the line you want to change (keep in mind that Python indices start at 0 and don't forget to add a newline character \n at the end of the string):
filedata[1] = 'new count,new search term,new date and time\n'
Save the file using a for loop:
for line in filedata:
f.write(line)
Here is the full code (notice I used the with context manager to open/close the file):
with open('file.log', 'r') as f:
filedata = f.readlines()
filedata[1] = 'new count,new search term,new date and time\n'
with open('file.log', 'w') as f:
for line in filedata:
f.write(line)
My solution:
count = 0
line_number = 0
replace = ""
f = open('examen.log','r')
term = "MemTotal"
for line in f.read().split('\n'):
if term in line:
replace= line.replace("5", "25", 1)
line_number = count
count = count + 1
print line_number
f.close()
f = open('examen.log','r')
filedata = f.readlines()
f.close()
filedata[line_number]=replace+'\n'
print filedata[line_number]
print filedata
f = open('examen.log','w')
for line in filedata:
f.write(line)
f.close()
You only need to define the searchterm & the replace value
I'm trying to take out the strings "a" and "b" from a document. Here's what I am doing but it's not working because I can't use replace in a list.
def filter_ab(filename):
fileRef=open(filename)
file_list=fileRef.readlines()
filter="ab"
for k in file_list:
for j in k:
if j in filter:
file_list=file_list.replace(j,"")
Can you use something like this:
f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
for line in f1:
f2.write(line.replace('a', '').replace('b', ''))
f1.close()
f2.close()
Just use this:
def filter_ab(filename):
lines = []
with open(filename, "r") as fh:
for line in fh.readlines():
line = line.replace("a", "")
line = line.replace("b", "")
lines.append(line)
with open(filename, "w") as fh:
for line in lines:
fh.write(line)
str.translate is actually quite convenient for something like this
with open('file1.txt', 'r') as f1, open('file2.txt', 'w') as f2:
for line in f1:
f2.write(line.translate(None, 'ab'))
The problem I am having at this point in time (being new to Python) is writing strings to a text file. The issue I'm experiencing is one where either the strings don't have linebreaks inbetween them or there is a linebreak after every character. Code to follow:
import string, io
FileName = input("Arb file name (.txt): ")
MyFile = open(FileName, 'r')
TempFile = open('TempFile.txt', 'w', encoding='UTF-8')
for m_line in MyFile:
m_line = m_line.strip()
m_line = m_line.split(": ", 1)
if len(m_line) > 1:
del m_line[0]
#print(m_line)
MyString = str(m_line)
MyString = MyString.strip("'[]")
TempFile.write(MyString)
MyFile.close()
TempFile.close()
My input looks like this:
1 Jargon
2 Python
3 Yada Yada
4 Stuck
My output when I do this is:
JargonPythonYada YadaStuck
I then modify the source code to this:
import string, io
FileName = input("Arb File Name (.txt): ")
MyFile = open(FileName, 'r')
TempFile = open('TempFile.txt', 'w', encoding='UTF-8')
for m_line in MyFile:
m_line = m_line.strip()
m_line = m_line.split(": ", 1)
if len(m_line) > 1:
del m_line[0]
#print(m_line)
MyString = str(m_line)
MyString = MyString.strip("'[]")
#print(MyString)
TempFile.write('\n'.join(MyString))
MyFile.close()
TempFile.close()
Same input and my output looks like this:
J
a
r
g
o
nP
y
t
h
o
nY
a
d
a
Y
a
d
aS
t
u
c
k
Ideally, I would like each of the words to appear on a seperate line without the numbers in front of them.
Thanks,
MarleyH
You have to write the '\n' after each line, since you're stripping the original '\n';
Your idea of using '\n'.join() doesn't work because it will use\n to join the string, inserting it between each char of the string. You need a single \n after each name, instead.
import string, io
FileName = input("Arb file name (.txt): ")
with open(FileName, 'r') as MyFile:
with open('TempFile.txt', 'w', encoding='UTF-8') as TempFile:
for line in MyFile:
line = line.strip().split(": ", 1)
TempFile.write(line[1] + '\n')
fileName = input("Arb file name (.txt): ")
tempName = 'TempFile.txt'
with open(fileName) as inf, open(tempName, 'w', encoding='UTF-8') as outf:
for line in inf:
line = line.strip().split(": ", 1)[-1]
#print(line)
outf.write(line + '\n')
Problems:
the result of str.split() is a list (this is why, when you cast it to str, you get ['my item']).
write does not add a newline; if you want one, you have to add it explicitly.