So I want to write a function annotate() which takes a file name as a parameter and prints it to a new file out_annotated.txt with:
the original text
row number
the total amount of words up to and including that row.
Let's say my .txt file is as following:
hello you
the sun is warm
I like dogs
I want the output to be:
hello you 1 2
the sun is warm 2 6
I like dogs 3 9
The code I used before was
def main():
length = count_rows("file.txt")
print(length)
def count_rows(fname):
with open(fname) as f:
for i, l in enumerate(f):
pass
return i + 1
if __name__ == "__main__":
main()
But how do I progress into making a new .txt file with the output including row numbers and the total amount of words?
You can create the name of your output file using os.path:
base, ext = os.path.splitext(fname)
out_path = base + "_annotated" + ext
Now you can open them both: one for reading one for writing, while holding a total words counter. Using enumerate as you did is good to keep track of line numbers, but according to your example you want to start from 1. We will split the lines to count words:
total_words = 0
with open(fname) as f_in, open(out_path, 'w') as f_out:
for line_num, line in enumerate(f_in, start=1):
total_words += len(line.split())
Lastly, because you want to add at the end of each line, you need to avoid the ending '\n', so you can write the lines after you strip them and add the row number and word count:
f_out.write("{} {} {}\n".format(line.strip(), line_num, total_words))
All together we have:
import os
def count_rows(fname):
base, ext = os.path.splitext(fname)
out_path = base + "_annotated" + ext
total_words = 0
with open(fname) as f_in, open(out_path, 'w') as f_out:
for line_num, line in enumerate(f_in, start=1):
total_words += len(line.split())
f_out.write("{} {} {}\n".format(line.strip(), line_num, total_words))
Running this on a file named file.txt with the contents as your example, produces a file called file_annotated.txt with contents of:
hello you 1 2
the sun is warm 2 6
I like dogs 3 9
Something like this will work:
def AppendNumbers(input_file, output_file):
# initialize variables:
total_number_of_words = 0
with open(input_file, 'r') as in_file, open(output_file, 'w+') as out_file:
for line in in_file.readlines():
# get number of words on each line:
number_of_words_per_line = len(line.split(' '))
# add to total word count:
total_number_of_words += number_of_words_per_line
# add words to new line:
new_line = line.replace('\n', '')
new_line = new_line + ' ' + str(number_of_words_per_line) + ' ' + str(total_number_of_words) + '\n'
# write new line to outfile:
out_file.write(new_line)
if __name__ == "__main__":
input_file = 'file.txt'
output_file = 'out_file.txt'
AppendNumbers(input_file, output_file)
Related
So, I'm trying to create a program that will automatically edit a specific set of characters in a file (it will read and replace them). No other data can be moved in the file otherwise it might become corrupted so I need to replace the text in the exact same place as before. I have looked around and found nothing useful but here is my code so far:
l = 3
w = 0
with open("InidCrd000.crd") as myfile:
hexWord = myfile.readlines()[l].split()[w]
codeA = hexWord[58]
codeB = hexWord[59]
print("Current value: ", codeA, codeB)
codeA = " "
codeB = "Ð"
print("New value: ", codeA, codeB)
EDIT - I now have this code (credit - Ilayaraja), which works but then it breaks the file up into lines and places random data in incorrect positions (although the inputted data is in the correct position):
def replace(filepath, lineno, position, newchar):
with open(filepath, "r") as reader:
lines = reader.readlines()
l = lines[lineno-1]
l = l[0:position] + newchar + l[position+1:]
lines[lineno-1] = l
with open(filepath, "w") as writer:
writer.writelines(lines)
replace("InidCrd000.crd", 4, 57, "")
replace("InidCrd000.crd", 4, 58, "Ð")
If you want the file for testing, here it is: 1drv.ms/u/s!AqRsP9xMA0g1iqMl-ZQbXUqX2WY8aA (It's a onedrive file)
first find the code you want to change using this :
l = 3
w = 0
with open("InidCrd000.crd") as myfile:
hexWord = myfile.readlines()[l].split()[w]
codeA = hexWord[58]
codeB = hexWord[59]
myfile.close()
then change like this :
import fileinput
with fileinput.FileInput(fileToSearch, inplace=True, backup='.bak') as file:
for line in file:
line.replace(codeA, textToReplace)
Define a function with the arguments the path of the file(filepath), line number(lineno 1 to N), position of the character in the line(position 0 to N) and the new character to be overwritten(newchar) as follows:
def replace(filepath, lineno, position, newchar):
with open(filepath, "r") as reader:
lines = reader.readlines()
l = lines[lineno-1]
l = l[0:position] + newchar + l[position+1:]
lines[lineno-1] = l
with open(filepath, "w") as writer:
writer.writelines(lines)
You can call the function as follows to replace the characters:
replace("InidCrd000.crd", 3, 58, " ")
replace("InidCrd000.crd", 3, 59, "Ð")
I'm currently working on a program that I have been assigned that is just getting the word and line count of a sonnet. The first bit of code here works and is the proper output my professor is looking for, even though it includes the first 2 lines of the sonnet.
import string
def main():
ifName = input("What file would you like to analyze? ")
ofName = input("What file should the results be written to? ")
infile = open(ifName, "r")
outfile = open(ofName, "w")
lineCount = 0
wordCount = 0
for line in infile:
lineCount +=1
wordLine = line.split()
L = len(wordLine)
wordCount += L
print("The file", ifName, "had:", file= outfile)
print("words =", wordCount, file= outfile)
print("lines =", lineCount, file= outfile)
print("The results have been printed to:", outfile)
infile.close
outfile.close
main()
However, the next part of the assignment is to get the same results using a second function, "countNum" with the parameter of "line". So countNum(line).
Here is the code I have been messing around with to see if I can get it to work.
import string
def countNum(line):
wordCount = 0
wordLine = line.split()
L = len(wordLine)
wordCount +=L
print(wordCount)
def main():
ifName = input("What file would you like to analyze? ")
ofName = input("What file should the results be written to? ")
infile = open(ifName, "r")
outfile = open(ofName, "w")
lineCount = 0
wordCount = 0
for line in infile:
lineCount +=1
wordTotal += countNum(line)
##wordLine = line.split()
##L = len(wordLine)
##wordCount += L
## print("The file", ifName, "had:", file= outfile)
## print("words =", wordCount, file= outfile)
## print("lines =", lineCount, file= outfile)
## print("The results have been printed to:", outfile)
infile.close
outfile.close
main()
If you were wondering, this is the sonnet.txt file:
Shakespeare’s Sonnet 18
Shall I compare thee to a summer's day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer's lease hath all too short a date:
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature's changing course untrimm'd;
But thy eternal summer shall not fade
Nor lose possession of that fair thou owest;
Nor shall Death brag thou wander'st in his shade,
When in eternal lines to time thou growest:
So long as men can breathe or eyes can see,
So long lives this, and this gives life to thee.
Your countNum is printing the result instead of returning it:
def countNum(line):
return len(line.split())
Also, your close methods need to have () after them. They aren't actually executing:
infile.close
outfile.close
to
infile.close()
outfile.close()
def count(line):
return len(line.split())
def main():
infile = open(input("File to analyse: "), "r")
file = infile.read().splitlines()
infile.close()
lineCount = len(file)
wordTotal = 0
for line in file:
words = count(line)
wordTotal += words
print("Lines:", lineCount)
print("Words:", wordTotal)
main()
You can use len to get the length of a list. The list of rows is given by f.readlines() and the list of words in a line is given by line.split(). And we can use the builtin sum to sum across a list and be super pythonic
https://docs.python.org/3/library/functions.html#sum
You should use Python's automatic closing by using the with keyword. https://docs.python.org/3/tutorial/inputoutput.html
so we'd have:
with open(infilename, "r") as infile, open(outfilename, "w") as outfile:
lines = infile.readlines()
linecount = len(lines)
wordcount = sum([len(line.split()) for line in lines])
print(linecount)
print(wordcount)
See https://docs.python.org/3.6/tutorial/datastructures.html#list-comprehensions for list comprehensions
I'm trying to replace a value in a specific line in a text file.
My text file contains count of the searchterm, searchterm & date and time
Text file:
MemTotal,5,2016-07-30 12:02:33,781
model name,3,2016-07-30 13:37:59,074
model,3,2016-07-30 15:39:59,075
How can I replace for example the count of the searchterm for line 2 (model name,3,2016-07-30 13:37:59,074)?
This is what I have already:
f = open('file.log','r')
filedata = f.read()
f.close()
newdata = filedata.replace("2", "3")
f = open('file.log', 'w')
f.write(newdata)
f.close()
It replace all values 2.
You have to change three things in your code to get the job done:
Read the file using readlines.
filedata = f.readlines()
Modify the line you want to change (keep in mind that Python indices start at 0 and don't forget to add a newline character \n at the end of the string):
filedata[1] = 'new count,new search term,new date and time\n'
Save the file using a for loop:
for line in filedata:
f.write(line)
Here is the full code (notice I used the with context manager to open/close the file):
with open('file.log', 'r') as f:
filedata = f.readlines()
filedata[1] = 'new count,new search term,new date and time\n'
with open('file.log', 'w') as f:
for line in filedata:
f.write(line)
My solution:
count = 0
line_number = 0
replace = ""
f = open('examen.log','r')
term = "MemTotal"
for line in f.read().split('\n'):
if term in line:
replace= line.replace("5", "25", 1)
line_number = count
count = count + 1
print line_number
f.close()
f = open('examen.log','r')
filedata = f.readlines()
f.close()
filedata[line_number]=replace+'\n'
print filedata[line_number]
print filedata
f = open('examen.log','w')
for line in filedata:
f.write(line)
f.close()
You only need to define the searchterm & the replace value
output_filename = r"C:\Users\guage\Output.txt"
RRA:
GREQ-299684_6j
GREQ-299684_6k
CZM:
V-GREQ-299684_6k
V-GREQ-299524_9
F_65624_1
R-GREQ-299680_5
DUN:
FB_71125_1
FR:
VQ-299659_18
VR-GREQ-299659_19
VEQ-299659_28
VR-GREQ-299659_31
VR-GREQ-299659_32
VEQ-299576_1
GED:
VEQ-299622_2
VR-GREQ-299618_13
VR-GREQ-299559_1
VR-GREQ-299524_14
FB_65624_1
VR-GREQ-299645_1
MNT:
FB_71125_1
FB_71125_2
VR-534_4
The above is the content of the the .txt file. how can I read it separately the content of it. for example -
RRA:VR-GREQ-299684_6j VR-GREQ-299684_6k VR-GREQ-299606_3 VR-GREQ-299606_4 VR-GREQ-299606_5 VR-GREQ-299606_7
and save it in a variable or something similar to it. Later I want to read CZM separately and so on. I did as below.
with open(output_filename, 'r') as f:
excel = f.read()
But how to read it separately ? can someone tell me how to do it ?
Something like this:
def read_file_with_custom_record_separator(file_path, delimiter='\n'):
fh = open(file_path)
data = ""
for line in fh:
if line.strip().endswith(delimiter) and data != "":
print "VARIABLE:\n<", data, ">\n"
data = line
else:
data += line
print "LAST VARIABLE:\n<", data, ">\n"
And then:
read_file_with_custom_record_separator("input.txt", ":")
You can make use of the file text : as indicator to create a new file like this:
savefilename = ""
with open(filename, 'r') as f:
for line in f:
line = line.strip() # get rid of the unnecessary white chars
lastchar = line[-1:] # get the last char
if lastchar == ":": # if the last char is ":"
savefilename = line[0:-1] # get file name from line (except the ":")
sf = open(savefilename + ".txt", 'w') # create a new file
else:
sf.write(line + "\n") # write the data to the opened file
Then you should get collection of files:
RRA.txt
CZM.txt
DUN.txt
# etc
which contains all the appropriate data:
RRA.txt
VR-GREQ-299684_6j
VR-GREQ-299684_6k
VR-GREQ-299606_3
VR-GREQ-299606_4
VR-GREQ-299606_5
VR-GREQ-299606_7
CZM.txt
VR-GREQ-299684_6k
VR-GREQ-299606_6
VR-GREQ-299606_8
VR-GREQ-299640_1
VR-GREQ-299640_5
VR-GREQ-299524_9
FB_65624_1
VR-GREQ-299680_5
DUN.txt
FB_71125_1
# and so on
You can replace the sf = open and the sf.write which whatever way you feel best to separate the data. Here, I use files...
You can iterate over the file and use the lines and indices to your advantage; something like this:
with open(output_filename, 'r') as f:
for index, line in enumerate(f):
# here you have access to each line and its index
# so you can save any number of lines you wish
What about reading it into a list, then process its element as you prefer
>>> f = open('myfile.txt', 'r').readlines()
>>> len(f)
46
>>> f[0]
RRA:
>>> f[-1]
VR-GREQ-299534_4
>>> f[:3]
['RRA:\n', 'VR-GREQ-299684_6j \n', 'VR-GREQ-299684_6k \n']
>>>
>>> [l for l in f if l.startswith('FB_')]
['FB_65624_1 \n', 'FB_71125_1 \n', 'FB_69228_1 \n', 'FB_65624_1 \n', 'FB_71125_1 \n', 'FB_71125_2 \n']
>>>
i have this piece of code:
asm = open(infile)
asmw = open(outfile, "w")
shutil.copyfile(infile, outfile)
for x in range(0, 8):
xorreg.append("xor " + reg[x] + ", " + reg[x])
for line in asm:
if any(s in line for s in xorreg):
found += line.count(xorreg[x])
print line
i want to write some text lines in the file right before "line" (the one printed)
how can i do that?
Thanks
This script appends to every lien containing the string Gandalf a new string The greatest wizard of all times was:
# show what's in the file
with open("some_file.txt", 'r') as f:
print f.read()
new_content = []
with open("some_file.txt", "r") as asmr:
for line in asmr.readlines():
if "Gandalf" in line:
# we have a match,we want something but we before that...
new_content += "The greatest wizard of all times was:"
new_content += line
# write the file with the new content
with open("some_file.txt", "w") as asmw:
asmw.writelines(new_content)
# show what's in the file now
with open("some_file.txt", 'r') as f:
print f.read()