For my code below i was wanting to print out a full sentences in which certain words from my word lists appear, aswell it would print out the word count underneath each specific word into a .txt file. I succesfully achieved this in the terminal but am really struggling to get it into a .txt. At the moment i can only seem to get it to print out the word count in the .txt but the sentences are still printing to terminal, does anybody know where i maybe going wrong? Sorry for my lack of knowledge beginner learning python. Thanks
import re, os
pathWordLists = "E:\\Python\WordLists"
searchfilesLists = os.listdir(pathWordLists)
pathWordbooks = "E:\\Python\Books"
searchfilesbooks = os.listdir(pathWordBooks)
lush = open("WorkWork.txt", "w")
def searchDocs(word):
for document in searchfilesbooks:
file = os.path.join(pathWordbooks, document)
text = open(file, "r")
hit_count = 0
for line in text:
if re.findall(word, line):
hit_count = hit_count +1
print(document + " |" + line, end="")
print(document + " => " + word + "=> "+ str(hit_count), file=lush)
text.close()
lush.flush()
return
def searchWord():
for document in searchfilesLists:
file = os.path.join(pathWordLists, document)
text = open(file, "r")
for line in text:
#print(line)
searchDocs(line.strip())
text.close()
print("Finish")
searchWord()
In case you're printing sentences with print(document + " |" + line, end="") you forgot the file parameter. Adding it should fix the problem:
print(document + " |" + line, end="", file=lush)
Try storing the result in a variable and then writing the variable to the file. Something like this:
def searchDocs(word):
results = []
for document in searchfilesbooks:
file = os.path.join(pathWordbooks, document)
with open(file, "r") as text:
lines = text.readlines()
hit_count = 0
for line in lines:
if re.findall(word, line):
hit_count += 1
results.append(document + " |" + line)
results.append(document + " => " + word + "=> "+ str(hit_count))
with open("WorkWork.txt", "w") as f:
f.write('\n'.join(results))
Related
Been trying to write my PYTHON code but it will always output the file with a blank line at the end. Is there a way to mod my code so it doesn't print out the last blank line.
def write_concordance(self, filename):
""" Write the concordance entries to the output file(filename)
See sample output files for format."""
try:
file_out = open(filename, "w")
except FileNotFoundError:
raise FileNotFoundError("File Not Found")
word_lst = self.concordance_table.get_all_keys() #gets a list of all the words
word_lst.sort() #orders it
for i in word_lst:
ln_num = self.concordance_table.get_value(i) #line number list
ln_str = "" #string that will be written to file
for c in ln_num:
ln_str += " " + str(c) #loads line numbers as a string
file_out.write(i + ":" + ln_str + "\n")
file_out.close()
Output_file
Line 13 in this picture is what I need gone
Put in a check so that the new line is not added for the last element of the list:
def write_concordance(self, filename):
""" Write the concordance entries to the output file(filename)
See sample output files for format."""
try:
file_out = open(filename, "w")
except FileNotFoundError:
raise FileNotFoundError("File Not Found")
word_lst = self.concordance_table.get_all_keys() #gets a list of all the words
word_lst.sort() #orders it
for i in word_lst:
ln_num = self.concordance_table.get_value(i) #line number list
ln_str = "" #string that will be written to file
for c in ln_num:
ln_str += " " + str(c) #loads line numbers as a string
file_out.write(i + ":" + ln_str)
if i != word_lst[-1]:
file_out.write("\n")
file_out.close()
The issue is here:
file_out.write(i + ":" + ln_str + "\n")
The \n adds a new line.
The way to fix this is to rewrite it slightly:
ln_strs = []
for i in word_lst:
ln_num = self.concordance_table.get_value(i) #line number list
ln_str = " ".join(ln_num) #string that will be written to file
ln_strs.append(f"{i} : {ln_str}")
file_out.write('\n'.join(ln_strs))
Just btw, you should actually not use file_out = open() and file_out.close() but with open() as file_out:, this way you always close the file and an exception won't leave the file hanging
The below code is supposed to count instances of a particular word in a text file, though it seems to only work for individual letters. Using a string of two letters or more always returns a count of 0. I have checked, and the input I have been using should definitely not return a count of 0 for the given files.
Any ideas?
def count_of_word(filename, word_to_count):
"""Counts instances of a particular word in a file"""
try:
with open(filename) as file_object:
contents = file_object.read()
except FileNotFoundError:
print("File " + filename + " not found")
else:
word_count = contents.lower().count(word_to_count)
print("The count of the word '" + word_to_count + "' in " + filename + " is " + str(word_count))
You change lower-case to only the file input. Try:
word_count = contents.lower().count(word_to_count.lower())
That works for me - I get 1026 for the count of and in the file you refer to.
EDIT: suspected encoding issue, so suggested specifying encoding, which worked:
open(filename, encoding='utf_8')
Did not change one line in your code, and it works, I'm wondering if this has to do anything with how you are passing 'the' or 'and' into the function should be count_of_word('alice.txt', 'the')
def count_of_word(filename, word_to_count):
"""Counts instances of a particular word in a file"""
try:
with open(filename) as file_object:
contents = file_object.read()
except FileNotFoundError:
print("File " + filename + " not found")
else:
word_count = contents.lower().count(word_to_count)
print("The count of the word '" + word_to_count + "' in " + filename + " is " + str(word_count))
count_of_word('alice.txt', 'the')
count_of_word('alice.txt', 'a')
~/python/stack/sept/twenty_2$ python3.7 alice.py
The count of the word 'and' in alice.txt is 2505
The count of the word 'a' in alice.txt is 9804
import os
searchquery = 'word'
with open('Y:/Documents/result.txt', 'w') as f:
for filename in os.listdir('Y:/Documents/scripts/script files'):
with open('Y:/Documents/scripts/script files/' + filename) as currentFile:
for line in currentFile:
if searchquery in line:
start = line.find(searchquery)
end = line.find("R")
result = line[start:end]
print result
f.write(result + ' ' +filename[:-4] + '\n')
Now this works well to search for "word" and prints everything after word up until an "R" providing that it is on the same line. However if the "R" is on the line it won't print the stuff before it.
eg:
this should not be printed!
this should also not be printed! "word" = 12345
6789 "R" After this R should not be printed either!
In the case above the 6789 on line 3 will not be printed with my current. However i want it to be. How do i make python keep going over multiple lines until it reaches the "R".
Thanks for any help!
It is normal that it does not print the content on the next line because you are searching for the word on one line. A better solution would be as follows.
import os
searchquery = 'word'
with open('Y:/Documents/result.txt', 'w') as f:
for filename in os.listdir('Y:/Documents/scripts/script files'):
with open('Y:/Documents/scripts/script files/' + filename) as currentFile:
content = ''.join([line for line in currentFile])
start = content.find(searchquery)
end = content.find("R")
result = content[start:end].replace("\n", "")
print result
f.write(result + ' ' +filename[:-4] + '\n')
Please be advised, this will work only for a single occurence. You will need to break it up further to print multiple occurences.
I wrote a script that will open my text file search for a certain word, then select the line that contains this word ans split it into three parts, then it chooses the part which is a number and add 1 to it, so every time I run the script one is added to this number. here is the script:
#!/usr/bin/env python
inputFile = open('CMakeLists.txt', 'r')
version = None
saved = ""
for line in inputFile:
if "_PATCH " in line:
print "inside: ", line
version = line
else:
saved += line
inputFile.close()
inputFile = open('CMakeLists.txt', 'w')
x = version.split('"')
print "x: ", x
a = x[0]
b = int(x[1]) + 1
c = x[2]
new_version = str(a) + '"' + str(b) + '"' + str(c)
print "new_version: ", new_version
inputFile.write(str(saved))
inputFile.write(str(new_version))
inputFile.close()
but my problem is that the new number is being written at the end of the file, I want it to stay in its original place. Any ideas ?
thanks
The problem is that you write the new version number after the original file (without the version line):
inputFile.write(str(saved))
inputFile.write(str(new_version))
You could fix it by saving the lines before and after the line that contains the version separately and then save them in the right order:
#!/usr/bin/env python
inputFile = open('CMakeLists.txt', 'r')
version = None
savedBefore = ""
savedAfter = ""
for line in inputFile:
if "_PATCH " in line:
print "inside: ", line
version = line
elif version is None:
savedBefore += line
else:
savedAfter += line
inputFile.close()
inputFile = open('CMakeLists.txt', 'w')
x = version.split('"')
print "x: ", x
a = x[0]
b = int(x[1]) + 1
c = x[2]
new_version = str(a) + '"' + str(b) + '"' + str(c)
print "new_version: ", new_version
inputFile.write(savedBefore)
inputFile.write(str(new_version))
inputFile.write(savedAfter)
inputFile.close()
Note: you might need to add some extra text with the version line to make it have the same format as the original (such as adding "_PATCH").
There is a lots to say on your code.
Your mistake is that you're writing your "saved" lines and after you are writing your modified version. Hence, this modified line will be written at the end of the file.
Moreover, I advice you to use with statements.
lines = []
with open('CmakeLists.txt', 'r') as _fd:
while True:
line = _fd.readline()
if not line:
break
if '_PATCH ' in line:
a, b, c = line.split('"')
b = int(b) + 1
line = '{} "{}" {}'.format(a, b, c)
lines.append(line)
with open('CmakeLists.txt', 'w') as _fd:
for line in lines:
_fd.write(line)
This code is untested and may contains some error... also, if your input file is huge, putting every lines in a list can be a bad idea.
#!/usr/bin/env python
inputFile = open('CMakeLists.txt', 'r')
version = None
saved = ""
for line in inputFile:
if "_PATCH " in line:
print "inside: ", line
version = line
x = version.split('"')
print "x: ", x
a = x[0]
b = int(x[1]) + 1
c = x[2]
new_version = str(a) + '"' + str(b) + '"' + str(c)
saved += new_version
else:
saved += line
inputFile.close()
inputFile = open('CMakeLists.txt', 'w')
inputFile.write(str(saved))
inputFile.close()
if a certain line is found, update its content and add to saved, once for loop ends, just write saved to file
I want to make a simple function that writes two words to a file each on a new line.
But if I run this code it only writes "tist - tost" to the file.
Code:
def write_words(word1, word2):
w = open("output.txt", "w")
w.write(word1 + " - " + word2 + '\n')
w.close()
write_words("test", "tast")
write_words("tist", "tost")
Output:
tist - tost
How can I write the two phrases to the file?
You need to open your file in append mode, also as a more pythonic way for opening a file you can use with statement which close the file at the end of block :
def write_words(word1, word2):
with open("output.txt", "a") as f :
f.write(word1 + " - " + word2 + '\n')