checking if word exists in a text file python - python

I'm working with Python, and I'm trying to find out if a word is in a text file. i am using this code but it always print the "word not found", i think there is some logical error in the condition, anyone please if you can correct this code:
file = open("search.txt")
print(file.read())
search_word = input("enter a word you want to search in file: ")
if(search_word == file):
print("word found")
else:
print("word not found")

Better you should become accustomed to using with when you open a file, so that it's automatically close when you've done with it. But the main thing is to use in to search for a string within another string.
with open('search.txt') as file:
contents = file.read()
search_word = input("enter a word you want to search in file: ")
if search_word in contents:
print ('word found')
else:
print ('word not found')

Other alternative, you can search while reading a file itself:
search_word = input("enter a word you want to search in file: ")
if search_word in open('search.txt').read():
print("word found")
else:
print("word not found")
To alleviate the possible memory problems, use mmap.mmap() as answered here related question

Previously, you were searching in the file variable, which was 'open("search.txt")' and since that wasn't in your file, you were getting word not found.
You were also asking if the search word exactly matched 'open("search.txt")' because of the ==. Don't use ==, use "in" instead. Try:
file = open("search.txt")
strings = file.read()
print(strings)
search_word = input("enter a word you want to search in file: ")
if(search_word in strings):
print("word found")
else:
print("word not found")

Related

why is it not accepting the hidden word?

i created a function that takes in a word and checks it in a file containing all words from the dictionary , accepts the word if it is found else it prints an error message and ask for the word again
def getHiddenWord():
file = open('dictionary.txt')
found = False
while found == False:
hiddenWord = input('Enter the hidden word')
for word in file.readlines():
if word.strip().lower() == hiddenWord.lower():
found = True
return hiddenWord.lower()
break
else:
continue
print('I don\'t have this word in my dictionary please try another word')
if i wrote a correct word in the first input it works perfectly but and time after that it keeps looping as intended but it doesn't accept the input taking in consideration that if i wrote the same words the first input it will work and get accepted
file.readlines()
can be called only once, when you'll try to call it again on the same opened file it will fail.
Solution: before the loop read the lines and save them into a variable:
def getHiddenWord():
file = open('dictionary.txt')
lines = file.readlines() # <-- here
file.close() # <-- here
found = False
while found == False:
hiddenWord = input('Enter the hidden word')
for word in lines: # <-- and here
if word.strip().lower() == hiddenWord.lower():
found = True
print(hiddenWord.lower() + ' found!') # <-- here
break
else:
print('I don\'t have this word in my dictionary please try another word')
Further, as Óscar López mentioned in his (now deleted) answer: if you want the game to continue after a word was found you shouldn't return - just print "success" and break
A better way would be to convert the file into a set once and the just use in to check if the input is there:
def get_hidden_word():
with open('dictionary.txt') as fp:
words = set(w.strip().lower() for w in fp)
while True:
guess = input('Enter the hidden word').strip().lower()
if guess in words:
return guess
print("I don't have this word in my dictionary please try another word")

Counting a desired word in a text file

I have to count the number of times a given word appears in a given text file, this one being the Gettysburg Address. For some reason, it is not counting my input of 'nation' so the output looks as such:
'nation' is found 0 times in the file gettysburg.txt
Here is the code I have currently, could someone point out what I am doing incorrectly?
fname = input("Enter a file name to process:")
find = input("Enter a word to search for:")
text = open(fname, 'r').read()
def processone():
if text is not None:
words = text.lower().split()
return words
else:
return None
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
if word == token:
count += 1
return count
words = processone()
word = find
frequency = count_word(words, word)
print("'"+find+"'", "is found", str(frequency), "times in the file", fname)
My first function splits the file into a string and turns all letters in it lower case. The second one removes the punctuation and is supposed to count the word given in the input.
Taking my first coding class, if you see more flaws in my coding or improvements that could be made, as well as helping find the solution to my problem, feel free.
In the for loop in the count_word() function, you have a return statement at the end of the loop, which exits the function immediately, after only one loop iteration.
You probably want to move the return statement to be outside of the for loop.
as a starter I would suggest you to use print statements and see what variables are printing, that helps to breakdown the problem. For example, print word was showing only first word from the file, which would have explained the problem in your code.
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
print (word)
if word == token:
count += 1
return count
Enter a file name to process:gettysburg.txt
Enter a word to search for:nation
fourscore
'nation' is found 0 times in the file gettysburg.txt
Use code below:
fname = input("Enter a file name to process:")
find = input("Enter a word to search for:")
text = open(fname, 'r').read()
def processone():
if text is not None:
words = text.lower().split()
return words
else:
return None
def count_word(tokens, token):
count = 0
for element in tokens:
word = element.replace(",", " ")
word = word.replace("."," ")
if word == token:
count += 1
return count
words = processone()
word = find
frequency = count_word(words, word)
print("'"+find+"'", "is found", str(frequency), "times in the file", fname)
statement "return" go out statement "for"

How do i find if a whole word is in a text file?

My code looks like this:
file = open('names.txt', 'r')
fileread = file.read()
loop = True
while loop is True:
with open('names.txt', 'r') as f:
user_input = input('Enter a name: ')
for line in f:
if user_input in line:
print('That name exists!')
else:
print('Couldn\'t find the name.')
The code basically asks the user for a name, and if the name exists in the text file, then the code says it exists, but if it doesn't it says it couldnt find it.
The only problem I have is that if you even enter part of the name, it will tell you the whole name exists. For example the names in my text file are: Anya, Albert and Clemont, all seperated on different lines. If i were to enter 'a' when prompted for user_input, the code will still say the name is present, and will just ask for another name. I understand why its doing this, because 'a' is technically in the line, but how do i make it so that it only says the name exists if they enter the whole thing? By whole thing i mean they enter for example 'Anya', rather than 'a' and the code only says the name exists if they enter 'Anya'. Thanks
Short solution using re.seach() function:
import re
with open('lines.txt', 'r') as fh:
contents = fh.read()
loop = True
while loop:
user_input = input('Enter a name: ').strip()
if (re.search(r'\b'+ re.escape(user_input) + r'\b', contents, re.MULTILINE)):
print("That name exists!")
else:
print("Couldn't find the name.")
Test cases:
Enter a name: Any
Couldn't find the name.
Enter a name: Anya
That name exists!
Enter a name: ...
To answer the question , just do equal comparison. Also noted that You have infinite loop , is that expected ? I changed the code to exit that loop when a matching name found in the file
file = open('inv.json', 'r')
fileread = file.read()
loop = True
while loop is True:
with open('inv.json', 'r') as f:
user_input = raw_input('Enter a name: ')
for line in f:
if user_input == line.strip():
print('That name exists!')
break
#loop =False
else:
print('Couldn\'t find the name.')
Input
Anya
Albert
Clemont
Output
Enter a name: an
Couldn't find the name.
Couldn't find the name.
Couldn't find the name.
Enter a name: An
Couldn't find the name.
Couldn't find the name.
Couldn't find the name.
Enter a name: Any
Couldn't find the name.
Couldn't find the name.
Couldn't find the name.
Enter a name: Anya
That name exists!

Searching for keyword in text file

I have a list of passwords in a text file called "SortedUniqueMasterList.txt". I'm writing a program that takes user input and checks to see if the inputted password is in the list.
Here is my code:
Passwords = []
with open("SortedUniqueMasterList.txt", "r", encoding = "latin-1") as infile:
print("File opened.")
for line in infile:
Passwords.append(line)
print("Lines loaded.")
while True:
InputPassword = input("Enter a password.")
print("Searching for password.")
if InputPassword in Passwords:
print("Found")
else:
print("Not found.")
However, every password that I enter returns "Not found.", even ones that I know for sure are in the list.
Where am I going wrong?
After reading lines in your file, each entry in your Passwords list will contain a "new line" character, '\n', that you will want to strip off before checking for a match, str.strip() should allow keywords to be found like this:
for line in infile:
Passwords.append(line.strip())
Something closer to this runs - when you read a file, you can read each line in as an array element. try this let mme know how it goes
with open("SortedUniqueMasterList.txt", "r") as infile:
print("File opened.")
Passwords = infile.readlines()
print("Lines loaded.")
while True:
InputPassword = input("Enter a password.")
InputPassword = str(InputPassword)
print("Searching for password.")
if InputPassword in Passwords:
print("Found")
else:
print("Not found.")
The issue is that you cannot look for a substring in a python list. You must loop through the list checking if the substring is found in any of the elements, then determine if it was found or not.
Passwords = []
with open("SortedUniqueMasterList.txt", "r", encoding = "latin-1") as infile:
print("File opened.")
for line in infile:
Passwords.append(line)
print("Lines loaded.")
while True:
InputPassword = input("Enter a password.")
print("Searching for password.")
found = False
for i in Passwords:
if InputPassword in i:
found = True
break
if found:
print("Found.")
else:
print("Not found.")
I think this is a problem I've had, maybe it is, maybe it isn't. The problem is called a trailing newline. Basically, when you press enter on a text document, it'll enter a special character which indicates it's a new line. To get rid of this problem, import the module linecache. Then, instead of having to open the file and close it, you do linecahce.getline(file, line).strip().
Hope this helps!

How to ignore case of a word while searching for it in a text file and copying into another

I am trying to write a program in python which searches for user specified words in a txt file and copies the selected lines containing that word into another file.
Also the user will have an option to exclude any word.
(e.g Suppose the user searches for the word "exception" and want to exclude the word "abc", then the code will only copy the lines which has "exception" in it but not "abc").
Now all the work will be done from the command prompt.
The input would be:
file.py test.txt(input file) test_mod.txt(output file) -e abc(exclude word denoted by -e)-s exception(search word denoted by -s)
Now the user will have an option to enter multiple exclude words and multiple search words.
I have done the program using the argparse module and it runs.
My problem is it only takes lower case words as search or exclude words. That is if I type "exception" as search word, it does not find "Exception" or "EXCEPTION". How do I solve this prob? I want to ignore case on both search and exclude words.
Here's my code as of now:
import sys
import os
import argparse
import tempfile
import re
def main(): #main method
try:
parser = argparse.ArgumentParser(description='Copies selected lines from files') #Defining the parser
parser.add_argument('input_file') #Adds the command line arguments to be given
parser.add_argument('output_file')
parser.add_argument('-e',action="append")
parser.add_argument('-s',action="append")
args = parser.parse_args() #Parses the Arguments
user_input1 = (args.e) #takes the word which is to be excluded.
user_input2 = (args.s) #takes the word which is to be included.
def include_exclude(input_file, output_file, exclusion_list=[], inclusion_list=[]): #Function which actually does the file writing and also handles exceptions
if input_file == output_file:
sys.exit("ERROR! Two file names cannot be the same.")
else:
try:
found_s = False #These 3 boolean variables will be used later to handle different exceptions.
found_e = False
found_e1 = True
with open(output_file, 'w') as fo: #opens the output file
with open(input_file, 'r') as fi: #opens the input file
for line in fi: #reads all the line in the input file
if user_input2 != None:
inclusion_words_in_line = map(lambda x: x in line, inclusion_list)#Mapping the inclusion and the exclusion list in a new list in the namespace
if user_input1 != None and user_input2 != None: #This list is defined as a single variable as condition operators cannot be applied to lists
exclusion_words_in_line = map(lambda x: x in line, exclusion_list)
if any(inclusion_words_in_line) and not any(exclusion_words_in_line): #Main argument which includes the search word and excludes the exclusion words
fo.write(line) #writes in the output file
found_s = True
elif user_input1 == None and user_input2 != None: #This portion executes if no exclude word is given,only the search word
if any(inclusion_words_in_line):
fo.write(line)
found_e = True
found_s = True
found_e1 = False
if user_input2 == None and user_input1 != None: #No search word entered
print("No search word entered.")
if not found_s and found_e: #If the search word is not found
print("The search word couldn't be found.")
fo.close()
os.remove(output_file)
elif not found_e and not found_s: #If both are not found
print("\nNOTE: \nCopy error.")
fo.close()
os.remove(output_file)
elif not found_e1: #If only the search word is entered
print("\nNOTE: \nThe exclusion word was not entered! \nWriting only the lines containing search words")
except IOError:
print("IO error or wrong file name.")
fo.close()
os.remove(output_file)
if user_input1 != user_input2 : #this part prevents the output file creation if someone inputs 2 same words creating an anomaly.
include_exclude(args.input_file, args.output_file, user_input1, user_input2);
if user_input1 == user_input2 : #This part prevents the program from running further if both of the words are same
sys.exit('\nERROR!!\nThe word to be excluded and the word to be included cannot be the same.')
except SystemExit as e: #Exception handles sys.exit()
sys.exit(e)
if __name__ == '__main__':
main()
The typical way to do this is to pick one case, and make all comparisons in that:
if word.lower() == "exception":
For your case, this could look like:
inclusion_words_in_line = map(lambda x: x in line.lower(),
inclusion_list)
this looks like an attempt to build a search engine, you can achieve this using a library like pylucene
you can then be able to run queries like:
+include -exclude
well, and of course many many more, it may worth the learning curve.

Categories

Resources