Anagram not matching up (string to list), Python - python

I'm trying to make a script where I can input an anagram of any word and it will read from a dictionary to see if there's a match
(ex. estt returns: = unjumble words: test)
If there are two matches it will write
(ex. estt returns: there are multiple matches: test, sett(assuming sett is a word lol)
I couldn't even get one match going, keeps returning "no match" even though if I look at my list made from a dictionary I see the words.
Here's the code I wrote so far
def anagrams(s):
if s =="":
return [s]
else:
ans = []
for w in anagrams(s[1:]):
for pos in range(len(w)+1):
ans.append(w[:pos]+s[0]+w[pos:])
return ans
dic_list = []
def dictionary(filename):
openfile = open(filename,"r")
read_file = openfile.read()
lowercase = read_file.lower()
split_words = lowercase.split()
for words in split_words:
dic_list.append(words)
def main():
dictionary("words.txt")
anagramsinput = anagrams(input("unjumble words here: "))
for anagram in anagramsinput:
if anagram in dic_list:
print(anagram)
else:
print("no match")
break
It's as if anagram isn't in dic_list. what's happening?

You are breaking after a single check in your loop, remove the break to get all anagrams:
def main():
dictionary("words.txt")
anagramsinput = anagrams(input("unjumble words here: "))
for anagram in anagramsinput:
if anagram in dic_list: # don't break, loop over every possibility
print(anagram)
If you don't want to print no match just remove it, also if you want all possible permutations of the letters use itertools.permutations:
from itertools import permutations
def anagrams(s):
return ("".join(p) for p in permutations(s))
Output:
unjumble words here: onaacir
aaronic
In your anagrams function you are returning before you finish the outer loop therefore missing many permutations:
def anagrams(s):
if s =="":
return [s]
else:
ans = []
for w in anagrams(s[1:]):
for pos in range(len(w)+1):
ans.append(w[:pos]+s[0]+w[pos:])
return ans # only return when both loops are done
Now after both changes your code will work

Related

Python Question Relating to Finding Anagram from Dictionary

I am struggling with this project that I am working on.
Edit: I want the program to find 2 words from the dictionary that are the anagram of the input word(s). The way I wanted to approach this program is by using counter(input()) and then looping through the dictionary content twice (finding first word anagram then the next). The loop would take every word from the dictionary, counter(that word) and see if it is <= counter(input word). Once the program finds first anagram, it adds that word to candidate and proceeds to second loop to find the second word.
To put to simple words, if I input a word (or a phrase), I would like the program to run through a dictionary text file (which I have saved) and find two words from the dictionary that becomes anagram to my input. For instance, if I input "dormitory" the program output should be "dirty room" and if input "a gentleman", output "elegant man". Here is what I have done so far:
from pathlib import Path
from collections import Counter
my_dictionary = open(Path.home() / 'dictionary.txt')
my_words = my_dictionary.read().strip().split('\n')
my_dictionary.close()
letter_number = 0
my_word = []
print('Please type in your phrase:')
word = input()
word = word.replace(" ","")
word_map = Counter(word.lower())
for a_word in my_words:
test = ''
candidate = ''
test_word = Counter(a_word.lower())
for letter in test_word:
if test_word[letter] <= word_map[letter]:
test += letter
if Counter(test) == test_word:
candidate += a_word.lower()
for a_word in my_words:
test = ''
test_word = Counter(a_word.lower())
for letter in test_word:
if test_word[letter] <= word_map[letter]:
test += letter
if Counter(test) == test_word:
candidate += a_word.lower()
if Counter(candidate) == word_map:
my_word.append(candidate)
print(my_word)
For some reason I am getting nothing from the output.
I cannot get any result after I put my input.
I also have tried to use del. command for getting rid of the word counter of first word from dictionary then proceed to find a second word from the dictionary but that didn't work either.
In summary, there must be some wrong place in the codes that flaws the program to not give any output.
Please help me figure out my mistake and error.
Thanks in advance.
Code can be optimized as follows:
# script.py
from pathlib import Path
from collections import Counter
filename = 'dictionary.txt'
my_words = Path.home().joinpath(filename).read_text().strip().splitlines()
word = input('Please type in your phrase:\n').replace(" ","")
word_counter = Counter(word.lower())
def parse(my_words=my_words):
matches = []
for a_word in my_words:
a_word_counter = Counter(a_word.lower())
if all(c <= word_counter[w] for c in a_word_counter.values()):
matches.append(a_word)
return matches
def exactly_parse(my_words=my_words):
return [w for w in my_words if Counter(w) == word_counter]
my_word = parse()
print(my_word)
Let's say content of dictionary.txt:
$ cat dictionary.txt
how
are
you
fine
thanks
input word is how
What's the expected output? how
$ python script.py
Please type in your phrase:
how
['how']
$ python script.py
Please type in your phrase:
thanksyou
['you', 'thanks']

is there a way to check if a word in a generated list, is inside a text file?

I want to check if a word in a generated list is inside a text file
I made an anagram generator, and I want to see if the list created by that generator has a real word by checking if it's inside an English dictionary.
#input
word = input("Word: ")
#function to generate anagrams of the word
def make_anagram(word):
if len(word) <= 1:
yield word
else:
for let in make_anagram(word[1:]):
for i in range(len(word)):
yield let[:i] + word[0:1] + let[i:]
#function to check anagrams
def check_if_anagram(word):
#this is the file with the dictionary
file = open("english-words-master\words.txt")
words = file.read()
#here's where I'm havning trouble
anagram = list(make_anagram(word))
if str(anagram) in words and str(anagram) != word:
print(str(anagram) + " is a real anagram.")
else:
print("there is no real anagram for " + word)
file.close()
the second function always returns the else statement.
I'm still a beginner so I don't understand how lists work very well, what's wrong in the check_if_anagram function?
You are checking if the whole list is in the file. For example, if anagram = ['abc', 'acb'] There you are literally checking:
if "['abc', 'acb']" in words
Which will most likely never happen...
What you want to do, is check each anagram from the list:
anagrams = list(make_anagram(word))
for anagram in anagrams:
if anagram in words and anagram != word:
print(anagram + " is a real anagram.")
break
else:
print("there is no real anagram for " + word)

Python strings with anagrams

At the moment this code takes in a string from a user and compares it to a text file in which many words are stored. It then outputs all the words that contain an exact match to the string. (E.G "otp = opt, top, pot) Currently when i input the string it only matches the string to the word with the EXACT same letters in a rearranged order.
My question is how do i go about being able to type in excess letters but still output all the words that are contained? for example: Type in "orkignwer" and the program will output "working" even though there are extra letters.
words = []
def isAnAnagram(word, user):
wordList= list(word)
wordList.sort()
inputList= list(user)
inputList.sort()
return (wordList == inputList)
def getAnagrams(user):
lister = [word for word in words if len(word) == len(user) ]
for item in lister:
if isAnAnagram(item, user):
yield item
with open('Dictionary.txt', 'r') as f:
allwords = f.readlines()
f.close()
for x in allwords:
x = x.rstrip()
words.append(x)
inp = 1
while inp != "99":
inp = input("enter word:")
result = getAnagrams(inp)
print(list(result))
You have to edit the isAnAnagram and the getAnagrams functions. First the getAnagrams function should be edited to also include the words of greater length in the lister list:
def getAnagrams(user):
lister = [word for word in words if len(word) <= len(user) ]
for item in lister:
if isAnAnagram(item, user):
yield item
Then you would need to edit the isAnAnagram function. As Alexander Huszagh pointed out, you can use the Counter from the collections package:
from collections import Counter
def isAnAnagram(word, user):
word_counter = Counter(word)
input_counter = Counter(user)
return all(count <= input_counter[key] for key, count in word_counter.items())
The all(count <= input_counter[key] for key, count in word_counter.items()) checks to see if every letter of word appears in user at least as many times as they did in word.
P.S. If you want a more optimized solution, you might want to checkout TRIEs (e.g. MARISA-trie, python-trie or PyTrie).

Find anagrams of a given word in a file

Alright so for class we have this problem where we need to be able to input a word and from a given text file (wordlist.txt) a list will be made using any anagrams of that word found in the file.
My code so far looks like this:
def find_anagrams1(string):
"""Takes a string and returns a list of anagrams for that string from the wordlist.txt file.
string -> list"""
anagrams = []
file = open("wordlist.txt")
next = file.readline()
while next != "":
isit = is_anagram(string, next)
if isit is True:
anagrams.append(next)
next = file.readline()
file.close()
return anagrams
Every time I try to run the program it just returns an empty list, despite the fact that I know there are anagrams present. Any ideas on what's wrong?
P.S. The is_anagram function looks like this:
def is_anagram(string1, string2):
"""Takes two strings and returns True if the strings are anagrams of each other.
list,list -> string"""
a = sorted(string1)
b = sorted(string2)
if a == b:
return True
else:
return False
I am using Python 3.4
The problem is that you are using the readline function. From the documentation:
file.readline = readline(...)
readline([size]) -> next line from the file, as a string.
Retain newline. A non-negative size argument limits the maximum
number of bytes to return (an incomplete line may be returned then).
Return an empty string at EOF.
The key information here is "Retain newline". That means that if you have a file containing a list of words, one per line, each word is going to be returned with a terminal newline. So when you call:
next = file.readline()
You're not getting example, you're getting example\n, so this will never match your input string.
A simple solution is to call the strip() method on the lines read from the file:
next = file.readline().strip()
while next != "":
isit = is_anagram(string, next)
if isit is True:
anagrams.append(next)
next = file.readline().strip()
file.close()
However, there are several problems with this code. To start with, file is a terrible name for a variable, because this will mask the python file module.
Rather than repeatedly calling readline(), you're better off taking advantage of the fact that an open file is an iterator which yields the lines of the file:
words = open('wordlist.txt')
for word in words:
word = word.strip()
isit = is_anagram(string, word)
if isit:
anagrams.append(word)
words.close()
Note also here that since is_anagram returns True or False, you
don't need to compare the result to True or False (e.g., if isit
is True). You can simply use the return value on its own.
Yikes, don't use for loops:
import collections
def find_anagrams(x):
anagrams = [''.join(sorted(list(i))) for i in x]
anagrams_counts = [item for item, count in collections.Counter(anagrams).items() if count > 1]
return [i for i in x if ''.join(sorted(list(i))) in anagrams_counts]
Here's another solution, that I think is quite elegant. This runs in O(n * m) where n is the number of words and m is number of letters (or average number of letters/word).
# anagarams.py
from collections import Counter
import urllib.request
def word_hash(word):
return frozenset(Counter(word).items())
def download_word_file():
url = 'https://raw.githubusercontent.com/first20hours/google-10000-english/master/google-10000-english-no-swears.txt'
urllib.request.urlretrieve(url, 'words.txt')
def read_word_file():
with open('words.txt') as f:
words = f.read().splitlines()
return words
if __name__ == "__main__":
# downloads a file to your working directory
download_word_file()
# reads file into memory
words = read_word_file()
d = {}
for word in words:
k = word_hash(word)
if k in d:
d[k].append(word)
else:
d[k] = [word]
# Prints the filtered results to only words with anagrams
print([x for x in d.values() if len(x) > 1])

Counting abecedarian words in a list: Python

Working on a very common problem to identify whether word is abecedarian (all letters in alphabetical order). I can do one word in several ways as discovered in "Think Python"; but, would like to be able to iterate through a list of words determining which are abecedarian and counting those that are.
def start():
lines= []
words= []
for line in open('word_test1.txt'):
lines.append(line.strip())
numlines=len(lines)
count = 0
for word in lines[:]:
i = 0
while i < len(word)-1:
if word[i+1] < word[i]:
return
i = i+1
print (word)
count= count + 1
print (count)
start()
I think my problem lies with the "return" in the "while i" loop. In the list I'm using there are at least three abecedarian words. The above code reads the first two (which are the first entries), prints them, counts them but on the following non-abecedarian word breaks out of the loop and ends the program.
I'm new at programming and this has taken me several hours over a couple of days.
No need for low level programming on this one :-)
def is_abcedarian(s):
'Determine whether the characters are in alphabetical order'
return list(s) == sorted(s)
The use filter to run over a list of words:
>>> filter(is_abcedarian, ['apple', 'bee', 'amp', 'sun'])
['bee', 'amp']
The return statement is breaking out of the entire start() function. There are many possible ways to solve this, but the clearest might be to break your code into two functions like this:
def is_abcedarian(word):
i = 0
while i < len(word)-1:
if word[i+1] < word[i]:
return False
i = i+1
return True
def start():
lines= []
words= []
for line in open('word_test1.txt'):
lines.append(line.strip())
numlines=len(lines)
count = 0
for word in lines[:]:
if is_abcedearian(word):
print (word)
count= count + 1
print (count)
In this example, the return statements in is_abcedarian() returns only from that function, and the return value is then tested by the if statement inside the for loop.
Once you have split apart your program in this way, you have the added benefit of being able to use your is_abcedarian() function from other places (in future related code you might write).
I believe you intended to break from the while loop when you find the letter's are not in order and instead you issued the return statement, which returns you from the function start.
There could be couple of ways to do this
You can use Exception, to raise a StopIteration Exception and catch it outside the while loop.
for word in lines[:]:
try:
i = 0
while i < len(word)-1:
if word[i+1] < word[i]:
raise StopIteration
i = i+1
print (word)
except StopIteration:
None
You can also try setting a flag found and then use it later to test for printing the word
A slightly reorganized approach:
def is_abcedarian(word):
return sorted(s)==list(s)
def main():
# read input file
with open('word_test1.txt') as inf:
words = [line.strip() for line in inf]
# figure out which words are "good"
good_words = [word for word in words if is_abcedarian(word)]
# print the "good" words
print("\n".join(good_words))
print(len(good_words))
if __name__=="__main__":
main()
I like iterools:
from itertools import tee, izip
def pairwise(iterable):
a, b = tee(iterable)
next(b)
return izip(a, b)
def is_abcdarien(word):
return all(c < d for c, d in pairwise(word))
words = 'asdf', 'qwer', 'fghi', 'klmn', 'aabcd', 'abcd'
print filter(is_abcdarien, words)
print len(filter(is_abcdarien, words))
Result:
('fghi', 'klmn', 'abcd')
3
Change c < d to c <= d if you want non-strict ordering, so that "aabcd" is also abcdarian, .
I have this solution for you - I have found it in the same place as you. I hope it still helps.
def is_abecedarian(word):
word.lower()
letter_value=0
for letter in word:
if ord(letter) < letter_value:
return False
else:
letter_value = ord(letter)
return True
fin = open('words.txt')
words_no = 0
for line in fin:
word = line.strip()
if is_abecedarian(word):
words_no = words_no + 1
print words_no

Categories

Resources