Selecting randomly from a list with a specific character - python

So, I have a text file (full of words) I put into a list. I want Python 2.7 to select a word from the list randomly, but for it to start in a specific character.
list code:
d=[]
with open("dic.txt", "r") as x:
d=[line.strip() for line in x]
It's for a game called Shiritori. The user starts with saying any word in the English language, ie dog. The program then has to pick another word starting with the last character, in this case, 'g'.
code for the game:
game_user='-1'
game_user=raw_input("em, lets go with ")
a1=len(game_user)
I need a program that will randomly select a word beginning with that character.

Because your game relies specifically upon a random word with a fixed starting letter, I suggest first sorting all your words into a dictionary with the starting letter as the key. Then, you can randomly lookup any word starting with a given letter:
d=[]
lookup = {}
with open("dic.txt", "r") as x:
d=[line.strip() for line in x]
for word in d:
if word[0] in lookup:
lookup[word[0]].append(word)
else:
lookup[word[0]] = [ word ]
now you have a dict 'lookup' that has all your words sorted by letter.
When you need a word that starts with the last letter of the previous word, you can randomly pick an element in your list:
import random
random_word = random.choice(lookup[ game_user[-1] ])

In order to get a new list of all the values that start with the last letter of the user input:
choices = [x in d if x[0] == game_user[-1]]
Then, you can select a word by:
newWord = random.choice(choices)

>>> import random
>>> with open('/usr/share/dict/words') as f:
... words = f.read().splitlines()
...
>>> c = 'a'
>>> random.choice([w for w in words if w.startswith(c)])
'anthologizing'
Obviously, you need to replace c = 'a' with raw_input("em, lets go with ")

You might get better use out of more advanced data structures, but here's a shot:
words_dict = {}
for row in d:
# Gives us the first letter
myletter = row[0]
if myletter not in words_dict:
words_dict[myletter] = []
words_dict[myletter].append(row)
After creating a dictionary of all letters and their corresponding words, you can then access any particular set of words like so:
words_dict['a']
Which will give you all the words that start with a in a list. Then you can take:
# This could be any letter here..
someletter = 'a'
newword = words_dict[someletter][random.randint(0,len(words_dict[someletter]-1))]
Let me know if that makes sense?

Related

Anagram algorithm in python

This program functions like an anagram, the segment below shows a small algorithm which goes through a list of given words that are stored within a list named word_list and compares the items within to a choice word that is inserted by the user.
The first loop iterates through every one of those items within the list and assigns them to word then sets shared_letters(counter to decide whether or not the letters word can be found within choice) to zero before starting to go through shared letters between the two words in order to not overflow the i iterable during the second loop.
The second loop iterates x using the length of word which is stored within word length . Then the loop goes through a conditional if-statement which decides whether the x index letter of sliced word (which is just equal to list(word)) is found within sliced choice (list(choice)). If it is then the counter shared_letters goes up by 1, otherwise it breaks out of the second loop and goes back to the first in order to get a new word.
The looping process has worked fine before with me, but for some reason in this segment of code it just no longer runs the second loop at all, I've tried putting in print statements to check the routes that the program was taking, and it always skipped over the nested for loop. Even when I tried turning it into something like a function, the program just refused to go through that function.
choice = input("Enter a word: ") # User enters a word
# Algorithm below compares the entered word with all the words found in the dictionary, then saves any words found into "sushi" list
for i in range(num_words): # Word loop, gives iterated word
word = word_list[i] # Picks a loop
shared_letters = 0 # Resets # of shared letters
for x in range(word_length): # Goes through the letters of iterated word
if sliced_word[x] in sliced_choice:
shared_letters = x + 1
elif sliced_word[x] not in sliced_choice:
break
Here is the complete program just in case you want to get a better idea of it, sorry if the coding looks all jumbled up, I've been trying a lot with this program and I just seem to never reach a good solution.
word_list = ["race","care","car","rake","caring","scar"]
sushi = []
word = ""
sliced_word = list(word)
word_length = len(sliced_word)
choice_word = ""
sliced_choice = list(choice_word)
choice_length = len(sliced_choice)
shared_letters = 0
num_words = len(word_list)
next_word = False
choice = input("Enter a word: ") # User enters a word
# Algorithm below compares the entered word with all the words found in the dicitionary, then saves any words found into "sushi" list
for i in range(num_words): # Word loop, gives iterated word
word = word_list[i] # Picks a loop
shared_letters = 0 # Resets # of shared letters
for x in range(word_length): # Goes through the letters of iterated word
if sliced_word[x] in sliced_choice:
# Checks if the letters of the iterated word can be found in the choice word
shared_letters = x + 1
elif sliced_word[x] not in sliced_choice:
break # If any of the letters within the iterated word are not within the choice word, it moves onto the next word
if shared_letters == word_length:
sushi.append(word_list[i])
# If all of the letters within the iterated word are found in the choice word, it appends the iterated word into the "sushi" list. Then moves onto the next word in the word_list.
You have a number of issues, but I think the biggest is that this search does not account for anagrams that have multiple of the same letter. The easiest way to determine if a word would be an anagram or not would be to see if they each have the same count for each letter.
There is a builtin helper class called Counter from the collections module that can help with this.
>>> from collections import Counter
>>> Counter("care")
Counter({'c': 1, 'a': 1, 'r': 1, 'e': 1})
>>> Counter("race")
Counter({'r': 1, 'a': 1, 'c': 1, 'e': 1})
>>> Counter("care") == Counter("race")
True
Working this into your example, you could refactor like this:
word_list = ["race","care","car","rake","caring","scar"]
sushi = []
for word in word_list:
if Counter(choice) == Counter(word):
sushi.append(word)
Now this is kind of slow if we have to make the Counter objects over and over again, so you could choose to store them in a dictionary:
word_list = ["race","care","car","rake","caring","scar"]
word_counters = {word: Counter(word) for word in word_list}
sushi = []
for word, counter in word_counters.items():
if Counter(choice) == counter:
sushi.append(word)
If you want to find an imperfect match, say one word is contained in the other, you can use the - operator and test if the lefthand side has any letters left over afterwards:
>>> not (Counter("race") - Counter("racecar"))
True
>>> not (Counter("race") - Counter("bob"))
False
Working that back into the example:
word_list = ["race","care","car","rake","caring","scar"]
word_counters = {word: Counter(word) for word in word_list}
sushi = []
for word, counter in word_counters.items():
if not (Counter(choice) - counter):
sushi.append(word)

compare specific string to a word python

say I have a certain string and a list of strings.
I would like to append to a new list all the words from the list (of strings)
that are exactly like the pattern
for example:
list of strings = ['string1','string2'...]
pattern =__letter__letter_ ('_c__ye_' for instance)
I need to add all strings that are made up of the same letters in the same places as the pattern, and has the same length.
so for instance:
new_list = ['aczxyep','zcisyef'...]
I have tried this:
def pattern_word_equality(words,pattern):
list1 = []
for word in words:
for letter in word:
if letter in pattern:
list1.append(word)
return list1
help will be much appreciated :)
If your pattern is as simple as _c__ye_, then you can look for the characters in the specific positions:
words = ['aczxyep', 'cxxye', 'zcisyef', 'abcdefg']
result1 = list(filter(lambda w: w[1] == 'c' and w[4:6] == 'ye', words))
If your pattern is getting more complex, then you can start using regular expressions:
pat = re.compile("^.c..ye.$")
result2 = list(filter(lambda w: pat.match(w), words))
Output:
print(result1) # ['aczxyep', 'zcisyef']
print(result2) # ['aczxyep', 'zcisyef']
This works:
words = ['aczxyep', 'cxxye', 'zcisyef', 'abcdefg']
pattern = []
for i in range(len(words)):
if (words[i])[1].lower() == 'c' and (words[i])[4:6].lower() == 'ye':
pattern.append(words[i])
print(pattern)
You start by defining the words and pattern lists. Then you loop around for the amount of items in words by using len(words). You then find whether the i item number is follows the pattern by seeing if the second letter is c and the 5th and 6th letters are y and e. If this is true then it appends that word onto pattern and it prints them all out at the end.

Creating subset list , in a list based on length of words

I have this code here: which takes user input and adds it to a list until they input to stop. At that point, it sorts the list items based on length.
What I am trying to do is put each word the user enters into a list of words with the same length. Like 2-letter words are put into one list, 3-letter words are put into another list.
When complete, I'm trying to return a list containing all of the individual word lists that were created.
So far all I have achieved is organizing them and then adding them to another list that outputs the list as many time as word entered.
def wordsList():
stop = "stop"
sentence = []
sentence2 = []
while True:
word = input("Enter a word: ")
if word == stop:
# exit the loop
break
sentence.append(word)
sentence.sort(key=len)
cat= len(sentence)
for sublist in sentence:
sentence.append(sentence)
# sublist.insert(sentence)
# sentence2=[sentence]
# print(" ".join(sentence)) # Goes through the list and finds a smaller word and convert the list of words into a single string, each word separated by a space.
print(sentence)
wordsList()
Welcome to StackOverflow,
I'd recommend using a dictionary where the keys are the length of the words and the values are the list of words that have that length. Here is a potential implementation:
import collections
STOP = "stop"
def wordsList():
words = collections.defaultdict(list)
while True:
word = input("Enter a word: ")
if word == STOP:
break
length = len(word)
words[length].append(word)
return list(words.values())
This is what itertools.groupby is made for. First you'll need a function that gets the length of one word, which is built-in as len. Then you can sort using that keyfunction, and group using it too.
import itertools
some_lst = [...] # user list of various length words
some_lst.sort(key=len)
groups = itertools.groupby(some_lst, key=len)
This builds groups into a list of tuples. Each tuple has format:
(len(words), words_with_that_len)
you're only really interested in the second element of each tuple, so pull that out.
result = [list(sublst) for _, sublst in groups]
You could also build this yourself with collections.defaultdict
import collections
result_dict = collections.defaultdict(list)
for word in some_lst:
result_dict[len(word)].append(word)
result_dict is now a dictionary of lists. To get a sorted list of lists you can run it through a list comprehension with sorted and dict.items
result = [v for _,v in sorted(result_dict.items())]

How to check if letters of one string are in another

I have a list L of words in alphabetical order, e.g hello = ehllo
how do I check for this words "blanagrams", which are words with mostly all similar letters except 1. For example, orchestra turns into orchestre. I've only been able to think to this part. I have an idea in which you have to test every letter of the current word and see whether this corresponds, to the other and if it does, it is a blanagram and create a dictionary, but i'm struggling to put it into code
L = [list of alphabetically ordered strings]
for word in L:
for letter in word:
#confused at this part
from collections import Counter
def same_except1(s1, s2):
ct1, ct2 = Counter(s1), Counter(s2)
return sum((ct1 - ct2).values()) == 1 and sum((ct2 - ct1).values()) == 1
Examples:
>>> same_except1('hello', 'hella')
True
>>> same_except1('hello', 'heela')
False
>>> same_except1('hello', 'hello')
False
>>> same_except1('hello', 'helloa')
False
Steven Rumbalski's answer got me thinking and there's also another way you can do this with a Counter (+1 for use of collections and thank you for sparking my interest)
from collections import Counter
def diff_one(w,z):
c=Counter(sorted(w+z)).values()
c=filter(lambda x:x%2!=0,c)
return len(c)==2
Basically all matched letters will have a counter value that will be even. So you filter those out and get left with the unmatched ones. If you have more than 2 unmatched then you have a problem.
Assuming this is similar to the part of the ladders game commonly used in AI, and you are trying to create a graph with adjacent nodes as possible words and not a dictionary.
d = {}
# create buckets of words that differ by one letter
for word in L:
for i in range(len(word)):
bucket = word[:i] + '_' + word[i+1:]
if bucket in d:
d[bucket].append(word)
else:
d[bucket] = [word]

How do I use Enumerate to to make each word into a number not each character

So I need to make a program that gets the user to enter a sentence, and then the code turns that sentence into numbers corresponding to it's position in the list, I cam across the command Enumerate here: Python using enumerate inside list comprehension but this gets every character not every word, so this is my code so far, can anyone help me fix this?
list = []
lists = ""
sentence= input("Enter a sentence").lower()
print(sentence)
list.append(lists)
print(lists)
for i,j in enumerate(sentence):
print (i,j)
Your sentence is string, so it is split to single chars. You should split it to words first:
for i,j in enumerate(sentence.split(' ')):
You can also try this:
>>> sentence = 'I like Moive'
>>> sentence = sentence.lower()
>>> sentence = sentence.split()
>>> for i, j in enumerate(sentence):
... print(i, j)

Categories

Resources