How can i improve this guessing algorithm? - python

I am trying to do a program that guess the word the user think, but right now the program is based only on elimination. Does anyone have an idea on how to make it better?
Here is a brief explanation on how it works now:
I have a list of words stored in "palavras.txt", these words are then transformed into a regular list.
First question is: "How much letters do your word have?". Based on that the program proceed to eliminate all the others words who do not have the same amount of letters. After that it creates a list that contains all the letters organized by the number of times they appear in the given position.
Then we have the second question: "Is the letter "x" the first letter of your word?". If the response is "not" it deletes all the words that contains that letter in that position, then goes to the second letter most used in that position and so on and so on. If yes it deletes all the words that doesn't contain that letter in that specific position and goes to the next letter of the word. And so on until the word is finished.
It works all the times, but sometimes it takes quite a lot of times. Is there a better way do it? AI? Machine learning maybe?
The code is not important since i'm just searching for ideas, but if anyone is curious here is how i did it:
import os
from unicodedata import normalize
import random
import string
# Define a função que retira os acentos das palavras
def remover_pont(txt):
import string
return txt.translate(str.maketrans('', '', string.punctuation))
def remover_acentos(txt):
return normalize('NFKD', txt).encode('ASCII', 'ignore').decode('ASCII')
# Retorna uma lista com as letras mais usadas naquela posição, em ordem
def letramusada(lista, pletra):
pletraordem = []
pletraordem2 = []
pl = []
for n in lista:
try:
pl.append(n[pletra - 1])
except:
pass
dict = {}
for k in pl:
if k in dict:
dict[k] += 1
else:
dict[k] = 1
pletraordem2 = (sorted(dict.items(), key=lambda t: t[1], reverse=True))
for c in pletraordem2:
pletraordem.append(c[0])
return pletraordem
# Lê o "banco de dados" que contém as palavras e as armazena na variável "palavras", sem acentos
file = open('palavras.txt')
palavras = file.read().split("\n")
# Armazena a quantidade de letras que a palavra pensada tem
nletras = int(input('Digite o número de letras da palavra (considerando hífen, caso haja) que você pensou, com máximo de 8: '))
# Declara listas que serão usadas em seguida
npalavras = []
palavras2 = []
palavras3 = []
# Armazena todas as palavras que contém a quantidade de letras escolhida anteriormente em uma nova lista chamada "nletras", desconsiderando pontos
for n in palavras:
if nletras == len(n):
npalavras.append(remover_acentos(n).lower())
c = 0
n = 0
for k in range(1, nletras + 1):
ordem = letramusada(npalavras, k)
cond = 0
try:
while cond == 0:
if len(npalavras) < 20 and c == 0:
print("\nHmmm, estou chegando perto!\n")
c += 1
if len(npalavras) < 3:
break
for c in ordem:
if c != 0:
r = str(input("A {} letra da sua palavra é a letra \"{}\"? [S/N] ".format(k, c))).lower()
r = r[0]
if r == "s":
for n in npalavras:
if n[k-1] == c:
palavras2.append(n)
npalavras.clear()
npalavras = palavras2[:]
palavras2.clear()
ordem.clear()
cond += 1
break
else:
for n in npalavras:
if n[k-1] != c:
palavras2.append(n)
npalavras.clear()
npalavras = palavras2[:]
palavras2.clear()
r = 0
pass
except:
n = 1
print("\nDesculpe, não achei nenhuma palavra :(")
escolha = random.choice(npalavras)
if n != 0:
print("\nA palavra que você pensou é: \"{}\"".format(escolha))

The Brute Force Be Your Friend
People may think "machine learning" is a silver bullet, but, what to learn? Especially when there's little information provided. What can you optimize? Your description sounds like a pure brute-force dictionary based password cracking, and hackers living in today are utilizing the power of GPU for that.
This may be a little off topic but even given a GPU the search can be hard. If you are not constrained to specific language / platform, the above link to hashcat is useful. The famous 133 MB dictionary can be enumerated in 5 minutes on a MacBookPro, which is way more powerful than guessing in Python.
The Search Space And Word Patterns
Also an average length for English words is about 8, this situation is really similar with a typical password. i.e. your search space is large - the upperbound is 26^8 = 208827064576 words! - except that player can only use a limited word list in the game.
The actual search space can be a little bit smaller since there are patterns in English words (like s is the most frequent alphabet and ae, as can appear more frequently than az things), but you are using a dictionary, so I don't think this can help.
The Non Dictionary Approach
And another idea is that the process can be quite close to recover a DNA sequence, which also has some patterns but the give information may vary. Think it as a word suggestion. Bioinfomatics uses the probabilistic patterns in DNA sequence for imputation.
This method can help when you can progressively guess the word / sequence. Otherwise, you can only use a brute force approach (when your word can only be recovered from a hash).
A classic method used for search engines, input methods and DNA imputation is hidden markov model. It guesses the next character based on your previous input, and the probability is a statistic value pre-calculated using real words.
This can be combined with dictionary to sort your suggestion (guess) and provide more accurate guessing.

you could store the words that have already been used, like say The first user used the word 'carro', then you could add that to a file, and after a few letters the program could check the list for already said words see if the word matches the description given i.e.: "has a c as first letter", and ask the next user if "carro" is their word, you could improve this further by adding a counter to each word, so that words that are more used appear on top of words that are less used.

There's another post that talks about word suggesting algorithm it even has the python code for it.
Here's the link What algorithm gives suggestions in a spell checker?

Related

Find alphabet neighbour letters in string (2 types: aABClg = ABC, aAbcd = aA)

EDIT: The problem, and the answer lies in using the enumerate function to access the index of iterables, but I'm still working on applying it properly.
I was asked to generate a random word with N length, and to print uppercase alphabet neighbours and lower - upper neighbours. I really don't know how to put this better.
The example is in the title, here is my code so far, and I think it works, I just need to fix the error made by the index search in the ascii_uppercase list variable.
Also, please excuse the messy do - while loop at the beginning.
import string
import random
letters = list(string.ascii_uppercase + string.ascii_lowercase)
enter = int(input('N: '))
def randomised(signs = string.ascii_uppercase + string.ascii_lowercase, X = enter):
return( ''.join(random.choice(signs) for _ in range(X)))
if enter == 1:
print('END')
while enter > 1:
enter = int(input('N: '))
word1 = randomised()
word2 = list(word1)
neighbour = ''
same = ''
for j in word2:
if word2[j] and word2[j+1] in string.ascii_uppercase and letters.index(j) == word2.index(j) and letters.index(j+1) == word2.index(j+1):
same += j
same += j+1
for i in word2:
if word2[i] == i.upper and word2[i+1] == (i+1).upper:
neighbour += i
neighbour += i+1
print('Created: {}, Neighbour uppercase letters: {}, Neighbour same letters: {}' .format(word1,same,neighbour))
expected behaviour:
N: 7
Created: aaBCDdD, Neighbour uppercase letters: BCD, Neighbour same letters: dD
N: 1
N: END
I am not so sure, but i think your problem might stem from the use of "i+1" and "j+1" without limiting the iterations stop before the last one.
The next thing is that you need to put this code in english for people to be able to provide better answers, i am not native english, but the core of my code is in english so it will be understandable worldwide, there are other general improvements that can be done, but those are learn while coding.
I hope your assignment goes great.
Another recommendation you can use "a".islower() to see if the character is lowercase, isupper() to se if it is uppercase, you don't need to import the whole alphabet. There are many builtin functions to deal with common scenarios and those tend to be more efficient than what most people would do without them.
Edit: the error is because you are doing string + number
Here is a simple working code (without formatting of the output). I used a simple pairwise iteration to compare each character with the previous one.
# generation of random word
N = 50
word = ''.join(random.choices(ascii_letters, k=N))
# forcing word for testing
word = 'aaBCDdD'
# test of conditions
cond1 = ''
cond2 = ''
flag1 = False # flag to add last character of stretch
for a,b in zip('_'+word, word+'_'):
if a.isupper() and b.isupper() and ord(a) == ord(b)-1:
cond1 += a
flag1 = True
elif flag1:
flag1 = False
cond1 += a
if a.islower() and b.isupper() and a.lower() == b.lower():
cond2 += a+b
print(cond1, cond2, sep='\n')
# BCD
# dD
NB. In case the conditions are met several times in the word, the identified patterns will just be concatenated
Example on random word of 500 characters:
IJRSOPHILMMNVW
kKdDyY

Convert string with accent into numbers (RSA encryption)

My math teacher asked us to program the RSA encryption/decryption process in python. So I've created the following function:
lettre_chiffre(T) which convert each character in the string into a number with the ord() function
chiffre_lettre(T) which does the opposite with chr()
And as these functions create 4 numbers blocks I need to encrypted in RSA with 5 numbers block to prevent frequency analysis.
The problem is the ord function doesn't works well with french accents "é" "à"...
Therefore, I was interested by using the bytearray method, but I have no idea how to use it.
How can I make this program works with accents. The encryption and decryption in byte with bytearray is not working with "é" and "à" for example.
python
def lettre_chiffre(T):
Message_chiffre = str('')
for lettre in T:
if ord(lettre) < 10000:
nombre = str(ord(lettre))
while len(nombre) != 4:
nombre = str('0') + nombre
Message_chiffre += nombre
else:
print("erreur lettre : ",lettre)
while len(Message_chiffre)%4 != 0:
Message_chiffre = str('0') + Message_chiffre
return str(Message_chiffre)
def chiffre_lettre(T):
Message_lettre = str('')
A =T
for i in range(int(len(str(A))/4)):
nombre = str(A)[4*i:4*i+4]
if int(nombre) < 10000:
Message_lettre += str(chr(int(nombre)))
return Message_lettre
Refer this post: https://stackoverflow.com/a/2788599
What you need is
>>> '\xc3\xa9'.decode('utf8')
u'\xe9'
>>> u = '\xc3\xa9'.decode('utf8')
>>> u
u'\xe9'
>>> ucd.name(u)
'LATIN SMALL LETTER E WITH ACUTE'

printing each letter of a word + another letter - python

I am doing this python problem where I have to get a word input from the user and then flip the word backwards and print the letters out backwards, adding a letter each time. I've made it so I can flip the word backwards. I know I can use the
for c in word
statement but I'm unsure how to make it so I can add a letter each time.
Below are the instructions and my code.
The childrens' song Bingo is from 1780!
In the song, each verse spells the name "Bingo", removing one letter
from the name each time.
​ When
writing this program, you'll need to work out a few things:
You need a way to reverse the dog's name. You need a loop to build up
the dog's name letter by letter. Each time you go through the loop you
add another letter to the reversed name.
An example:
Name: bingo o
og
ogn
ogni
ognib
And ognib was their name-o
Code I have:
name = input("Name: ")
reversed_text = ''
last_index = len(name) - 1
for i in range(last_index, -1, -1):
reversed_text += name[i]
print(reversed_text)
Thanks
Your answer was pretty much on point, all you needed to do was indent the last line, so it prints out reversed_text each time a letter is added to it.
name = input("Name: ")
reversed_text = ''
last_index = len(name) - 1
for i in range(last_index, -1, -1):
reversed_text += name[i]
print(reversed_text)
Python's reverse list slicing can help you here.
name = input()
for i in range(2,len(name)+2):
print(name[-1:-i:-1])
Output:
o
og
ogn
ogni
ogniB
Here is one way to do what you want. Note that I changed your method of reversing the string--my way used just one line. You may understand the slice method used here, reducing the index by 1 each time using the -1 in the slice.
The printing of the partial names uses a loop, with each iteration printing a slice of the reversed name. Let me know if you have any questions.
name = input("Name: ")
reversed_text = name[::-1]
for i in range(1, len(name) + 1):
print(reversed_text[:i])
print('And', reversed_text, 'was their name-o')
This prints:
o
og
ogn
ogni
ognib
And ognib was their name-o
I am not sure I understand all your requirements but this may be useful to you:
word = 'Bingo'
drow = ''
for c in reversed(word):
drow += c
print drow
Output:
o
og
ogn
ogni
ogniB

append item to list at a random index os initial list

i have a list of answers to a quiz i a making, i would like to make it multiple choice and in multiple choice quizes the answer is never at the bottom always or at the same index however this is what my code is
Answers = ["bogota", "carracas", "brasilia", "santiago", "london"]
Questions = ["colombia", "venezuela", "brasil", "chile", "england"]
q = [Questions[i] for i in sorted(random.sample(range(len(Questions)), 3))]
tryindex = [i for i, x in enumerate(QuestionsT) if x in q]
Ca = [Answers[i] for i in tryindex]
for x in q:
Pa = [i for i in random.sample(Answers, 3) if i !=q.index(x)]
Pa.append(Ca[q.index(x)])
print("what is the capital of:" + x + "?")
print("\n".join(Pa))
a = input("\n""Answer")
for i in range(0,3):
if a == Ca[i]:
score +=1
this returns eg for one iteration:
what is the capital of: colombia?
london
carracas
brasilia
santiago
bogota
notice that bogota is at the bottom due to the .append(Ca[q.insert(x)])
what i would like is that the answer in this case would be inserted into Ca(correct answers) randomly. is there a way to do this?
Answers means the genral list of all the answers possible Questions
means the general list of all questions possible
in both of the above lists each element is reference by its index so
that by finding the index of the element in Questions it is
possible to find the value held in Answers by the same index
q means questions selected randomly for quiz
Ca means correct answers for questions in q.
Pa means possible answers, randomly obtained from the genral array Answers.
Here is a clean solution, take a look, change what you want, ask for what you don't get.
# -*- coding: utf-8 -*-
# Imports
import random
# Parameters
data = {'Brasil': 'Brasilia',
'Chile': 'Santiago',
'Colombia': 'Bogota',
'England': 'London',
'Venezuela': 'Carracas'}
nbr_questions = 3
score = 0
former_questions = ['']
# Script
for i in range(nbr_questions):
# Grab the couple country / capital from the data dictionnary
capital = ''
while capital in former_questions:
country, capital = random.choice(list(data.items()))
# Create the proposition display list
proposition_display = list()
proposition_display.append(capital)
i = 0
while i < 2:
cap = random.choice(list(data.values()))
if cap not in proposition_display:
proposition_display.append(cap)
i += 1
# Display
print ('What is the capital of {} ?'.format(country))
answer = input ('Answer: ')
if answer.lower().strip(' ') == capital.lower().strip(' '):
print ('Correct!')
score += 1
else:
print ('Wrong. Answer was {}.'.format(capital))
print ('-----------------------------')
# Add this capital to the list former_questions to avoid repetition
former_questions.append(capital)
print ('Your score is: {}'.format(score))

Multiple if statements under one code, with multiple conditons [duplicate]

This question already has answers here:
How do boolean operators work in 'if' conditions?
(3 answers)
"or" condition causing problems with "if"
(4 answers)
Closed 9 years ago.
French countries names are feminine when they end with the letter E, masculine otherwise. There are 6 exceptions.(belize, cambodge, mexique, mozambique, zaire, zimbabwe) I am to write a program that takes in an input and adds le, or la infront depending on if it masculine or feminine.
Also, if the country names starts with a vowel it needs to print l' infront instead of le, or la.
One more condition. If the input is one of these two plural countries it is to print les infront.(etats-unis, pays-bas)
Here is my current code
vowels=("aeiouAEIOU")
word=input("Enter a french country :")
if word==("belize")or("cambodge")or("mexique")or("mozambique")or("zaire")or("zimbabe"):
print("le",word)
elif word==("etats-unis")or("pays-bays"):
print("les",word)
elif word.endswith("e"):
print("le",word)
else:
print("la",word)
if word.startswith(vowels):
print("l'",word)
The problem Im having is no matter what input I use it always prints le infront.
For example: Input Canada; Output le Canada.
Why is it not testing the other conditions?
It's because:
if word == "A" or "B":
isn't the same as
if word == "A" or word == "B":
The first evaluates (is word == "A") logical_or ("B")
So the first version always evaluations to true. Here's an example:
>>> X = "asdf"
>>> if(X):
... print("hurray")
...
hurray
>>>
Give this a shot
exceptions = set("belize cambodge mexique mozambique zaire zimbabwe".split())
vowels = set('aeiou')
plurals = set("etats-unis pays-bas".split())
word, sentinel = "", "quit"
while word != sentinel:
word = input("Enter the name of a country: ")
if word == sentinel:
continue
male = word in exceptions or word[-1].lower() not in vowels
plurality = word in plurals
apo = word[0].lower() in vowels
if apo:
print("l'%s" %word)
elif plurality:
print("les", word)
else:
print("le" if male else "la", word)
use:
if word in ["belize", "cambodge", "mexique", "mozambique", "zaire", "zimbabe"]:
print("le",word)
The problem here is that word==("belize")or("cambodge")or("mexique") is not doing what you think. There are lots of explanations of this around, but to get it to work, you either need to do what I have above or something like:
if word=="belize" or word=="cambodge" or word=="mexique": # etc

Categories

Resources