Comparing the Nth letter to Nth letters of multiple strings in python

Comparing the Nth letter to Nth letters of multiple strings in python - python

I can't quite figure this one out.
I have multiple five letter long strings and I want to compare each of the letters of the strings to a single string, and then to know if any of the Nth letters of the strings are equal to the Nth letter of the string I'm comparing them to, like this:
string_1 = 'ghost'
string_2 = 'media'
string_3 = 'blind'
the_word = 'shine'
if the_word[0] == string_1[0] or the_word[0] == string_2[0] or the_word[0] == string_3[0] or the_word[1] == string_1[1] or the_word[1] == string_2[1]... and so on...
print('The Nth letter of some of the strings is equal to the Nth letter of the_word')
else:
print('None of the letters positions correspond')
If there are multiple strings I want to compare the if statement gets very long so there must be a better way of doing this.
I would also like to know what the corresponding letters are (in this case they would be H (string_1[1] == the_word[1]), I (string_3[2] == the_word[2]) and N (string_3[3] == the_word[3])
If there are more than one corresponding letters I would like the return to be list containing all of the letters.
Also I dont need to know if the corresponding letter was the first or whatever the letters position in the word is, only if there are any (and what) corresponding letters.
I find this kind of hard to explain so sorry for possible confusion, will be happy to elaborate.
Thank you!

IIUC, you can get to what you want using zip -
base_strings = zip(string_1, string_2, string_3)
for cmp_pair in zip(the_word, base_strings):
if (cmp_pair[0] in cmp_pair[1]):
print(cmp_pair[0])
Output
h
i
n

You can extract the logic to a dedicated function and call it over each character of the string to be checked:
string_1 = 'ghost'
string_2 = 'media'
string_3 = 'blind'
the_word = 'shine'
def check_letter(l, i, words):
match = []
for w in words:
if w[i] == l:
match.append(w)
return match
for i in range(len(the_word)):
l = the_word[i]
print("checking letter: {}".format(l))
match = check_letter(l, i, [string_1, string_2, string_3])
if (len(match) > 0):
print("found in: {}".format(match))
else:
print("found in: -")
The above code results in:
$ python3 test.py
checking letter: s
found in: -
checking letter: h
found in: ['ghost']
checking letter: i
found in: ['blind']
checking letter: n
found in: ['blind']
checking letter: e
found in: -

Maybe this answers your question:
strings = ['ghost', 'media', 'blind']
the_word = 'shine'
for s in strings:
check = []
lett = []
for i in range(len(s)):
if s[i] == the_word[i]:
check.append(i)
lett.append(s[i])
if check:
print('The letters {0} (position {1}) of the string {2} match to
the word {3}'.format(lett,check,s,the_word))
else:
print('No match between {0} and {1}'.format(s,the_word))

Well one straight forward way would be the following:
string_1 = 'ghost'
string_2 = 'media'
string_3 = 'blind'
string_4 = 'trenn'
the_word = 'shine'
string_list = [string_1, string_2, string_3]
duplicate_letters_list = []
for string in string_list:
for i in range(5):
if the_word[i] == string[i]:
print(f'{i}th letter is in {string} is a duplicate')
if the_word[i] not in duplicate_letters_list:
duplicate_letters_list.append(the_word[i])
print(duplicate_letters_list)
Output
1th letter is in ghost is a duplicate
2th letter is in blind is a duplicate
3th letter is in blind is a duplicate
['h', 'i', 'n']

Related

split a string to have chunks containing the maximum number of possible characters

e.g. string = 'bananaban'
=> ['ban', 'anab', 'an']
My attempt:
def apart(string):
letters = []
for i in string:
while i not in letters:
letters.append(i)
print("The letters are:" +str(letters))
x = []
result = []
return result
string = str(input("Enter string: "))
print(apart(string)
Basically, If I know all the letters that are in the word/string, I want to add them into x, until x contains all letters. Then I want to add x into result.
In my examaple "bananaban" it would mean [ban] is one x, because "ban" countains the letter "b","a" and "n". Same goes for [anab]. [an] only contains "a" and "n" because it is the end of the word.
Would be cool if somebody could help me ^^

IIUC, you want to split after all characters are in the current chunk.
You could use a set to keep track of the seen characters:
s = 'bananaban'
seen = set()
letters = set(s)
out = ['']
for c in s:
if seen != letters:
out[-1] += c
seen.add(c)
else:
seen = set(c)
out.append(c)
output: ['ban', 'anab', 'an']

The logical way seens to be first create a set with all letters in your string, then go over teh original one, collecting each character, and startign a new collection each time the set of letters in the collection match the original.
def apart(string):
target = set(string)
result = []
component = ""
for char in string:
component += char
if set(component) == target:
result.append(component)
component = ""
if component:
result.append(component)
return result

Using a set of the characters in the string, you can loop through the string and add or extend the last group in your resulting list:
S = "bananaban"
chars = set(S) # distinct characters of string
groups = [""] # start with an empty group
for c in S:
if chars.issubset(groups[-1]): # group contains all characters
groups.append(c) # start a new group
else:
groups[-1] += c # append character to last group
print(groups)
['ban', 'anab', 'an']

How to replace the specified dash with the letter

I wish to write a hangman program and in order to do so, I have to replace the hash ('-') letter(s) with the user's guessed letter (guess). But when I run the code, it replaces all the hashes with the user's guess letter.
The code seems okay but I don't get the desired result.
words is a list of words I have written before the function.
def word_guess():
random.shuffle(words)
word = words[0]
words.pop(0)
print(word)
l_count = 0
for letter in word:
l_count += 1
# the hidden words are shown a '-'
blank = '-' * l_count
print(blank)
guess = input("please guess a letter ")
if guess in word:
# a list of the position of all the specified letters in the word
a = [i for i, letter in enumerate(word) if letter == guess]
for num in a:
blank_reformed = blank.replace(blank[num], guess)
print(blank_reformed)
word_guess()
e.g: when the word is 'funny', and guess is 'n', the output is 'nnnnn'.
How should I replace the desired hash string with guess letter?

it replaces all the hashes
This is exactly what blank.replace is supposed to do, though.
What you should do is replace that single character of the string. Since strings are immutable, you can't really do this. However, lists of strings are mutable, so you could do blank = ['-'] * l_count, which would be a list of dashes, and then modify blank[num]:
for num in a:
blank[num] = guess
print(blank)

A couple things to note:
inefficient/un-pythonic pop operation (see this)
l_count is just len(word)
un-pythonic, unreadable replacement
Instead, here's a better implementation:
def word_guess() -> str:
random.shuffle(words)
word = words.pop()
guess = input()
out = ''
for char in word:
if char == guess:
out.append(char)
else:
out.append('-')
return out

If you don't plan to use the locations of the correct guess later on, then you can simplify the last section of code:
word = 'hangman'
blank = '-------'
guess = 'a'
if guess in word:
blank_reformed = ''.join(guess if word[i] == guess else blank[i] for i in range(len(word)))
blank_reformed
'-a---a-'
(You still have some work to do make the overall game work...)

python given query string find a set of strings with same beginning

Edit: I appreciate all the answers but could anyone tell me why my solution is not working? I wanted to try to do this without the .startswith() thank you!
I am trying to complete this excercise:
Implement an autocomplete system. That is, given a query string and a set of all possible query strings,
return all strings in the set that have s as a prefix.
For example, given the query string de and the set of strings [dog, deer, deal], return [deer, deal].
Hint: Try preprocessing the dictionary into a more efficient data structure to speed up queries.
But I get a empty list. What could I be doing wrong? I thought this would give me [deer, deal]
def autocomplete(string,set):
string_letters = []
letter_counter = 0
list_to_return = []
for letter in string:
string_letters.append(letter)
for words in set:
for letter in words:
if letter_counter == len(string):
list_to_return.append(words)
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
output:
[]
Edit: I appreciate all the answers but could anyone tell me why my solution is not working? I wanted to try to do this without the .startswith() thank you!

Here is how I would accomplish what you are trying to do:
import re
strings = ['dog', 'deer', 'deal']
search = 'de'
pattern = re.compile('^' + search)
[x for x in strings if pattern.match(x)]
RESULT: ['deer', 'deal']
However in most cases with a use case such as this, you might want to ignore the case of the search string and search field.
import re
strings = ['dog', 'Deer', 'deal']
search = 'De'
pattern = re.compile('^' + search, re.IGNORECASE)
[x for x in strings if pattern.match(x)]
RESULT: ['Deer', 'deal']
To answer the part of why your code does not work, it helps to add some verbosity to the code:
def autocomplete(string,set):
string_letters = []
letter_counter = 0
list_to_return = []
for letter in string:
string_letters.append(letter)
for word in set:
print(word)
for letter in word:
print(letter, letter_counter, len(string))
if letter_counter == len(string):
list_to_return.append(word)
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
print('hit break')
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
Output:
dog
('d', 0, 2)
('o', 1, 2)
hit break
deer
('d', 1, 2)
hit break
deal
('d', 1, 2)
hit break
[]
As you can see in the output for dog 'd matched but o did not', this made the letter_counter 1, then upon deer 'd != 'e' so it breaks... This perpetuates over and over. Interestingly setting 'ddeer' would actually match due this behavior. To fix this you need to reset the letter_counter in the for loop, and have additional break points to prevent over-reving your indexes.
def autocomplete(string,set):
string_letters = []
list_to_return = []
for letter in string:
string_letters.append(letter)
for word in set:
# Reset letter_counter as it is only relevant to this word.
letter_counter = 0
print(word)
for letter in word:
print(letter, letter_counter, len(string))
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
# We did not match break early
break
if letter_counter == len(string):
# We matched for all letters append and break.
list_to_return.append(word)
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))

I notice the hint, but it's not stated as a requirement, so:
def autocomplete(string,set):
return [s for s in set if s.startswith(string)]
print(autocomplete("de", ["dog","deer","deal"]))
str.startswith(n) will return a boolean value, True if the str starts with n, otherwise, False.

You can just use the startswith string function and avoid all those counters, like this:
def autocomplete(string, set):
list_to_return = []
for word in set:
if word.startswith(string):
list_to_return.append(word)
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))

Simplify.
def autocomplete(string, set):
back = []
for elem in set:
if elem.startswith(string[0]):
back.append(elem)
return back
print(autocomplete("de", ["dog","deer","deal","not","this","one","dasd"]))

Alternatives to index()

So for my project I have to allow the user to input a sentence and then input a word and find all the occourunces of the word and print the numbers. Here's what I have
found = 0
sen = input("Enter the sentence you would like to break down!")
sen1 = sen.upper()
list = sen1.split()
search=input("Enter the word you want to search")
search1 = search.upper()
for search1 in list:
found = found + 1
position=list.index(search1)
if position == 0:
print("First word in the sentence")
if position == 1:
print("Second word in the sentence")
if position == 2:
print("Third word in the sentence")
if position == 3:
print("Fourth word in the sentence")
if position == 4:
print("Fifth word in the sentence")
if position == 5:
print("6th word in the sentence")
else:
position1 = position + 1
print(position1, "th word in the sentence")
but it only prints the first occurunce of the word and rarely works. Any solutions?

Replace list with a_list.
List of positions of a search1 occurances:
positions = [idx for idx, el in enumerate(a_list) if el == search1]

You have a great alternative which is re.finditer:
import re
sen = input("Enter the sentence you would like to break down!")
search = input("Enter the word you want to search")
for match in re.finditer(search, sen):
print (match.start())

Several comments have mentioned the danger of using list as a variable name. It's not actually a reserved word, but it is the name of a built-in type, and shadowing it by using it as a variable name can lead to mysterious bugs if you later wish to use this type to construct a list or test the type of an object.
A major problem with the code you posted is here:
search1 = search.upper()
for search1 in list:
The first line saves the upper-case version of the string search to the name search1. But the next line simply clobbers that with the words in list; it does not perform any searching operation. At the end of the for loop, search1 will be equal to the last item in list, and that's why your code isn't doing what you expect it to when it executes position=list.index(search1): you're telling it to find the position of the last word in list.
You could use .index to do what you want. To find multiple occurences you need to use a loop and pass .index a starting position. Eg,
def find_all(wordlist, word):
result = []
i = 0
while True:
try:
i = wordlist.index(word, i) + 1
result.append(i)
except ValueError:
return result
However, there's really not much benefit in using .index here..index performs its scan at C speed, so it's faster than scanning in a Python loop but you probably won't notice much of a speed difference unless the list you're scanning is large.
The simpler approach is as given in Tomasz's answer. Here's a variation I wrote while Tomasz was writing his answer.
def ordinal(n):
k = n % 10
return "%d%s" % (n, "tsnrhtdd"[(n // 10 % 10 != 1) * (k < 4) * k::4])
def find_all(wordlist, word):
return [i for i, s in enumerate(wordlist, 1) if s == word]
sen = 'this has this like this'
wordlist = sen.upper().split()
words = 'this has that like'
for word in words.split():
pos = find_all(wordlist, word.upper())
if pos:
pos = ', '.join([ordinal(u) for u in pos])
else:
pos = 'Not found'
print('{0}: {1}'.format(word, pos))
output
this: 1st, 3rd, 5th
has: 2nd
that: Not found
like: 4th
The code for ordinal was "borrowed" from this answer.

Python wordlist

I would like to compare the input letters(dictionary) with the list(textfile with words) and print the words matching the inputed letters. What have I done wrong?(I know i only have a print YES or NO-function if it finds a matching word at the moment. What's the best way to create this function by the way?).
def ordlista(list):
fil = open("ord.txt", "r")
words = fil.readlines()
list = []
for w in words:
w = w.strip()
list.append(w)
return list
chars = {}
word = raw_input("Write 9 letters: ")
for w in word:
w = w.lower()
if w not in chars:
chars[w] = 1
else:
chars[w] += 1
if chars.keys() in ordlista(list):
print "YES"
else:
print "NO"

chars.keys() is a list, so
chars.keys() in ordlista(list):
will never be True. What you want is match the letter counts against each word in your list. So I'd suggest
charsum = sum(chars.values())
for word in wordlist:
if len(word) == charsum and all([(word.count(c) == chars[c]) for c in chars]):
print "YES for word '%s'" % word
EDIT: If you want those words to match which have at least the letter counts (i.e. a word with 3 a's will match an input of two a's), then you'll have to change the == to a >=.
EDIT2: Since you want exact matches, the easiest solution would be to count the number of chars and make sure the word has that length.

You are checking for the presence of the entire list of keys in your character list, rather than checking for each key individually. You must iterate over your keys individually, and then check for their presence.
for k in chars:
if k in ordlista(list):
print "YES"
else:
print "NO"
If you want to print the words which consist solely of the letters in your character list, you may use the following approach.
for word in ordlista(list):
if not filter(lambda char: char not in chars, word):
print word

Use sets:
chars = set(raw_input("Write 9 letters: "))
for word in ordlista(None):
if(set(word) == chars):
print "YES for '%s'" % word
BTW, the argument list to ordlista is unnecessary, as it is not used. I would also advise against using the name list in general, because it hides the built-in <type 'list'>
Update: I have read your comment on jellybean's post. If every letter can only be used once, you can obviously not use sets!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparing the Nth letter to Nth letters of multiple strings in python - python

IIUC, you can get to what you want using zip - base_strings = zip(string_1, string_2, string_3) for cmp_pair in zip(the_word, base_strings): if (cmp_pair[0] in cmp_pair[1]): print(cmp_pair[0]) Output h i n

Related

split a string to have chunks containing the maximum number of possible characters

How to replace the specified dash with the letter

python given query string find a set of strings with same beginning

Alternatives to index()

Python wordlist

Categories

Resources