Iterating through list, ignoring duplicates - python

I've written a program that attempts to find a series of letters (toBeFound - these letters represent a word) in a list of letters (letterList), however it refuses to acknowledge the current series of 3 letters as it counts the 'I' in the first list twice, adding it to the duplicate list.
Currently this code returns "incorrect", when it should return "correct".
letterList= ['F','I', 'I', 'X', 'O', 'R', 'E']
toBeFound = ['F', 'I', 'X']
List = []
for i in toBeFound[:]:
for l in letterList[:]:
if l== i:
letterList.remove(l)
List.append(i)
if List == toBeFound:
print("Correct.")
else:
print("Incorrect.")
letterList and toBeFound are sample values, the letters in each can be anything. I can't manage to iterate through the code and successfully ensure that duplicates are ignored. Any help would be greatly appreciated!

Basically, you're looking to see if toBeFound is a subset of letterList, right?
That is a hint to use sets:
In [1]: letters = set(['F','I', 'I', 'X', 'O', 'R', 'E'])
In [2]: find = set(['F', 'I', 'X'])
In [3]: find.issubset(letters)
Out[3]: True
In [4]: find <= letters
Out[4]: True
(BTW, [3] and [4] are different notations for the same operator.)

I think this would solve your problem. Please try it and let me know
letterList= ['F','I', 'I', 'X', 'O', 'R', 'E']
toBeFound = ['F', 'I', 'X']
found_list = [i for i in toBeFound if i in letterList]
print("Correct" if toBeFound == found_list else "Incorrect")

You could make the initial list a set, but if you want to look up a word like 'hello' it wont work because you'll need both l's.
One way to solve this is to use a dictionary to check and see how we are doing so far.
letterList = ['H', 'E', 'L', 'X', 'L', 'I', 'O']
toBeFound = ['H', 'E', 'L', 'L', 'O']
# build dictionary to hold our desired letters and their counts
toBeFoundDict = {}
for i in toBeFound:
if i in toBeFoundDict:
toBeFoundDict[i] += 1
else:
toBeFoundDict[i] = 1
letterListDict = {} # dictionary that holds values from input
output_list = [] # dont use list its a reserved word
for letter in letterList:
if letter in letterListDict: # already in dictionary
# if we dont have too many of the letter add it
if letterListDict[letter] < toBeFoundDict[letter]:
output_list.append(letter)
# update the dictionary
letterListDict[letter] += 1
else: # not in dictionary so lets add it
letterListDict[letter] = 1
if letter in toBeFoundDict:
output_list.append(letter)
if output_list == toBeFound:
print('Success')
else:
print('fail')

How about this: (I tested in python3.6)
import collections
letterList= ['F','I', 'I', 'X', 'O', 'R', 'E']
toBeFound = ['F', 'I', 'X']
collections.Counter(letterList)
a=collections.Counter(letterList) # print(a) does not show order
# but a.keys() has order preserved
final = [i for i in a.keys() if i in toBeFound]
if final == toBeFound:
print("Correct")
else:
print("Incorrect")

If you're looking to check if letterList has the letters of toBeFound in the specified order and ignoring repeating letters, this would be a simple variation on the old "file match" algorithm. You could implement it in a non-destructive function like this:
def letterMatch(letterList,toBeFound):
i= 0
for letter in letterList:
if letter == toBeFound[i] : i += 1
elif i > 0 and letter != toBeFound[i-1] : break
if i == len(toBeFound) : return True
return False
letterMatch(['F','I', 'I', 'X', 'O', 'R', 'E'],['F', 'I', 'X'])
# returns True
On the other hand, if what you're looking for is testing if letterList has all the letters needed to form toBeFound (in any order), then the logic is much simpler as you only need to "check out" the letters of toBeFound using the ones in letterList:
def lettermatch(letterList,toBeFound):
missing = toBeFound.copy()
for letter in letterList:
if letter in missing : missing.remove(letter)
return not missing

As requested.
letterList= ['F','I', 'I', 'X', 'O', 'R', 'E']
toBeFound = ['F', 'I', 'X']
List = []
for i in toBeFound[:]:
for l in set(letterList):
if l== i:
List.append(i)
if List == toBeFound:
print("Correct.")
else:
print("Incorrect.")
This prints correct. I made the letterList a set! Hope it helps.

One simple way is to just iterate through toBeFound, and look for each element in letterList.
letterList= ['F','I', 'I', 'X', 'O', 'R', 'E']
toBeFound = ['F', 'I', 'X']
found = False
for x in letterList:
if x not in toBeFound:
found = False
break
if found:
print("Correct.")
else:
print("Incorrect.")

Related

Pangram detection

beginner here--
Given a string, my code must detect whether or not it is a pangram. Return True if it is, False if not.It should ignore numbers and punctuation.
When given "ABCD45EFGH,IJK,LMNOPQR56STUVW3XYZ" it returns none and when given "This isn't a pangram! is not a pangram." it returns True when the answer should be False.
This isn't a pangram! is not a pangram. What am I not seeing?
import string
def is_pangram(s):
singlechar = set(s)
list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
for index, item in enumerate(singlechar):
if item in list:
list.remove(item)
if list:
return True
break
if not list:
return False
Sets are a great way to check whether something belongs in two collections with their intersection or doesn't belong in one of the two with their difference.
In your case, if the intersection between the set of the letters in your phrase and the letters a-z is of length 26, it is a pangram.
from string import ascii_lowercase
def is_pangram(s):
return len(set(s.lower()).intersection(ascii_lowercase)) == 26
You could have just continued to use sets and their method .difference to find out if there were more characters in the set of all characters or there were no differences (before that you would need to strip the string from punctuation (and whitespace) and make it lowercase (done by .lower and .translate and .maketrans methods of strings)):
import string
def is_pangram(s):
input_set = set(s.lower().translate(
str.maketrans('', '', f'{string.punctuation} ')))
check_set = set(string.ascii_lowercase)
return not check_set.difference(input_set)
value1 = 'The quick brown fox jumps over a lazy dog!'
print(is_pangram(value1))
# True
value2 = 'This isn\'t a pangram! is not a pangram'
print(is_pangram(value2))
# False
If you want to still do it with a list:
def is_pangram(s):
input_set = set(s.lower().translate(
str.maketrans('', '', f'{string.punctuation} ')))
lst = list(string.ascii_lowercase)
for item in input_set:
if item in lst:
lst.remove(item)
if not lst:
return True
return False

How do I return a value within a loop within a function?

I have this almost figured it out but there is one thing. Basically I want to return a string without a vowel (a common challenge I guess). This is similar to other challenges on CodeWars I have done, still uncompleted due to this. I have a for loop within a function. I call the function to return value.
For some reason, I'm returning empty or rather "None", yet I get the result I wanted by printing. On the same line and indenting.
This is for a Codewar challenge, so I need to return values instead of , printing, logging (I know). I asked for a friend, hours of researching but nothing could help me.
def disemvowel(string):
#aeiou
vowel = ['a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U']
aList = list(string) #'' to [...]
for x in aList:
for y in vowel:
if x == y:
#print(x)
aList.remove(x)
print(''.join(aList)) # "Ths wbst s fr lsrs LL!"
return(''.join(aList)) # Nothing shows up here...
I expect the output of "Ths wbst s fr lsrs LL!" by returning but I get None.
https://www.codewars.com/kata/52fba66badcd10859f00097e/train/python
Source ^
To remove vowels from strings, the quickest way would be to use str.replace.
def remove_vowels(msg):
vowels = ['a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U']
for vowel in vowels:
msg = msg.replace(vowel, '')
return msg
Use a list comprehension:
def remove_vowels(msg):
return ''.join(c for c in msg if c.lower() not in {'a', 'e', 'i', 'o', 'u'})
Examples:
>>> remove_vowels("Lorem ipsum dolor sit amet.")
'Lrm psm dlr st mt.'
>> remove_vowels("This is it")
'Ths s t'
>>> remove_vowels("This website is for losers LOL!")
'Ths wbst s fr lsrs LL!'

How can method which evaluates a list to determine if it contains specific consecutive items be improved?

I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)
#Example nestedList:
nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]
I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.
I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.
def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False
I will do this,
bad = ['O', 'I']
for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)
#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.
I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.
nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]
#first create a map
pairdict = dict()
for i in range(len(nestedList)):
for j in range(len(nestedList[i])-1):
pair1 = (nestedList[i][j],nestedList[i][j+1])
if pair1 in pairdict:
pairdict[pair1].append(i+1)
else:
pairdict[pair1] = [i+1]
pair2 = (nestedList[i][j+1],nestedList[i][j])
if pair2 in pairdict:
pairdict[pair2].append(i+1)
else:
pairdict[pair2] = [i+1]
del nestedList
print(pairdict.get(('e','z'),None))
create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.
I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).
import re
p = re.compile('[OI]{2}|[IO]{2}') # match only OI or IO
def is_bad(pattern, to_check):
for item in to_check:
maybe_found = pattern.search(''.join(item))
if maybe_found:
yield True
else:
yield False
l = list(is_bad(p, nestedList))
print(l)
# [True, False, True]

When to break up a function

Currently I am very new to python programming, been working on it for a while, I came across a little project task which was to make a program that takes all of the vowels out of a statement so I decided to try it out. I came up with a program but it seems that it only takes out the vowels sometimes, I find this very weird and I would like to ask for some assistance in solving it.
def anti_vowel(text):
list = ['a', 'e', 'i', 'o', 'u']
big_list = ['A', 'E', 'I', 'O', 'U']
list_word = []
for f in text:
list_word.append(f)
for vowel in list:
for letter in list_word:
if vowel == letter:
list_word.remove(vowel)
for vowel in big_list:
for letter in list_word:
if vowel == letter:
list_word.remove(vowel)
new_word = ''.join(list_word)
return new_word
print anti_vowel("uuuUUUUUIIIIiiiIiIoOuuooouuUOUUuooouU")
This statement as it sits prints out 'IiIuUUuoouU', but if I add more iterations over the lists using more for statements it decreases the amount of shown letters. Can someone tell me why this might be?
A little improvement:
list = ['a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U']
def anti_vowel(text):
return ''.join([x for x in text if x not in list])
print anti_vowel("uuuUUUUUIIIIiiiIiIoOuuooouuUOUUuooouU")
Abend's solution with little edits that make it correct
vowels = ['a','e','i','o','u','A','E','I','O','U']
def no_vowels(str1):
return ''.join([char for char in list(str1) if char not in vowels])
print no_vowels('finished')
Gives
fnshd
Process finished with exit code 0
This code is the correct implementation
def anti_vowel(c):
newstr = c
vowels = ('a', 'e', 'i', 'o', 'u')
for x in c.lower():
if x in vowels:
newstr = newstr.replace(x,"")
return newstr

How to properly set fst rules

I got in touch with tranducers and python, so i use default FST library. For example, I have a list ['a','b','c']. I need to replace 'b' if it is followed by 'c'. I make following rules, but it works only if 'b' is between 'a' and 'c' and only with this length of array.
from fst import fst
list = ['a','b','c']
t = fst.FST('example')
for i in range(0,len(list)):
t.add_state(str(i))
t.initial_state = '0'
t.add_arc('0','0',('a'),('a'))
t.add_arc('0','1',('b'),('d'))
t.add_arc('1','1',('c'),('c'))
t.set_final('1')
print t.transduce(list)
I got ['a','d','c']
I need to be able replace 'b' with 'd' wherever it is.
e.g. replace 'b' when followed by 'l'
['m','r','b','l'] => ['m','r','o','l']
['m','b','l'] => ['m','o','l']
['b','l','o'] => ['o','l','o']
Please help me, thanks!
Consider this function...
lists = [['m','r','b','l'],
['m','b','l'],
['b','l','o'],
['b','m','o']]
def change(list_, find_this, followed_by, replace_to):
return_list = list_.copy()
idx = list_.index(find_this)
if list_[idx+1] == followed_by:
return_list = list_.copy()
return_list[idx] = replace_to
return return_list
for lst in lists:
print(change(lst, 'b', 'l', 'o'))
''' output:
['m', 'r', 'o', 'l']
['m', 'o', 'l']
['o', 'l', 'o']
['b', 'm', 'o']
'''
You should add other pertinent validations, though.

Categories

Resources