Dictionary comprehension produce different result from loop - python

The loop to check whether hand contain letter in word work as below.
hand = {'h': 1, 'e': 1, 'l': 2, 'o': 1}
word = 'hello'
extra_hand = hand.copy()
for letter in word:
extra_hand[letter] -= 1
>> extra_hand
{'h': 0, 'e': 0, 'l': 0, 'o': 0}
Then, I try to convert to Dictionary comprehension. It should look like this.
hand = {'h': 1, 'e': 1, 'l': 2, 'o': 1}
word = 'hello'
extra_hand = {letter:hand[letter] - 1 for letter in word}
>>extra_hand
{'h': 0, 'e': 0, 'l': 1, 'o': 0}
As you can see, the result is different, l is 1 which incorrect. I suspect that 'l' were derived from hand dictionary object without mutation. So, it just did 2-1 twice and become 1 rather than 2-1 and 1-1.
What should I do to fix the dictionary comprehension please?

A dictionary comprehension cannot be used in this recursive manner. It cannot continually update an item as word is iterated.
Another way to think of this is that the keys and values of your dictionary are not available for manipulation until the entire comprehension is complete.
You can consider the dictionary comprehension to be replicating the for loop below. As with the for loop, you will be setting values rather than adding to the value previously assigned to the key.
for letter in word:
extra_hand['letter'] = hand['letter'] - 1
Your loop is perfectly fine and there is no need to use a dictionary comprehension.
As an alternative, if you only wish to calculate non-zero counts, you can use collections.Counter:
from collections import Counter
hand = {'h': 1, 'e': 1, 'l': 2, 'o': 1}
word = 'hello'
res = Counter(hand) - Counter(word)
# Counter()
hand = {'h': 1, 'e': 2, 'l': 2, 'o': 1}
word = 'hello'
res = Counter(hand) - Counter(word)
# Counter({'e': 1})

Your both methods do not mean the same. If the dictionary comprehension method would be tranlated in loops, you would get
hand = {'h': 1, 'e': 1, 'l': 2, 'o': 1}
word = 'hello'
extra_hand = {}
for letter in word:
extra_hand[letter] = hand[letter] - 1
So, hand['l'] is never changed and therefore, it's still 2 when the loop reaches the second l. That's why you get the value 1 both times.
In my opinion, the loop variant is perfectly fine.

extra_hand = {letter:hand[letter] - 1 for letter in word}
is equivalent to:
for letter in word:
extra_hand[letter] = hand[letter] - 1
And not:
for letter in word:
extra_hand[letter] -= 1
In the first case, extra_hand['l'] equals to 1, while in the second case, you subtract 1 twice (which gives 0).

Related

How could one reduce the usage of helper functions in lambda expressions?

In this example I'm taking letters from a set and append them to a dictionary where the letter becomes the key and the literal 1 becomes the value to each pair.
def base_dict_from_set(s):
return reduce(lambda d,e : addvalue(1, e, d), s, dict())
def addvalue(value, key, d):
d[key] = value
return d
>>> base_dict_from_set(set("Hello World!".lower()))
{'o': 1, '!': 1, 'l': 1, 'd': 1, 'w': 1, ' ': 1, 'r': 1, 'e': 1, 'h': 1}
I was wondering whether I could somehow be rid of the 'addvalue' helper function and add the element and reference the modified dictionary within the lambda function itself.
The routine within addvalue itself seams very simple to me, so I would prefer something that looks like this:
def base_dict_from_set(s):
reutrn reduce(lambda d,e : d[e] = 1, s, dict())
I don't have a lot of experience in python and I come from a functional programming perspective. My goal is to understand pythons functional capabilities but I am too unexperienced to properly phrase and google what I am looking for.
What you are trying to do is why dict.fromkeys exists: create a dict that maps each key to the same constant value.
>>> dict.fromkeys("Hello World!".lower(), 1)
{'h': 1, 'e': 1, 'l': 1, 'o': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1, '!': 1}
There's no need to convert the string to a set first, since any duplicates will just be overwritten by the following occurrences.
(If the constant value is mutable, you should use the dict comprehension to ensure that each key gets its own mutable value, rather than every key sharing a reference to the same mutable value.)
You can use a dict comprehension for the same result:
{l: 1 for l in set("Hello World!".lower())}
To answer exactly the question asked, yes you can get rid of the addvalue by replacing addvalue(1, e, d) with {**d, e:1}.
Nevertheless, your code is still faulty. It is not counting the occurrences, but creates a dict of key: 1 for every letter in the string and it should create a dict of key: number_of_occurences to achieve this you should replace addvalue(1, e, d) with {**d, e: 1 + (d[e] if e in d else 0)} and not convert the string to set as it eliminates duplicates
I'm a bit surprised that you tried to use reduce when your goal is to transform each item in an input collection (the letters in a string) to an output collection (a key/value pair where the key is the letter and the value is a constant number), independently of each other.
In my view, reduce is for when an operation needs to be done to items in a sequence and taking all previous items into account (for instance, when calculating a sum of values).
So in a functional style, using map here would be more appropriate than reduce, in my opinion. Python supports this:
def quant_dict_from_set(s):
return dict(map(lambda c: (c, 1), s.lower()))
Where map converts the string to key/value pairs and the dict constructor collects these pairs in a dictionary, while eliminating duplicate keys at the same time.
But more idiomatic approaches would be to use a dictionary comprehension or the dict.fromkeys constructor.
Hacky and hard to read, but closest to the lambda you were trying to write, and hopefully educational:
>>> f = lambda d, e: d.__setitem__(e, 1) or d
>>> d = {}
>>> output = f(d, 42)
>>> output
{42: 1}
>>> output is d
True
Using __setitem__ avoids the = assignment.
__setitem__ returns None, so the expression d.__setitem__(e, 1) or d always evaluates to d, which is returned by the lambda.
You can use collections.Counter, a subclass of dict specifically for counting occurrences of elements.
>>> import collections
>>> collections.Counter('Hello, World!'.lower())
Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ',': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1, '!': 1})
>>> collections.Counter(set('Hello, World!'.lower()))
Counter({'w': 1, 'l': 1, 'r': 1, ',': 1, 'h': 1, 'd': 1, 'o': 1, 'e': 1, ' ': 1, '!': 1})
Note that Counter is appropriate if you want to count the elements, of if you want to initiate the values to the constant 1. If you want to initiate the values to another constant, then Counter will not be the solution and you should use a dictionary comprehension or the dict.fromkeys constructor.

Why this python character sequence code giving unexpected result?

I am writing a python program to find character sequence in a word. But the program is giving the unexpected result.
I have found a similar type program that works perfectly.
To me I think the two program is quite similar but dont know why one of them does not work
The program that is not working:
# Display the character sequence in a word
dict={}
string=input("Enter the string:").strip().lower()
for letter in string:
if letter !=dict.keys():
dict[letter]=1
else:
dict[letter]=dict[letter]+1
print(dict)
The program that is working:
def char_frequency(str1):
dict = {}
for n in str1:
keys = dict.keys()
if n in keys:
dict[n] += 1
else:
dict[n] = 1
return dict
print(char_frequency('google.com'))
The output for the first program is giving:
Enter the string:google.com
{'g': 1, 'c': 1, 'm': 1, 'o': 1, 'l': 1, '.': 1, 'e': 1}
The output for the second program is:
{'c': 1, 'e': 1, 'o': 3, 'g': 2, '.': 1, 'm': 1, 'l': 1}
The above is the correct output.
Now the questions in my mind.
i. Why the first program is not working correctly?
ii. Is the ideology of these two programs are different?
Actually, there's a little mistake is in the if statement you have used. Just have a look at the below modified program.
Note: Also make sure not to use pre-defined data type names like dict as variable names. I have changed that to d here.
>>> d = {}
>>>
>>> string=input("Enter the string:").strip().lower()
Enter the string:google.com
>>>
>>> for letter in string:
... if letter not in d.keys():
... d[letter] = 1
... else:
... d[letter] = d[letter] + 1
...
>>> print(d)
{'g': 2, 'o': 3, 'l': 1, 'e': 1, '.': 1, 'c': 1, 'm': 1}
>>>
You can have also have a look at the below statements executed on the terminal.
Comparing a key with d.keys() will always return False as key is a string here and d.keys() will always be an object of type dict_keys (Python3) and a list (Python2).
>>> d = {"k1": "v1", "k3": "v2", "k4": "Rishi"}
>>>
>>> d.keys()
dict_keys(['k1', 'k3', 'k4'])
>>>
>>> "k1" in d
True
>>>
>>> not "k1" in d
False
>>>
>>> "k1" == d.keys()
False
>>>
>>> "k1" not in d
False
>>>
Answers of your 2 questions:
Because the statement letter != dict.keys() is always True so no increment in key counts. Just change it to letter not in dict.keys(). And it is better to use d in place of dict so that the statement will look like letter not in d.keys().
Logic of both the programs are same i.e. iterating over the dictionary, checking for an existence of key in dictionary. If it does not exist, create a new key with count 1 else increment the related count by 1.
Thank you v. much.
This line is nonsensical:
if letter !=dict.keys():
letter is a length one str, while dict.keys() returns a key view object, which is guaranteed to never be equal to a str of any kind. Your if check is always false. The correct logic would be:
if letter not in dict:
(you could add .keys() if you really want to, but it's wasteful and pointless; membership testing on a dict is checking its keys implicitly).
Side-note: You're going to confuse the crap out of yourself by naming a variable dict, because you're name-shadowing the dict constructor; if you ever need to use it, it won't be available in that scope. Don't shadow built-in names if at all possible.

How to test list of inputs with PyTest in a sentinel while loop

I've been trying to test a list of inputs in Pytest using this function
def test_play_hand():
word_list = load_words()
hand = {'e': 2, 'u': 1, 'g': 1, 'm': 1, 'b': 1, 't': 1}
inputs = ['gum', 'beet', '.']
with mock.patch('builtins.input', return_value= next(iter(inputs))):
assert play_hand(hand, word_list) == 0
The function play_hand run a sentinel-based while loop that gets a dictionary, then asks users for a string input.
If the input is a ., the loop ends.
Otherwise, if the loop gets a string, it will check the hand and for the available characters and remove the characters used in the string from the hand.
The test works when mock.patch gets only one input.
How can you implement the test using a list or multiple inputs for testing?
Without iter() it gives an error of inputs not iterable and with iter() it just freezes.
I appreciate any input.
Edit: Forgot to mention that play_hand returns an int
Found the answer.
def test_play_hand():
word_list = load_words()
hand = {'e': 2, 'u': 1, 'g': 1, 'm': 1, 'b': 1, 't': 1}
inputs = ['gum', 'beet', '.']
with mock.patch('builtins.input', side_effect = inputs) :
assert play_hand(hand, word_list) == 12
I had to replace return_value with side_effect to run all the inputs in the test.
Link: https://docs.python.org/3/library/unittest.mock.html

Character frequency in python 2.7

I'm so stuck on this task. I have a task where I need to write a program in python 2.7 which prompts a user to input a string and then the program needs to return the number of times the letters in that string occur. for example the word "google.com" must return 'o': 3, 'g': 2, '.': 1, 'e': 1, 'l': 1, 'm': 1, 'c': 1
I know i need to use the list() function but all i have so far is:
string = raw_input("Enter a string: ")
newString = list(string)
and then i get stuck from there because I don't know how to make the program count the number of times the letters occur. I know there must be a for loop in the syntax but I'm not sure how I'm going to use it in this case.
NB: We haven't been introduced to dictionaries or imports yet so please keep it as simple as possible. Basically the most round about method will work best.
You can handle this problem directly with the help of count function.
You can start with an empty dictonary and add each character of the entered string and its count to the dictionary.
This can be done like this..!
string = raw_input("Enter a string: ")
count_dict = {}
for x in string:
count_dict[x] = string.count(x)
print count_dict
#input : google.com
# output : {'c': 1, 'e': 1, 'g': 2, 'm': 1, 'l': 1, 'o': 3, '.': 1}
Update:
Since you haven't been introduced to dictionary and imports, you can use the below solution.
for i in set(string):
print("'{}'".format(i), string.count(i), end=",")
Use Counter:
from collections import Counter
string = "google.com"
print(Counter(string))
Other way, create a dictionary and add chars looping through your string.
dicta = {}
for i in string:
if i not in dicta:
dicta[i] = 1
else:
dicta[i] += 1
print(dicta)

python list.count always returns 0

I have a lengthy Python list and would like to count the number of occurrences of a single character. For example, how many total times does 'o' occur? I want N=4.
lexicon = ['yuo', 'want', 'to', 'sioo', 'D6', 'bUk', 'lUk'], etc.
list.count() is the obvious solution. However, it consistently returns 0. It doesn't matter which character I look for. I have double checked my file - the characters I am searching for are definitely there. I happen to be calculating count() in a for loop:
for i in range(100):
# random sample 500 words
sample = list(set(random.sample(lexicon, 500)))
C1 = ['k']
total = sum(len(i) for i in sample) # total words
sample_count_C1 = sample.count(C1) / total
But it returns 0 outside of the for loop, over the list 'lexicon' as well. I don't want a list of overall counts so I don't think Counter will work.
Ideas?
If we take your list (the shortened version you supplied):
lexicon = ['yu', 'want', 'to', 'si', 'D6', 'bUk', 'lUk']
then we can get the count using sum() and a generator-expression:
count = sum(s.count(c) for s in lexicon)
so if c were, say, 'k' this would give 2 as there are two occurances of k.
This will work in a for-loop or not, so you should be able to incorporate this into your wider code by yourself.
With your latest edit, I can confirm that this produces a count of 4 for 'o' in your modified list.
If I understand your question correctly, you would like to count the number of occurrences of each character for each word in the list. This is known as a frequency distribution.
Here is a simple implementation using Counter
from collections import Counter
lexicon = ['yu', 'want', 'to', 'si', 'D6', 'bUk', 'lUk']
chars = [char for word in lexicon for char in word]
freq_dist = Counter(chars)
Counter({'t': 2, 'U': 2, 'k': 2, 'a': 1, 'u': 1, 'l': 1, 'i': 1, 'y': 1, 'D': 1, '6': 1, 'b': 1, 's': 1, 'w': 1, 'n': 1, 'o': 1})
Using freq_dist, you can return the number of occurrences for a character.
freq_dist.get('a')
1
# get() method returns None if character is not in dict
freq_dist.get('4')
None
It's giving zero because sample.count('K') will matches k as a string. It will not consider buk or luk.
If u want to calculate frequency of character go like this
for i in range(100):
# random sample 500 words
sample = list(set(random.sample(lexicon, 500)))
C1 = ['k']
total = sum(len(i) for i in sample) # total words
sample_count=sum([x.count(C1) for x in sample])
sample_count_C1 = sampl_count / total

Categories

Resources