function and dictionary format understanding - python

Hi I need to understand this line
freq[x] = freq.get(x,0) + 1
in below code and how it is working. I know function is returning dictionary but I need to know how this line exactly works.
def get_frequency_dict(sequence):
"""
Returns a dictionary where the keys are elements of the sequence
and the values are integer counts, for the number of times that
an element is repeated in the sequence.
sequence: string or list
return: dictionary
"""
# freqs: dictionary (element_type -> int)
freq = {}
for x in sequence:
freq[x] = freq.get(x,0) + 1
return freq

The line uses the dict.get() method, which returns either the value for the given key, or a default value.
So the line
freq[x] = freq.get(x,0) + 1
stores either 1 if x was not found in the dictionary (so freq.get(x, 0) returned 0) or it increments an already existing value. In effect, this counts all the values in sequence, only creating keys for any value when it first encounters that value. This saves you having to pre-set all possible values as keys with the value 0.
The whole function could be trivially replaced by a collections.Counter() instance:
from collections import Counter
def get_frequency_dict(sequence):
"""
Returns a dictionary where the keys are elements of the sequence
and the values are integer counts, for the number of times that
an element is repeated in the sequence.
sequence: string or list
return: dictionary
"""
return Counter(sequence)
Since Counter is a subclass of dict, the invariants stated in the documentation would still be satisfied.

As refered in the documentation, the method get of dictionnaries returns the value associated to a key (first parameter) if key is defined in the dict. Else, it returns the default value (second parameter).
get(dict, key, default=None):
if key in dict:
return dict[key]
return default
In your case, the function counts the number of occurences of each element in the sequence.

Related

why key is used in the tuple ? and what's syntax here

def LongestWord(sen):
nw = ""
for letter in sen:
if letter.isalpha() or letter.isnumeric():
nw += letter
else :
nw += " "
return max(nw.split(),key=len)
print(LongestWord("Hello world"))
what is the key=len means. key is used in dict right ? I can't understand the syntax here max(nw.split(), key=len) ?
You're right that dictionaries contain mappings from keys to values. In this particular case though, key is just one of the parameters of the max function. It allows the caller to specify a sort function. For more information, see https://docs.python.org/3/library/functions.html#max.
max means maximum, but what metric's maximum are you trying to find? That's where the key comes in. Here, the key is len (length) that is the you are trying to find the element with the highest length. In case of words you can not simply use greater than or less than, hence you need to specify a key with which you determine the pattern. For example:
>>> words = ['this','is','an','example']
>>> max(words, key=len)
'example'
You can think of the keys as the keys in dictionary, as they key here is len, the dict would be like:
{4: 'this', 2: 'an', 7: 'example'}
So it will return the value of the highest key (7), that is example.
You can also define custom keys:
>>> def vowels(word):
... '''this returns number of vowels
... in a word'''
... v = 'aeiou'
... ctr = 0
... for char in word:
... if char in v:
... ctr += 1
... return ctr
>>> words = ['standing','in','a','queue']
>>> max(words, key = vowels)
'queue'
The dictionary analogy would be:
{2:'standing', 1: 'a', 3: 'queue'}
So the answer will be queue
max(nw.split(),key=len)
Here, max(iterator, default, key=function) max function takes 3 arguments first is iterator like list, tuple, or dictonay.
Second parameter is default value to return if the iterator is empty, the second paramter is optional.
third parameter is a key word argument that is key=function we have to pass a function that take one parameter and our each value in the iterator is passed to this function so the on the bases of return value of this function our max() function gives output to us.
third parameter is also a optional paramter.
Here key is for the method max().
Since this function is to find the longest word in the string, you are trying to find the word with max length, hence key = len
Example:
max(111,222,333,444,555,999) = 999
max(111,222,333,444,555,999, key = lambda x:x%3 ) = 111

Variable not reassigned when changed in for loop

The goal of this code is to count the word that appears the most within the given list. I planned to do this by looping through the dictionary. If a word appeared a greater number of times than the value stored in the variable rep_num, it was reassigned. Currently, the variable rep_num remains 0 and is not reassigned to the number of times a word appears in the list. I believe this has something to do with trying to reassign it within a for loop, but I am not sure how to fix the issue.
def rep_words(novel_list):
rep_num=0
for i in range(len(novel_list)):
if novel_list.count(i)>rep_num:
rep_num=novel_list.count(i)
return rep_num
novel_list =['this','is','the','story','in','which','the','hero','was','guilty']
In the given code, 2 should be returned, but 0 is returned instead.
In you for loop you are iterating over the numbers and not list elements themselves,
def rep_words(novel_list):
rep_num=0
for i in novel_list:
if novel_list.count(i)>rep_num:
rep_num=novel_list.count(i)
return rep_num
You're iterating over a numeric range, and counting the integer i, none of which values exist in the list at all. Try this instead, which returns the maximum frequency, and optionally a list of words which occur that many times.
novel_list =['this','is','the','story','in','which','the','hero','was','guilty']
def rep_words(novel_list, include_words=False):
counts = {word:novel_list.count(word) for word in set(novel_list)}
rep = max(counts.values())
word = [k for k,v in counts.items() if v == rep]
return (rep, word) if include_words else rep
>>> rep_words(novel_list)
2
>>> rep_words(novel_list, True)
(2, ['the'])
>>> rep_words('this list of words has many words in this list of words and in this list of words is this'.split(' '), True)
(4, ['words', 'this'])
You've an error in your function (you're counting the index, not the value), write like this:
def rep_words(novel_list):
rep_num=0
for i in novel_list:
if novel_list.count(i)>rep_num: #you want to count the value, not the index
rep_num=novel_list.count(i)
return rep_num
Or you may try this too:
def rep_words(novel_list):
rep_num=0
for i in range(len(novel_list)):
if novel_list.count(novel_list[i])>rep_num:
rep_num=novel_list.count(novel_list[i])
return rep_num

How to index without triggering 'IndexError: invalid index to scalar variable.' error

I am trying to find the maximum value(s) in a dictionary and my approach is to iterate through and check for the greatest values and make a list of the greatest value/key pair and anything equivelent to it from a dictionary and I have the following...
def foo(class_sum, input_object): # Predicts the class x is a member of given probability dictionary
probabilities = calc_class_probs(class_sum, input_object) # Returns a dictionary with probability elements
best_label, best_perc = list(), list()
best_perc.append(-1)
print(best_perc[0])
for val in probabilities:
print(probabilities[val])
if probabilities[val] > best_perc[0]:
del best_label
del best_perc # Empty list in case unwanted elements are present
best_label, best_perc = val, probabilities[val] # update with new best
elif probabilities[val] == best_perc:
best_label.append(val)
best_perc.append(probabilities[val])
return best_label, best_perc
So from this I expect that if probabilities[val] > best_perc[0]: will evaluate to 7.591391076586993e-36 > -1, at least for the first iteration. However I get the following output from the print statements...
-1
7.591391076586993e-36
EDIT: it appears to fail on the second iteration, could this be due to the del statements?
0 7.591391076586993e-36
1 7.297754873023128e-36
...
IndexError: invalid index to scalar variable.
and this error...
if probabilities[val] > best_perc[0]:
IndexError: invalid index to scalar variable.
Why can it print the values but not index them here? Please note probabilities is a dictionary with keys that have single probability values like the below.
{'0': 7.59139e-36, '1': 7.2977e-36,...}
I believe your problem is here:
best_label, best_perc = val, probabilities[val]
After executing first 'if' best_lavel and best_perc is no longer list, so you cant access best_perc[0].
You can replace it by
best_label, best_perc = [val], [probabilities[val]]
Notice, when you iterate in a dictionary, you iterate through keys, not values.
You can alos iterate by keys and values:
for key, val in dictionary.items():

Python user defined sort

I am trying to implement a user defined sort function, similar to the python List sort as in list.sort(cmp = None, key = None, reverse = False) for example.
Here is my code so far
from operator import itemgetter
class Sort:
def __init__(self, sList, key = itemgetter(0), reverse = False):
self._sList = sList
self._key = key
self._reverse = reverse
self.sort()
def sort(self):
for index1 in range(len(self._sList) - 1):
for index2 in range(index1, len(self._sList)):
if self._reverse == True:
if self._sList[index1] < self._sList[index2]:
self._sList[index1], self._sList[index2] = self._sList[index2], self._sList[index1]
else:
if self._sList[index1] > self._sList[index2]:
self._sList[index1], self._sList[index2] = self._sList[index2], self._sList[index1]
List = [[1 ,2],[3, 5],[5, 1]]
Sort(List, reverse = True)
print List
I have a really bad time when it comes to the key parameter.
More specifically, I would like to know if there is a way to code a list with optional indexes (similar to foo(*parameters) ).
I really hope you understand my question.
key is a function to convert the item to a criterion used for comparison.
Called with the item as the sole parameter, it returns a comparable value of your choice.
One classical key example for integers stored as string is:
lambda x : int(x)
so strings are sorted numerically.
In your algorithm, you would have to replace
self._sList[index1] < self._sList[index2]
by
self._key(self._sList[index1]) < self._key(self._sList[index2])
so the values computed from items are compared, rather than the items themselves.
note that Python 3 dropped the cmp method, and just kept key method.
also note that in your case, using itemgetter(0) as the key function works for subscriptable items such as list (sorting by first item only) or str (sorting by first character only).

Python result of a strange function, that I can't reproduce in another language

My python function :
def searchMAX(Dict):
v=list(Dict.values())
return list(Dict.keys())[v.index(max(v))]
I can't reproduce it in java to understand what's its output
If I do :
myDico ={0:0.0}
myDico.update({1:1.2})
myDico.update({2:11.2})
myDico.update({3:17.2})
myMax = searchMAX(myDico)
print(*myMax, sep='\n')
I have this error :
TypeError: print() argument after * must be an iterable, not int
With print(myMax, sep'\n') only retun 3 not a list :( ?
Assuming that Dict is a Python dict then:
v = list(Dict.values())
make a list of the iterator over the values of Dict and assings it to v
Then
return list(Dict.keys())[v.index(max(v))]
make a list of the keys of Dict and and returns the key that has the maximum value associated with it by finding the index of the maximum value (v.index(max(v))) and using that index on the list.
Thus searchMAX return a key which in your case is always an integer and you cannot pass that to print() with a *. You should do:
print(Max, sep='\n')

Categories

Resources