Merging two Python dictionaries but remaining the order [duplicate] - python

I'm just starting to play around with Python (VBA background). Why does this dictionary get created out of order? Shouldn't it be a:1, b:2...etc.?
class Card:
def county(self):
c = 0
l = 0
groupL = {} # groupL for Loop
for n in range(0,13):
c += 1
l = chr(n+97)
groupL.setdefault(l,c)
return groupL
pick_card = Card()
group = pick_card.county()
print group
here's the output:
{'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4, 'g': 7, 'f': 6, 'i': 9, 'h': 8, 'k': 11, 'j': 10, 'm': 13, 'l': 12}
or, does it just get printed out of order?

Dictionaries have no order in python. In other words, when you iterate over a dictionary, the order that the keys/items are "yielded" is not the order that you put them into the dictionary. (Try your code on a different version of python and you're likely to get differently ordered output). If you want a dictionary that is ordered, you need a collections.OrderedDict which wasn't introduced until python 2.7. You can find equivalent recipes on ActiveState if you're using an older version of python. However, often it's good enough to just sort the items (e.g. sorted(mydict.items()).
EDIT as requested, an OrderedDict example:
from collections import OrderedDict
groupL = OrderedDict() # groupL for Loop
c = 0
for n in range(0,13):
c += 1
l = chr(n+97)
groupL.setdefault(l,c)
print (groupL)

Related

Does collection's Counter keeps data sorted?

I was reading python collections's Counter. It says following:
>>> from collections import Counter
>>> Counter({'z': 9,'a':4, 'c':2, 'b':8, 'y':2, 'v':2})
Counter({'z': 9, 'b': 8, 'a': 4, 'c': 2, 'y': 2, 'v': 2})
Somehow these printed values are printed in descending order (9 > 8 > 4 > 2). Why is it so? Does Counter store values sorted?
PS: Am on python 3.7.7
In terms of the data stored in a Counter object: The data is insertion-ordered as of Python 3.7, because Counter is a subclass of the built-in dict. Prior to Python 3.7, there was no guaranteed order of the data.
However, the behavior you are seeing is coming from Counter.__repr__. We can see from the source code that it will first try to display using the Counter.most_common method, which sorts by value in descending order. If that fails because the values are not sortable, it will fall back to the dict representation, which, again, is insertion-ordered.
The order depends on the python version.
For python < 3.7, there is no guaranteed order, since python 3.7 the order is that of insertion.
Changed in version 3.7: As a dict subclass, Counter inherited the
capability to remember insertion order. Math operations on Counter
objects also preserve order. Results are ordered according to when an
element is first encountered in the left operand and then by the order
encountered in the right operand.
Example on python 3.8 (3.8.10 [GCC 9.4.0]):
from collections import Counter
Counter({'z': 9,'a':4, 'c':2, 'b':8, 'y':2, 'v':2})
Output:
Counter({'z': 9, 'a': 4, 'c': 2, 'b': 8, 'y': 2, 'v': 2})
how to check that Counter doesn't sort by count
As __str__ in Counter return the most_common, it is not a reliable way to check the order.
Convert to dict, the __str__ representation will be faithful.
c = Counter({'z': 9,'a':4, 'c':2, 'b':8, 'y':2, 'v':2})
print(dict(c))
# {'z': 9, 'a': 4, 'c': 2, 'b': 8, 'y': 2, 'v': 2}

How does this for loop work on the following string?

I have been working through Automate the Boring Stuff by Al Sweighart. I'm struggling with understanding the code below:
INPUT
message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
print(count)
OUTPUT
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}
QUESTION
Since it does not matter what the variable in a for loop is called (ie character can be changed to x, pie etc) how does the code know to run the loop through each character in the string?
It's not about the variable's name, it's about the object this variable points to. The implementation of the loop in the Python virtual machine knows how to iterate over objects based on their types.
Iterating over something is implemented as iterating over iter(something), which in turn is the same as iterating over something.__iter__(). Different classes implement their own versions of __iter__, so that loops work correctly.
str.__iter__ iterates over the individual characters of a string, list.__iter__ - over the list's elements and so on.
You could create your own object and iterate over it:
class MyClass:
def __iter__(self):
return iter([1,2,3,4])
my_object = MyClass()
for x in my_object:
print(x)
This will print the numbers from 1 to 4.
A string is an array in python. So, it means that when you loop on a string, you loop on each character; in your case, you set what has been read to character.
Then, setdefault maps character to 0 if character is not yet in the dict. The rest looks quite straightforward.
Strings in python are sequences of chars : https://docs.python.org/3/library/stdtypes.html#textseq. Therefore, the for c in m: line iterate on every elements of the m sequence, i.e. on every character of the string

Assign integers to alphabets and add those integers

I am trying to assign numbers 1-26 to alphabets a-z and add up those numbers according to any given string without any success. For example: a = 1, b=2, c=3. So, if any given string is "abc", the output should be 1+2+3=6.
Programming background - Novice, self-learning.
I have only learned upto strings, lists and their corresponding methods in python programming. I haven't studied functions and classes yet, so please make your answers as simple as possible.
So far I've tried
Name = "abc"
a,b,c = [1,2,3]
Sum_of_name = ""
For alphabet in abc:
Sum_of_name = sum_of_name + alphabet
Print(sum_of_name)
Prints out the same abc.
I realise that when I iterate the string "abc", the string is different than the variables a,b and c. Thus, the integers aren't assigned to the strings and can't be added up.
Any suggestions on how I can work through this with my current level of knowledge.
This is one approach.
Demo:
from string import ascii_lowercase
d = {v: i for i,v in enumerate(ascii_lowercase, 1)}
Name = "abc"
print( sum(d[i] for i in Name) )
Output:
6
First make a list of the letters
>>> from string import ascii_lowercase as alphabet
>>> alphabet
'abcdefghijklmnopqrstuvwxyz'
Then make a lookup of letter to value (there are other ways to do this)
>>> values = {letter: value for value, letter in enumerate(alphabet, 1)}
>>> values
{'d': 4, 'f': 6, 'o': 15, 'b': 2, 's': 19, 'c': 3, 'w': 23, 'q': 17, 'v': 22, 'p': 16, 'i': 9, 'e': 5, 'l': 12, 't': 20, 'y': 25, 'n': 14, 'a': 1, 'r': 18, 'j': 10, 'x': 24, 'g': 7, 'm': 13, 'k': 11, 'h': 8, 'z': 26, 'u': 21}
Then use that to sum values
def sum_letters(word):
return sum(values[letter] for letter in word)
>>> sum_letters('abc')
6
If you have a fixed order then you can use ord()
a="Name"
s=0
for i in a.lower():
s+=ord(i)-96
print(s)
To get the characters in the alphabet, you can use the string lib:
>>> import string
>>> letters = string.lowercase
>>> letters
'abcdefghijklmnopqrstuvwxyz'
We can then turn that into a dictionary to make getting the numeric (positional) value of a letter easy:
letter_map = dict(zip(list(letters), range(1, len(letters) + 1)))
So your function will perform a simple dict lookup for each letter input:
def string_sum(string_input):
return sum(letter_map[char] for char in string_input)
Several test cases:
>>> assert string_sum('abc') == 6
>>> assert string_sum('') == 0 # because it's empty

How to remove the least frequent element from a Counter in Python the fastest way?

I'd like to implement a Counter which drops the least frequent element when the counter's size going beyond some threshold. For that I need to remove the least frequent element.
What is the fastest way to do that in Python?
I know counter.most_common()[-1], but it creates a whole list and seems slow when done extensively? Is there a better command (or maybe a different data structure)?
You may implement least_common by borrowing implementation of most_common and performing necessary changes.
Refer to collections source in Py2.7:
def most_common(self, n=None):
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts.
>>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)]
'''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.iteritems(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.iteritems(), key=_itemgetter(1))
To change it in order to retrieve least common we need just a few adjustments.
import collections
from operator import itemgetter as _itemgetter
import heapq as _heapq
class MyCounter(collections.Counter):
def least_common(self, n=None):
if n is None:
return sorted(self.iteritems(), key=_itemgetter(1), reverse=False) # was: reverse=True
return _heapq.nsmallest(n, self.iteritems(), key=_itemgetter(1)) # was _heapq.nlargest
Tests:
c = MyCounter("abbcccddddeeeee")
assert c.most_common() == c.least_common()[::-1]
assert c.most_common()[-1:] == c.least_common(1)
Since your stated goal is to remove items in the counter below a threshold, just reverse the counter (so the values becomes a list of keys with that value) and then remove the keys in the counter below the threshold.
Example:
>>> c=Counter("aaaabccadddefeghizkdxxx")
>>> c
Counter({'a': 5, 'd': 4, 'x': 3, 'c': 2, 'e': 2, 'b': 1, 'g': 1, 'f': 1, 'i': 1, 'h': 1, 'k': 1, 'z': 1})
counts={}
for k, v in c.items():
counts.setdefault(v, []).append(k)
tol=2
for k, v in counts.items():
if k<=tol:
c=c-Counter({}.fromkeys(v, k))
>>> c
Counter({'a': 5, 'd': 4, 'x': 3})
In this example, all counts less than or equal to 2 are removed.
Or, just recreate the counter with a comparison to your threshold value:
>>> c
Counter({'a': 5, 'd': 4, 'x': 3, 'c': 2, 'e': 2, 'b': 1, 'g': 1, 'f': 1, 'i': 1, 'h': 1, 'k': 1, 'z': 1})
>>> Counter({k:v for k,v in c.items() if v>tol})
Counter({'a': 5, 'd': 4, 'x': 3})
If you only want to get the least common value, then the most efficient way to handle this is to simply get the minimum value from the counter (dictionary).
Since you can only say whether a value is the lowest, you actually need to look at all items, so a time complexity of O(n) is really the lowest we can get. However, we do not need to have a linear space complexity, as we only need to remember the lowest value, and not all of them. So a solution that works like most_common() in reverse is too much for us.
In this case, we can simply use min() with a custom key function here:
>>> c = Counter('foobarbazbar')
>>> c
Counter({'a': 3, 'b': 3, 'o': 2, 'r': 2, 'f': 1, 'z': 1})
>>> k = min(c, key=lambda x: c[x])
>>> del c[k]
>>> c
Counter({'a': 3, 'b': 3, 'o': 2, 'r': 2, 'z': 1})
Of course, since dictionaries are unordered, you do not get any influence on which of the lowest values is removed that way in case there are multiple with the same lowest occurrence.

Why does this python dictionary get created out of order using setdefault()?

I'm just starting to play around with Python (VBA background). Why does this dictionary get created out of order? Shouldn't it be a:1, b:2...etc.?
class Card:
def county(self):
c = 0
l = 0
groupL = {} # groupL for Loop
for n in range(0,13):
c += 1
l = chr(n+97)
groupL.setdefault(l,c)
return groupL
pick_card = Card()
group = pick_card.county()
print group
here's the output:
{'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4, 'g': 7, 'f': 6, 'i': 9, 'h': 8, 'k': 11, 'j': 10, 'm': 13, 'l': 12}
or, does it just get printed out of order?
Dictionaries have no order in python. In other words, when you iterate over a dictionary, the order that the keys/items are "yielded" is not the order that you put them into the dictionary. (Try your code on a different version of python and you're likely to get differently ordered output). If you want a dictionary that is ordered, you need a collections.OrderedDict which wasn't introduced until python 2.7. You can find equivalent recipes on ActiveState if you're using an older version of python. However, often it's good enough to just sort the items (e.g. sorted(mydict.items()).
EDIT as requested, an OrderedDict example:
from collections import OrderedDict
groupL = OrderedDict() # groupL for Loop
c = 0
for n in range(0,13):
c += 1
l = chr(n+97)
groupL.setdefault(l,c)
print (groupL)

Categories

Resources