How to iterate over the elements of a map in python - python

Given a string s, I want to know how many times each character at the string occurs. Here is the code:
def main() :
while True :
try :
line=raw_input('Enter a string: ')
except EOFError :
break;
mp={};
for i in range(len(line)) :
if line[i] in mp :
mp[line[i]] += 1;
else :
mp[line[i]] = 1;
for i in range(len(line)) :
print line[i],': ',mp[line[i]];
if __name__ == '__main__' :
main();
When I run this code and I enter abbba, I get:
a : 2
b : 3
b : 3
b : 3
a : 2
I would like to get only:
a : 2
b : 3
I understand why this is happening, but as I'm new to python, I don't know any other ways to iterate over the elements of a map. Could anyone tell me how to do this? Thanks in advance.

You could try a Counter (Python 2.7 and above; see below for a pre-2.7 option):
>>> from collections import Counter
>>> Counter('abbba')
Counter({'b': 3, 'a': 2})
You can then access the elements just like a dictionary:
>>> counts = Counter('abbba')
>>> counts['a']
2
>>> counts['b']
3
And to iterate, you can use #BurhanKhalid's suggestion (the Counter behaves as a dictionary, where you can iterate over the key/value pairs):
>>> for k, v in Counter('abbba').iteritems():
... print k, v
...
a 2
b 3
If you're using a pre-2.7 version of Python, you can use a defaultdict to simplify your code a bit (process is still the same - only difference is that now you don't have to check for the key first - it will 'default' to 0 if a matching key isn't found). Counter has other features built into it, but if you simply want counts (and don't care about most_common, or being able to subtract, for instance), this should be fine and can be treated just as any other dictionary:
>>> from collections import defaultdict
>>> counts = defaultdict(int)
>>> for c in 'abbba':
... counts[c] += 1
...
>>> counts
defaultdict(<type 'int'>, {'a': 2, 'b': 3})
When you use iteritems() on a dictionary (or the Counter/defaultdict here), a key and a value are returned for each iteration (in this case, the key being the letter and the value being the number of occurrences). One thing to note about using dictionaries is that they are inherently unordered, so you won't necessarily get 'a', 'b', ... while iterating. One basic way to iterate through a dictionary in a sorted manner would be to iterate through a sorted list of the keys (here alphabetical, but sorted can be manipulated to handle a variety of options), and return the dictionary value for that key (there are other ways, but this will hopefully be somewhat informative):
>>> mapping = {'some': 2, 'example': 3, 'words': 5}
>>> mapping
{'some': 2, 'example': 3, 'words': 5}
>>> for key in sorted(mapping.keys()):
... print key, mapping[key]
...
example 3
some 2
words 5

Iterating over a mapping yields keys.
>>> d = {'foo': 42, 'bar': 'quux'}
>>> for k in d:
... print k, d[k]
...
foo 42
bar quux

You need to look up the help for dict(). It's all there -- 'for k in mp' iterates over keys, 'for v in mp.values()' iterates over values, 'for k,v in mp.items()' iterates over key, value pairs.
Also, you don't need those semicolons. While they are legal in Python, nobody uses them, there's pretty much no reason to.

Python 2.5 and above
dDIct = collections.defaultdict(int)
[(d[i]+=1) for i in line]
print dDict

Related

Bast way to update a value in a dict

I have a dict that contains parameters. I want to consider that any unspecified parameter should be read as a zero. I see several ways of doing it and wonder which one is recommended:
parameters = {
"apples": 2
}
def gain_fruits0 (quantity, fruit):
if not fruit in parameters :
parameters[fruit] = 0
parameters[fruit] += quantity
def gain_fruits1 (quantity, fruits):
parameters[fruit] = quantity + parameters.get(fruit,0)
parameters is actually way bigger than that, if that is important to know for optimization purposes.
So, what would be the best way? gain_fruits0, gain_fruits1, or something else?
This is a typical use of defaultdict, which works exactly like a regular dictionary except that it has the functionality you're after built in:
>>> from collections import defaultdict
>>> d = defaultdict(int) # specify default value `int()`, which is 0
>>> d['apples'] += 1
>>> d
defaultdict(int, {'apples': 1})
>>> d['apples'] # can index normally
1
>>> d['oranges'] # missing keys map to the default value
0
>>> dict(d) # can also cast to regular dict
{'apples': 1, 'oranges': 0}

Combine 2 dictionaries by key-value in python

I have not found a solution to my question, yet I hope it's trivial.
I have two dictionaries:
dictA:
contains the order number of a word in a text as key: word as value
e.g.
{0:'Roses',1:'are',2:'red'...12:'blue'}
dictB:
contains counts of those words in the text
e.g.
{'Roses':2,'are':4,'blue':1}
I want to replace the values in dictA by values in dictB via keys in dictB, checking for nones, replacing by 0.
So output should look like:
{0:2,1:4,2:0...12:1}
Is there a way for doing it, preferentially without introducing own functions?
Use a dictionary comprehension and apply the get method of dict B to return 0 for items that are not found in B:
>>> A = {0:'Roses',1:'are',2:'red', 12:'blue'}
>>> B = {'Roses':2,'are':4,'blue':1}
>>> {k: B.get(v, 0) for k, v in A.items()}
{0: 2, 1: 4, 2: 0, 12: 1}

Find count of characters within the string in Python

I am trying to create a dictionary of word and number of times it is repeating in string. Say suppose if string is like below
str1 = "aabbaba"
I want to create a dictionary like this
word_count = {'a':4,'b':3}
I am trying to use dictionary comprehension to do this.
I did
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
This ends up giving an error saying
File "<stdin>", line 1
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
^
SyntaxError: invalid syntax
Can anybody tell me what's wrong with the syntax? Also,How can I create such a dictionary using dictionary comprehension?
As others have said, this is best done with a Counter.
You can also do:
>>> {e:str1.count(e) for e in set(str1)}
{'a': 4, 'b': 3}
But that traverses the string 1+n times for each unique character (once to create the set, and once for each unique letter to count the number of times it appears. i.e., This has quadratic runtime complexity.). Bad result if you have a lot of unique characters in a long string... A Counter only traverses the string once.
If you want no import version that is more efficient than using .count, you can use .setdefault to make a counter:
>>> count={}
>>> for c in str1:
... count[c]=count.setdefault(c, 0)+1
...
>>> count
{'a': 4, 'b': 3}
That only traverses the string once no matter how long or how many unique characters.
You can also use defaultdict if you prefer:
>>> from collections import defaultdict
>>> count=defaultdict(int)
>>> for c in str1:
... count[c]+=1
...
>>> count
defaultdict(<type 'int'>, {'a': 4, 'b': 3})
>>> dict(count)
{'a': 4, 'b': 3}
But if you are going to import collections -- Use a Counter!
Ideal way to do this is via using collections.Counter:
>>> from collections import Counter
>>> str1 = "aabbaba"
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
You can not achieve this via simple dict comprehension expression as you will require reference to your previous value of count of element. As mentioned in Dawg's answer, as a work around you may use list.count(e) in order to find count of each element from the set of string within you dict comprehension expression. But time complexity will be n*m as it will traverse the complete string for each unique element (where m are uniques elements), where as with counter it will be n.
This is a nice case for collections.Counter:
>>> from collections import Counter
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
It's dict subclass so you can work with the object similarly to standard dictionary:
>>> c = Counter(str1)
>>> c['a']
4
You can do this without use of Counter class as well. The simple and efficient python code for this would be:
>>> d = {}
>>> for x in str1:
... d[x] = d.get(x, 0) + 1
...
>>> d
{'a': 4, 'b': 3}
Note that this is not the correct way to do it since it won't count repeated characters more than once (apart from losing other characters from the original dict) but this answers the original question of whether if-else is possible in comprehensions and demonstrates how it can be done.
To answer your question, yes it's possible but the approach is like this:
dic = {x: (dic[x] + 1 if x in dic else 1) for x in str1}
The condition is applied on the value only not on the key:value mapping.
The above can be made clearer using dict.get:
dic = {x: dic.get(x, 0) + 1 for x in str1}
0 is returned if x is not in dic.
Demo:
In [78]: s = "abcde"
In [79]: dic = {}
In [80]: dic = {x: (dic[x] + 1 if x in dic else 1) for x in s}
In [81]: dic
Out[81]: {'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1}
In [82]: s = "abfg"
In [83]: dic = {x: dic.get(x, 0) + 1 for x in s}
In [84]: dic
Out[84]: {'a': 2, 'b': 2, 'f': 1, 'g': 1}

Ordering a nested dictionary by the frequency of the nested value

I have this list made from a csv which is massive.
For every item in list, I have broken it into it's id and details. id is always between 0-3 characters max length and details is variable.
I created an empty dictionary, D...(rest of code below):
D={}
for v in list:
id = v[0:3]
details = v[3:]
if id not in D:
D[id] = {}
if details not in D[id]:
D[id][details] = 0
D[id][details] += 1
aside: Can you help me understand what the two if statements are doing? Very new to python and programming.
Anyway, it produces something like this:
{'KEY1_1': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
'KEY1_2': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
and many more KEY1's with variable numbers of key2's
Each 'KEY1' is unique but each 'key2' isn't necessarily. The value2_
s are all different.
Ok so, right now I found a way to sort by the first KEY
for k, v in sorted(D.items()):
print k, ':', v
I have done enough research to know that dictionaries can't really be sorted but I don't care about sorting, I care about ordering or more specifically frequencies of occurrence. In my code value2_x is the number of times its corresponding key2_x occurs for that particular KEY1_x. I am starting to think I should have used better variable names.
Question: How do I order the top-level/overall dictionary by the number in value2_x which is in the nested dictionary? I want to do some statistics to those numbers like...
How many times does the most frequent KEY1_x:key2_x pair show up?
What are the 10, 20, 30 most frequent KEY1_x:key2_x pairs?
Can I only do that by each KEY1 or can I do it overall? Bonus: If I could order it that way for presentation/sharing that would be very helpful because it is such a large data set. So much thanks in advance and I hope I've made my question and intent clear.
You could use Counter to order the key pairs based on their frequency. It also provides an easy way to get x most frequent items:
from collections import Counter
d = {
'KEY1': {
'key2_1': 5,
'key2_2': 1,
'key2_3': 3
},
'KEY2': {
'key2_1': 2,
'key2_2': 3,
'key2_3': 4
}
}
c = Counter()
for k, v in d.iteritems():
c.update({(k, k1): v1 for k1, v1 in v.iteritems()})
print c.most_common(3)
Output:
[(('KEY1', 'key2_1'), 5), (('KEY2', 'key2_3'), 4), (('KEY2', 'key2_2'), 3)]
If you only care about the most common key pairs and have no other reason to build nested dictionary you could just use the following code:
from collections import Counter
l = ['foobar', 'foofoo', 'foobar', 'barfoo']
D = Counter((v[:3], v[3:]) for v in l)
print D.most_common() # [(('foo', 'bar'), 2), (('foo', 'foo'), 1), (('bar', 'foo'), 1)]
Short explanation: ((v[:3], v[3:]) for v in l) is a generator expression that will generate tuples where first item is the same as top level key in your original dict and second item is the same as key in nested dict.
>>> x = list((v[:3], v[3:]) for v in l)
>>> x
[('foo', 'bar'), ('foo', 'foo'), ('foo', 'bar'), ('bar', 'foo')]
Counter is a subclass of dict. It accepts an iterable as an argument and each unique element in iterable will be used as key and value is the count of element in the iterable.
>>> c = Counter(x)
>>> c
Counter({('foo', 'bar'): 2, ('foo', 'foo'): 1, ('bar', 'foo'): 1})
Since generator expression is an iterable there's no need to convert it to list in between so construction can simply be done with Counter((v[:3], v[3:]) for v in l).
The if statements you asked about are checking if the key exists in dict:
>>> d = {1: 'foo'}
>>> 1 in d
True
>>> 2 in d
False
So the following code will check if key with value of id exists in dict D and if it doesn't it will assign empty dict there.
if id not in D:
D[id] = {}
The second if does exactly the same for nested dictionaries.

Destructuring-bind dictionary contents

I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like
params = {'a':1,'b':2}
a,b = params.values()
But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?
from operator import itemgetter
params = {'a': 1, 'b': 2}
a, b = itemgetter('a', 'b')(params)
Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.
One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict:
pluck = lambda dict, *args: (dict[arg] for arg in args)
things = {'blah': 'bleh', 'foo': 'bar'}
foo, blah = pluck(things, 'foo', 'blah')
Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict:
sorted_vals = lambda dict: (t[1] for t in sorted(dict.items()))
things = {'foo': 'bar', 'blah': 'bleh'}
blah, foo = sorted_vals(things)
How come nobody posted the simplest approach?
params = {'a':1,'b':2}
a, b = params['a'], params['b']
Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy):
a,b = [d[k] for k in ('a','b')]
This works with generators too:
a,b = (d[k] for k in ('a','b'))
Here is a full example:
>>> d = dict(a=1,b=2,c=3)
>>> d
{'a': 1, 'c': 3, 'b': 2}
>>> a, b = [d[k] for k in ('a','b')]
>>> a
1
>>> b
2
>>> a, b = (d[k] for k in ('a','b'))
>>> a
1
>>> b
2
Here's another way to do it similarly to how a destructuring assignment works in JS:
params = {'b': 2, 'a': 1}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had:
params = {'b': 2, 'a': 1, 'c': 3}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
After the lambda has been applied, the rest variable will now contain:
{'c': 3}
Useful for omitting unneeded keys from a dictionary.
Hope this helps.
Maybe you really want to do something like this?
def some_func(a, b):
print a,b
params = {'a':1,'b':2}
some_func(**params) # equiv to some_func(a=1, b=2)
If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted
(Ab)using the import system
The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this:
import sys, types
mydict = {'a':1,'b':2}
sys.modules["mydict"] = types.SimpleNamespace(**mydict)
from mydict import a, b
A somewhat more serious hack would be to write a context manager to load and unload the module:
with obj_as_module(mydict, "mydict_module"):
from mydict_module import a, b
By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict).
See this answer for an implementation and some extensions of the idea.
One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.
Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations:
CPython implementation detail: This function relies on Python stack frame support
in the interpreter, which isn’t guaranteed to exist in all implementations
of Python. If running in an implementation without Python stack frame support
this function returns None.
Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to!
def destructure(dict_):
if not isinstance(dict_, dict):
raise TypeError(f"{dict_} is not a dict")
# the parent frame will contain the information about
# the current line
parent_frame = inspect.currentframe().f_back
# so we extract that line (by default the code context
# only contains the current line)
(line,) = inspect.getframeinfo(parent_frame).code_context
# "hello, key = destructure(my_dict)"
# -> ("hello, key ", "=", " destructure(my_dict)")
lvalues, _equals, _rvalue = line.strip().partition("=")
# -> ["hello", "key"]
keys = [s.strip() for s in lvalues.split(",") if s.strip()]
if missing := [key for key in keys if key not in dict_]:
raise KeyError(*missing)
for key in keys:
yield dict_[key]
In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"}
In [6]: hello, key = destructure(my_dict)
In [7]: hello
Out[7]: 'world'
In [8]: key
Out[8]: 'value'
This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries
With Python 3.10, you can do:
d = {"a": 1, "b": 2}
match d:
case {"a": a, "b": b}:
print(f"A is {a} and b is {b}")
but it adds two extra levels of indentation, and you still have to repeat the key names.
Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon.
try this
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
keys = data.keys()
a,b,c = [data[k] for k in keys]
result:
a == 'Apple'
b == 'Banana'
c == 'Carrot'
Well, if you want these in a class you can always do this:
class AttributeDict(dict):
def __init__(self, *args, **kwargs):
super(AttributeDict, self).__init__(*args, **kwargs)
self.__dict__.update(self)
d = AttributeDict(a=1, b=2)
Based on #ShawnFumo answer I came up with this:
def destruct(dict): return (t[1] for t in sorted(dict.items()))
d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' }
a, b, c = destruct(d)
(Notice the order of items in dict)
An old topic, but I found this to be a useful method:
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
for key in data.keys():
locals()[key] = data[key]
This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable.
Testing:
print(a)
print(b)
print(c)
Output
Apple
Banana
Carrot
An easy and simple way to destruct dict in python:
params = {"a": 1, "b": 2}
a, b = [params[key] for key in ("a", "b")]
print(a, b)
# Output:
# 1 2
I don't know whether it's good style, but
locals().update(params)
will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.
Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays:
params = {'a': 1, 'b': 2}
a, b = params.values()
print(a)
print(b)
Output:
1
2

Categories

Resources