So here is my problem, i have a dictionary with following key => values:
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:политичка -> 2
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:државата -> 2
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:енергично -> 1
1caa60ebf9459d9cd406f1a03e1719b675dcfaad78292edc7e4a56be:полициска -> 1
I have this code to show the keys needed:
for key, value in count_db.iteritems():
print key[:56]
So now i have:
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002 -> 2
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002 -> 2
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002 -> 1
1caa60ebf9459d9cd406f1a03e1719b675dcfaad78292edc7e4a56be -> 1
I need to merge them into:
6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002 -> 5
1caa60ebf9459d9cd406f1a03e1719b675dcfaad78292edc7e4a56be -> 1
I have made this but i have not succeed in doing it correctly:
length_dic=len(count_db.keys())
for key, value in count_db.iteritems():
count_element=key[:56]
#print "%s => %s" % (key[:56], value) #value[:56]
for i in range(length_dic):
i+=1
if count_element == key[:56]:
itr+=int(value)
print i
length_dic=length_dic-1
Any hints?
A trivial approach would be:
result = {}
for key, value in count_db.iteritems():
result[key[:56]] = result.get(key[:56], 0) + value
You could also achieve the same with reduce if you want to get it on one line:
import collections
result = reduce(lambda x,y: x[y[0][:56]] += y[1] , count_db.iteritems(), collections.defaultdict(int))
Given your dictionary as
>>> spam={"6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:AAAA": 2,
"6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:BBBB": 2,
"6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002:CCCC": 1,
"1caa60ebf9459d9cd406f1a03e1719b675dcfaad78292edc7e4a56be:DDDD": 1
}
you can somewhat do like the following
>>> bacon=collections.defaultdict(int)
>>> for k,v in [(k[:56],v) for k,v in spam.iteritems()]:
bacon[k]+=v
>>> bacon
defaultdict(<type 'int'>, {'6bc51fb21fd9eefef4ec97a241733cd59b71e8e14ad70e9068d32002': 5, '1caa60ebf9459d9cd406f1a03e1719b675dcfaad78292edc7e4a56be': 1})
>>>
This is exactly what the Counter object (in version 2.7+) is for:
import collections
c = collections.Counter()
for key, value in count_db.iteritems():
c[key[:56]] += value
I didn't understand why you did all that in your code. I think this would do the job:
tmp_dict = {}
for key, value in count_db.iteritems():
count_element=key[:56]
if count_element in tmp_dict:
tmp_dict[count_element] += value
else:
tmp_dict[count_element] = value
Related
Say I have a dictionary in Python {1:'a', 100:'b', 1024:'c'}
I want to build a function that can look for not only the exact value of the key, but also approximated values. For instance, the function can return b if the input is 99 or 101.
Could you suggest me some approaches?
If you want to keep the speed advantage of a dict, you could bin your keys, e.g. by rounding them to the nearest multiple of 10:
>>> data = {1:'a', 100:'b', 1024:'c'}
>>> fuzzy = { ((k + 5) // 10) * 10:v for k,v in data.items() }
>>> fuzzy
{0: 'a', 100: 'b', 1020: 'c'}
When you want to check if a values is close to a key in data, you simply apply the same transformation:
>>> fuzzy.get(((98+5)//10)*10)
'b'
>>> fuzzy.get(((97+5)//10)*10)
'b'
>>> fuzzy.get(((100+5)//10)*10)
'b'
>>> fuzzy.get(((101+5)//10)*10)
'b'
>>> fuzzy.get(((1022+5)//10)*10)
'c'
If you have a finite range for the values of the keys that is known in advance something like this indexing with tuples
>>> d={(0,2):'a', (99,101):'b', (1023,1025):'c'}
To find the value of a key:
Find 1024.01:
>>> d={(0,2):'a', (99,101):'b', (1023,1025):'c'}
>>> next(v for (k,v) in d.iteritems() if k[0]<=1024.01<=k[1])
'c'
Find 1025.01:
>>> next(v for (k,v) in d.iteritems() if k[0]<=1025.01<=k[1])
# throws an error because key is not found
You can make your own lookup function as follows:
import sys
def lookup(value, dict):
nearest = sys.maxint
result = ""
for k,v in dict.iteritems():
if abs(value - k) < nearest:
nearest = abs(value - k)
result = v
return result
print lookup(101, {1:'a', 100:'b', 1024:'c'})
You can search for values within 2% range (configurable) with something like this:
data = {1:'a', 100:'b', 1024:'c'}
def get_approx(data, key):
return [elem[1] for elem in data.iteritems() if elem[0]*0.98 <= key <= elem[0]*1.02]
get_approx(data, 99) # outputs ['b']
I'm running python 2.7 in PS on a w10. I want to print the key and the value of a dictionary with every pair enumerated.
I do the following:
my_dict = {'key_one': 1, 'key_two': 2, 'key_three': 3}
for k, v in enumerate(my_dict.iteritems(), start = 1):
print k, v
which in turn gives:
1 ('key_one', 1)
2 ('key_two', 2)
3 ('key_three', 3)
How do I return the entries without the braces?
Example - I want to put a = sign in between my key-value pairs.
If you want to keep the indicies (from enumerate), then you're going to have to unpack the key and value from the dict items separates. Right now what you're calling k is actually an index, and what you're calling v is actually a key-value pair. Try something like this:
for i, (k, v) in enumerate(my_dict.iteritems(), start=1):
print i, k, v
That results in something like:
1 key_two 2
2 key_one 1
3 key_three 3
To get them formatted with an equals sign, you'd have to change the print statement to print i, "{}={}".format(k, v), which would result in something like:
1 key_two=2
2 key_one=1
3 key_three=3
If you need to retrieve the keys in a consistent order, use sorted(), like this:
for i, (k, v) in enumerate(sorted(my_dict.iteritems()), start=1):
...
Or, if you want to sort by values first instead of the keys first, you could specify a key function for the sorted() call. That would look like: sorted(my_dict.iteritems(), key=lambda (x, y): (y, x)). That would give you an output of
1 key_one=1
2 key_two=2
3 key_three=3
You don't need enumerate if you just want to print the existing key and values in your dictionary. Just use format():
for k, v in my_dict.items():
print '{} = {}'.format(k, v)
This would give:
key_one = 1
key_two = 2
key_three = 3
This works
my_dict = {'key_one': 1, 'key_two': 2, 'key_three': 3}
for key,value in my_dict.iteritems():
print key,value
Like this?
>>> for k, v in my_dict.iteritems():
... print k, v
...
key_two 2
key_one 1
key_three 3
or
>>> for i, (k, v) in enumerate(my_dict.iteritems(), start=1):
... print i, k, v
...
1 key_two 2
2 key_one 1
3 key_three 3
Simple one-line solution(for Python 2.7):
print '\n'.join([k+'='+ str(my_dict[k]) for k in my_dict.keys()])
The output:
key_two=2
key_one=1
key_three=3
You do not need enumerate() here. It is used when you need to iterate along with the index. you do not even need str.format() for achieving this. Simply place a entry of '=' string between your key, value and you will get what your desire. For example:
>>> my_dict = {'key_one': 1, 'key_two': 2, 'key_three': 3}
>>> for key, value in my_dict.items():
... print key, '=', value
...
key_two = 2
key_one = 1
key_three = 3
Edit: Based on the comment at user3030010's answer
Note: dict in python are un ordered. In case you want to maintain the order, use collections.OrderedDict() instead. It will preserve the order independent of the platform and python version. For example if you created the dict like:
>>> from collections import OrderedDict
>>> my_dict = OrderedDict()
>>> my_dict['key_one'] = 1
>>> my_dict['key_two'] = 2
>>> my_dict['key_three'] = 3
On iterating it, you will always get the same response as:
>>> for key, value in my_dict.items():
... print key, '=', value
...
key_one = 1
key_two = 2
key_three = 3
Ive got two sets of key value pairs that look like this:
tom = {'coffee': 2, 'hotdog': 1}
and another like this:
namcat = {'hotdog stand':[hotdog, foodstand], 'cafe':[breakfast, coffee]}
Id like to compare whenever a key associated with 'tom' is the same as a value in 'namcat', and if so add 1 to a running total. I think its iterating over key-value pairs with lists that is causing me issues.
for k, v in namcat.items():
for item in v:
for key, value in tom.items():
if value == item:
running_total += 1
Demo:
>>> hotdog = 1
>>> coffee = 2
>>> foodstand = 6
>>> breakfast = 10
>>> tom = {'coffee': 2, 'hotdog': 1}
>>> namcat = {'hotdog stand':[hotdog, foodstand], 'cafe':[breakfast, coffee]}
>>> running_total = 0
>>> for k, v in namcat.items():
for item in v:
for key, value in tom.items():
if value == item:
running_total += 1
>>> running_total
2
This should do it. Hope it helps!
In Python it's annoying to have to check whether a key is in the dictionary first before incrementing it:
if key in my_dict:
my_dict[key] += num
else:
my_dict[key] = num
Is there a shorter substitute for the four lines above?
An alternative is:
my_dict[key] = my_dict.get(key, 0) + num
You have quite a few options. I like using Counter:
>>> from collections import Counter
>>> d = Counter()
>>> d[12] += 3
>>> d
Counter({12: 3})
Or defaultdict:
>>> from collections import defaultdict
>>> d = defaultdict(int) # int() == 0, so the default value for each key is 0
>>> d[12] += 3
>>> d
defaultdict(<function <lambda> at 0x7ff2fe7d37d0>, {12: 3})
What you want is called a defaultdict
See http://docs.python.org/library/collections.html#collections.defaultdict
transform:
if key in my_dict:
my_dict[key] += num
else:
my_dict[key] = num
into the following using setdefault:
my_dict[key] = my_dict.setdefault(key, 0) + num
There is also a little bit different setdefault way:
my_dict.setdefault(key, 0)
my_dict[key] += num
Which may have some advantages if combined with other logic.
A solution to shorten the condition can be the following sample:
dict = {}
dict['1'] = 10
dict['1'] = dict.get('1', 0) + 1 if '1' in dict else 1
print(dict)
Any one of .get or .setdefault can be used:
.get() give default value passed in the function if there is no valid key
my_dict[key] = my_dict.get(key, 0) + num
.setdefault () create a key with default value passed
my_dict[key] = my_dict.setdefault(key, 0) + num
I receive a dictionary as input, and want to return a list of keys for which the dictionary values are unique in the scope of that dictionary.
I will clarify with an example. Say my input is dictionary a, constructed as follows:
a = dict()
a['cat'] = 1
a['fish'] = 1
a['dog'] = 2 # <-- unique
a['bat'] = 3
a['aardvark'] = 3
a['snake'] = 4 # <-- unique
a['wallaby'] = 5
a['badger'] = 5
The result I expect is ['dog', 'snake'].
There are obvious brute force ways to achieve this, however I wondered if there's a neat Pythonian way to get the job done.
I think efficient way if dict is too large would be
countMap = {}
for v in a.itervalues():
countMap[v] = countMap.get(v,0) + 1
uni = [ k for k, v in a.iteritems() if countMap[v] == 1]
Here is a solution that only requires traversing the dict once:
def unique_values(d):
seen = {} # dict (value, key)
result = set() # keys with unique values
for k,v in d.iteritems():
if v in seen:
result.discard(seen[v])
else:
seen[v] = k
result.add(k)
return list(result)
Note that this actually is a bruteforce:
l = a.values()
b = [x for x in a if l.count(a[x]) == 1]
>>> b = []
>>> import collections
>>> bag = collections.defaultdict(lambda: 0)
>>> for v in a.itervalues():
... bag[v] += 1
...
>>> b = [k for (k, v) in a.iteritems() if bag[v] == 1]
>>> b.sort() # optional
>>> print b
['dog', 'snake']
>>>
A little more verbose, but does need only one pass over a:
revDict = {}
for k, v in a.iteritems():
if v in revDict:
revDict[v] = None
else:
revDict[v] = k
[ x for x in revDict.itervalues() if x != None ]
( I hope it works, since I can't test it here )
What about subclassing?
class UniqueValuesDict(dict):
def __init__(self, *args):
dict.__init__(self, *args)
self._inverse = {}
def __setitem__(self, key, value):
if value in self.values():
if value in self._inverse:
del self._inverse[value]
else:
self._inverse[value] = key
dict.__setitem__(self, key, value)
def unique_values(self):
return self._inverse.values()
a = UniqueValuesDict()
a['cat'] = 1
a['fish'] = 1
a[None] = 1
a['duck'] = 1
a['dog'] = 2 # <-- unique
a['bat'] = 3
a['aardvark'] = 3
a['snake'] = 4 # <-- unique
a['wallaby'] = 5
a['badger'] = 5
assert a.unique_values() == ['dog', 'snake']
Here's another variation.
>>> import collections
>>> inverse= collections.defaultdict(list)
>>> for k,v in a.items():
... inverse[v].append(k)
...
>>> [ v[0] for v in inverse.values() if len(v) == 1 ]
['dog', 'snake']
I'm partial to this because the inverted dictionary is such a common design pattern.
You could do something like this (just count the number of occurrences for each value):
def unique(a):
from collections import defaultdict
count = defaultdict(lambda: 0)
for k, v in a.iteritems():
count[v] += 1
for v, c in count.iteritems():
if c <= 1:
yield v
Use nested list comprehensions!
print [v[0] for v in
dict([(v, [k for k in a.keys() if a[k] == v])
for v in set(a.values())]).values()
if len(v) == 1]