pop is a great little function that, when used on dictionaries (given a known key) removes the item with that key from the dictionary and also returns the corresponding value. But what if I want the key as well?
Obviously, in simple cases I could probably just do something like this:
pair = (key, some_dict.pop(key))
But if, say, I wanted to pop the key-value pair with the lowest value, following the above idea I would have to do this...
pair = (min(some_dict, key=some.get), some_dict.pop(min(some_dict, key=some_dict.get)))
... which is hideous as I have to do the operation twice (obviously I could store the output from min in a variable, but I'm still not completely happy with that). So my question is: Is there an elegant way to do this? Am I missing an obvious trick here?
You can define yourself dictionary object using python ABCs which provides the infrastructure for defining abstract base classes. And then overload the pop attribute of python dictionary objects based on your need:
from collections import Mapping
class MyDict(Mapping):
def __init__(self, *args, **kwargs):
self.update(dict(*args, **kwargs))
def __setitem__(self, key, item):
self.__dict__[key] = item
def __getitem__(self, key):
return self.__dict__[key]
def __delitem__(self, key):
del self.__dict__[key]
def pop(self, k, d=None):
return k,self.__dict__.pop(k, d)
def update(self, *args, **kwargs):
return self.__dict__.update(*args, **kwargs)
def __iter__(self):
return iter(self.__dict__)
def __len__(self):
return len(self.__dict__)
def __repr__(self):
return repr(self.__dict__)
Demo:
d=MyDict()
d['a']=1
d['b']=5
d['c']=8
print d
{'a': 1, 'c': 8, 'b': 5}
print d.pop(min(d, key=d.get))
('a', 1)
print d
{'c': 8, 'b': 5}
Note : As #chepner suggested in comment as a better choice you can override popitem, which already returns a key/value pair.
A heap supports the pop-min operation you describe. You'll need to create a heap from your dictionary first, though.
import heapq
# Must be two steps; heapify modifies its argument in-place.
# Reversing the key and the value because the value will actually be
# the "key" in the heap. (Or rather, tuples are compared
# lexicographically, so put the value in the first position.)
heap = [(v, k) for k, v in some_dict.items()]
heapq.heapify(heap)
# Get the smallest item from the heap
value, key = heapq.heappop(heap)
here is a simpler implementation
class CustomDict(dict):
def pop_item(self, key):
popped = {key:self[key]} #save "snapshot" of the value of key before popping
self.pop(key)
return popped
a = CustomDict()
b = {"hello":"wassup", "lol":"meh"}
a.update(b)
print(a.pop_item("lol"))
print(a)
So here we create a custom dict that pops the item you want and gives out the key-value pair
Related
There's a common problem where I need to keep track of a bunch of collections in a dictionary. Let's say I want to keep track of which items I borrowed from my friends. The defaultdict class is quite useful to do this:
from collections import defaultdict
d = defaultdict(set)
d['Peter'].add('salt')
d['Eric'].add('car')
d['Eric'].add('jacket')
# defaultdict(<class 'set'>, {'Peter': {'salt'}, 'Eric': {'jacket', 'car'}})
This allows me to add items to the respective sets without worrying if any key is already in the set. Now if I return the salt to Peter. This means I owe him nothing and he can be removed from the dictionary. Doing this is slightly more cumbersome.
d['Peter'].remove('salt')
if not d['Peter']:
del(d['Peter'])
I know I could put this in some function, but for readability I would like a class that removes the key automatically if the corresponding set is empty. Is there some way to do this?
Edit
Okay I realize a pretty major problem with this idea when trying to solve it using inheritance and changing the index function. This is that that when calling d[index] the value is obviously returned already before calling .remove(something), which makes it impossible for the dictionary to know that it has been emptied. I'm guessing there's not really a way around using something different.
The problem with using a defaultdict to do what you want is that even accessing a key sets that key using the factory function. Consider:
from collections import defaultdict
d = defaultdict(set)
if d["Peter"]:
print("I owe something to Peter")
print(d)
# defaultdict(set, {'Peter': set()})
Also, the problem with creating a sub-class, as you've realized, the __getitem__() method is called before the set is ever emptied, so you'd have to call another function that checks if the set is empty and remove it.
A better idea might be to just not include keys with empty sets when you're creating the string representation.
class NewDefaultDict(defaultdict):
def __repr__(self):
return (f"NewDefaultDict({repr(self.default_factory)}, {{" +
", ".join(f"{repr(k)}: {repr(v)}" for k, v in self.items() if v) +
"})")
nd = NewDefaultDict(set)
nd["Peter"].add("salt")
nd["Paul"].add("pepper")
nd["Paul"].remove("pepper")
print(nd)
# NewDefaultDict(<class 'set'>, {'Peter': {'salt'}})
You would also need to redefine __contains__() to check if the value is empty, so that e.g. "Paul" in nd returns False:
def __contains__(self, key):
return defaultdict.__contains__(self, key) and self[key]
To make it compatible with for ... in nd constructs and dict-unpacking, you can redefine __iter__():
def __iter__(self):
for key in defaultdict.__iter__(self):
if self[key]: yield key
Then,
for k in nd:
print(k)
gives:
Peter
A dictionary comprehension might be useful.
from collections import defaultdict
d = defaultdict(set)
d['Peter'].add('salt')
d['Eric'].add('car')
d['Eric'].add('jacket')
d['Peter'].remove('salt')
d2 = {k: v for k, v in d.items() if len(v) > 0}
The d2 dictionary is now:
{'Eric': {'car', 'jacket'}}
Alternatively, using the fact that an empty set is considered false in Python.
d2 = {k: v for k, v in d.items() if v}
Defining a class to implement this logic, similar to the other answer, we can simply ignore keys/values where the value meets a criteria. A function is passed using the ignore parameter to define that criteria.
from collections import defaultdict
class default_ignore_dict(defaultdict):
def __init__(self, factory, ignore, *args, **kwargs):
defaultdict.__init__(self, factory, *args, **kwargs)
self.ignore = ignore
def __contains__(self, key):
return defaultdict.__contains__(self, key) and not self.ignore(self[key])
def items(self):
return ((k, v) for k, v in defaultdict.items(self) if not self.ignore(v))
def keys(self):
return (k for k, _ in self.items())
def values(self):
return (v for _, v in self.items())
Testing this:
>>> d = default_ignore_dict(set, lambda s: not s)
>>> d['Peter'].add('salt')
>>> d['Peter'].remove('salt')
>>> d['Eric'].add('car')
>>> d['Eric'].add('jacket')
>>>
>>> 'Peter' in d
False
>>> list(d.items())
[('Eric', {'car', 'jacket'})]
>>>
Is there a way to make a defaultdict also be the default for the defaultdict? (i.e. infinite-level recursive defaultdict?)
I want to be able to do:
x = defaultdict(...stuff...)
x[0][1][0]
{}
So, I can do x = defaultdict(defaultdict), but that's only a second level:
x[0]
{}
x[0][0]
KeyError: 0
There are recipes that can do this. But can it be done simply just using the normal defaultdict arguments?
Note this is asking how to do an infinite-level recursive defaultdict, so it's distinct to Python: defaultdict of defaultdict?, which was how to do a two-level defaultdict.
I'll probably just end up using the bunch pattern, but when I realized I didn't know how to do this, it got me interested.
The other answers here tell you how to create a defaultdict which contains "infinitely many" defaultdict, but they fail to address what I think may have been your initial need which was to simply have a two-depth defaultdict.
You may have been looking for:
defaultdict(lambda: defaultdict(dict))
The reasons why you might prefer this construct are:
It is more explicit than the recursive solution, and therefore likely more understandable to the reader.
This enables the "leaf" of the defaultdict to be something other than a dictionary, e.g.,: defaultdict(lambda: defaultdict(list)) or defaultdict(lambda: defaultdict(set))
For an arbitrary number of levels:
def rec_dd():
return defaultdict(rec_dd)
>>> x = rec_dd()
>>> x['a']['b']['c']['d']
defaultdict(<function rec_dd at 0x7f0dcef81500>, {})
>>> print json.dumps(x)
{"a": {"b": {"c": {"d": {}}}}}
Of course you could also do this with a lambda, but I find lambdas to be less readable. In any case it would look like this:
rec_dd = lambda: defaultdict(rec_dd)
There is a nifty trick for doing that:
tree = lambda: defaultdict(tree)
Then you can create your x with x = tree().
Similar to BrenBarn's solution, but doesn't contain the name of the variable tree twice, so it works even after changes to the variable dictionary:
tree = (lambda f: f(f))(lambda a: (lambda: defaultdict(a(a))))
Then you can create each new x with x = tree().
For the def version, we can use function closure scope to protect the data structure from the flaw where existing instances stop working if the tree name is rebound. It looks like this:
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
I would also propose more OOP-styled implementation, which supports infinite nesting as well as properly formatted repr.
class NestedDefaultDict(defaultdict):
def __init__(self, *args, **kwargs):
super(NestedDefaultDict, self).__init__(NestedDefaultDict, *args, **kwargs)
def __repr__(self):
return repr(dict(self))
Usage:
my_dict = NestedDefaultDict()
my_dict['a']['b'] = 1
my_dict['a']['c']['d'] = 2
my_dict['b']
print(my_dict) # {'a': {'b': 1, 'c': {'d': 2}}, 'b': {}}
I based this of Andrew's answer here.
If you are looking to load data from a json or an existing dict into the nester defaultdict see this example:
def nested_defaultdict(existing=None, **kwargs):
if existing is None:
existing = {}
if not isinstance(existing, dict):
return existing
existing = {key: nested_defaultdict(val) for key, val in existing.items()}
return defaultdict(nested_defaultdict, existing, **kwargs)
https://gist.github.com/nucklehead/2d29628bb49115f3c30e78c071207775
Here is a function for an arbitrary base defaultdict for an arbitrary depth of nesting.
(cross posting from Can't pickle defaultdict)
def wrap_defaultdict(instance, times=1):
"""Wrap an instance an arbitrary number of `times` to create nested defaultdict.
Parameters
----------
instance - list, dict, int, collections.Counter
times - the number of nested keys above `instance`; if `times=3` dd[one][two][three] = instance
Notes
-----
using `x.copy` allows pickling (loading to ipyparallel cluster or pkldump)
- thanks https://stackoverflow.com/questions/16439301/cant-pickle-defaultdict
"""
from collections import defaultdict
def _dd(x):
return defaultdict(x.copy)
dd = defaultdict(instance)
for i in range(times-1):
dd = _dd(dd)
return dd
Based on Chris W answer, however, to address the type annotation concern, you could make it a factory function that defines the detailed types. For example this is the final solution to my problem when I was researching this question:
def frequency_map_factory() -> dict[str, dict[str, int]]:
"""
Provides a recorder of: per X:str, frequency of Y:str occurrences.
"""
return defaultdict(lambda: defaultdict(int))
here is a recursive function to convert a recursive default dict to a normal dict
def defdict_to_dict(defdict, finaldict):
# pass in an empty dict for finaldict
for k, v in defdict.items():
if isinstance(v, defaultdict):
# new level created and that is the new value
finaldict[k] = defdict_to_dict(v, {})
else:
finaldict[k] = v
return finaldict
defdict_to_dict(my_rec_default_dict, {})
#nucklehead's response can be extended to handle arrays in JSON as well:
def nested_dict(existing=None, **kwargs):
if existing is None:
existing = defaultdict()
if isinstance(existing, list):
existing = [nested_dict(val) for val in existing]
if not isinstance(existing, dict):
return existing
existing = {key: nested_dict(val) for key, val in existing.items()}
return defaultdict(nested_dict, existing, **kwargs)
Here's a solution similar to #Stanislav's answer that works with multiprocessing and also allows for termination of the nesting:
from collections import defaultdict
from functools import partial
class NestedDD(defaultdict):
def __init__(self, n, *args, **kwargs):
self.n = n
factory = partial(build_nested_dd, n=n - 1) if n > 1 else int
super().__init__(factory, *args, **kwargs)
def __repr__(self):
return repr(dict(self))
def build_nested_dd(n):
return NestedDD(n)
Is there a way to make a defaultdict also be the default for the defaultdict? (i.e. infinite-level recursive defaultdict?)
I want to be able to do:
x = defaultdict(...stuff...)
x[0][1][0]
{}
So, I can do x = defaultdict(defaultdict), but that's only a second level:
x[0]
{}
x[0][0]
KeyError: 0
There are recipes that can do this. But can it be done simply just using the normal defaultdict arguments?
Note this is asking how to do an infinite-level recursive defaultdict, so it's distinct to Python: defaultdict of defaultdict?, which was how to do a two-level defaultdict.
I'll probably just end up using the bunch pattern, but when I realized I didn't know how to do this, it got me interested.
The other answers here tell you how to create a defaultdict which contains "infinitely many" defaultdict, but they fail to address what I think may have been your initial need which was to simply have a two-depth defaultdict.
You may have been looking for:
defaultdict(lambda: defaultdict(dict))
The reasons why you might prefer this construct are:
It is more explicit than the recursive solution, and therefore likely more understandable to the reader.
This enables the "leaf" of the defaultdict to be something other than a dictionary, e.g.,: defaultdict(lambda: defaultdict(list)) or defaultdict(lambda: defaultdict(set))
For an arbitrary number of levels:
def rec_dd():
return defaultdict(rec_dd)
>>> x = rec_dd()
>>> x['a']['b']['c']['d']
defaultdict(<function rec_dd at 0x7f0dcef81500>, {})
>>> print json.dumps(x)
{"a": {"b": {"c": {"d": {}}}}}
Of course you could also do this with a lambda, but I find lambdas to be less readable. In any case it would look like this:
rec_dd = lambda: defaultdict(rec_dd)
There is a nifty trick for doing that:
tree = lambda: defaultdict(tree)
Then you can create your x with x = tree().
Similar to BrenBarn's solution, but doesn't contain the name of the variable tree twice, so it works even after changes to the variable dictionary:
tree = (lambda f: f(f))(lambda a: (lambda: defaultdict(a(a))))
Then you can create each new x with x = tree().
For the def version, we can use function closure scope to protect the data structure from the flaw where existing instances stop working if the tree name is rebound. It looks like this:
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
I would also propose more OOP-styled implementation, which supports infinite nesting as well as properly formatted repr.
class NestedDefaultDict(defaultdict):
def __init__(self, *args, **kwargs):
super(NestedDefaultDict, self).__init__(NestedDefaultDict, *args, **kwargs)
def __repr__(self):
return repr(dict(self))
Usage:
my_dict = NestedDefaultDict()
my_dict['a']['b'] = 1
my_dict['a']['c']['d'] = 2
my_dict['b']
print(my_dict) # {'a': {'b': 1, 'c': {'d': 2}}, 'b': {}}
I based this of Andrew's answer here.
If you are looking to load data from a json or an existing dict into the nester defaultdict see this example:
def nested_defaultdict(existing=None, **kwargs):
if existing is None:
existing = {}
if not isinstance(existing, dict):
return existing
existing = {key: nested_defaultdict(val) for key, val in existing.items()}
return defaultdict(nested_defaultdict, existing, **kwargs)
https://gist.github.com/nucklehead/2d29628bb49115f3c30e78c071207775
Here is a function for an arbitrary base defaultdict for an arbitrary depth of nesting.
(cross posting from Can't pickle defaultdict)
def wrap_defaultdict(instance, times=1):
"""Wrap an instance an arbitrary number of `times` to create nested defaultdict.
Parameters
----------
instance - list, dict, int, collections.Counter
times - the number of nested keys above `instance`; if `times=3` dd[one][two][three] = instance
Notes
-----
using `x.copy` allows pickling (loading to ipyparallel cluster or pkldump)
- thanks https://stackoverflow.com/questions/16439301/cant-pickle-defaultdict
"""
from collections import defaultdict
def _dd(x):
return defaultdict(x.copy)
dd = defaultdict(instance)
for i in range(times-1):
dd = _dd(dd)
return dd
Based on Chris W answer, however, to address the type annotation concern, you could make it a factory function that defines the detailed types. For example this is the final solution to my problem when I was researching this question:
def frequency_map_factory() -> dict[str, dict[str, int]]:
"""
Provides a recorder of: per X:str, frequency of Y:str occurrences.
"""
return defaultdict(lambda: defaultdict(int))
here is a recursive function to convert a recursive default dict to a normal dict
def defdict_to_dict(defdict, finaldict):
# pass in an empty dict for finaldict
for k, v in defdict.items():
if isinstance(v, defaultdict):
# new level created and that is the new value
finaldict[k] = defdict_to_dict(v, {})
else:
finaldict[k] = v
return finaldict
defdict_to_dict(my_rec_default_dict, {})
#nucklehead's response can be extended to handle arrays in JSON as well:
def nested_dict(existing=None, **kwargs):
if existing is None:
existing = defaultdict()
if isinstance(existing, list):
existing = [nested_dict(val) for val in existing]
if not isinstance(existing, dict):
return existing
existing = {key: nested_dict(val) for key, val in existing.items()}
return defaultdict(nested_dict, existing, **kwargs)
Here's a solution similar to #Stanislav's answer that works with multiprocessing and also allows for termination of the nesting:
from collections import defaultdict
from functools import partial
class NestedDD(defaultdict):
def __init__(self, n, *args, **kwargs):
self.n = n
factory = partial(build_nested_dd, n=n - 1) if n > 1 else int
super().__init__(factory, *args, **kwargs)
def __repr__(self):
return repr(dict(self))
def build_nested_dd(n):
return NestedDD(n)
I need a function to change one item in composite dictionary.
I've tried something like..
def SetItem(keys, value):
item = self.dict
for key in keys:
item = item[key]
item = value
and
SetItem(['key1', 'key2'], 86)
It should be equivalent to self.dict['key1']['key2'] = 86, but this function has no effect.
Almost. You actually want to do something like:
def set_keys(d, keys, value):
item = d
for key in keys[:-1]:
item = item[key]
item[keys[-1]] = value
Or recursively like this:
def set_key(d, keys, value):
if len(keys) == 1:
d[keys[0]] = value
else:
set_key(d[keys[0]], keys[1:], value)
Marcin's right though. You would really want to incorporate something more rigorous, with some error handling for missing keys/missing dicts.
setItem = lambda self,names,value: map((lambda name: setattr(self,name,value)),names)
You don't have a self parameter
Just use the line of working code you have.
If you insist, here's a way:
def setitem(self, keys, value):
reduce(dict.get, # = lambda dictionary, key: dictionary[key]
keys[:-1], self.dictionary)[keys[-1]] = value
Obviously, this will break if the list of keys hits a non-dict value. You'll want to handle that. In fact, an explicit loop would probably be better for that reason, but you get the idea.
An idea involving recursion and EAFP, both of which I always like:
def set_item(d, keys, value):
key = keys.pop(0)
try:
set_item(d[key], keys, value)
# IndexError happens when the pop fails (empty list), KeyError happens when it's not a dict.
# Assume both mean we should finish recursing
except (IndexError, KeyError):
d[key] = value
Example:
>>> d = {'a': {'aa':1, 'ab':2}, 'b':{'ba':1, 'bb':2}}
>>> set_item(d, ['a', 'ab'], 50)
>>> print d
{'a': {'aa': 1, 'ab': 50}, 'b': {'ba': 1, 'bb': 2}}
Edit: As Marcin points out below, this will not work for arbitrarily nested dicts since Python has a recursion limit. It's also not for highly performance-sensitive situations (recursion in Python generally isn't). Nonetheless, outside of these two situations I find this to be somewhat more explicit than something involving reduce or lambda.
Fairly new to Python, still struggling with so much information.
All the documentation I've seen about dictionaries explain various ways of getting a value via a key - but I'm looking for a pythonic way to do the opposite - get a key via a value.
I know I can loop through the keys and inspect their values until I find the value I'm looking for and then grab the key, but I'm looking for a direct route.
There is no direct route. It's pretty easy with list comprehensions, though;
[k for k, v in d.iteritems() if v == desired_value]
If you need to do this occasionally and don't think it's worth while indexing it the other way as well, you could do something like:
class bidict(dict):
def key_with_value(self, value, default=None):
for k, v in self.iteritems():
if v == value:
return v
return default
def keys_with_value(self, value, default=None):
return [v for k, v in self.iteritems() if v == value]
Then d.key_with_value would behave rather like d.get, except the other way round.
You could also make a class which indexed it both ways automatically. Key and value would both need to be hashable, then. Here are three ways it could be implemented:
In two separate dicts, with the exposing some dict-like methods; you could perhaps do foo.by_key[key] or foo.by_value[value]. (No code given as it's more complicated and I'm lazy and I think this is suboptimal anyway.)
In a different structure, so that you could do d[key] and d.inverse[value]:
class bidict(dict):
def __init__(self, *args, **kwargs):
self.inverse = {}
super(bidict, self).__init__(key, value)
def __setitem__(self, key, value):
super(bidict, self).__setitem__(key, value)
self.inverse[value] = key
def __delitem__(self, key):
del self.inverse[self[key]]
super(bidict, self).__delitem__(key)
In the same structure, so that you could do d[key] and d[value]:
class bidict(dict):
def __setitem__(self, key, value):
super(bidict, self).__setitem__(key, value)
super(bidict, self).__setitem__(value, key)
def __delitem__(self, key):
super(bidict, self).__delitem__(self[key])
super(bidict, self).__delitem__(key)
(Notably absent from these implementations of a bidict is the update method which will be slightly more complex (but help(dict.update) will indicate what you'd need to cover). Without update, bidict({1:2}) wouldn't do what it was intended to, nor would d.update({1:2}).)
Also consider whether some other data structure would be more appropriate.
Since your dictionary can contain duplicate values (i.e. {'a': 'A', 'b': 'A'}), the only way to find a key from value is to iterate over the dictionary as you describe.
Or... build the opposite dictionary. you have to recreate it after each modification of the original dictionary.
Or... write a class that maintains both-ways dictionary. You will have to manage situations where duplicate value appears.
the first solution with the list comprehension is good.
but a small fix for python 3.x, instead of .iteritems() it should be just .items():
[k for k, v in d.items() if v == desired_value]
Building a opposite dictionary is not at all good manner as one or more key have same value but if you invert it you need to insert key:[value1,... ] structure which will lead you to another problem.