How can I convert defaultdict(Set) to defaultdict(list)? - python

I have a defaultdict(Set):
from sets import Set
from collections import defaultdict
values = defaultdict(Set)
I want the Set functionality when building it up in order to remove duplicates. Next step I want to store this as json. Since json doesn't support this datastructure I would like to convert the datastructure into a defaultdict(list) but when I try:
defaultdict(list)(values)
I get: TypeError: 'collections.defaultdict' object is not callable, how should I do the conversion?

You can use following:
>>> values = defaultdict(Set)
>>> values['a'].add(1)
>>> defaultdict(list, ((k, list(v)) for k, v in values.items()))
defaultdict(<type 'list'>, {'a': [1]})
defaultdict constructor takes default_factory as a first argument which can be followed by the same arguments as in normal dict. In this case the second argument is a generator expression that returns tuples consisting key and value.
Note that if you only need to store it as a JSON normal dict will do just fine:
>>> {k: list(v) for k, v in values.items()}
{'a': [1]}

defaultdict(list, values)
The defaultdict constructor works like the dict constructor with a mandatory default_factory argument in front. However, this won't convert any existing values from Sets to lists. If you want to do that, you need to do it manually:
defaultdict(list, ((k, list(v)) for k, v in values.viewitems()))
You might not even want a defaultdict at all at that point, though:
{k: list(v) for k, v in values.viewitems()}

Say that a = set(), and you have populated it already with unique values. Then, when using defaultdict you could cast it into a list: defaultdict(list(a))

Related

How to create a reverse dictionary that takes in account repeated values?

I am trying to create a function that takes in a dictionary and returns a reverse of it while taking care of repeated values. That is, if the original dictionary would be
original_dict = {'first': ['a'], 'second': ['b', 'c'], 'third': ['d'], 'fourth': ['d']}
the function should return
{'a': ['first'], 'b': ['second'], 'c': ['second'], 'd': ['third', 'fourth']}
I've written
def reversed_dict(d):
new_dict = {}
for keys,values in d.items():
new_dict[values]=keys
but when I try it out with the original dictionary, I get an error "unhashable type: 'list'" when I try out the function. Are there any hints what might be causing it?
You have to iterate over the values in the list as well:
def reversed_dict(d):
new_dict = {}
for keys,values in d.items():
for val in values:
new_dict.setdefault(val, []).append(keys)
return new_dict
You have to iterate over the values and add them as keys. You also have to take into account the possibility that you may have already added a value as a key.
def reversed_dict(d):
new_dict = {}
for keys,values in d.items():
for v in values:
if v in new_dict:
new_dict[v].append(keys)
else:
new_dict[v] = [keys]
return new_dict
Use collections.defaultdict:
from collections import defaultdict
def reversed_dict(d):
new_dict = defaultdict(list)
for key, values in d.items():
for value in values:
new_dict[value].append(key)
return new_dict
The problem with your approach is you're using the entire list as the key of the dictionary. Instead you need to iterate over the list (i.e. for value in values: in the code above.)
defaultdict just makes it simpler to read.
You are getting this error because any of your original_dict values is a mutable type which is, as the error suggests, an unhashable type thus not
avalid candidate for a key in the reversed_dict.
You can workaround this problem by type-checking and casting mutable types into an immutable equivalent, e.g. a list into a tuple.
(also I find dict comp a way more elegant and concise approach):
def reversed_dict(d):
return {v if not isinstance(v, list) else tuple(v): k for k, v in d.items()}

Converting ordered dictionaries to dictionary

I have a dataset which might have n level of ordered dictionary of ordered dictionaries which might be again inside list of tuples or tuples or just lists,Now i need to convert all of them into normal dictionaries,Is there a easier method to do other than recursive search and conversion.
from collections import OrderedDict
def ordered_to_regular_dict(d):
if isinstance(d, OrderedDict):
d = {k: ordered_to_regular_dict(v) for k, v in d.items()}
return d
I got an answer from stack overflow which helps with ordered dictionary of ordered dictionaries but not with the dictionaries inside list of tuple or ordered dictionary inside a list or a tuple.
Why not just writing an if for every possibility (tuple, list, dict) like that:
from collections import OrderedDict
def ordered_to_regular_dict(d):
if isinstance(d, OrderedDict) or isinstance(d, dict):
d = {k: ordered_to_regular_dict(v) for k, v in d.items()}
elif isinstance(d, list):
d = [ordered_to_regular_dict(v) for v in d.items()]
elif isinstance(d, tuple):
d = (ordered_to_regular_dict(v) for v in d.items())
return d
You should leverage Python's builtin copy mechanism.
You can override copying behavior for OrderedDict via Python's copyreg module (also used by pickle). Then you can use Python's builtin copy.deepcopy() function to perform the conversion.
import copy
import copyreg
from collections import OrderedDict
def convert_nested_ordered_dict(x):
"""
Perform a deep copy of the given object, but convert
all internal OrderedDicts to plain dicts along the way.
Args:
x: Any pickleable object
Returns:
A copy of the input, in which all OrderedDicts contained
anywhere in the input (as iterable items or attributes, etc.)
have been converted to plain dicts.
"""
# Temporarily install a custom pickling function
# (used by deepcopy) to convert OrderedDict to dict.
orig_pickler = copyreg.dispatch_table.get(OrderedDict, None)
copyreg.pickle(
OrderedDict,
lambda d: (dict, ([*d.items()],))
)
try:
return copy.deepcopy(x)
finally:
# Restore the original OrderedDict pickling function (if any)
del copyreg.dispatch_table[OrderedDict]
if orig_pickler:
copyreg.dispatch_table[OrderedDict] = orig_pickler
Merely by using Python's builtin copying infrastructure, this solution has several nice properties:
Works for more than just JSON data.
Works for arbitrary data hierarchies.
Does not require you to implement special logic for each possible element type (e.g. list, tuple, etc.)
deepcopy() will properly handle duplicate objects within the collection:
x = [1,2,3]
d = {'a': x, 'b': x}
assert d['a'] is d['b']
d2 = copy.deepcopy(d)
assert d2['a'] is d2['b']
Since our solution is based on deepcopy() we'll have the same advantage.
This solution also converts attributes that happen to be OrderedDict, not only collection elements:
class C:
def __init__(self, a):
self.a = a
def __repr__(self):
return f"C(a={self.a})"
c = C(OrderedDict([(1, 'one'), (2, 'two')]))
print("original: ", c)
print("converted:", convert_nested_ordered_dict(c))
original: C(a=OrderedDict([(1, 'one'), (2, 'two')]))
converted: C(a={1: 'one', 2: 'two'})

How can I convert nested dictionary to defaultdict?

How can I convert nested dictionary to nested defaultdict?
dic = {"a": {"aa": "xxx"}}
default = defaultdict(lambda: None, dic)
print(default["dummy_key"]) # return None
print(default["a"]["dummy_key"]) # KeyError
You need to either loop or recurse over the nested dictionary, through all of its levels.
Unless it's potentially ridiculously deep (as in hundreds of levels), or so wide that small performance factors make a difference, recursion is probably simplest here:
def defaultify(d):
if not isinstance(d, dict):
return d
return defaultdict(lambda: None, {k: defaultify(v) for k, v in d.items()})
Or if you want it to work with all mappings, not just dicts, you could use collections.abc.Mapping instead of dict in your isinstance check.
Of course this is assuming you have a pure nested dict. If you've got, say, something you parsed from a typical JSON response, where there might be dicts with list values with dict elements, you have to handle the other possibilities too:
def defaultify(d):
if isinstance(d, dict):
return defaultdict(lambda: None, {k: defaultify(v) for k, v in d.items()})
elif isinstance(d, list):
return [defaultify(e) for e in d]
else:
return d
But if this actually is coming from JSON, it's probably better to just use your defaultdict as an object_pairs_hook while the JSON is being parsed, rather than parsing it to a dict and then converting it to a defaultdict later.
There's an example in the docs of using an OrderedDict in place of dict, but that won't quite work for us—unlike OrderedDict and dict, defaultdict can't just take an iterable of pairs as its only argument; it needs the default value factory first. So we can bind that in, using functools.partial:
d = json.loads(jsonstring, object_hook_pairs=partial(defaultdict, lambda: None))
And so on.

Combining indefinite number of dictionaries for python

I'm a novice python programmer and I'm stuck on a homework problem.
I want to combine dictionaries (tried using **dict) without using the update() method because I want to keep any duplicate keys. I'm fine with having some keys with multiple values.
Could someone point me in the right direction?
Also, I'm doing this in python 3.3
A dict maps a key to a value. Not multiple values. Thus, you need to make each value in the combined dict be a combination of all the values from the input dicts. The easiest way is to use a collections.defaultdict(list):
import collections
input_dicts = [{1: 0}, {1: 1}, {1: 2}]
output_dict = collections.defaultdict(list)
for d in input_dicts:
for key in d:
output_dict[key].append(d[key])
A collections.defaultdict calls a function you specify to generate a default value for any key that you try to access that doesn't already have a value. A collections.defaultdict(list) is thus a dict with default values of lists for all keys. This code will produce an output dict mapping keys to lists of all values from the input dicts.
You can't have duplicate keys in a dictionary. The keys must be unique, but I think what you're looking for is a defaultdict
from collections import defaultdict
d = defaultdict(list)
d1 = {1:'hi', 2:'hey', 3:'hai'}
d2 = {1:'hello', 2:'cabbage', 3:'greetings'}
for k, v in d1.items():
d[k].append(v)
for k1, v1 in d2.items():
d[k1].append(v1)
print d
Prints:
defaultdict(<type 'list'>, {1: ['hi', 'hello'], 2: ['hey', 'cabbage'], 3: ['hai', 'greetings']})

Python a List of lists Merge

Say I have a list of list like this: (suppose you have no idea how many lists in this list)
list=[['food','fish'],['food','meat'],['food','veg'],['sports','football']..]
how can I merge the items in the list like the following:
list=[['food','fish','meat','veg'],['sports','football','basketball']....]
i.e, merge all the nested lists into the same list if they contain one of the same items.
Use defaultdict to make a dictionary that maps a type to values and then get the items:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> items = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
>>> for key, value in items:
... d[key].append(value)
...
>>> [[key] + values for key, values in d.items()]
[['food', 'fish', 'meat', 'veg'], ['sports', 'football']]
The "compulsory" alternative to defaultdict which can work better for data that's already in order of the key and if you don't want to build data structures on it (ie, just work on groups)...
data = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
from itertools import groupby
print [[k] + [i[1] for i in v] for k, v in groupby(data, lambda L: L[0])]
But defaultdict is more flexible and easier to understand - so go with #Blender's answer.

Categories

Resources