It would be convenient if a defaultdict could be initialized along the following lines
d = defaultdict(list, (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),
('b', 3)))
to produce
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
Instead, I get
defaultdict(<type 'list'>, {'a': 2, 'c': 3, 'b': 3, 'd': 4})
To get what I need, I end up having to do this:
d = defaultdict(list)
for x, y in (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)):
d[x].append(y)
This is IMO one step more than should be necessary, am I missing something here?
What you're apparently missing is that defaultdict is a straightforward (not especially "magical") subclass of dict. All the first argument does is provide a factory function for missing keys. When you initialize a defaultdict, you're initializing a dict.
If you want to produce
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
you should be initializing it the way you would initialize any other dict whose values are lists:
d = defaultdict(list, (('a', [1, 2]), ('b', [2, 3]), ('c', [3]), ('d', [4])))
If your initial data has to be in the form of tuples whose 2nd element is always an integer, then just go with the for loop. You call it one extra step; I call it the clear and obvious way to do it.
the behavior you describe would not be consistent with the defaultdicts other behaviors. Seems like what you want is FooDict such that
>>> f = FooDict()
>>> f['a'] = 1
>>> f['a'] = 2
>>> f['a']
[1, 2]
We can do that, but not with defaultdict; lets call it AppendDict
import collections
class AppendDict(collections.MutableMapping):
def __init__(self, container=list, append=None, pairs=()):
self.container = collections.defaultdict(container)
self.append = append or list.append
for key, value in pairs:
self[key] = value
def __setitem__(self, key, value):
self.append(self.container[key], value)
def __getitem__(self, key): return self.container[key]
def __delitem__(self, key): del self.container[key]
def __iter__(self): return iter(self.container)
def __len__(self): return len(self.container)
Sorting and itertools.groupby go a long way:
>>> L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
>>> L.sort(key=lambda t:t[0])
>>> d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(L, key=lambda t: t[0])])
>>> d
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
To make this more of a one-liner:
L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(sorted(L, key=operator.itemgetter(0)), key=lambda t: t[0])])
Hope this helps
I think most of this is a lot of smoke and mirrors to avoid a simple for loop:
di={}
for k,v in [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]:
di.setdefault(k,[]).append(v)
# di={'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
If your goal is one line and you want abusive syntax that I cannot at all endorse or support you can use a side effect comprehension:
>>> li=[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]
>>> di={};{di.setdefault(k[0],[]).append(k[1]) for k in li}
set([None])
>>> di
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
If you really want to go overboard into the unreadable:
>>> {k1:[e for _,e in v1] for k1,v1 in {k:filter(lambda x: x[0]==k,li) for k,v in li}.items()}
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
You don't want to do that. Use the for loop Luke!
>>> kvs = [(1,2), (2,3), (1,3)]
>>> reduce(
... lambda d,(k,v): d[k].append(v) or d,
... kvs,
... defaultdict(list))
defaultdict(<type 'list'>, {1: [2, 3], 2: [3]})
Related
dicts support several initialization/update arguments:
d = dict([('a',1), ('b',2)]) # list of (key,value)
d = dict({'a':1, 'b':2}) # copy of dict
d = dict(a=1, b=2) #keywords
How do I write the signature and handle the arguments of a function that handles the same set of cases?
I'm writing a class that I want to support the same set of arguments, but just store the list of (key,value) pairs in order to allow for multiple items with the same key. I'd like to be able to add to the list of pairs from dicts, or lists of pairs or keywords in a way that is the same as for dict.
The C implementation of update isn't very helpful to me as I'm implementing this in python.
Here is a function that should behave the same as dict (except for the requirement to support multiple keys):
def func(*args, **kwargs):
result = []
if len(args) > 1:
raise TypeError('expected at most 1 argument, got 2')
elif args:
if all(hasattr(args[0], a) for a in ('keys', '__getitem__')):
result.extend(dict(args[0]).items())
else:
for k, v in args[0]:
hash(k)
result.append((k, v))
result.extend(kwargs.items())
return result
A few notes:
The collections.abc.Mapping class isn't used to test for mapping objects, because dict has lower requirements than that.
The signature of the function could be defined more precisely using positional-only syntax, like this: def func(x=None, /, **kwargs). However, that requires Python >= 3.8, so a more backwards-compatible solution has been preferred.
The function should raise the same exceptions as dict, but the messages won't always be exactly the same.
Below are some simple tests that compare the behaviour of the function with the dict constructor (although it does not attempt to cover every possibility):
def test(fn):
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'d': 4, 'e': 5, 'f': 6}
class Maplike():
def __getitem__(self, k):
return d1[k]
def keys(self):
return d1.keys()
print('test %s\n=========\n' % fn.__name__)
print('dict:', fn(d1))
print('maplike:', fn(Maplike()))
print('seq:', fn(tuple(d1.items())))
print('it:', fn(iter(d1.items())))
print('gen:', fn(i for i in d1.items()))
print('set:', fn(set(d2.items())))
print('str:', fn(['fu', 'ba', 'r!']))
print('kwargs:', fn(**d1))
print('arg+kwargs:', fn(d1, **d2))
print('dup-keys:', fn(d1, **d1))
print('empty:', fn())
print()
try:
fn(d1, d2)
print('ERROR')
except Exception as e:
print('len-args: %s' % e)
try:
fn([(1, 2, 3)])
print('ERROR')
except Exception as e:
print('pairs: %s' % e)
try:
fn([([], 3)])
print('ERROR')
except Exception as e:
print('hashable: %s' % e)
print()
test(func)
test(dict)
Ouput:
test func
=========
dict: [('a', 1), ('b', 2), ('c', 3)]
maplike: [('a', 1), ('b', 2), ('c', 3)]
seq: [('a', 1), ('b', 2), ('c', 3)]
it: [('a', 1), ('b', 2), ('c', 3)]
gen: [('a', 1), ('b', 2), ('c', 3)]
set: [('d', 4), ('e', 5), ('f', 6)]
str: [('f', 'u'), ('b', 'a'), ('r', '!')]
kwargs: [('a', 1), ('b', 2), ('c', 3)]
arg+kwargs: [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5), ('f', 6)]
dup-keys: [('a', 1), ('b', 2), ('c', 3), ('a', 1), ('b', 2), ('c', 3)]
empty: []
len-args: expected at most 1 argument, got 2
pairs: too many values to unpack (expected 2)
hashable: unhashable type: 'list'
test dict
=========
dict: {'a': 1, 'b': 2, 'c': 3}
maplike: {'a': 1, 'b': 2, 'c': 3}
seq: {'a': 1, 'b': 2, 'c': 3}
it: {'a': 1, 'b': 2, 'c': 3}
gen: {'a': 1, 'b': 2, 'c': 3}
set: {'d': 4, 'e': 5, 'f': 6}
str: {'f': 'u', 'b': 'a', 'r': '!'}
kwargs: {'a': 1, 'b': 2, 'c': 3}
arg+kwargs: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6}
dup-keys: {'a': 1, 'b': 2, 'c': 3}
empty: {}
len-args: dict expected at most 1 argument, got 2
pairs: dictionary update sequence element #0 has length 3; 2 is required
hashable: unhashable type: 'list'
This should do the trick:
class NewDict:
def __init__(self, *args, **kwargs):
self.items = []
if kwargs:
self.items = list(kwargs.items())
elif args:
if type(args[0]) == list:
self.items = list(args[0])
elif type(args[0]) == dict:
self.items = list(args[0].items())
def __repr__(self):
s = "NewDict({"
for i, (k,v) in enumerate(self.items):
s += repr(k) + ": " + repr(v)
if i < len(self.items) - 1:
s += ", "
s += "})"
return s
d1 = NewDict()
d2 = NewDict([('a',1), ('b',2), ('a',2)])
d3 = NewDict({'a':1, 'b':2})
d4 = NewDict(a=1, b=2)
print(d1, d2, d3, d4, sep="\n")
print(d2.items)
Output:
NewDict({})
NewDict({'a': 1, 'b': 2, 'a': 2})
NewDict({'a': 1, 'b': 2})
NewDict({'a': 1, 'b': 2})
[('a', 1), ('b', 2), ('a', 2)]
I'm not saying this is a safe or great way to do this, but how about ...
def getPairs(*args, **kwargs):
pairs = []
for a in args:
if type(a) == dict:
pairs.extend([[key, val] for key, val in a.items()])
elif type(a) == list:
for a2 in a:
if len(a2) > 1:
pairs.append([a2[0], a2[1:]])
pairs.extend([[key, val] for key, val in kwargs.items()])
return pairs
please tell me.
Description
I'd like to update the value of a variable of type OrderedDict by using the update method of dict.
However, after executing the update method, the OrderedDict type of the update target variable is lost, and output as expected can not be done.
Question points:
Is it faulty to lose the type of OrderedDict?
Is there another way to update dict while keeping the type of OrderedDict?
Below is an example of the problem.
from collections import OrderedDict
dic = OrderedDict()
dic['a'] = 1
dic['b'] = OrderedDict()
dic['b']['b1'] = 2
dic['b']['b2'] = 3
dic['b']['b3'] = 4
print(dic)
> OrderedDict([('a', 1), ('b', OrderedDict([('b1', 2), ('b2', 3), ('b3', 4)]))]) # ok
new_dic = {'a': 2, 'b': {'b1': 3, 'b2': 4, 'b3': 5}}
print(new_dic)
> {'a': 2, 'b': {'b1': 3, 'b2': 4, 'b3': 5}}
dic.update(new_dic)
print(dic)
> OrderedDict([('a', 2), ('b', {'b1': 3, 'b2': 4, 'b3': 5})]) # NG: Type has been lost
An update has the effect of a rebinding of the affected keys. What you are doing in short, is:
# ...
dic['b'] = OrderedDict()
# ...
dic['b'] = {'b1': 3, 'b2': 4, 'b3': 5}
# ...
The new value of key 'b' in dic is now the common dict. You are trying to do some nested update that is not provided out of the box. You could implement it yourself along the lines of:
def update(d1, d2):
for k, v in d2.items():
if k in d1 and isinstance(v, dict) and isinstance(d1[k], dict):
update(d1[k], v)
else:
d1[k] = v
Now you can apply it to your case:
update(dic, new_dic)
# OrderedDict([('a', 2), ('b', OrderedDict([('b1', 3), ('b2', 4), ('b3', 5)]))])
change this line
new_dic = {'a': 2, 'b': {'b1': 3, 'b2': 4, 'b3': 5}}
to
new_dic = {'a': 2, 'b': OrderedDict([('b1', 3), ('b2', 4), ('b3', 5)])}
It'd be okay!
from collections import OrderedDict
dic = OrderedDict()
dic['a'] = 1
dic['b'] = OrderedDict()
dic['b']['b1'] = 2
dic['b']['b2'] = 3
dic['b']['b3'] = 4
print(dic)
#> OrderedDict([('a', 1), ('b', OrderedDict([('b1', 2), ('b2', 3), ('b3', 4)]))]) # ok
new_dic = {'a': 2, 'b': {'b1': 3, 'b2': 4, 'b3': 5}}
new_dic['b'] = OrderedDict(new_dic['b'])
print(new_dic)
#> {'a': 2, 'b': OrderedDict([('b1', 3), ('b2', 4), ('b3', 5)])}
dic.update(new_dic)
print(dic)
#> OrderedDict([('a', 2), ('b', OrderedDict([('b1', 3), ('b2', 4), ('b3', 5)]))])
I am trying to count occurrences of various items based on condition. What I have until now is this function that given two items will increase the counter like this:
given [('a', 'a'), ('a', 'b'), ('b', 'a')] will output defaultdict(<class 'collections.Counter'>, {'a': Counter({'a': 1, 'b': 1}), 'b': Counter({'a': 1})
the function can be seen bellow
def freq(samples=None):
out = defaultdict(Counter)
if samples:
for (c, s) in samples:
out[c][s] += 1
return out
It is limited though to only work with tuples while I would like it to be more generic and work with any number of variables e.g., [('a', 'a', 'b'), ('a', 'b', 'c'), ('b', 'a', 'a')] would still work and I would be able to query the result for lets say res['a']['b'] and get the count for 'c' that is one.
What would be the best way to do this in Python?
Assuming all tuples in the list have the same length:
from collections import Counter
from itertools import groupby
from operator import itemgetter
def freq(samples=[]):
sorted_samples = sorted(samples)
if sorted_samples and len(sorted_samples[0]) > 2:
return {key: freq(value[1:] for value in values) for key, values in groupby(sorted_samples, itemgetter(0))}
else:
return {key: Counter(value[1] for value in values) for key, values in groupby(sorted_samples, itemgetter(0))}
That gives:
freq([('a', 'a'), ('a', 'b'), ('b', 'a'), ('a', 'c')])
>>> {'a': Counter({'a': 1, 'b': 1, 'c': 1}), 'b': Counter({'a': 1})}
freq([('a', 'a', 'a'), ('a', 'b', 'c'), ('b', 'a', 'a'), ('a', 'c', 'c')])
>>> {'a': {'a': Counter({'a': 1}), 'b': Counter({'c': 1}), 'c': Counter({'c': 1})}, 'b': {'a': Counter({'a': 1})}}
One option is to use the full tuples as keys
def freq(samples=[]):
out = Counter()
for sample in samples:
out[sample] += 1
return out
which would then return things as
Counter({('a', 'a', 'b'): 1, ('a', 'b', 'c'): 1, ('b', 'a', 'a'): 1})
You could convert the tuples to strings to select certain slices, e.g. "('a', 'b',". For example in a new dictionary {k: v for k,v in out.items() if str(k)[:10] == "('a', 'b',"}.
If the groups are indeed either 2 or 3 long, but never both, you can change to:
def freq(samples):
l = len(samples[0])
if l == 2:
out = defaultdict(lambda: 0)
for a, b in samples:
out[a][b] += 1
elif l == 3:
out = defaultdict(lambda: defaultdict(lambda: 0))
for a, b, c in samples:
out[a][b][c] += 1
return out
For example, in a race, I have a list of runners and their names in a list ordered from their places, such as ['Bob', 'Charlie', 'Sarah', 'Alex', 'Bob']
I want to create a dictionary with this list such as
{'Bob': [0, 4], 'Charlie': [1], 'Sarah': [2], 'Alex': [3]}
If you only need to create a dictionary with the list variables as the dictionary keys and the positions of the lists' variables as the dictionary values, how would you do so?
[A, B, C, A] -> {A: [0, 3] B: [1], C:[2]}
(I'm having trouble figuring this out.)
Thank you. Sorry for the changed output. Thank you very much!
You can use enumerate(). This will iterate through the list, providing you with both the current element and that element's index.
my_list = ['Bob', 'Charlie', 'Sarah']
my_dict = {}
for index, name in enumerate(my_list):
my_dict[name] = index
EDIT: Since the OP has changed.
To get exactly what you requested, you could use a defaultdict. This will create a dict and you specify what you want the default values to be. So if you go to access a key that does not yet exist, an empty list will automatically be added as the value. This way you can do the following:
from collections import defualtdict
my_list = ['Bob', 'Charlie', 'Sarah', 'Bob']
my_dict = defaultdict(list)
for index, name in enumerate(my_list):
my_dict[name].append(index)
you can use enumerate() and itertools.groupby():
>>> your_list=['A','B','C','C','A','A','B','D']
>>> l=[(j,i) for i,j in enumerate(your_list,1)]
>>> l
[('A', 1), ('B', 2), ('C', 3), ('C', 4), ('A', 5), ('A', 6), ('B', 7), ('D', 8)]
>>> g=[list(g) for k, g in groupby(sorted(l),itemgetter(0))]
>>> g
[[('A', 1), ('A', 5), ('A', 6)], [('B', 2), ('B', 7)], [('C', 3), ('C', 4)], [('D', 8)]]
>>> z=[zip(*i) for i in g]
>>> z
[[('A', 'A', 'A'), (1, 5, 6)], [('B', 'B'), (2, 7)], [('C', 'C'), (3, 4)], [('D',), (8,)]]
>>> {i[0]:j for i,j in z}
{'A': (1, 5, 6), 'C': (3, 4), 'B': (2, 7), 'D': (8,)}
how about a simple loop to get the desired result:
x = ['Bob', 'Charlie', 'Sarah', 'Alex', 'Bob']
y = {}
for i, name in enumerate(x):
if name in y.keys():
y[name].append(i)
else:
y[name] = [i]
I would like to create a dictionary from list
>>> list=['a',1,'b',2,'c',3,'d',4]
>>> print list
['a', 1, 'b', 2, 'c', 3, 'd', 4]
I use dict() to produce dictionary from list
but the result is not in sequence as expected.
>>> d = dict(list[i:i+2] for i in range(0, len(list),2))
>>> print d
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
I expect the result to be in sequence as the list.
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
Can you guys please help advise?
Dictionaries don't have any order, use collections.OrderedDict if you want the order to be preserved. And instead of using indices use an iterator.
>>> from collections import OrderedDict
>>> lis = ['a', 1, 'b', 2, 'c', 3, 'd', 4]
>>> it = iter(lis)
>>> OrderedDict((k, next(it)) for k in it)
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
Dictionary is an unordered data structure. To preserve order use collection.OrderedDict:
>>> lst = ['a',1,'b',2,'c',3,'d',4]
>>> from collections import OrderedDict
>>> OrderedDict(lst[i:i+2] for i in range(0, len(lst),2))
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
You could use the grouper recipe: zip(*[iterable]*n) to collect the items into groups of n:
In [5]: items = ['a',1,'b',2,'c',3,'d',4]
In [6]: items = iter(items)
In [7]: dict(zip(*[items]*2))
Out[7]: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
PS. Never name a variable list, since it shadows the builtin (type) of the same name.
The grouper recipe is easy to use, but a little harder to explain.
Items in a dict are unordered. So if you want the dict items in a certain order, use a collections.OrderedDict (as falsetru already pointed out):
In [13]: collections.OrderedDict(zip(*[items]*2))
Out[13]: OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])