create dictionary from list - in sequence - python

I would like to create a dictionary from list
>>> list=['a',1,'b',2,'c',3,'d',4]
>>> print list
['a', 1, 'b', 2, 'c', 3, 'd', 4]
I use dict() to produce dictionary from list
but the result is not in sequence as expected.
>>> d = dict(list[i:i+2] for i in range(0, len(list),2))
>>> print d
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
I expect the result to be in sequence as the list.
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
Can you guys please help advise?

Dictionaries don't have any order, use collections.OrderedDict if you want the order to be preserved. And instead of using indices use an iterator.
>>> from collections import OrderedDict
>>> lis = ['a', 1, 'b', 2, 'c', 3, 'd', 4]
>>> it = iter(lis)
>>> OrderedDict((k, next(it)) for k in it)
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

Dictionary is an unordered data structure. To preserve order use collection.OrderedDict:
>>> lst = ['a',1,'b',2,'c',3,'d',4]
>>> from collections import OrderedDict
>>> OrderedDict(lst[i:i+2] for i in range(0, len(lst),2))
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

You could use the grouper recipe: zip(*[iterable]*n) to collect the items into groups of n:
In [5]: items = ['a',1,'b',2,'c',3,'d',4]
In [6]: items = iter(items)
In [7]: dict(zip(*[items]*2))
Out[7]: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
PS. Never name a variable list, since it shadows the builtin (type) of the same name.
The grouper recipe is easy to use, but a little harder to explain.
Items in a dict are unordered. So if you want the dict items in a certain order, use a collections.OrderedDict (as falsetru already pointed out):
In [13]: collections.OrderedDict(zip(*[items]*2))
Out[13]: OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

Related

How to merge 2 ordered dictionaries in python?

I have 2 ordered dictionaries, like:
a = collections.OrderedDict()
b = collections.OrderedDict()
And they have stuff in them. How do I merge these 2? I tried:
mergeDict = dict(a.items() + b.items())
but doing this it's not a ordered dictionary anymore.
What I am looking for: if a = {1, 2, 5, 6} and b = [0, 7, 3, 9} then mergeDict = {1, 2, 5, 6, 0, 7, 3, 9}
Two ways (assuming Python 3.6):
Use "update method". Suppose there are two dictionaries:
>>> d1 = collections.OrderedDict([('a', 1), ('b', 2)])
>>> d2 = {'c': 3, 'd': 4}
>>> d1.update(d2)
>>> d1
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
Second method using 'concatenation operator (+)'
>>> d1 = collections.OrderedDict([('a', 1), ('b', 2)])
>>> d2 = {'c': 3, 'd': 4}
>>> d3 = collections.OrderedDict(list(d1.items()) + list(d2.items()))
>>> d3
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
from itertools import chain
from collections import OrderedDict
OrderedDict(chain(a.items(), b.items()))
CPython 3.6, and any 3.7+ interpreter already preserves dictionary key ordering.
This allows you to do use the {**a, **b} syntax.
For example:
>>> a = {1: "AA", 2: "BB", 5: "CC", 6: "DD"}
>>> b = {0: "EE", 7: "FF", 3: "GG", 9: "HH"}
>>> {**a, **b}
{1: 'AA', 2: 'BB', 5: 'CC', 6: 'DD', 0: 'EE', 7: 'FF', 3: 'GG', 9: 'HH'}
instead of dict use back OrderedDict for mergeDict
mergeDict = collections.OrderedDict(a.items() + b.items())
Remark :this only works for python 2.x, add list() over dict.items() for python 3 because dict.items() no longer return list that support + operation
or use a.update(b) like #alfasin mentions in comment
i try with simple example and both method works well for me

Most Pythonic way for creating a defaultdictionary counter

I am trying to count occurrences of various items based on condition. What I have until now is this function that given two items will increase the counter like this:
given [('a', 'a'), ('a', 'b'), ('b', 'a')] will output defaultdict(<class 'collections.Counter'>, {'a': Counter({'a': 1, 'b': 1}), 'b': Counter({'a': 1})
the function can be seen bellow
def freq(samples=None):
out = defaultdict(Counter)
if samples:
for (c, s) in samples:
out[c][s] += 1
return out
It is limited though to only work with tuples while I would like it to be more generic and work with any number of variables e.g., [('a', 'a', 'b'), ('a', 'b', 'c'), ('b', 'a', 'a')] would still work and I would be able to query the result for lets say res['a']['b'] and get the count for 'c' that is one.
What would be the best way to do this in Python?
Assuming all tuples in the list have the same length:
from collections import Counter
from itertools import groupby
from operator import itemgetter
def freq(samples=[]):
sorted_samples = sorted(samples)
if sorted_samples and len(sorted_samples[0]) > 2:
return {key: freq(value[1:] for value in values) for key, values in groupby(sorted_samples, itemgetter(0))}
else:
return {key: Counter(value[1] for value in values) for key, values in groupby(sorted_samples, itemgetter(0))}
That gives:
freq([('a', 'a'), ('a', 'b'), ('b', 'a'), ('a', 'c')])
>>> {'a': Counter({'a': 1, 'b': 1, 'c': 1}), 'b': Counter({'a': 1})}
freq([('a', 'a', 'a'), ('a', 'b', 'c'), ('b', 'a', 'a'), ('a', 'c', 'c')])
>>> {'a': {'a': Counter({'a': 1}), 'b': Counter({'c': 1}), 'c': Counter({'c': 1})}, 'b': {'a': Counter({'a': 1})}}
One option is to use the full tuples as keys
def freq(samples=[]):
out = Counter()
for sample in samples:
out[sample] += 1
return out
which would then return things as
Counter({('a', 'a', 'b'): 1, ('a', 'b', 'c'): 1, ('b', 'a', 'a'): 1})
You could convert the tuples to strings to select certain slices, e.g. "('a', 'b',". For example in a new dictionary {k: v for k,v in out.items() if str(k)[:10] == "('a', 'b',"}.
If the groups are indeed either 2 or 3 long, but never both, you can change to:
def freq(samples):
l = len(samples[0])
if l == 2:
out = defaultdict(lambda: 0)
for a, b in samples:
out[a][b] += 1
elif l == 3:
out = defaultdict(lambda: defaultdict(lambda: 0))
for a, b, c in samples:
out[a][b][c] += 1
return out

How to sort keys of dict by values?

I have a dict {'a': 2, 'b': 0, 'c': 1}.
Need to sort keys by values so that I can get a list ['b', 'c', 'a']
Is there any easy way to do this?
sorted_keys = sorted(my_dict, key=my_dict.get)
>>> d={'a': 2, 'b': 0, 'c': 1}
>>> [i[0] for i in sorted(d.items(), key=lambda x:x[1])]
['b', 'c', 'a']
try this:
import operator
lst1 = sorted(lst.items(), key=operator.itemgetter(1))
There's a simple way to do it.
You can use .items() to get key-value and use sorted to sort them accordingly.
dictionary = sorted(dictionary.items(),key=lambda x:x[1])
>>> d = {'a':2, 'b':0, 'c':1}
>>> sor = sorted(d.items(), key=lambda x: x[1])
>>> sor
[('b', 0), ('c', 1), ('a', 2)]
>>> for i in sor:
... print i[0]
...
b
c
a

How to return a list as a dictionary, with the placement of the lists' variables as the dictionary's values?

For example, in a race, I have a list of runners and their names in a list ordered from their places, such as ['Bob', 'Charlie', 'Sarah', 'Alex', 'Bob']
I want to create a dictionary with this list such as
{'Bob': [0, 4], 'Charlie': [1], 'Sarah': [2], 'Alex': [3]}
If you only need to create a dictionary with the list variables as the dictionary keys and the positions of the lists' variables as the dictionary values, how would you do so?
[A, B, C, A] -> {A: [0, 3] B: [1], C:[2]}
(I'm having trouble figuring this out.)
Thank you. Sorry for the changed output. Thank you very much!
You can use enumerate(). This will iterate through the list, providing you with both the current element and that element's index.
my_list = ['Bob', 'Charlie', 'Sarah']
my_dict = {}
for index, name in enumerate(my_list):
my_dict[name] = index
EDIT: Since the OP has changed.
To get exactly what you requested, you could use a defaultdict. This will create a dict and you specify what you want the default values to be. So if you go to access a key that does not yet exist, an empty list will automatically be added as the value. This way you can do the following:
from collections import defualtdict
my_list = ['Bob', 'Charlie', 'Sarah', 'Bob']
my_dict = defaultdict(list)
for index, name in enumerate(my_list):
my_dict[name].append(index)
you can use enumerate() and itertools.groupby():
>>> your_list=['A','B','C','C','A','A','B','D']
>>> l=[(j,i) for i,j in enumerate(your_list,1)]
>>> l
[('A', 1), ('B', 2), ('C', 3), ('C', 4), ('A', 5), ('A', 6), ('B', 7), ('D', 8)]
>>> g=[list(g) for k, g in groupby(sorted(l),itemgetter(0))]
>>> g
[[('A', 1), ('A', 5), ('A', 6)], [('B', 2), ('B', 7)], [('C', 3), ('C', 4)], [('D', 8)]]
>>> z=[zip(*i) for i in g]
>>> z
[[('A', 'A', 'A'), (1, 5, 6)], [('B', 'B'), (2, 7)], [('C', 'C'), (3, 4)], [('D',), (8,)]]
>>> {i[0]:j for i,j in z}
{'A': (1, 5, 6), 'C': (3, 4), 'B': (2, 7), 'D': (8,)}
how about a simple loop to get the desired result:
x = ['Bob', 'Charlie', 'Sarah', 'Alex', 'Bob']
y = {}
for i, name in enumerate(x):
if name in y.keys():
y[name].append(i)
else:
y[name] = [i]

One-step initialization of defaultdict that appends to list?

It would be convenient if a defaultdict could be initialized along the following lines
d = defaultdict(list, (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),
('b', 3)))
to produce
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
Instead, I get
defaultdict(<type 'list'>, {'a': 2, 'c': 3, 'b': 3, 'd': 4})
To get what I need, I end up having to do this:
d = defaultdict(list)
for x, y in (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)):
d[x].append(y)
This is IMO one step more than should be necessary, am I missing something here?
What you're apparently missing is that defaultdict is a straightforward (not especially "magical") subclass of dict. All the first argument does is provide a factory function for missing keys. When you initialize a defaultdict, you're initializing a dict.
If you want to produce
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
you should be initializing it the way you would initialize any other dict whose values are lists:
d = defaultdict(list, (('a', [1, 2]), ('b', [2, 3]), ('c', [3]), ('d', [4])))
If your initial data has to be in the form of tuples whose 2nd element is always an integer, then just go with the for loop. You call it one extra step; I call it the clear and obvious way to do it.
the behavior you describe would not be consistent with the defaultdicts other behaviors. Seems like what you want is FooDict such that
>>> f = FooDict()
>>> f['a'] = 1
>>> f['a'] = 2
>>> f['a']
[1, 2]
We can do that, but not with defaultdict; lets call it AppendDict
import collections
class AppendDict(collections.MutableMapping):
def __init__(self, container=list, append=None, pairs=()):
self.container = collections.defaultdict(container)
self.append = append or list.append
for key, value in pairs:
self[key] = value
def __setitem__(self, key, value):
self.append(self.container[key], value)
def __getitem__(self, key): return self.container[key]
def __delitem__(self, key): del self.container[key]
def __iter__(self): return iter(self.container)
def __len__(self): return len(self.container)
Sorting and itertools.groupby go a long way:
>>> L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
>>> L.sort(key=lambda t:t[0])
>>> d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(L, key=lambda t: t[0])])
>>> d
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})
To make this more of a one-liner:
L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(sorted(L, key=operator.itemgetter(0)), key=lambda t: t[0])])
Hope this helps
I think most of this is a lot of smoke and mirrors to avoid a simple for loop:
di={}
for k,v in [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]:
di.setdefault(k,[]).append(v)
# di={'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
If your goal is one line and you want abusive syntax that I cannot at all endorse or support you can use a side effect comprehension:
>>> li=[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]
>>> di={};{di.setdefault(k[0],[]).append(k[1]) for k in li}
set([None])
>>> di
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
If you really want to go overboard into the unreadable:
>>> {k1:[e for _,e in v1] for k1,v1 in {k:filter(lambda x: x[0]==k,li) for k,v in li}.items()}
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}
You don't want to do that. Use the for loop Luke!
>>> kvs = [(1,2), (2,3), (1,3)]
>>> reduce(
... lambda d,(k,v): d[k].append(v) or d,
... kvs,
... defaultdict(list))
defaultdict(<type 'list'>, {1: [2, 3], 2: [3]})

Categories

Resources