Convert dictionary into list with length based on values - python

I have a dictionary
d = {1: 3, 5: 6, 10: 2}
I want to convert it to a list that holds the keys of the dictionary. Each key should be repeated as many times as its associated value.
I've written this code that does the job:
d = {1: 3, 5: 6, 10: 2}
l = []
for i in d:
for j in range(d[i]):
l.append(i)
l.sort()
print(l)
Output:
[1, 1, 1, 5, 5, 5, 5, 5, 5, 10, 10]
But I would like it to be a list comprehension. How can this be done?

You can do it using a list comprehension:
[i for i in d for j in range(d[i])]
yields:
[1, 1, 1, 10, 10, 5, 5, 5, 5, 5, 5]
You can sort it again to get the list you were looking for.

[k for k,v in d.items() for _ in range(v)]
... I guess...
if you want it sorted you can do
[k for k,v in sorted(d.items()) for _ in range(v)]

One approach is to use itertools.chain to glue sublists together
>>> list(itertools.chain(*[[k]*v for k, v in d.items()]))
[1, 1, 1, 10, 10, 5, 5, 5, 5, 5, 5]
Or if you are dealing with a very large dictionary, then you could avoid constructing the sub lists with itertools.chain.from_iterable and itertools.repeat
>>> list(itertools.chain.from_iterable(itertools.repeat(k, v) for k, v in d.items()))
[1, 1, 1, 10, 10, 5, 5, 5, 5, 5, 5]
Comparative timings for a very large dictionary with using a list comprehension that uses two loops:
>>> d = {i: i for i in range(100)}
>>> %timeit list(itertools.chain.from_iterable(itertools.repeat(k, v) for k, v in d.items()))
10000 loops, best of 3: 55.6 µs per loop
>>> %timeit [k for k, v in d.items() for _ in range(v)]
10000 loops, best of 3: 119 µs per loop
It's not clear whether you want your output sorted (your example code does not sort it), but if so simply presort d.items()
# same as previous examples, but we sort d.items()
list(itertools.chain(*[[k]*v for k, v in sorted(d.items())]))

Counter.elements() method does exactly this:
from collections import Counter
d = {1: 3, 5: 6, 10: 2}
c = Counter(d)
result = list(c.elements())
print(result)
# [1, 1, 1, 5, 5, 5, 5, 5, 5, 10, 10]

Related

How do I find each duplicate's index in a List in Python?

For example, let's say I have a list:
lst = [1, 2, 3, 3, 4, 3, 5, 6]
Is there any function that returns all the indexes of the 3s?(which would return [2, 3, 5])
I've changed the list in order to build a dictionary of all items that appeared more than once.
from collections import Counter
lst = [1, 2, 3, 3, 4, 3, 5, 2, 6]
# only values that appears more than once
c = Counter(lst) - Counter(set(lst))
res = {}
for i, elem in enumerate(lst):
if elem in c:
item = res.get(elem)
if item:
item.append(i)
else:
res[elem] = [i]
print(res)
output :
{2: [1, 7], 3: [2, 3, 5]}
A better approach is to use defaultdict :
from collections import defaultdict
lst = [1, 2, 3, 3, 4, 3, 5, 2, 6]
d = defaultdict(list)
for i, elem in enumerate(lst):
d[elem].append(i)
print({k: v for k, v in d.items() if len(v) > 1})
output :
{2: [1, 7], 3: [2, 3, 5]}
You can simply use function enumerate(lst)
it returns elements index_number and value.
for example
lst[0] = 1
the case above index_number is 0
and value is 1
so you can just use enumerate function for returning index number using if condition
(returns it when the value is 3)
lst = [1, 2, 3, 3, 4, 3, 5, 6]
lst_result = [i for i, v in enumerate(lst) if v == 3]
print(lst_result)
your outcome will be
[2,3,5]
I was trying to figure out a solution for a similar problem and this is what I came up with.
def locate_index(item_name, list_name):
"""Determine the indexes of an item if in a list."""
locations = []
for i in range(len(list_name)):
if list_name[i] == item_name:
locations.append(list_name.index(item_name, i, len(list_name)))
print(locations)
Example:
lst = [1, 2, 3, 3, 4, 3, 5, 6]
list2 = [1, 2, 3, 3, 4, 3, 5, 6, 6, 6, 6]
locate_index(3, lst) ---- a)
locate_index(6, list2) ---- b)
Output
a)
[2, 3, 5]
b)
[7, 8, 9, 10]

Mapping two list without looping

I have two lists of equal length. The first list l1 contains data.
l1 = [2, 3, 5, 7, 8, 10, ... , 23]
The second list l2 contains the category the data in l1 belongs to:
l2 = [1, 1, 2, 1, 3, 4, ... , 3]
How can I partition the first list based on the positions defined by numbers such as 1, 2, 3, 4 in the second list, using a list comprehension or lambda function. For example, 2, 3, 7 from the first list belongs to the same partition as they have corresponding values in the second list.
The number of partitions is known at the beginning.
You can use a dictionary:
>>> l1 = [2, 3, 5, 7, 8, 10, 23]
>>> l2 = [1, 1, 2, 1, 3, 4, 3]
>>> d = {}
>>> for i, j in zip(l1, l2):
... d.setdefault(j, []).append(i)
...
>>>
>>> d
{1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]}
If a dict is fine, I suggest using a defaultdict:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for number, category in zip(l1, l2):
... d[category].append(number)
...
>>> d
defaultdict(<type 'list'>, {1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]})
Consider using itertools.izip for memory efficiency if you are using Python 2.
This is basically the same solution as Kasramvd's, but I think the defaultdict makes it a little easier to read.
This will give a list of partitions using list comprehension :
>>> l1 = [2, 3, 5, 7, 8, 10, 23]
>>> l2 = [1, 1, 2, 1, 3, 4, 3]
>>> [[value for i, value in enumerate(l1) if j == l2[i]] for j in set(l2)]
[[2, 3, 7], [5], [8, 23], [10]]
A nested list comprehension :
[ [ l1[j] for j in range(len(l1)) if l2[j] == i ] for i in range(1, max(l2)+1 )]
If it is reasonable to have your data stored in numpy's ndarrays you can use extended indexing
{i:l1[l2==i] for i in set(l2)}
to construct a dictionary of ndarrays indexed by category code.
There is an overhead associated with l2==i (i.e., building a new Boolean array for each category) that grows with the number of categories, so that you may want to check which alternative, either numpy or defaultdict, is faster with your data.
I tested with n=200000, nc=20 and numpy was faster than defaultdict + izip (124 vs 165 ms) but with nc=10000 numpy was (much) slower (11300 vs 251 ms)
Using some itertools and operator goodies and a sort you can do this in a one liner:
>>> l1 = [2, 3, 5, 7, 8, 10, 23]
>>> l2 = [1, 1, 2, 1, 3, 4, 3]
>>> itertools.groupby(sorted(zip(l2, l1)), operator.itemgetter(0))
The result of this is a itertools.groupby object that can be iterated over:
>>> for g, li in itertools.groupby(sorted(zip(l2, l1)), operator.itemgetter(0)):
>>> print(g, list(map(operator.itemgetter(1), li)))
1 [2, 3, 7]
2 [5]
3 [8, 23]
4 [10]
This is not a list comprehension but a dictionary comprehension. It resembles #cromod's solution but preserves the "categories" from l2:
{k:[val for i, val in enumerate(l1) if k == l2[i]] for k in set(l2)}
Output:
>>> l1
[2, 3, 5, 7, 8, 10, 23]
>>> l2
[1, 1, 2, 1, 3, 4, 3]
>>> {k:[val for i, val in enumerate(l1) if k == l2[i]] for k in set(l2)}
{1: [2, 3, 7], 2: [5], 3: [8, 23], 4: [10]}
>>>

Create dictionary where keys are from a list and values are the sum of corresponding elements in another list

I have two lists L1 and L2. Each unique element in L1 is a key which has a value in the second list L2. I want to create a dictionary where the values are the sum of elements in L2 that are associated to the same key in L1.
I did the following but I am not very proud of this code. Is there any simpler pythonic way to do it ?
L = [2, 3, 7, 3, 4, 5, 2, 7, 7, 8, 9, 4] # as L1
W = range(len(L)) # as L2
d = { l:[] for l in L }
for l,w in zip(L,W): d[l].append(w)
d = {l:sum(v) for l,v in d.items()}
EDIT:
Q: How do I know which elements of L2 are associated to a given key element of L1?
A: if they have the same index. For example if the element 7 is repeated 3 times in L1 (e.g. L1[2] == L1[7] == L1[8] = 7), then I want the value of the key 7 to be L2[2]+L2[7]+L2[8]
You can use enumerate() in order to access to item's index while you loop over the list and use collections.defaultdict() (by passing the int as it's missing function which will be evaluated as 0 at first time) to preserve the items and add the values while encounter a duplicate key:
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for i,j in enumerate(L):
... d[j]+=i
...
>>> d
defaultdict(<type 'int'>, {2: 6, 3: 4, 4: 15, 5: 5, 7: 17, 8: 9, 9: 10})
If you don't need the intermediate dict of lists you can use the collections.Counter:
import collections
L = [2, 3, 7, 3, 4, 5, 2, 7, 7, 8, 9, 4] # as L1
W = range(len(L)) # as L2
d2 = collections.Counter()
for i, value in enumerate(L):
d2[value] += i
which behaves like a normal dict:
Counter({2: 6, 3: 4, 4: 15, 5: 5, 7: 17, 8: 9, 9: 10})
Hope this may help you.
L = [2, 3, 7, 3, 4, 5, 2, 7, 7, 8, 9, 4] # as L1
dict_a = dict.fromkeys(set(L),0)
for l,w in enumerate(L):
dict_a[w] = int(dict_a[w]) + l

Pythonic way to convert list of tuples to dict of lists? [duplicate]

I'm looking for a way to convert a list of tuples like this:
[(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
into a dictionary like this:
{4:[1,2,3] ,15:[4,5,9,11,12], 23:[6,7,8,10]}
The second element from each tuple becomes a dictionary key, and all the first tuple elements associated with that key are stored in a value list.
Can you show me how that can be done?
>>> from collections import defaultdict
>>> l= [(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
>>> d= defaultdict( list )
>>> for v, k in l:
... d[k].append(v)
...
>>> d
defaultdict(<type 'list'>, {23: [6, 7, 8, 10], 4: [1, 2, 3], 15: [4, 5, 9, 11, 12]})
>>> [ {k:d[k]} for k in sorted(d) ]
[{4: [1, 2, 3]}, {15: [4, 5, 9, 11, 12]}, {23: [6, 7, 8, 10]}]
>>> a = [(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
>>> b = {}
>>> for i, j in a:
... b.setdefault(j, []).append(i)
...
>>> b
{23: [6, 7, 8, 10], 4: [1, 2, 3], 15: [4, 5, 9, 11, 12]}
>>>
tuples = [(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
dicts = {}
for elem in tuples:
try:
dicts[elem[1]].append(elem[0])
except KeyError:
dicts[elem[1]] = [elem[0],]
l = [(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
d = {}
for v, k in l:
d.setdefault(k, []).append(v)
This will do:
from collections import defaultdict
def to_list_of_dicts(list_of_tuples):
d = defaultdict(list)
for x, y in list_of_tuples:
d[y].append(x)
return sorted([{x: y} for (x, y) in d.items()])
It's not fancy but it is simple
l = [(1,4),(2,4),(3,4),(4,15),(5,15),(6,23),(7,23),(8,23),(9,15),(10,23),(11,15),(12,15)]
d = dict((k, [i[0] for i in l if i[1] == k]) for k in frozenset(j[1] for j in l))
Huzzah!
for key, value in tuples:
if d.get(key):
d[key].append(value)
continue
d[key] =[value]

Slice list to ordered chunks

I have dictionary like:
item_count_per_section = {1: 3, 2: 5, 3: 2, 4: 2}
And total count of items retrieved from this dictionary:
total_items = range(sum(item_count_per_section.values()))
Now I want to transform total_items by values of dictionary following way:
items_no_per_section = {1: [0,1,2], 2: [3,4,5,6,7], 3:[8,9], 4:[10,11] }
I.e. slice total_items sequencially to sublists which startrs from previous "iteration" index and finished with value from initial dictionary.
You don't need to find total_items at all. You can straightaway use itertools.count, itertools.islice and dictionary comprehension, like this
from itertools import count, islice
item_count_per_section, counter = {1: 3, 2: 5, 3: 2, 4: 2}, count()
print {k:list(islice(counter, v)) for k, v in item_count_per_section.items()}
Output
{1: [0, 1, 2], 2: [3, 4, 5, 6, 7], 3: [8, 9], 4: [10, 11]}
dict comprehension of itertools.isliced iter of total_items:
from itertools import islice
item_count_per_section = {1: 3, 2: 5, 3: 2, 4: 2}
total_items = range(sum(item_count_per_section.values()))
i = iter(total_items)
{key: list(islice(i, value)) for key, value in item_count_per_section.items()}
Outputs:
{1: [0, 1, 2], 2: [3, 4, 5, 6, 7], 3: [8, 9], 4: [10, 11]}
Note: this works for any total_items, not just range(sum(values)), assuming that was just your sample to keep the question generic. If you do just want the numbers, go with #thefourtheye's answer

Categories

Resources