pythonic way to reverse a dict where values are lists? - python

I have a dictionary that looks something like this:
letters_by_number = {
1: ['a', 'b', 'c', 'd'],
2: ['b', 'd'],
3: ['a', 'c'],
4: ['a', 'd'],
5: ['b', 'c']
}
I want to reverse it to look something like this:
numbers_by_letter = {
'a': [1, 3, 4],
'b': [1, 2, 5],
'c': [1, 3, 5],
'd': [1, 2, 4]
}
I know that I could do this by looping through (key, value) through letters_by_number, looping through value (which is a list), and adding (val, key) to a list in the dictionary. This is cumbersome and I feel like there must be a more "pythonic" way to do this. Any suggestions?

This is well-suited for collections.defaultdict:
>>> from collections import defaultdict
>>> numbers_by_letter = defaultdict(list)
>>> for k, seq in letters_by_number.items():
... for letter in seq:
... numbers_by_letter[letter].append(k)
...
>>> dict(numbers_by_letter)
{'a': [1, 3, 4], 'b': [1, 2, 5], 'c': [1, 3, 5], 'd': [1, 2, 4]}
Note that you don't really need the final dict() call (a defaultdict will already give you the behavior you probably want), but I included it here because the result from your question is type dict.

Use setdefault:
letters_by_number = {
1: ['a', 'b', 'c', 'd'],
2: ['b', 'd'],
3: ['a', 'c'],
4: ['a', 'd'],
5: ['b', 'c']
}
inv = {}
for k, vs in letters_by_number.items():
for v in vs:
inv.setdefault(v, []).append(k)
print(inv)
Output
{'a': [1, 3, 4], 'b': [1, 2, 5], 'c': [1, 3, 5], 'd': [1, 2, 4]}

A (trivial) subclass of dict would make this very easy:
class ListDict(dict):
def __missing__(self, key):
value = self[key] = []
return value
letters_by_number = {
1: ['a', 'b', 'c', 'd'],
2: ['b', 'd'],
3: ['a', 'c'],
4: ['a', 'd'],
5: ['b', 'c']
}
numbers_by_letter = ListDict()
for key, values in letters_by_number.items():
for value in values:
numbers_by_letter[value].append(key)
from pprint import pprint
pprint(numbers_by_letter, width=40)
Output:
{'a': [1, 3, 4],
'b': [1, 2, 5],
'c': [1, 3, 5],
'd': [1, 2, 4]}

Here's a solution using a dict comprehension, without adding list elements in a loop. Build a set of keys by joining all the lists together, then build each list using a list comprehension. To be more efficient, I've first built another dictionary containing sets instead of lists, so that k in v is an O(1) operation.
from itertools import chain
def invert_dict_of_lists(d):
d = { i: set(v) for i, v in d.items() }
return {
k: [ i for i, v in d.items() if k in v ]
for k in set(chain.from_iterable(d.values()))
}
Strictly, dictionaries in modern versions of Python 3 retain the order that keys are inserted in. This produces a result where the keys are in the order they appear in the lists; not alphabetical order like in your example. If you do want the keys in sorted order, change for k in set(...) to for k in sorted(set(...)).

Related

Create multiple dictionaries from lists partitions

So I have a list:
[ 1, 2, 3, 4, 5 ]
And two lists of the form
['A', 'B', 'C'] [ 'D', 'E']
whose total length sum is equal to the original list (partition). How can I obtain the following dictionaries in Python:
{'A': 1, 'B': 2, 'C': 3 } {'D': 4, 'E': 5}
Thanks
You can use next with iter:
values = [ 1, 2, 3, 4, 5 ]
lists = [['A', 'B', 'C'], ['D', 'E']]
itr = iter(values)
result = [{key: next(itr) for key in lst} for lst in lists]
Output:
[{'A': 1, 'B': 2, 'C': 3}, {'D': 4, 'E': 5}]

Creating a dictionary of only the max common pairing of groups

I would like to create a dictionary of the max common pairings - an "agreement" table. Is it possible to shorten the code a bit when finding the agreement? As of now, I am not really liking finding the max count and then matching on the count to find the "agreement".
import pandas as pd
from collections import defaultdict
df = pd.DataFrame({
'id': ['A', 'A', 'B', 'B', 'B', 'B'],
'value': [1, 1, 2, 2, 1, 2]})
df = df.groupby(["id","value"]).size().reset_index().rename(columns={0: "count"})
df["max_rank"] = df.groupby(["id"])["count"].transform("max")==df["count"]
df = df.loc[(df["max_rank"]==True)]
d = defaultdict(list)
for idx, row in df.iterrows():
d[row['id']].append(row['value'])
d = [{k: v} for k, v in d.items()]
d
output:
[{'A': [1]}, {'B': [2]}]
You can build a dict that maps each id to a list of values, and then use the collections.Counter.most_common method to obtain the most common value for each id:
from collections import Counter
d = {'id': ['A', 'A', 'B', 'B', 'B', 'B'], 'value': [1, 1, 2, 2, 1, 2]}
mapping = {}
for k, v in zip(d['id'], d['value']):
mapping.setdefault(k, []).append(v)
print({k: Counter(l).most_common(1)[0][0] for k, l in mapping.items()})
This outputs:
{'A': 1, 'B': 2}

how to identify relationship/mapping between the two list in python?

I have created two list.
list1= [a,b,c,a,d]
list2=[1,2,3,4,5]
I want to find relationship between this two list based on index position i.e
In list1 a is repeated 2 times index 0,3 .in list2 index 0,3 values are 1 ,4 the relation is a one to many is a:{1,4}
next b not repeated in list 1 and it index is 1 and list2 index 1 value is 2 ,the relation is one to one b:{2}
my expected output will be {a:{1,4},b:{2},c:{3},d:{5}}
I'd use a defaultdict:
from collections import defaultdict
list1 = ['a', 'b', 'c', 'a', 'd']
list2 = [1, 2, 3, 4, 5]
result = defaultdict(set)
for value1, value2, in zip(list1, list2):
result[value1].add(value2)
print(dict(result))
outputs
{'a': {1, 4}, 'b': {2}, 'c': {3}, 'd': {5}}
You can use a combination of dictionary and list comprehension to do this:
{x: [list2[i] for i, j in enumerate(list1) if j == x] for x in list1}
output:
{'a': [1, 4], 'b': [2], 'c': [3], 'd': [5]}
a = ['a', 'b', 'c', 'a', 'd']
b = [1, 2, 3, 4, 5]
ret = {}
for idx, _a in enumerate(a):
value = ret.get(_a, ret.setdefault(_a, []))
value.append(b[idx])
And ret will be the output
Option is to zip the two lists:
L = list(zip(list1, list2))
Result:
[('a', 1), ('b', 2), ('c', 3), ('a', 4), ('d', 5)]
Use it to create a dictionary with sets as values:
D ={}
for key in L:
if key[0] not in D:
D[key[0]] = {key[1]}
else:
D[key[0]].add(key[1])
I would not do it this way in real code, but this approach is mildly entertaining and perhaps educational.
from collections import defaultdict
from itertools import groupby
from operator import itemgetter
xs = ['a', 'b', 'c', 'a', 'd']
ys = [1, 2, 3, 4, 5]
d = {
x : set(y for _, y in group)
for x, group in groupby(sorted(zip(xs, ys)), key = itemgetter(0))
}
print(d) # {'a': {1, 4}, 'b': {2}, 'c': {3}, 'd': {5}}
It's not from pure python, as this question tagged with pandas I tried this way.
Option-1
df=pd.DataFrame({'l1':list1,'l2':list2})
res1=df.groupby('l1').apply(lambda x:x.l2.values.tolist()).to_dict()
Option-2
print df.groupby('l1')['l2'].unique().to_dict()
Output:
{'a': [1, 4], 'c': [3], 'b': [2], 'd': [5]}

Python count in a sublist in a nest list

x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
Let's say we have a list with random str letters.
How can i create a function so it tells me how many times the letter 'a' comes out, which in this case 2. Or any other letter, like 'b' comes out once, 'f' comes out twice. etc.
Thank you!
You could flatten the list and use collections.Counter:
>>> import collections
>>> x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
>>> d = collections.Counter(e for sublist in x for e in sublist)
>>> d
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})
>>> d['a']
2
import itertools, collections
result = collections.defaultdict(int)
for i in itertools.chain(*x):
result[i] += 1
This will create result as a dictionary with the characters as keys and their counts as values.
Just FYI, you can use sum() to flatten a single nested list.
>>> from collections import Counter
>>>
>>> x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
>>> c = Counter(sum(x, []))
>>> c
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})
But, as Blender and John Clements have addressed, itertools.chain.from_iterable() may be more clear.
>>> from itertools import chain
>>> c = Counter(chain.from_iterable(x)))
>>> c
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})

Dictionaries and listing positions of word in a list

I need to take a list and use a dictionary to catalogue where a particular item occurs in a list, as an example:
L = ['a', 'b', 'c', 'b', 'c', 'a', 'e']
the dictionary needs to contain the following:
D = {'a': 0, 5 , 'b': 1, 3 , 'c': 2, 4 , 'e': 6}
However if I use what I wrote:
for i in range(len(word_list)):
if D.has_key('word_list[i]') == False:
D['word_list[i]'] = i
else:
D[word_list[i]] += i
Then I get a KeyError for a certain word and I don't understand why I should be getting an error.
if D.has_key('word_list[i]') == False:
Uh, what?
At the very least, you should drop the quotes:
if D.has_key(word_list[i]) == False:
But you're also misusing a number of Python structures:
Why are summing up the indices?
Why are you comparing to False?
Shouldn't you be using setdefault
Like this:
for i in range(len(word_list)):
D.setdefault(word_list[i], []).append(i)
I modified you solution a bit to work
word_list = ['a', 'b', 'c', 'b', 'c', 'a', 'e']
dict = {'a': [], 'b': [], 'c': [], 'e': []}
for i in range(len(word_list)):
if word_list[i] not in dict:
dict[word_list[i]] = [i]
else:
dict[word_list[i]].append(i)
Result
{'a': [0, 5], 'c': [2, 4], 'b': [1, 3], 'e': [6]}
I think this would be the shortest solution for your problem:
>>> from collections import defaultdict
>>> D = defaultdict(list)
>>> for i,el in enumerate(L):
D[el].append(i)
>>> D
defaultdict(<type 'list'>, {'a': [0, 5], 'c': [2, 4], 'b': [1, 3], 'e': [6]})
If you want to stick with dict, correcting your code I would came up with:
>>> D = {}
>>> for i,el in enumerate(L):
if el not in D:
D[el] = [i] #crate a new list
else:
D[el].append(i) #appending to the existing list
>>> D
{'a': [0, 5], 'c': [2, 4], 'b': [1, 3], 'e': [6]}
Also, there is a setdefault method in dict which can be used:
>>> D = {}
>>> for i,el in enumerate(L):
D.setdefault(el,[]).append(i)
>>> D
{'a': [0, 5], 'c': [2, 4], 'b': [1, 3], 'e': [6]}
But I prefer to use defaultdict from collections.

Categories

Resources