dictionary operation using python - python

I have a dictionary.
dict = {'A':['a', 'b'], 'B':['c', 'b', 'a'], 'C':['d', 'c'], }
what is easy way to find out similar values from keys of dictionary?
output :
A&B : 'a', 'b'
A&C : None
B&C : 'c'
How this can be achieved?

In [1]: dct = {'A':['a', 'b'], 'B':['c', 'b', 'a'], 'C':['d', 'c'], }
In [2]: set(dct['A']).intersection(dct['B'])
Out[2]: {'a', 'b'}
In [3]: set(dct['A']).intersection(dct['C'])
Out[3]: set()
In [4]: set(dct['B']).intersection(dct['C'])
Out[4]: {'c'}

Using set & other_set operator or set.intersection and itertools.combinations:
>>> import itertools
>>>
>>> d = {'A':['a', 'b'], 'B':['c', 'b', 'a'], 'C':['d', 'c'], }
>>> for a, b in itertools.combinations(d, 2):
... common = set(d[a]) & set(d[b])
... print('{}&{}: {}'.format(a, b, common))
...
A&C: set()
A&B: {'b', 'a'}
C&B: {'c'}

Related

Sorting list based on dictionary keys in python

Is there a short way to sort a list based on the order of another dictionary keys?
suppose I have:
lst = ['b', 'c', 'a']
dic = { 'a': "hello" , 'b': "bar" , 'c': "foo" }
I want to sort the list to be ['a','b','c'] based on the order of dic keys.
You can create a lookup of keys versus their insertion order in dic. To do so you can write:
>>> lst = ['d', 'b', 'c', 'a']
>>> dic = {"a": "hello", "b": "bar", "c": "foo"}
>>> order = {k: i for i, k in enumerate(dic)}
>>> order
{'a': 0, 'b': 1, 'c': 2}
Using this you can write a simple lookup for the key argument of sorted to rank items based on order.
>>> sorted(lst, key=order.get)
['a', 'b', 'c']
If there are values in lst that are not found in dic you should call get using a lambda so you can provide a default index. You'll have to choose if you want to rank unknown items at the start or end.
Default to the start:
>>> lst = ['d', 'b', 'c', 'a']
>>> sorted(lst, key=lambda k: order.get(k, -1))
['d', 'a', 'b', 'c']
Default to the end:
>>> lst = ['d', 'b', 'c', 'a']
>>> sorted(lst, key=lambda k: order.get(k, len(order)))
['a', 'b', 'c', 'd']

is there a simpler way to group and count with python?

I am grouping and counting a set of data.
df = pd.DataFrame({'key': ['A', 'B', 'A'],
'data': np.ones(3,)})
df.groupby('key').count()
outputs
data
key
A 2
B 1
The piece of code above works though, I wonder if there is a simpler one.
'data': np.ones(3,) seems to be a placeholder and indispensable.
pd.DataFrame(['A', 'B', 'A']).groupby(0).count()
outputs
A
B
My question is, is there a simpler way to do this, produce the count of 'A' and 'B' respectively, without something like 'data': np.ones(3,) ?
It doesn't have to be a pandas method, numpy or python native function are also appreciated.
Use a Series instead.
>>> import pandas as pd
>>>
>>> data = ['A', 'A', 'A', 'B', 'C', 'C', 'D', 'D', 'D', 'D', 'D']
>>>
>>> pd.Series(data).value_counts()
D 5
A 3
C 2
B 1
dtype: int64
Use a defaultdict:
from collections import defaultdict
data = ['A', 'A', 'B', 'A', 'C', 'C', 'A']
d = defaultdict(int)
for element in data:
d[element] += 1
d # output: defaultdict(int, {'A': 4, 'B': 1, 'C': 2})
There's not any grouping , just counting, so you can use
from collections import Counter
counter(['A', 'B', 'A'])

Divide list to multiple lists based on elements value

I have the following list:
initial_list = [['B', 'D', 'A', 'C', 'E']]
On each element of the list I apply a function and put the results in a dictionary:
for state in initial_list:
next_dict[state] = move([state], alphabet)
This gives the following result:
next_dict = {'D': ['E'], 'B': ['D'], 'A': ['C'], 'C': ['C'], 'E': ['D']}
What I would like to do is separate the keys from initial_list based on their
values in the next_dict dictionary, basically group the elements of the first list to elements with the same value in the next_dict:
new_list = [['A', 'C'], ['B', 'E'], ['D']]
'A' and 'C' will stay in the same group because they have the same value 'C', 'B' and 'D' will also share the same group because their value is 'D' and then 'D' will be in it's own group.
How can I achieve this result?
You need groupby, after having sorted your list by next_dict values :
It generates a break or new group every time the value of the key
function changes (which is why it is usually necessary to have sorted
the data using the same key function).
from itertools import groupby
initial_list = ['B', 'D', 'A', 'C', 'E']
def move(letter):
return {'A': 'C', 'C': 'C', 'D': 'E', 'E': 'D', 'B': 'D'}.get(letter)
sorted_list = sorted(initial_list, key=move)
print [list(v) for k,v in groupby(sorted_list, key=move)]
#=> [['A', 'C'], ['B', 'E'], ['D']]
Simplest way to achieve this will be to use itertools.groupby with key as dict.get as:
>>> from itertools import groupby
>>> next_dict = {'D': ['E'], 'B': ['D'], 'A': ['C'], 'C': ['C'], 'E': ['D']}
>>> initial_list = ['B', 'D', 'A', 'C', 'E']
>>> [list(i) for _, i in groupby(sorted(initial_list, key=next_dict.get), next_dict.get)]
[['A', 'C'], ['B', 'E'], ['D']]
I'm not exactly sure that's what you want but you can group the values based on their values in the next_dict:
>>> next_dict = {'D': 'E', 'B': 'D', 'A': 'C', 'C': 'C', 'E': 'D'}
>>> # external library but one can also use a defaultdict.
>>> from iteration_utilities import groupedby
>>> groupings = groupedby(['B', 'D', 'A', 'C', 'E'], key=next_dict.__getitem__)
>>> groupings
{'C': ['A', 'C'], 'D': ['B', 'E'], 'E': ['D']}
and then convert that to a list of their values:
>>> list(groupings.values())
[['A', 'C'], ['D'], ['B', 'E']]
Combine everything into a one-liner (not really recommended but a lot of people prefer that):
>>> list(groupedby(['B', 'D', 'A', 'C', 'E'], key=next_dict.__getitem__).values())
[['A', 'C'], ['D'], ['B', 'E']]
Try this:
next_next_dict = {}
for key in next_dict:
if next_dict[key][0] in next_next_dict:
next_next_dict[next_dict[key][0]] += key
else:
next_next_dict[next_dict[key][0]] = [key]
new_list = next_next_dict.values()
Or this:
new_list = []
for value in next_dict.values():
new_value = [key for key in next_dict.keys() if next_dict[key] == value]
if new_value not in new_list:
new_list.append(new_value)
We can sort your list with your dictionary mapping, and then use itertools.groupby to form the groups. The only amendment I made here is making your initial list an actual flat list.
>>> from itertools import groupby
>>> initial_list = ['B', 'D', 'A', 'C', 'E']
>>> next_dict = {'D': ['E'], 'B': ['D'], 'A': ['C'], 'C': ['C'], 'E': ['D']}
>>> s_key = lambda x: next_dict[x]
>>> [list(v) for k, v in groupby(sorted(initial_list, key=s_key), key=s_key)]
[['A', 'C'], ['B', 'E'], ['D']]

Duplicate elements in a list [duplicate]

This question already has answers here:
Repeating elements of a list n times
(14 answers)
Closed 7 months ago.
I have a list in Python:
l = ['a', 'c', 'e', 'b']
I want to duplicate each element immediately next to the original.
ll = ['a', 'a', 'c', 'c', 'e', 'e', 'b', 'b']
The order of the elements should be preserved.
>>> l = ['a', 'c', 'e', 'b']
>>> [x for pair in zip(l,l) for x in pair]
['a', 'a', 'c', 'c', 'e', 'e', 'b', 'b']
Or
>>> from itertools import repeat
>>> [x for item in l for x in repeat(item, 2)]
['a', 'a', 'c', 'c', 'e', 'e', 'b', 'b']
This is old but I can't see the straightforward option here (IMO):
[ item for item in l for repetitions in range(2) ]
So for the specific case:
>>> l = ['a', 'c', 'e', 'b']
l = ['a', 'c', 'e', 'b']
>>> [ i for i in l for r in range(2) ]
[ i for i in l for r in range(2) ]
['a', 'a', 'c', 'c', 'e', 'e', 'b', 'b']
>>>
And generalizing:
[ item for item in l for _ in range(r) ]
Where r is the quantity of repetitions you want.
So this has a O(n.r) space and time complexity, is short, with no dependencies and also idiomatic.
import itertools
ll = list(itertools.chain.from_iterable((e, e) for e in l))
At work:
>>> import itertools
>>> l = ['a', 'c', 'e', 'b']
>>> ll = list(itertools.chain.from_iterable((e, e) for e in l))
>>> ll
['a', 'a', 'c', 'c', 'e', 'e', 'b', 'b']
As Lattyware pointed out, in case you want more than just double the element:
from itertools import chain, repeat
ll = list(chain.from_iterable(repeat(e, 2) for e in l))
Try this
for i in l:
ll.append(i)
ll.append(i)
Demo
It will just do your work but it's not an optimized way of doing this.
use the ans. posted by #Steven Rumbalski
Here's a pretty easy way:
sum(zip(l, l), tuple())
It duplicates each item, and adds them to a tuple. If you don't want a tuple (as I suspect), you can call list on the the tuple:
list(sum(zip(l, l), tuple()))
A few other versions (that yield lists):
list(sum(zip(l, l), ()))
sum([list(i) for i in zip(l, l)], [])
sum(map(list, zip(l, l)), [])
Pandas gives a method for duplicated elements:
import pandas as pd
l = pd.Series([2, 1, 3, 1])
print(l.duplicated())
>>>0 False
1 False
2 False
3 True
dtype: bool
print('Has list duplicated ? :', any(l.duplicated()))
>>>Has list duplicated ? : True

Python count in a sublist in a nest list

x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
Let's say we have a list with random str letters.
How can i create a function so it tells me how many times the letter 'a' comes out, which in this case 2. Or any other letter, like 'b' comes out once, 'f' comes out twice. etc.
Thank you!
You could flatten the list and use collections.Counter:
>>> import collections
>>> x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
>>> d = collections.Counter(e for sublist in x for e in sublist)
>>> d
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})
>>> d['a']
2
import itertools, collections
result = collections.defaultdict(int)
for i in itertools.chain(*x):
result[i] += 1
This will create result as a dictionary with the characters as keys and their counts as values.
Just FYI, you can use sum() to flatten a single nested list.
>>> from collections import Counter
>>>
>>> x = [['a', 'b', 'c'], ['a', 'c', 'd'], ['e', 'f', 'f']]
>>> c = Counter(sum(x, []))
>>> c
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})
But, as Blender and John Clements have addressed, itertools.chain.from_iterable() may be more clear.
>>> from itertools import chain
>>> c = Counter(chain.from_iterable(x)))
>>> c
Counter({'a': 2, 'c': 2, 'f': 2, 'b': 1, 'e': 1, 'd': 1})

Categories

Resources