Merging Dictionary values by key names - python

I have a dictionary of lists that includes keys with very similar names that I need to merge together; for example,
new_dict = {
'a0':['hello', 'how'],
'a1':['are'],
'a2':['you'],
'b0':['fine'],
'b1':['thanks']
}
And I want something like this:
desired = {
'a':['hello', 'how', 'are', 'you'],
'b':['fine', 'thanks']
}
I though that I could change the key as if it was a list element, like this:
for key in new_dict.keys():
if 'a' in key:
key == 'a'
But this obviously doesn't work. What's the best way to do it? Thanks.

This is one way:
from collections import defaultdict
d = defaultdict(list)
for k, v in new_dict.items():
d[k[0]].extend(v)
# defaultdict(list,
# {'a': ['hello', 'how', 'are', 'you'], 'b': ['fine', 'thanks']})

You can use a defaultdict:
from collections import defaultdict
desired = defaultdict(list)
for key, val in new_dict.items():
desired[key[0]] += val

As pointed out in other comments, defaultdictionary eliminates the need for the if else statements.
new_new_dict={}
for k,v in new_dict.items():
k1 = k[0]
if k1 in new_new_dict:
new_new_dict[k1].extend(v)
else:
new_new_dict[k1]=v

You can use itertools.groubpy:
import itertools
import re
new_dict = {
'a0':['hello', 'how'],
'a1':['are'],
'a2':['you'],
'b0':['fine'],
'b1':['thanks']
}
final_data = {a:[i for b in map(lambda x:x[-1], list(b)) for i in b] for a, b in itertools.groupby(sorted(new_dict.items(), key=lambda x:re.findall('^[a-zA-Z]+', x[0])[0]), key=lambda x:re.findall('^[a-zA-Z]+', x[0])[0])}
Output:
{'a': ['are', 'hello', 'how', 'you'], 'b': ['fine', 'thanks']}

Related

Search string value from a list with dictionary list values

Below I have a list and dict with the values as below. The dict has keys associated with list values.
What I wanted was to search strings in the list with values of the list in the dictionary and with each match capture the corresponding key value and create a new list with the key values for each match as above listed.
list = ['man', 'men', 'boy', 'buoy', 'cat','caat']
dict={'man':['man', 'men', 'mun'], 'boy':['boy','buoy','bay'], 'cat':['cat','caat','cut']}
Expected output for above case is:
Outputlist=['man','man','boy','boy','cat','cat']
When I tried the same I am getting only one item to match as below.
lis = ['man', 'men', 'boy', 'buoy', 'cat','caat']
dic={'man':['man', 'men', 'mun'], 'boy':['boy','buoy','bay'], 'cat':['cat','caat','cut']}
for key,value in dic.items():
if value in lis:
output.append(key)
print(output)
you can use a nested list comprehension. Notice that you should not use list and dict as variables:
x = [key for key, value in dict_.items() for x in list_ if x in value]
print(x) # ['man', 'man', 'boy', 'boy', 'cat', 'cat']
The first thing to note is that to achieve the same length input and output lists, you should be looping over the list, not the dictionary.
ys = [k for x in xs for k, v in d.items() if x in v]
Another way is to construct a reverse mapping. This should make things asymptotically faster:
lookup = {x: k for k, v in d.items() for x in v}
>>> lookup
{
'bay': 'boy',
'boy': 'boy',
'buoy': 'boy',
'caat': 'cat',
'cat': 'cat',
'cut': 'cat',
'man': 'man',
'men': 'man',
'mun': 'man',
}
Then, simply:
ys = [lookup[x] for x in xs]
I think you did it the wrong way. First, you wan to go across the lis and after find the corresponding key :
output = []
lis = ['man', 'men', 'boy', 'buoy', 'cat','caat']
dic={'man':['man', 'men', 'mun'], 'boy':['boy','buoy','bay'], 'cat':['cat','caat','cut']}
for l in lis:
for key, value in dic.items():
if l in value:
output.append(key)
print(output)
As the others mention, you can use list comprehension to do it directly:
output = [next(k for k,v in dict if l in v) for l in list]

Splitting a list of strings into sub lists based on their length

If for instance I have a list
['My', 'Is', 'Name', 'Hello', 'William']
How can I manipulate it such that I can create a new list
[['My', 'Is'], ['Name'], ['Hello'], ['William']]
You could use itertools.groupby:
>>> from itertools import groupby
>>> l = ['My', 'Is', 'Name', 'Hello', 'William']
>>> [list(g) for k, g in groupby(l, key=len)]
[['My', 'Is'], ['Name'], ['Hello'], ['William']]
If however the list is not already sorted by length you will need to sort it first as #recnac mentions in the comments below:
>>> l2 = ['My', 'Name', 'Hello', 'Is', 'William']
>>> [list(g) for k, g in groupby(sorted(l2, key=len), key=len)]
[['My', 'Is'], ['Name'], ['Hello'], ['William']]
You can build a dict that maps word lengths to a list of matching words, and then get the list of the dict's values:
l = ['My', 'Is', 'Name', 'Hello', 'William']
d = {}
for w in l:
d.setdefault(len(w), []).append(w)
print(list(d.values()))
This outputs:
[['My', 'Is'], ['Name'], ['Hello'], ['William']]
Hi guys I have also found a solution, while it is not the most concise I thought it would be worth sharing
data = ['My', 'Is', 'Name', 'Hello', 'William']
dict0 = {}
for index in data:
if len(index) not in dict0:
dict0[len(index)] = [index]
elif len(index) in dict0:
dict0[len(index)] += [index]
list0 = []
for i in dict0:
list0.append(dict0[i])
print(list0)
you can use dict to record the string group by length, defaultdict is used for convenient here.
from collections import defaultdict
str_list = ['My', 'Is', 'Name', 'Hello', 'William']
group_by_len = defaultdict(list)
for s in str_list:
group_by_len[len(s)].append(s)
result = list(group_by_len.values())
output:
[['My', 'Is'], ['Name'], ['Hello'], ['William']]
Hope that will help you, and comment if you have further questions. : )

Create a list of unique keys in Python

I have a list of
[{"1":"value"},{"1":"second_value"},{"2":"third_value"},{"2":"fourth_value"},{"3":"fifth_value"}]
want to convert it into
[{"1":"value","2":"third_value","3":"fifth_value"},{"1":"second_value","2":"fourth_value"}]
There is probably a cleaner way of doing this, input is appreciated:
d = [{"1":"value"},{"1":"second_value"},{"2":"third_value"},{"2":"fourth_value"},{"3":"fifth_value"}]
results = [{}]
for item in stuff:
j,k = item.items()[0] // Do the initial dicts always contain one key-value pair?
for result in results:
if j not in result:
result[j] = k
break
if result == results[-1]:
results.append(item)
break
Result:
[{'1': 'value', '3': 'fifth_value', '2': 'third_value'}, {'1': 'second_value', '2': 'fourth_value'}]
You can use collections.defaultdict:
>>> import collections
>>> result = collections.defaultdict(list)
>>> for item in d:
... result[item.values()[0]].append(item.keys()[0])
...
>>> [{key: value for key in keys} for value, keys in result.items()]
[{'1': 'second_value', '2': 'second_value'}, {'1': 'value', '3': 'value', '2': 'value'}]
Note that second_value comes before value in this as the ordering is rather arbitrary (unless you were to explicitly specify that value should be ordered before second_value the above would give you the ordering that the dictionary returns).
You can use collections.defaultdict here. Iterate over the list, use the values as keys and collect all the keys related to a value in a list.
>>> from collections import defaultdict
>>> d = defaultdict(list)
for dic in lis:
for k, v in dic.items():
d[v].append(k)
...
Now d becomes:
>>> d
defaultdict(<type 'list'>,
{'second_value': ['1', '2'],
'value': ['1', '2', '3']})
Now iterate over d to get the desired result:
>>> [{v1:k for v1 in v} for k, v in d.items()]
[{'1': 'second_value', '2': 'second_value'}, {'1': 'value', '3': 'value', '2': 'value'}]

Group a list by word length

For example, I have a list, say
list = ['sight', 'first', 'love', 'was', 'at', 'It']
I want to group this list by word length, say
newlist = [['sight', 'first'],['love'], ['was'], ['at', 'It']]
Please help me on it.
Appreciation!
Use itertools.groupby:
>>> from itertools import groupby
>>> lis = ['sight', 'first', 'love', 'was', 'at', 'It']
>>> [list(g) for k, g in groupby(lis, key=len)]
[['sight', 'first'], ['love'], ['was'], ['at', 'It']]
Note that for itertools.groupby to work properly all the items must be sorted by length, otherwise use collections.defaultdict(O(N)) or sort the list first and then use itertools.groupby(O(NlogN)). :
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> lis = ['sight', 'first', 'foo', 'love', 'at', 'was', 'at', 'It']
>>> for x in lis:
... d[len(x)].append(x)
...
>>> d.values()
[['at', 'at', 'It'], ['foo', 'was'], ['love'], ['sight', 'first']]
If you want the final output list to be sorted too then better sort the list items by length and apply itertools.groupby to it.
You can use a temp dictionary then sort by length:
li=['sight', 'first', 'love', 'was', 'at', 'It']
d={}
for word in li:
d.setdefault(len(word), []).append(word)
result=[d[n] for n in sorted(d, reverse=True)]
print result
# [['sight', 'first'], ['love'], ['was'], ['at', 'It']]
You can use defaultdict:
from collections import defaultdict
d=defaultdict(list)
for word in li:
d[len(word)].append(word)
result=[d[n] for n in sorted(d, reverse=True)]
print result
or use __missing__ like so:
class Dicto(dict):
def __missing__(self, key):
self[key]=[]
return self[key]
d=Dicto()
for word in li:
d[len(word)].append(word)
result=[d[n] for n in sorted(d, reverse=True)]
print result
Since the groupby solution was already taken ;-)
from collections import defaultdict
lt = ['sight', 'first', 'love', 'was', 'at', 'It']
d = defaultdict(list)
for x in lt:
d[len(x)].append(x)
d.values()
[['at', 'It'], ['was'], ['love'], ['sight', 'first']]

convert to list comprehension

this is my python code
mylist = ['a', 'f', 'z']
old_d = {'a': 'aaa', 'b': 'bbb', 'c': 'ccc', 'f': 'fff', 'g':'ggg', 'z':'zzz'}
new_d = {}
for key in mylist:
new_d[key] = old_d[key]
Can we write the above code using list comprehensions or something similar like
new_d[key] = old_d[key] for key in mylist
In Python 2.7 and above you can use a dict comprehension:
new_d = {key: old_d[key] for key in myList}
In Python 2.6 and below, you don't have dict comprehensions must use dict with a generator or list comprehension:
new_d = dict((key, old_d[key]) for key in myList)
new_d = {key: old_d[key] for key in mylist}
You can use dict comprehension on Python 2.7 and newer:
new_d = {k: old_d[k] for k in mylist}

Categories

Resources