list1 = ['fire', 'cats', 'mats', 'wats', 'mire', 'tire']
I would like to divide these words up based on the 3 last letters of each word, and save it into a dictionary.
Is this possible?
Create a defaultdict with list as default item, and append items, computing key with string slice, like this:
import collections
list1 = ['fire', 'cats', 'mats', 'wats', 'mire', 'tire']
d=collections.defaultdict(list)
for i in list1:
d[i[-3:]].append(i)
print(dict(d)) # copy in a dict just for clean display (no defaultdict prefix)
result:
{'ire': ['fire', 'mire', 'tire'], 'ats': ['cats', 'mats', 'wats']}
Also can be done with a one-liner, not as efficient because of the inner loo which tests all the list (one-liners are trendy, but sometimes just not the best solution):
d = {k:[v for v in list1 if v.endswith(k)] for k in set(x[-3:] for x in list1)}
result:
{'ire': ['fire', 'mire', 'tire'], 'ats': ['cats', 'mats', 'wats']}
if you wanted to filter out items with not enough associated words you could do this (after having computed d in the first pass):
d = {k:v for k,v in d.items() if len(v)>2}
that creates a new dictionary with only key/values if there are more than 2 elements.
Related
I have a list of dictionaries and every dictionary has the word as key, and the number of times that word appears in a particular document as value. Now I am wondering How can I find how many dictionaries a particular word appears in?
suppose I have a list of following dictionaries:
dict1 = {'Association':5, 'Rule':2, 'Mining':3}
dict2 = {'Rule':4, 'Mining':1}
dict3 = {'Association':4, 'Mining':3}
Result after counting how many dictionaries a word appears in:
result_dict = {'Association':2, 'Rule':2, 'Mining':3}
Counter is a dict subclass that can be useful here:
from collections import Counter
dicts = [dict1, dict2, dict3]
key_counters = [Counter(dictionary.keys()) for dictionary in dicts]
start_counter = Counter()
result_dict = sum(key_counters, start_counter)
assert result_dict == {'Association': 2, 'Rule': 2, 'Mining': 3}
This can be easily done with dict comprehension.
First, make a list out of your dicts:
dict1 = {'Association':5,'Rule':2,'Mining':3}
dict2 = {'Rule':4,'Mining':1}
dict3 = {'Association':4,'Mining':3}
dicts = [dict1, dict2, dict3]
Then, make a set of all the words in the dictionaries with a union (might be a cleaner way to do this, but this worked):
all_words = set().union(*[d.keys() for d in dicts])
Then, count how many dictionaries each word appears in:
{k: sum([1 for d in dicts if k in d.keys()]) for k in all_words}
This returned the desired output from your example.
I have 2 lists, both of which aren't fixed in len.
list1 = ["John", "bruce", "William"]
list2 = ["lindt", "reese", "snickers", "chocolate", "Milkyway", "Cadbury", "Candy"]
I want to distribute the candy amongst the members in list1 so that end result would look something like
John: "lindt","chocolate","candy"
Bruce: "reese","Mlikyway"
Will: "Snickers","Cadbury"
I tried using cycle and zip from itertools but all I am getting is a tuple with something like
list1 = ["John","bruce","William"]
list2 = ["lindt","reese","snickers","chocolate","Milkyway","Cadbury","Candy"]
for i in zip(list2,cycle(list1)):
print(i)
Output
('lindt', 'John')
('reese', 'bruce')
('snickers', 'William')
('chocolate', 'John')
('Milkyway', 'bruce')
('Cadbury', 'William')
('Candy', 'John')
Let's use a dictionary:
kids = {}
all members from list1 become a list in your dictionary:
for i in list1: kids[i] = []
then use zip and cycle to assign each kid their candy:
for i in zip(list2,cycle(list1)): kids[i[1]].append(i[0])
# kids[i[1]] represents the kids's name since it's at index 1 in the tuple
# i[0] represents the candy that is associated to the kid.
Result:
>>> for i in kids: print(i,':', kids[i])
john : ['lindt', 'chocolate', 'candy']
bruce : ['reese', 'milkyway']
william : ['snickers', 'cadbury']
You were pretty close but you need a data structure to store these results. Instead of creating a dictionary containing the keys and empty lists you can also use dict.setdefault:
from itertools import cycle
d = dict()
for name, item in zip(cycle(list1), list2):
d.setdefault(name, []).append(item)
print(d)
# {'John': ['lindt', 'chocolate', 'Candy'],
# 'William': ['snickers', 'Cadbury'],
# 'bruce': ['reese', 'Milkyway']}
Instead of setdefault you can also use collections.defaultdict:
from itertools import cycle
from collections import defaultdict
d = defaultdict(list)
for name, item in zip(cycle(list1), list2):
d[name].append(item)
print(d)
# defaultdict(list,
# {'John': ['lindt', 'chocolate', 'Candy'],
# 'William': ['snickers', 'Cadbury'],
# 'bruce': ['reese', 'Milkyway']})
I have one list which contain a few dictionaries.
[{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
how to combine the dictionary when it has the same key. for example:
u'work', u'go to work'are under one key:'TEXT242.txt', so that i can remove the duplicated key.
[{u'TEXT242.txt': [u'work', u'go to work']},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
The setdefault method of dictionaries is handy here... it can create an empty list when a dictionary key doesn't exist, so that you can always append the value.
dictlist = [{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
newdict = {}
for d in dictlist:
for k in d:
newdict.setdefault(k, []).append(d[k])
from collections import defaultdict
before = [{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
after = defaultdict(list)
for i in before:
for k, v in i.items():
after[k].append(v)
out:
defaultdict(list,
{'TEXT1007.txt': ['report'],
'TEXT242.txt': ['work', 'go to work'],
'TEXT797.txt': ['study']})
This technique is simpler and faster
than an equivalent technique using dict.setdefault()
I have the following Python list:
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EY:EE:KJERWEWERKJWE']
I would like to take the entries of this list and create a dictionary of key-values pairs that looks like
dictionary_list1 = {'EW':'G:B<<LADHFSSFAFFF', 'CB':'E:OWTOWTW', 'PP':'E:A,A<F<AF', 'GR':'A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX':'F:-111', 'DS':'f:115.5', 'MW':'AA:0', 'MA':'A:0XT:i:0', 'EW':'EE:KJERWEWERKJWE'}
How does one parse/split the list above list1 to do this? My first instinct was to try try1 = list1.split(":"), but then I think it is impossible to retrieve the "key" for this list, as there are multiple colons :
What is the most pythonic way to do this?
You can specify a maximum number of times to split with the second argument to split.
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EW:EE:KJERWEWERKJWE']
d = dict(item.split(':', 1) for item in list1)
Result:
>>> import pprint
>>> pprint.pprint(d)
{'CB': 'E:OWTOWTW',
'DS': 'f:115.5',
'EW': 'EE:KJERWEWERKJWE',
'GR': 'A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7',
'MA': 'A:0XT:i:0',
'MW': 'AA:0',
'PP': 'E:A,A<F<AF',
'SX': 'F:-111'}
If you'd like to keep track of values for non-unique keys, like 'EW:G:B<<LADHFSSFAFFF' and 'EW:EE:KJERWEWERKJWE', you could add keys to a collections.defaultdict:
import collections
d = collections.defaultdict(list)
for item in list1:
k,v = item.split(':', 1)
d[k].append(v)
Result:
>>> pprint.pprint(d)
{'CB': ['E:OWTOWTW'],
'DS': ['f:115.5'],
'EW': ['G:B<<LADHFSSFAFFF', 'EE:KJERWEWERKJWE'],
'GR': ['A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7'],
'MA': ['A:0XT:i:0'],
'MW': ['AA:0'],
'PP': ['E:A,A<F<AF'],
'SX': ['F:-111']}
You can also use str.partition
list1 = ['EW:G:B<<LADHFSSFAFFF', 'CB:E:OWTOWTW', 'PP:E:A,A<F<AF', 'GR:A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7', 'SX:F:-111', 'DS:f:115.5', 'MW:AA:0', 'MA:A:0XT:i:0', 'EW:EE:KJERWEWERKJWE']
d = dict([t for t in x.partition(':') if t!=':'] for x in list1)
# or more simply as TigerhawkT3 mentioned in the comment
d = dict(x.partition(':')[::2] for x in list1)
for k, v in d.items():
print('{}: {}'.format(k, v))
Output:
MW: AA:0
CB: E:OWTOWTW
GR: A:OUO-1-XXX-EGD:forthyFive:1:HMJeCXX:7
PP: E:A,A<F<AF
EW: EE:KJERWEWERKJWE
SX: F:-111
DS: f:115.5
MA: A:0XT:i:0
Say I have a list of list like this: (suppose you have no idea how many lists in this list)
list=[['food','fish'],['food','meat'],['food','veg'],['sports','football']..]
how can I merge the items in the list like the following:
list=[['food','fish','meat','veg'],['sports','football','basketball']....]
i.e, merge all the nested lists into the same list if they contain one of the same items.
Use defaultdict to make a dictionary that maps a type to values and then get the items:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> items = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
>>> for key, value in items:
... d[key].append(value)
...
>>> [[key] + values for key, values in d.items()]
[['food', 'fish', 'meat', 'veg'], ['sports', 'football']]
The "compulsory" alternative to defaultdict which can work better for data that's already in order of the key and if you don't want to build data structures on it (ie, just work on groups)...
data = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
from itertools import groupby
print [[k] + [i[1] for i in v] for k, v in groupby(data, lambda L: L[0])]
But defaultdict is more flexible and easier to understand - so go with #Blender's answer.