How to join values between sublists - python

I have a list with sublists, for example:
LA=[[1,2],[2,7],[4,5],[1,9],[6,5],[4,3],[2,1],[2,2]]
If the first element in each sublist is an ID, how do I join the ones with the same ID to get something like this:
LR=[[1,11],[2,10],[4,8],[6,5]]
I've tried using a for loop, but it's too long and not efficient.

You can use itertools.groupby:
import itertools
LA=[[1,2],[2,7],[4,5],[1,9],[6,5],[4,3],[2,1],[2,2]]
new_d = [[a, sum(i[-1] for i in list(b))] for a, b in itertools.groupby(sorted(LA), key=lambda x:x[0])]
Output:
[[1, 11], [2, 10], [4, 8], [6, 5]]

LA=[[1,2],[2,7],[4,5],[1,9],[6,5],[4,3],[2,1],[2,2]]
new_dict = {}
for (key, value) in LA:
if key in new_dict:
new_dict[key].append(value)
else:
new_dict[key] = [value]
for key, value in new_dict.items():
new_dict[key] = (sum(value))
dictlist = []
for key, value in new_dict.items():
temp = [key,value]
dictlist.append(temp)
print(dictlist)
will do the job too

You can do it just using list comprehensions:
LR = [[i,sum([L[1] for L in LA if L[0]==i])] for i in set([L[0] for L in LA])]
Gives the desired result.
To break this down a bit set([L[0] for L in LA]) gives a set (with no repeats) of all of the ID's, then we simply itterate over that set and sum the values which also have that ID.

Grouping with collections.defaultdict() is always straightforward:
from collections import defaultdict
LA = [[1,2],[2,7],[4,5],[1,9],[6,5],[4,3],[2,1],[2,2]]
# create defaultdict of list values
d = defaultdict(list)
# loop over each list
for sublist in LA:
# group by first element of each list
key = sublist[0]
# add to dictionary, each key will have a list of values
d[key].append(sublist)
# definitely can be more readable
result = [[key, sum(x[1] for x in [y for y in value])] for (key, value) in sorted(d.items())]
print(result)
Output:
[[1, 11], [2, 10], [4, 8], [6, 5]]

Related

Sum list of list by first index

I want to "sum" two lists of lists by first index.
To give an example, I have L1 and L2:
L1=[[2,"word1"],[1,"word2"],[3,"word3"]]
L2=[[7,"word4"],[6,"word1"],[3,"word5"],[6,"word3"]]
and I want this as output:
L3=[[8,"word1"],[1,"word2"],[9,"word3"],[7,"word4"],[3,"word5"]]
Of course I know how to code this but not in a very elegant way, with some while loop, and I'm wondering if there is no simpler solution...
My code so far:
def sum_list(L1,L2):
word1=[x[1] for x in L1]
word2=[x[1] for x in L2]
score1=[x[0] for x in L1]
score2=[x[0] for x in L2]
word3=[]
L3=[]
i=0
while i<len(word1):
word3.append(word1[i])
if word1[i] not in word2:
L3.append([score1[i],word1[i]])
else:
L3.append([score1[i]+score2[word2.index(word1[i])],word1[i]])
i=i+1
i=0
while i<len(word2):
if word2[i] not in word3:
L3.append([score2[i],word2[i]])
i=i+1
return L3
Thanks
# Accumulate counts in a dict.
d = dict()
list(map(lambda p: d.__setitem__(p[1],d.setdefault(p[1], 0)+p[0]), L1))
list(map(lambda p: d.__setitem__(p[1],d.setdefault(p[1], 0)+p[0]), L2))
# Unpack the dict.
L3 = [[k, v] for k,v in d.items()]
print(L3)
A more verbose approach:
def accumulateInDict(d, l):
for pair in l:
key = pair[1]
count = d.get(key, 0)
d[key] = count + pair[0]
# Accumulate counts in a dict.
d = dict()
accumulateInDict(d, L1)
accumulateInDict(d, L2)
# Unpack the dict.
L3 = [[k, v] for k,v in d.items()]
print(L3)
You could use a Counter for that.:
from collections import Counter
L1 = [[2, "word1"], [1, "word2"], [3, "word3"]]
L2 = [[7, "word4"], [6, "word1"], [3, "word5"], [6, "word3"]]
L1_reversed = [l[::-1] for l in L1]
L2_reversed = [l[::-1] for l in L2]
L1_counter = Counter(dict(L1_reversed))
L2_counter = Counter(dict(L2_reversed))
L3_counter = L1_counter + L2_counter
print(L3_counter)
Gives:
Counter({'word3': 9, 'word1': 8, 'word4': 7, 'word5': 3, 'word2': 1})
And if you want to have your list of lists back:
L3 = [[value, key] for key, value in L3_counter.items()]
print(L3)
Which gives:
[[8, 'word1'], [1, 'word2'], [9, 'word3'], [7, 'word4'], [3, 'word5']]
The code is a bit convoluted, to change the data in the needed structure. But maybe you'd want to rethink about how you keep the data anyways as a list of list is not necessarily the best structure to represent such data. A dict would be the better representation here (no duplicates, and directly indexable by key).

Create a key:value pair in the first loop and append more values in subsequent loops

How can I create a key:value pair in a first loop and then just append values in subsequent loops?
For example:
a = [1,2,3]
b = [8,9,10]
c = [4,6,5]
myList= [a,b,c]
positions= ['first_position', 'second_position', 'third_position']
I would like to create a dictionary which records the position values for each letter so:
mydict = {'first_position':[1,8,4], 'second_position':[2,9,6], 'third_position':[3,10,5]}
Imagine that instead of 3 letters with 3 values each, I had millions. How could I loop through each letter and:
In the first loop create the key:value pair 'first_position':[1]
In subsequent loops append values to the corresponding key: 'first_position':[1,8,4]
Thanks!
Try this code:
mydict = {}
for i in range(len(positions)):
mydict[positions[i]] = [each[i] for each in myList]
Output:
{'first_position': [1, 8, 4],
'second_position': [2, 9, 6],
'third_position': [3, 10, 5]}
dictionary.get('key') will return None if the key doesn't exist. So, you can check if the value is None and then append it if it isn't.
dict = {}
for list in myList:
for position, val in enumerate(list):
this_position = positions[position]
if dict.get(this_position) is not None:
dict[this_position].append(val)
else:
dict[this_position] = [val]
The zip function will iterate the i'th values of positions, a, b and c in order. So,
a = [1,2,3]
b = [8,9,10]
c = [4,6,5]
positions= ['first_position', 'second_position', 'third_position']
sources = [positions, a, b, c]
mydict = {vals[0]:vals[1:] for vals in zip(*sources)}
print(mydict)
This created tuples which is usually fine if the lists are read only. Otherwise do
mydict = {vals[0]:list(vals[1:]) for vals in zip(*sources)}

How to match values from one list to unique values in dictionary of lists in Python

So my input values are as follows:
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,5], 'C':[6,6,7,8]}
temp_dict2 = {}
val = [5]
The list val may contain more values, but for now, it only contains one. My desired outcome is:
>>>temp_dict2
{'B':[5]}
The final dictionary needs to only have the keys for the lists that contain the item in the list val, and only unique instances of that value in the list. I've tried iterating through the two objects as follows:
for i in temp_dict1:
for j in temp_dict1[i]:
for k in val:
if k in j:
temp_dict2.setdefault(i, []).append(k)
But that just returns an argument of type 'int' is not iterable error message. Any ideas?
Changed your dictionary to cover some more cases:
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,6], 'C':[6,6,7,8]}
temp_dict2 = {}
val = [5, 6]
for item in val:
for key, val in temp_dict1.items():
if item in val:
temp_dict2.setdefault(key, []).append(item)
print(temp_dict2)
# {'B': [5, 6], 'C': [6]}
Or, the same using list comprehension (looks a bit hard to understand, not recommended).
temp_dict2 = {}
[temp_dict2.setdefault(key, []).append(item) for item in val for key, val in temp_dict1.items() if item in val]
For comparison with #KeyurPotdar's solution, this can also be achieved via collections.defaultdict:
from collections import defaultdict
temp_dict1 = {'A': [1,2,3,4], 'B':[5,5,6], 'C':[6,6,7,8]}
temp_dict2 = defaultdict(list)
val = [5, 6]
for i in val:
for k, v in temp_dict1.items():
if i in v:
temp_dict2[k].append(i)
# defaultdict(list, {'B': [5, 6], 'C': [6]})
You can try this approach:
temp_dict1 = {'A': [1,2,3,4,5,6], 'B':[5,5,5], 'C':[6,6,7,8]}
val = [5,6]
def values(dict_,val_):
default_dict={}
for i in val_:
for k,m in dict_.items():
if i in m:
if k not in default_dict:
default_dict[k]=[i]
else:
default_dict[k].append(i)
return default_dict
print(values(temp_dict1,val))
output:
{'B': [5], 'C': [6], 'A': [5, 6]}

Python. Adding multiple items to keys in a dict

I am trying to build a dict from a set of unique values to serve as the keys and a zipped list of tuples to provide the items.
set = ("a","b","c")
lst 1 =("a","a","b","b","c","d","d")
lst 2 =(1,2,3,3,4,5,6,)
zip = [("a",1),("a",2),("b",3),("b",3),("c",4),("d",5)("d",6)
dct = {"a":1,2 "b":3,3 "c":4 "d":5,6}
But I am getting:
dct = {"a":1,"b":3,"c":4,"d":5}
here is my code so far:
#make two lists
rtList = ["EVT","EVT","EVT","EVT","EVT","EVT","EVT","HIL"]
raList = ["C64G","C64R","C64O","C32G","C96G","C96R","C96O","RA96O"]
# make a set of unique codes in the first list
routes = set()
for r in rtList:
routes.add(r)
#zip the lists
RtRaList = zip(rtList,raList)
#print RtRaList
# make a dictionary with list one as the keys and list two as the values
SrvCodeDct = {}
for key, item in RtRaList:
for r in routes:
if r == key:
SrvCodeDct[r] = item
for key, item in SrvCodeDct.items():
print key, item
You don't need any of that. Just use a collections.defaultdict.
import collections
rtList = ["EVT","EVT","EVT","EVT","EVT","EVT","EVT","HIL"]
raList = ["C64G","C64R","C64O","C32G","C96G","C96R","C96O","RA96O"]
d = collections.defaultdict(list)
for k,v in zip(rtList, raList):
d[k].append(v)
You may achieve this using dict.setdefault method as:
my_dict = {}
for i, j in zip(l1, l2):
my_dict.setdefault(i, []).append(j)
which will return value of my_dict as:
>>> my_dict
{'a': [1, 2], 'c': [4], 'b': [3, 3], 'd': [5, 6]}
OR, use collections.defaultdict as mentioned by TigerhawkT3.
Issue with your code: You are not making the check for existing key. Everytime you do SrvCodeDct[r] = item, you are updating the previous value of r key with item value. In order to fix this, you have to add if condition as:
l1 = ("a","a","b","b","c","d","d")
l2 = (1,2,3,3,4,5,6,)
my_dict = {}
for i, j in zip(l1, l2):
if i in my_dict: # your `if` check
my_dict[i].append(j) # append value to existing list
else:
my_dict[i] = [j]
>>> my_dict
{'a': [1, 2], 'c': [4], 'b': [3, 3], 'd': [5, 6]}
However this code can be simplified using collections.defaultdict (as mentioned by TigerhawkT3), OR using dict.setdefault method as:
my_dict = {}
for i, j in zip(l1, l2):
my_dict.setdefault(i, []).append(j)
In dicts, all keys are unique, and each key can only have one value.
The easiest way to solve this is have the value of the dictionary be a list, as to emulate what is called a multimap. In the list, you have all the elements that is mapped-to by the key.
EDIT:
You might want to check out this PyPI package: https://pypi.python.org/pypi/multidict
Under the hood, however, it probably works as described above.
Afaik, there is nothing built-in that supports what you are after.

Adding and combining values with dictionary comprehensions?

Let's say I have a list:
a_list = [["Bob", 2], ["Bill", 1], ["Bob", 2]]
I want to add these to a dictionary and combining the values to the corresponding key. So, in this case, I want a dictionary that looks like this:
{"Bob" : 4, "Bill" : 1}
How can I do that with dictionary comprehensions?
This is what I have:
d1 = {group[0]: int(group[1]) for group in a_list}
To do what you want with a dictionary comprehension, you'd need an external extra dictionary to track values per name so far:
memory = {}
{name: memory[name] for name, count in a_list if not memory.__setitem__(name, count + memory.setdefault(name, 0))}
but this produces two dictionaries with the sums:
>>> a_list = [["Bob", 2], ["Bill", 1], ["Bob", 2]]
>>> memory = {}
>>> {name: memory[name] for name, count in a_list if not memory.__setitem__(name, count + memory.setdefault(name, 0))}
{'Bob': 4, 'Bill': 1}
>>> memory
{'Bob': 4, 'Bill': 1}
That's because without the memory dictionary you cannot access the running sum per name.
At that point you may as well just use a dictionary and a regular loop:
result = {}
for name, count in a_list:
result[name] = result.get(name, 0) + count
or a collections.defaultdict() object:
from collections import defaultdict
result = defaultdict(int)
for name, count in a_list:
result[name] += count
or even a collections.Counter() object, giving you additional multi-set functionality for later:
from collections import Counter
result = Counter()
for name, count in a_list:
result[name] += count
The other, less efficient option is to sort your a_list first and then use itertools.groupby)():
from itertools import groupby
from operator import itemgetter
key = itemgetter(0) # sort by name
{name: sum(v[1] for v in group)
for name, group in groupby(sorted(a_list, key=key), key)}
This is a O(NlogN) approach vs. the straightforward O(N) approach of a loop without a sort.

Categories

Resources