Convert a list of str and lists to dict - python

In Python, how do I convert a list that contains strings and lists that have two values into a dictionary such that the key is the string and the value is a list of lists such that the first value of each list is the key.
For example, the current list I have is:
['A', ['A', 1], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2], ['C',3]]
and I want the dictionary:
{'A': [['A', 1]], 'B': [['B',1], ['B',2]], 'C': [['C',1], ['C',2], ['C',3]]}
Thank you.
EDIT: The number of lists that follow a string is arbitrary.

With this, no matter the order of the list, it selects exactly what you're looking for.
def new(list_):
new_dic = {x:[y for y in list_ if type(y) == list and y[0] == x] for x in list_ if type(x) == str}
print(new_dic)
new(['A', ['A', 1], ['A',2], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2]])

d = {l: [] for l in mylist if type(l) is str}
for l in mylist:
if type(l) is list:
d[l[0]].append(l)

You can try defaultdict
from collections import defaultdict
my_dict = defaultdict(list)
my_list = ['A', ['A', 1], ['A',2], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2]]
for index in my_list:
if len(index) > 1:
my_dict[index[0]].append(index)

It seems like it doesn't matter what the string values are in your list. Based on the current structure of the list provided, and the required output, you can just check for the lists inside the list, and by using the defaultdict construct, you can simply just craft your dictionary accordingly:
from collections import defaultdict
l = ['A', ['A', 1], 'B', ['B',1], ['B',2], 'C', ['C', 1], ['C',2], ['C',3]]
d = defaultdict(list)
for data in l:
if type(data) is list:
d[data[0]].append(data)
Output:
defaultdict(<class 'list'>, {'A': [['A', 1]], 'C': [['C', 1], ['C', 2], ['C', 3]], 'B': [['B', 1], ['B', 2]]})
So, here, the defaultdict will take a list as its default collection value. Therefore, when adding a new key, the default value will be a list. As you iterate over the list, simply check the type of the data in the list. When you find a list, you insert it in to your dictionary taking the first value of that list as the key, and then append the list as the value. It should give you the output you are looking for.

Related

Creating a dictionary using indices of list

What is the most efficient way to create a dictionary from a string/list? For example, if I have a list ['a', 'b', 'c', 'd'], how would I create the dictionary for which the elements of the list are the keys, and the indices are the values? How it would look like for the above list: {'a': 0, 'b': 1, 'c': 2, 'd': 3}
enumerate() will return the elements and their indexes, you can use this in a dictionary comprehension.
l = ['a', 'b', 'c', 'd']
d = {value: index for index, value in enumerate(l)}
you can use this:
lista = ['a', 'b', 'c', 'd']
dictionary = {}
n = 0
for el in lista:
dictionary[el] = n
n += 1

Creating two lists from one to get IDs and numerical values

I have a list e.g. below
Final_list = ['A', 'B', 'C', 'D', 'E', 'B_-1', 'C_-1', 'D_-1']
and I would like to create two lists to get IDs and then numerical values.
To do that, I have split after "_" the following to split;
j = []
for i in Final_list:
timelags = i.split("_")
j.append(timelags)
print(j)
and the result is
List_2 = [['A'], ['B'], ['C'], ['D'], ['E'], ['B', '-1'], ['C', '-1'], ['D', '-1']]
But I would like to create two lists e.g. see below;
ID = ['A','B','C','D','E']
Timelag = [[0],[-1,0],[-1,0],[-1,0],[0]]
You can see that there are no duplicates in the ID list and Timelag list has 0 if there is no _ is the original list e.g. for A and E.
P.S: the order needs to be the same for both lists.
If items in each lists must correspond to each other, you could map each element to a value:
Final_list = ['A', 'B', 'C', 'D', 'E', 'B_-1', 'C_-1', 'D_-1']
mapping = {}
for elt in Final_list:
if len(elt) == 1:
mapping[elt] = [0]
else:
mapping[elt[0]] = [int(elt[2:])] + mapping[elt[0]]
mapping
{'A': [0], 'B': [-1, 0], 'C': [-1, 0], 'D': [-1, 0], 'E': [0]}
It does not preserve the order of the elements, but the pairing of elements and their value remains intact.
If lists are important, they can be extracted - they maintain the pairing, but not the order:
list(mapping.keys()), list(mapping.values())
(['A', 'B', 'C', 'D', 'E'], [[0], [-1, 0], [-1, 0], [-1, 0], [0]])
Further, an ordered dictionary could be used to maintain both pairing, and order of the lists, depending on the importance this has to your use case.
Caveat: the tokenization of the input data is rather crude and assumes that length, and values of the data is constrained. Refinements can be added depending on your needs.
Using groupby Solution
from itertools import groupby
l = [list(g) for k, g in groupby(sorted(Final_list),lambda x:x[0])]
d = [(i[0], [i[1].split('_')[1],0] if len(i)==2 else [0]) for i in l]
list(zip(*d))
Output:
[('A', 'B', 'C', 'D', 'E'), ([0], ['-1', 0], ['-1', 0], ['-1', 0], [0])]
Since order is to be preserved, use OrderedDict. Store the key and then its timestamp in it and use it to build your list.
>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> for x in Final_list:
if len(x)<=1 :
d[x] = 0
else :
a, b = x.split('_')
d[a] = b
>>> d
=> OrderedDict([('A', 0), ('B', '-1'), ('C', '-1'), ('D', '-1'), ('E', 0)])
#convert into the format OP wants
>>> [ [int(v)] for v in d.values() ]
=> [[0], [-1], [-1], [-1], [0]]
NOTE : since the 0 in [-1,0] is a bit ambigious, have not included it.

Converting list to dictionary with list elements as index - Python

From Python: create dict from list and auto-gen/increment the keys (list is the actual key values)?, it's possible to create a dict from a list using enumerate to generate tuples made up of incremental keys and elements in life, i.e.:
>>> x = ['a', 'b', 'c']
>>> list(enumerate(x))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> dict(enumerate(x))
{0: 'a', 1: 'b', 2: 'c'}
It is also possible to reverse the key-value by iterating through every key in the dict (assuming that there is a one-to-one mapping between key-value pairs:
>>> x = ['a', 'b', 'c']
>>> d = dict(enumerate(x))
>>> {v:k for k,v in d.items()}
{'a': 0, 'c': 2, 'b': 1}
Given the input list ['a', 'b', 'c'], how can achieve the dictionary where the elements as the key and incremental index as values without trying to loop an additional time to reverse the dictionary?
How about simply:
>>> x = ['a', 'b', 'c']
>>> {j:i for i,j in enumerate(x)}
{'a': 0, 'c': 2, 'b': 1}

Identify duplicates in a list of lists and sum up their last items

I have a list of lists from which I would like to remove duplicates and sum up duplicates' last elements. An item is a duplicate if its first 2 elements are the same. This is better illustrated with an example:
input = [['a', 'b', 2], ['a', 'c', 1], ['a', 'b', 1]]
# Desired output
output = [['a', 'b', 3], ['a', 'c', 1]]
There are similar questions here on SO but I haven't found one which would deal with list of lists and summing up list items at the same time.
I tried several approaches but couldn't make it work:
create a copy of input list, make a nested loop, if second duplicate is found, add its last item to original --> this got too confusing with too much nesting
I looked into collections Counter but it doesn't seem to work with list of lists
itertools
Could you give me any pointers on how to approach this problem?
I don't think lists are the best data structure for it. I would use dictionaries with tuple key. I you really need list, you can create one later:
from collections import defaultdict
data = [['a', 'b', 2], ['a', 'c', 1], ['a', 'b', 1]]
result = collections.defaultdict(int) # new keys are auto-added and initialized as 0
for item in data:
a, b, value = item
result[(a,b)] += value
print result
# defaultdict(<type 'int'>, {('a', 'b'): 3, ('a', 'c'): 1})
print dict(result)
# {('a', 'b'): 3, ('a', 'c'): 1}
print [[a, b, total] for (a, b), total in result.items()]
# [['a', 'b', 3], ['a', 'c', 1]]
You could use Counter; someone's already given a manual defaultdict solution; so here's an itertools.groupby one, just for variety:
>>> from itertools import groupby
>>> inp = [['a', 'b', 2], ['a', 'c', 1], ['a', 'b', 1]]
>>> [k[:2] + [sum(v[2] for v in g)] for k,g in groupby(sorted(inp), key=lambda x: x[:2])]
[['a', 'b', 3], ['a', 'c', 1]]
but I second #m.wasowski's view that a dictionary (or dict subclass like defaultdict or Counter) is probably a better data structure.
It'd also be somewhat more general to use [:-1] and [-1] instead of [:2] and [2], but I'm too lazy to make the change. :-)
I prefer this approach:
>>> from collections import Counter
>>> from itertools import repeat, chain
>>> sum((Counter({tuple(i[:-1]): i[-1]}) for i in input), Counter())
Counter({('a', 'b'): 3, ('a', 'c'): 1})
(Thanks to #DSM for pointing out an improvement to my original answer.)
If you want it in list form:
>>> [[a, b, n] for (a,b),n in _.items()]
[['a', 'b', 3], ['a', 'c', 1]]
>>> t = [['a', 'b', 2], ['a', 'c', 1], ['a', 'b', 1]]
>>> sums = {}
>>> for i in t:
sums[tuple(i[:-1])] = sums.get(tuple(i[:-1]),0) + i[-1]
>>> output = [[a,b,sums[(a,b)]] for a,b in sums]
>>> output
[['a', 'b', 3], ['a', 'c', 1]]
inp = [['a', 'b', 2], ['a', 'c', 1], ['a', 'b', 1], ['a', 'c', 2], ['a', 'b', 4]]
lst = []
seen = []
for i, first in enumerate(inp):
if i in seen:
continue
found = False
count = first[-1]
for j, second in enumerate(inp[i + 1:]):
if first[:2] == second[:2]:
count += second[-1]
found = True
seen.append(i + j + 1)
if found:
lst.append(first[:-1] + [count])
else:
lst.append(first)
print(lst)
# [['a', 'b', 7], ['a', 'c', 3]]

Sort List in Python by two other lists

My question is very similar to these two links 1 and 2:
I have three different lists. I want to sort List1 based on List2 (in ascending order). However, I have repeats in List2. I then want to sort these repeats by List3 (in descending order). Confusing enough?
What I have:
List1 = ['a', 'b', 'c', 'd', 'e']
List2 = [4, 2, 3, 2, 4]
List3 = [0.1, 0.8, 0.3, 0.6, 0.4]
What I want:
new_List1 = ['b', 'd', 'c', 'e', 'a']
'b' comes before 'd' since 0.8 > 0.6. 'e' comes before 'a' since 0.4 > 0.1.
I think you should be able to do this by:
paired_sorted = sorted(zip(List2,List3,List1),key = lambda x: (x[0],-x[1]))
l2,l3,l1 = zip(*paired_sorted)
In action:
>>> List1 = ['a', 'b', 'c', 'd', 'e']
>>> List2 = [4, 2, 3, 2, 4]
>>> List3 = [0.1, 0.8, 0.3, 0.6, 0.4]
>>> paired_sorted = sorted(zip(List2,List3,List1),key = lambda x: (x[0],-x[1]))
>>> l2,l3,l1 = zip(*paired_sorted)
>>> print l1
('b', 'd', 'c', 'e', 'a')
Here's how it works. First we match corresponding elements from your lists using zip. We then sort those elements based on the items from List2 first and (negated) List3 second. Then we just need to pull off the List1 elements again using zip and argument unpacking -- Although you could do it easily with a list-comprehension if you wanted to make sure you had a list at the end of the day instead of a tuple.
This gets a little tougher if you can't easily negate the values in List3 -- e.g. if they're strings. You need to do the sorting in 2 passes:
paired = zip(List2,List3,List1)
rev_sorted = sorted(paired,reverse=True,key=lambda x: x[1]) #"minor" sort first
paired_sorted = sorted(rev_sorted,key=lambda x:x[0]) #"major" sort last
l2,l3,l1 = zip(*paired_sorted)
(you could use operator.itemgetter(1) in place of lambda x:x[1] in the above if you prefer). This works because python sorting is "stable". It doesn't re-order "equal" elements.
This requires a decorate-sort-undecorate step:
decorated = zip(List1, List2, List3)
decorated.sort(key=lambda v: (v[1], -v[2]))
new_list1 = [v[0] for v in decorated]
or, combined into one line:
new_list1 = [v[0] for v in sorted(zip(List1, List2, List3), key=lambda v: (v[1], -v[2]))]
Output:
>>> List1 = ['a', 'b', 'c', 'd', 'e']
>>> List2 = [4, 2, 3, 2, 4]
>>> List3 = [0.1, 0.8, 0.3, 0.6, 0.4]
>>> new_list1 = [v[0] for v in sorted(zip(List1, List2, List3), key=lambda v: (v[1], -v[2]))]
>>> new_list1
['b', 'd', 'c', 'e', 'a']
>>> [v for i, v in sorted(enumerate(List1), key=lambda i_v: (List2[i_v[0]], -List3[i_v[0]]))]
['b', 'd', 'c', 'e', 'a']
This sorts the index/value pairs by using the indices to get the corresponding values from the other lists to use in the key function used for ordering by sorted(), and then extracts just the values using a list comprehension.
Here is a shorter alternative that sorts just the indices and then uses those indices to grab the values from List1:
>>> [List1[i] for i in sorted(range(len(List1)), key=lambda i: (List2[i], -List3[i]))]
['b', 'd', 'c', 'e', 'a']

Categories

Resources