This question already has answers here:
Convert list of lists to list of dictionaries
(2 answers)
Closed 2 years ago.
I have 2 arrays, one of which is two level:
lst1 = [1, 2, 3]
lst2 = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
How to make one list with dictionaries of them? like this:
dct = [{1:'a', 2:'b', 3:'c'}, {1:'d', 2:'e', 3:'f'}, {1:'g', 2:'h', 3:'i'}]
here some cool way :
lst1 = [1, 2, 3]
lst2 = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
list_of_dicts = [{lst1[i]:x for i,x in enumerate(lst_tmp)} for lst_tmp in lst2 ]
list_of_dicts
>>> [{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
or if you intentionally added curly braces and meant to use a set, it is impossible to turn it into a set because it is not hashable (although there some creative way to get it as in here)..
alretnativly you can also zip it!:
list_of_dicts = [dict(zip(lst1,lst_tmp)) for lst_tmp in lst2 ]
list_of_dicts
>>> [{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
The output that you want is not a dictionary, but if you mean a list of dictionaries then you can use this compact form:
lst1 = [1, 2, 3]
lst2 = [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]]
dict_list = [dict(zip(lst1, lst2[i])) for i in range(len(lst2))]
You can do this with zip, which combines two lists which can then be converted to a dict. You can do this in one line with list comprehension.
first = [1, 2, 3]
second = [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]]
print([dict(zip(first, item)) for item in second])
# [{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
Without list comprehension:
first = [1, 2, 3]
second = [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]]
result = []
for item in second:
result.append(dict(zip(first, item)))
print(result)
# [{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
And a little bit more code, using enumerate:
first = [1, 2, 3]
second = [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]]
result = []
for sub in second:
d = {}
for index, item in enumerate(first):
d[item] = sub[index]
result.append(d)
print(result)
# [{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
Related
I am trying to create dictionary with a reapeating pattern like
{0:"A",
1:"B",
2:"C",
3:"D",
4:"A",
5:"B",
6:"C",
7:"D",}
and so on. How would I do that? I have tried using for loops, but couldn't figure it out.
I'm not even sure this is the right approach to my problem. I am solving a simulation numerous times with the same output, only changing 1 input for every loop of the simulation.
Basically I end up with a DataFrame that collects the output (4 different series) for every simulation with columns
[0, 1, 2, 3, 4, 5, 6, 7, 8, ...]
which I would like to rename
["A", "B", "C", "D", "A", "B", "C", "D",...]
Alternatively, is there some sort of datatype in Python, which can provide 2 levels of categorizing like
[Simulation 1: ["A", "B", "C", "D"],
Simulation 2: ["A", "B", "C", "D"],
Simulation 3: ["A", "B", "C", "D"],
Simulation 4: ["A", "B", "C", "D"],
Simulation 5: ["A", "B", "C", "D"],
and so on...]
where "A", "B", "C" and "D" each contains a column of data output, that is different for every simulation?
You can achieve this neatly with itertools.cycle:
In [1]: import itertools
In [2]: cols = [0, 1, 2, 3, 4, 5, 6, 7]
In [3]: dict(zip(cols, itertools.cycle('ABCD')))
Out[3]: {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'A', 5: 'B', 6: 'C', 7: 'D'}
If you'd rather not import modules you could use dictionary comprehension with a modulus operator (%)
print({i:'ABCD'[i%4] for i in range(12)})
{0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'A', 5: 'B', 6: 'C', 7: 'D', 8: 'A', 9: 'B', 10: 'C', 11: 'D'}
If you want to use a for-loop, you could use the modulo operator along with string.ascii_uppercase:
>>> from string import ascii_uppercase
>>> n = 8
>>> repeat_every = 4
>>> d = {i: ascii_uppercase[i % repeat_every] for i in range(n)}
>>> d
{0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'A', 5: 'B', 6: 'C', 7: 'D'}
Alternatively, is there some sort of datatype in Python, which can
provide 2 levels of categorizing like...
You could use itertoools.permutations inside a dict comprehension:
>>> from itertools import permutations
>>> from string import ascii_uppercase
>>>
>>> def pretty_print_simple_dict(d):
... print("{")
... for k, v in d.items():
... print(f"\t{k}: {v}")
... print("}")
...
>>> repeat_every = 4
>>> d = {
... f"Simulation {i + 1}": list(p)
... for i, p in enumerate(permutations(ascii_uppercase[:repeat_every]))
... }
>>>
>>> pretty_print_simple_dict(d)
{
Simulation 1: ['A', 'B', 'C', 'D']
Simulation 2: ['A', 'B', 'D', 'C']
Simulation 3: ['A', 'C', 'B', 'D']
Simulation 4: ['A', 'C', 'D', 'B']
Simulation 5: ['A', 'D', 'B', 'C']
Simulation 6: ['A', 'D', 'C', 'B']
Simulation 7: ['B', 'A', 'C', 'D']
Simulation 8: ['B', 'A', 'D', 'C']
Simulation 9: ['B', 'C', 'A', 'D']
Simulation 10: ['B', 'C', 'D', 'A']
Simulation 11: ['B', 'D', 'A', 'C']
Simulation 12: ['B', 'D', 'C', 'A']
Simulation 13: ['C', 'A', 'B', 'D']
Simulation 14: ['C', 'A', 'D', 'B']
Simulation 15: ['C', 'B', 'A', 'D']
Simulation 16: ['C', 'B', 'D', 'A']
Simulation 17: ['C', 'D', 'A', 'B']
Simulation 18: ['C', 'D', 'B', 'A']
Simulation 19: ['D', 'A', 'B', 'C']
Simulation 20: ['D', 'A', 'C', 'B']
Simulation 21: ['D', 'B', 'A', 'C']
Simulation 22: ['D', 'B', 'C', 'A']
Simulation 23: ['D', 'C', 'A', 'B']
Simulation 24: ['D', 'C', 'B', 'A']
}
This question already has answers here:
Removing duplicates from a list of lists
(16 answers)
How to remove duplicate lists in a list of list? [duplicate]
(2 answers)
Removing duplicates from list of lists in Python
(5 answers)
Python and remove duplicates in list of lists regardless of order within lists
(2 answers)
Remove duplicated lists in list of lists in Python
(4 answers)
Closed 3 years ago.
Given list that looks like:
list = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
How do I return
final_list = [["A"], ["B"], ["A", "B"], ["A", "B", "C"]]
Note that I treat ["A","B"] to be same as ["B","A"]
and ["A","B","C"] same as ["B", "A", "C"].
Try this :
list_ = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
l = list(map(list, set(map(tuple, map(set, list_)))))
Output :
[['A', 'B'], ['B'], ['A', 'B', 'C'], ['A']]
This process goes through like :
First convert each sub-list into a set. Thus ['A', 'B'] and ['B', 'A'] both are converted to {'A', 'B'}.
Now convert each of them to a tuple for removing duplicate items as set() operation can not be done with set sub-items in the list.
With set() operation make a list of unique tuples.
Now convert each tuple items in the list into list type.
This is equivalent to :
list_ = [['A'], ['B'], ['A', 'B'], ['B', 'A'], ['A', 'B', 'C'], ['B', 'A', 'C']]
l0 = [set(i) for i in list_]
# l0 = [{'A'}, {'B'}, {'A', 'B'}, {'A', 'B'}, {'A', 'B', 'C'}, {'A', 'B', 'C'}]
l1 = [tuple(i) for i in l0]
# l1 = [('A',), ('B',), ('A', 'B'), ('A', 'B'), ('A', 'B', 'C'), ('A', 'B', 'C')]
l2 = set(l1)
# l2 = {('A', 'B'), ('A',), ('B',), ('A', 'B', 'C')}
l = [list(i) for i in l2]
# l = [['A', 'B'], ['A'], ['B'], ['A', 'B', 'C']]
l = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
[list(i) for i in {tuple(sorted(i)) for i in l}]
One possible solution:
lst = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
print([
list(i)
for i in sorted(
set(
tuple(sorted(i))
for i in lst
),
key=lambda k: (len(k), k)
)
])
Prints:
[['A'], ['B'], ['A', 'B'], ['A', 'B', 'C']]
When the data you want to handle has to be both unique and unordered, a better choice of data structure are set and frozenset.
A set is an unordered container of unique values.
A frozenset is a set which cannot be mutated, it is thus hashable which allows it to be contained into another set.
Example
lst = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
data = {frozenset(el) for el in lst}
print(data)
Output
{frozenset({'B'}), frozenset({'A', 'B'}), frozenset({'A', 'C', 'B'}), frozenset({'A'})}
The following is a equality partition. It works on any list of any type that has equality defined for it. This is worse than a hash partition as it is quadratic time.
def partition(L, key=None):
if key is None:
key = lambda x: x
parts = []
for item in L:
for part in parts:
if key(item) == key(part[0]):
part.append(item)
break
else:
parts.append([item])
return parts
def unique(L, key=None):
return [p[0] for p in partition(L, key=key)]
alist = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
unique(alist)
# results in [['A'], ['B'], ['A', 'B'], ['B', 'A'], ['A', 'B', 'C'], ['B', 'A', 'C']]
unique(alist, key=lambda v: tuple(sorted(v)))
# results in [['A'], ['B'], ['A', 'B'], ['A', 'B', 'C']]
I have a list of column names, and a list of lists that I would like to turn into a nested dictionary, where each inner list contains the column names as keys. By applying the code below, I encounter the same problem as my real data - I get the right key:value pairs, but only for the very last list.
I thought the way I was trying was a pretty simple approach (too simple?). I'm open to any way to do this, preferably without the use of third party packages, but in the interest of learning would like to know why this doesn't work.
keys = [1, 2, 3]
list_of_lists = [['A', 'B', 'C'], ['D', 'E', 'F']]
for x in list_of_lists:
test = dict(zip(keys, x))
print(test)
Desired output:
{{1: 'A', 2: 'B', 3: 'C'}, {1: 'D', 2: 'E', 3: 'F'}}
Actual output:
{1: 'D', 2: 'E', 3: 'F'}
If what you want is indeed a list of dicts, a very simple one-liner:
keys = [1, 2, 3]
list_of_lists = [['A', 'B', 'C'], ['D', 'E', 'F']]
print([dict(zip(keys, values)) for values in list_of_lists])
# [{1: 'A', 2: 'B', 3: 'C'}, {1: 'D', 2: 'E', 3: 'F'}]
keys = [1, 2, 3]
list_of_lists = [['A', 'B', 'C'], ['D', 'E', 'F']]
test = []
for x in list_of_lists:
test.append(dict(zip(keys, x)))
print(test)
This gives the list of dictionaries.
Output:
[{1: 'A', 2: 'B', 3: 'C'}, {1: 'D', 2: 'E', 3: 'F'}]
Nested dictionaries would require you to have key for each inner element. In the below example, I'm using count as key.
keys = [1, 2, 3]
list_of_lists = [['A', 'B', 'C'], ['D', 'E', 'F']]
test = {}
count = 0
for x in list_of_lists:
test[count] = dict(zip(keys, x))
count = count + 1
print(test)
Output:
{0: {1: 'A', 2: 'B', 3: 'C'}, 1: {1: 'D', 2: 'E', 3: 'F'}}
Unfortunately what you’re desired output shows is a set of dicts, and with a dict being unhashable this will not work.
Alternatively you could make a list or tuple of dicts:
test = [{k:v for k, v in zip(keys, l)} for l in list_of_lists]
#[{1: 'A', 2: 'B', 3: 'C'}, {1: 'D', 2: 'E', 3: 'F'}]
Or a dict of dicts, the keys for the outer dict being an enumeration of the outer list
test = {i: {k:v for k, v in zip(keys, l)} for i, l in enumerate(list_of_lists)}
#{0: {1: 'A', 2: 'B', 3: 'C'}, 1: {1: 'D', 2: 'E', 3: 'F'}}
You are better off placing your dictionaries in a list.
[dict(zip(keys, x)) for x in list_of_lists]
I am not sure, perhaps you would be interested in the order and then you might want to try
{i:x for i, x in zip(range(len(list_of_lists)), list_of_lists)}
Hope this helps
Edit:
Changed the dictionary response code
I can club two lists into a dictionary as below -
list1 = [1,2,3,4]
list2 = ['a','b','c','d']
dct = dict(zip(list1, list2))
print(dct)
Result,
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
However with duplicates as below,
list3 = [1,2,3,3,4,4]
list4 = ['a','b','c','d','e','f']
dct_ = dict(zip(list1, list2))
print(dct)
I get,
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
What should i do to consider the duplicates in my list as individual keys in my resulting dictionary?
I am expecting results as below -
{1: 'a', 2: 'b', 3: 'c', 3: 'd', 4: 'e', 4: 'f'}
Instead you can create the dictionary with values as list:
from collections import defaultdict
d = defaultdict(list)
for k,v in zip(list3, list4):
d[k].append(v)
defaultdict(list, {1: ['a'], 2: ['b'], 3: ['c', 'd'], 4: ['e', 'f']})
You can't have duplicate keys in a dictionary. However, you can have multiple values(a list) mapped to each key.
An easy way to do this is with dict.setdefault():
list3 = [1,2,3,3,4,4]
list4 = ['a','b','c','d','e','f']
d = {}
for x, y in zip(list3, list4):
d.setdefault(x, []).append(y)
print(d)
# {1: ['a'], 2: ['b'], 3: ['c', 'd'], 4: ['e', 'f']}
The other option is to use a collections.defaultdict(), as shown in #YOLO's answer.
For example, list to_be consists of: 3 of "a", 4 of "b", 3 of "c", 5 of "d"...
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]
Now I want it to be like this:
done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)
What's the fastest way to do this?
Presuming I am understanding what you want, it can be done relatively easily by combining itertools.zip_longest, itertools.groupby and itertools.chain.from_iterable():
We first group the items into sets (the "a"s, the "b"s, etc...), we zip them up to get them in the order your want (one from each set), use chain to produce a single list, and then remove the None values introduced by the zipping.
>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
You may want to separate out some of the list comprehensions to make it a bit more readable, however:
>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
(The given version is for 3.x, for 2.x, you will want izip_longest().)
As always, if you expect empty strings, 0, etc... then you will want to do if item is not None, and if you need to keep None values in tact, create a sentinel object and check for identity against that.
You could also use the roundrobin() recipe given in the docs, as an alternative to zipping, which makes it as simple as:
>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
As a final note, the observant might note me making lists from the groupby() generators, which may seem wasteful, the reason comes from the docs:
The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the
groupby() object is advanced, the previous group is no longer visible.
So, if that data is needed later, it should be stored as a list.
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
answer.extend(sorted(counts))
for k in counts:
counts[k] -= 1
counts = {k:v for k,v in counts.iteritems() if v>0}
Now, answer looks like this:
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
I'm not sure if this is fastest, but here's my stab at it:
>>> d = defaultdict(int)
>>> def sort_key(a):
... d[a] += 1
... return d[a],a
...
>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
wrapped up in a function:
def weird_sort(x):
d = defaultdict(int)
def sort_key(a):
d[a] += 1
return (d[a],a)
return sorted(x,key=sort_key)
Of course, this requires that the elements in your iterable be hashable.
A bit less elegant than Lattyware's:
import collections
def rearrange(l):
counts = collections.Counter(l)
output = []
while (sum([v for k,v in counts.items()]) > 0):
output.extend(sorted([k for k, v in counts.items() if v > 0))
for k in counts:
counts[k] = counts[k] - 1 if counts[k] > 0 else 0
return counts
Doing it "by hand and state machinne" should be way more efficient -
but for relatively small lists (<5000), you should have no problem taking vantage of
Python goodies doing this:
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]
def do_it(lst):
lst = lst[:]
result = []
while True:
group = set(lst)
result.extend(sorted(group))
for element in group:
del lst[lst.index(element)]
if not lst:
break
return result
done = do_it(to_be)
The "big O" complexity of the function above should be really BIG. I had not event ried to figure it out.