How to re-arrange list like this (python)? - python

For example, list to_be consists of: 3 of "a", 4 of "b", 3 of "c", 5 of "d"...
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]
Now I want it to be like this:
done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)
What's the fastest way to do this?

Presuming I am understanding what you want, it can be done relatively easily by combining itertools.zip_longest, itertools.groupby and itertools.chain.from_iterable():
We first group the items into sets (the "a"s, the "b"s, etc...), we zip them up to get them in the order your want (one from each set), use chain to produce a single list, and then remove the None values introduced by the zipping.
>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
You may want to separate out some of the list comprehensions to make it a bit more readable, however:
>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
(The given version is for 3.x, for 2.x, you will want izip_longest().)
As always, if you expect empty strings, 0, etc... then you will want to do if item is not None, and if you need to keep None values in tact, create a sentinel object and check for identity against that.
You could also use the roundrobin() recipe given in the docs, as an alternative to zipping, which makes it as simple as:
>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
As a final note, the observant might note me making lists from the groupby() generators, which may seem wasteful, the reason comes from the docs:
The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the
groupby() object is advanced, the previous group is no longer visible.
So, if that data is needed later, it should be stored as a list.

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
answer.extend(sorted(counts))
for k in counts:
counts[k] -= 1
counts = {k:v for k,v in counts.iteritems() if v>0}
Now, answer looks like this:
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

I'm not sure if this is fastest, but here's my stab at it:
>>> d = defaultdict(int)
>>> def sort_key(a):
... d[a] += 1
... return d[a],a
...
>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
wrapped up in a function:
def weird_sort(x):
d = defaultdict(int)
def sort_key(a):
d[a] += 1
return (d[a],a)
return sorted(x,key=sort_key)
Of course, this requires that the elements in your iterable be hashable.

A bit less elegant than Lattyware's:
import collections
def rearrange(l):
counts = collections.Counter(l)
output = []
while (sum([v for k,v in counts.items()]) > 0):
output.extend(sorted([k for k, v in counts.items() if v > 0))
for k in counts:
counts[k] = counts[k] - 1 if counts[k] > 0 else 0
return counts

Doing it "by hand and state machinne" should be way more efficient -
but for relatively small lists (<5000), you should have no problem taking vantage of
Python goodies doing this:
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]
def do_it(lst):
lst = lst[:]
result = []
while True:
group = set(lst)
result.extend(sorted(group))
for element in group:
del lst[lst.index(element)]
if not lst:
break
return result
done = do_it(to_be)
The "big O" complexity of the function above should be really BIG. I had not event ried to figure it out.

Related

Generating distinct partly-unordered permutations in Python

Given a list of length 2n, say ["a", "b", "c", "d", "e", "f"] or ["a", "a", "b", "b", "c", "d"] (elements in the list don't necessarily have to be unique), I'd like to generate all the possible distinct permutations of that list while taking into account that the order in which the element 2k and the element 2k+1 appear doesn't matter. That means that
["a", "b", "c", "d", "e", "f"] and ["b", "a", "c", "d", "e", "f"] are the same permutation, but
["a", "b", "c", "d", "e", "f"] and ["a", "c", "b", "d", "e", "f"] are not.
For example, from the list ["a", "b", "c", "d"], the code I need would generate this sequence:
["a", "b", "c", "d"], ["a", "c", "b", "d"], ["a", "d", "b", "c"], ["b", "c", "a", "d"], ["b", "d", "a", "c"], ["c", "d", "a", "b"]
I know it's possible to do that by generating all the permutations and keeping one of each of those that are equivalent to each other, but that's not a very efficient way to proceed, especially with larger sets. Is there a more efficient way to do that?
I wrote this code, but it's highly inefficient (keep in mind that I need to use lists with a length of up to 14):
from itertools import permutations
list_letters = ["a", "b", "c", "d"]
n = int(len(list_letters) / 2)
set_distinct_perm = set()
for perm in permutations(list_letters):
perm = list(perm)
for i in range(n):
perm[2*i:2*i + 2] = sorted(perm[2*i:2*i+2])
perm = tuple(perm)
set_distinct_perm.add(perm)
print(set_distinct_perm)
How about the following, where we take permutations of each [2k, 2k+1] (inclusive) subsequence and then take the product of those permutations:
from itertools import product, permutations
def equivalents(lst):
perms = product(*({*permutations(lst[i:i+2])} for i in range(0, len(lst), 2)))
return [[x for tupl in perm for x in tupl] for perm in perms] # flattening inner part
print(*equivalents('abcdef'))
# ['a', 'b', 'c', 'd', 'f', 'e'] ['b', 'a', 'c', 'd', 'f', 'e']
# ['a', 'b', 'c', 'd', 'e', 'f'] ['a', 'b', 'd', 'c', 'f', 'e']
# ['a', 'b', 'd', 'c', 'e', 'f'] ['b', 'a', 'd', 'c', 'e', 'f']
# ['b', 'a', 'd', 'c', 'f', 'e'] ['b', 'a', 'c', 'd', 'e', 'f']
print(*equivalents('aabbef'))
# ['a', 'a', 'b', 'b', 'f', 'e'] ['a', 'a', 'b', 'b', 'e', 'f']

Python: Remove duplicate lists in list of lists [duplicate]

This question already has answers here:
Removing duplicates from a list of lists
(16 answers)
How to remove duplicate lists in a list of list? [duplicate]
(2 answers)
Removing duplicates from list of lists in Python
(5 answers)
Python and remove duplicates in list of lists regardless of order within lists
(2 answers)
Remove duplicated lists in list of lists in Python
(4 answers)
Closed 3 years ago.
Given list that looks like:
list = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
How do I return
final_list = [["A"], ["B"], ["A", "B"], ["A", "B", "C"]]
Note that I treat ["A","B"] to be same as ["B","A"]
and ["A","B","C"] same as ["B", "A", "C"].
Try this :
list_ = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
l = list(map(list, set(map(tuple, map(set, list_)))))
Output :
[['A', 'B'], ['B'], ['A', 'B', 'C'], ['A']]
This process goes through like :
First convert each sub-list into a set. Thus ['A', 'B'] and ['B', 'A'] both are converted to {'A', 'B'}.
Now convert each of them to a tuple for removing duplicate items as set() operation can not be done with set sub-items in the list.
With set() operation make a list of unique tuples.
Now convert each tuple items in the list into list type.
This is equivalent to :
list_ = [['A'], ['B'], ['A', 'B'], ['B', 'A'], ['A', 'B', 'C'], ['B', 'A', 'C']]
l0 = [set(i) for i in list_]
# l0 = [{'A'}, {'B'}, {'A', 'B'}, {'A', 'B'}, {'A', 'B', 'C'}, {'A', 'B', 'C'}]
l1 = [tuple(i) for i in l0]
# l1 = [('A',), ('B',), ('A', 'B'), ('A', 'B'), ('A', 'B', 'C'), ('A', 'B', 'C')]
l2 = set(l1)
# l2 = {('A', 'B'), ('A',), ('B',), ('A', 'B', 'C')}
l = [list(i) for i in l2]
# l = [['A', 'B'], ['A'], ['B'], ['A', 'B', 'C']]
l = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
[list(i) for i in {tuple(sorted(i)) for i in l}]
One possible solution:
lst = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
print([
list(i)
for i in sorted(
set(
tuple(sorted(i))
for i in lst
),
key=lambda k: (len(k), k)
)
])
Prints:
[['A'], ['B'], ['A', 'B'], ['A', 'B', 'C']]
When the data you want to handle has to be both unique and unordered, a better choice of data structure are set and frozenset.
A set is an unordered container of unique values.
A frozenset is a set which cannot be mutated, it is thus hashable which allows it to be contained into another set.
Example
lst = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
data = {frozenset(el) for el in lst}
print(data)
Output
{frozenset({'B'}), frozenset({'A', 'B'}), frozenset({'A', 'C', 'B'}), frozenset({'A'})}
The following is a equality partition. It works on any list of any type that has equality defined for it. This is worse than a hash partition as it is quadratic time.
def partition(L, key=None):
if key is None:
key = lambda x: x
parts = []
for item in L:
for part in parts:
if key(item) == key(part[0]):
part.append(item)
break
else:
parts.append([item])
return parts
def unique(L, key=None):
return [p[0] for p in partition(L, key=key)]
alist = [["A"], ["B"], ["A","B"], ["B","A"], ["A","B","C"], ["B", "A", "C"]]
unique(alist)
# results in [['A'], ['B'], ['A', 'B'], ['B', 'A'], ['A', 'B', 'C'], ['B', 'A', 'C']]
unique(alist, key=lambda v: tuple(sorted(v)))
# results in [['A'], ['B'], ['A', 'B'], ['A', 'B', 'C']]

How to compare elements of a list in a Python map's value, and check to see if at least n number of elements match?

I want to iterate over a map's values and compare elements of a list to see if at least 3 elements match in the same order, and then have a list returned with the keys that match the condition.
prefs = {
's1': ["a", "b", "c", "d", "e"],
's2': ["c", "d", "e", "a", "b"],
's3': ["a", "b", "c", "d", "e"],
's4': ["c", "d", "e", "b", "e"],
's5': ["c", "d", "e", "a", "b"]
}
Here is a sample map. In this example keys s1, and s3 have at least three elements in the list value that match "a", "b", "c". So s1, and s3 should be returned like this s1 -- s3. Similarly s2 and s4 match so that should also be returned, but s2 has multiple matches because it matches with s5 as well so s2 -- s5 should be returned. I want to return all possible matches for each key-value pair in a list.
The return output should be something like:
[[s1--s3], [s2--s4], [s2--s5], [s4--s5]]
I'm unable to figure out how I can iterate over each value in the map, but here is a snippet of element-wise comparison. I'm wondering if I can set a counter, and check to see if match_cnt > 3 and then return the keys in a list.
a = ["a", "b", "c", "d", "e"]
b = ["a", "c", "b", "d", "e"]
match_cnt = 0
if len(a) == len(b):
for i in range(len(a)):
if a[i] == b[i]:
print(a[i], b[i])
Also, want some knowledge on the runtime of this algorithm.
Complete code solution would be appreciated.
I had been advised to open a new question here
You can make use .items() to iterate over the map, then it's just matching the first 3 list items using a slice:
prefs = {
's1': ["a", "b", "c", "d", "e"],
's2': ["c", "d", "e", "a", "b"],
's3': ["a", "b", "c", "d", "e"],
's4': ["c", "d", "e", "b", "e"],
's5': ["c", "d", "e", "a", "b"]
}
results = []
for ki, vi in prefs.items():
for kj, vj in prefs.items():
if ki == kj: # skip checking same values on same keys !
continue
if vi[:3] == vj[:3]: # slice the lists to test first 3 characters
match = tuple(sorted([ki, kj])) # sort results to eliminate duplicates
results.append(match)
print (set(results)) # print a unique set
Returns:
set([('s1', 's3'), ('s4', 's5'), ('s2', 's5'), ('s2', 's4')])
Edit:
To check all possible combinations, you can use combinations() from itertools. iCombinations/jCombinations are preserving order with a length of 3 list items:
from itertools import combinations
prefs = {
's1': ["a", "b", "c", "d", "e"],
's2': ["c", "d", "e", "a", "b"],
's3': ["a", "b", "c", "d", "e"],
's4': ["c", "d", "e", "b", "e"],
's5': ["c", "d", "e", "a", "b"]
}
results = []
for ki, vi in prefs.items():
for kj, vj in prefs.items():
if ki == kj: # skip checking same values on same keys !
continue
# match pairs from start
iCombinations = [vi[n:n+3] for n in range(len(vi)-2)]
jCombinations = [vj[n:n+3] for n in range(len(vj)-2)]
# match all possible combinations
import itertools
iCombinations = itertools.combinations(vi, 3)
jCombinations = itertools.combinations(vj, 3)
if any([ic in jCombinations for ic in iCombinations]): # checking all combinations
match = tuple(sorted([ki, kj]))
results.append(match)
print (set(results)) # print a unique set
This returns:
set([('s1', 's3'), ('s2', 's5'), ('s3', 's5'), ('s2', 's3'), ('s2', 's4'), ('s1', 's4'), ('s1', 's5'), ('s3', 's4'), ('s4', 's5'), ('s1', 's2')])
I've tried to be as detailed as possible. This should be an example how you can often work your way through such a problem by inserting a lot of print messages to create a log of what's going on.
prefs = {
's1': ["a", "b", "c", "d", "e"],
's2': ["c", "d", "e", "a", "b"],
's3': ["a", "b", "c", "d", "e"],
's4': ["c", "d", "e", "b", "e"],
's5': ["c", "d", "e", "a", "b"]
}
# Get all items of prefs and sort them by key. (Sorting might not be
# necessary, that's something you'll have to decide.)
items_a = sorted(prefs.items(), key=lambda item: item[0])
# Make a copy of the items where we can delete the processed items.
items_b = items_a.copy()
# Set the length for each compared slice.
slice_length = 3
# Calculate how many comparisons will be necessary per item.
max_shift = len(items_a[0][1]) - slice_length
# Create an empty result list for all matches.
matches = []
# Loop all items
print("Comparisons:")
for key_a, value_a in items_a:
# We don't want to check items against themselves, so we have to
# delete the first item of items_b every loop pass (which would be
# the same as key_a, value_a).
del items_b[0]
# Loop remaining other items
for key_b, value_b in items_b:
print("- Compare {} to {}".format(key_a, key_b))
# We have to shift the compared slice
for shift in range(max_shift + 1):
# Start the slice at 0, then shift it
start = 0 + shift
# End the slice at slice_length, then shift it
end = slice_length + shift
# Create the slices
slice_a = value_a[start:end]
slice_b = value_b[start:end]
print(" - Compare {} to {}".format(slice_a, slice_b), end="")
if slice_a == slice_b:
print(" -> Match!", end="")
matches += [(key_a, key_b, shift)]
print("")
print("Matches:")
for key_a, key_b, shift in matches:
print("- At positions {} to {} ({} elements), {} matches with {}".format(
shift + 1, shift + slice_length, slice_length, key_a, key_b))
Which prints:
Comparisons:
- Compare s1 to s2
- Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
- Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
- Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s1 to s3
- Compare ['a', 'b', 'c'] to ['a', 'b', 'c'] -> Match!
- Compare ['b', 'c', 'd'] to ['b', 'c', 'd'] -> Match!
- Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
- Compare s1 to s4
- Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
- Compare ['b', 'c', 'd'] to ['d', 'e', 'b']
- Compare ['c', 'd', 'e'] to ['e', 'b', 'e']
- Compare s1 to s5
- Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
- Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
- Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s2 to s3
- Compare ['c', 'd', 'e'] to ['a', 'b', 'c']
- Compare ['d', 'e', 'a'] to ['b', 'c', 'd']
- Compare ['e', 'a', 'b'] to ['c', 'd', 'e']
- Compare s2 to s4
- Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
- Compare ['d', 'e', 'a'] to ['d', 'e', 'b']
- Compare ['e', 'a', 'b'] to ['e', 'b', 'e']
- Compare s2 to s5
- Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
- Compare ['d', 'e', 'a'] to ['d', 'e', 'a'] -> Match!
- Compare ['e', 'a', 'b'] to ['e', 'a', 'b'] -> Match!
- Compare s3 to s4
- Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
- Compare ['b', 'c', 'd'] to ['d', 'e', 'b']
- Compare ['c', 'd', 'e'] to ['e', 'b', 'e']
- Compare s3 to s5
- Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
- Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
- Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s4 to s5
- Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
- Compare ['d', 'e', 'b'] to ['d', 'e', 'a']
- Compare ['e', 'b', 'e'] to ['e', 'a', 'b']
Matches:
- At positions 1 to 3 (3 elements), s1 matches with s3
- At positions 2 to 4 (3 elements), s1 matches with s3
- At positions 3 to 5 (3 elements), s1 matches with s3
- At positions 1 to 3 (3 elements), s2 matches with s4
- At positions 1 to 3 (3 elements), s2 matches with s5
- At positions 2 to 4 (3 elements), s2 matches with s5
- At positions 3 to 5 (3 elements), s2 matches with s5
- At positions 1 to 3 (3 elements), s4 matches with s5
It's still unclear, what your output really should be. However, I think you'll have no problems in converting the above code to your needs.

how to combine or leave strings in the lists depending on the condition in python?

I have three lists:
li1 = ["a", "a", "a", "a", "b", "b", "a", "a", "b"]
li2 = ["a", "a", "a", "b", "a,", "b", "a", "a"]
li3 = ["b", "b", "a", "a", "b"]
I want to "slice and paste" elements by "b"
The result is supposed to look like this:
li1 = ["aaaa", "b", "b", "aa", "b"]
li2 = ["aaa", "b", "a", "b", "aa"]
li3 = ["b", "b", "aa", "b"]
But I don't know how to approach this... please help me!
Use itertools.groupby.
If you want to join groups not belonging to a certain key
from itertools import groupby
def join_except_key(iterable, key='b'):
groups = groupby(iterable)
for k, group in groups:
if k != key:
yield ''.join(group) # more general: ''.join(map(str, group))
else:
yield from group
Demo:
>>> li1 = ["a", "a", "a", "a", "b", "b", "a", "a", "b", "c", "c", "b", "c", "c"]
>>> list(join_except_key(li1))
['aaaa', 'b', 'b', 'aa', 'b', 'cc', 'b', 'cc']
If you want to join groups belonging to a certain key
from itertools import groupby
def join_by_key(iterable, key='a'):
groups = groupby(iterable)
for k, group in groups:
if k == key:
yield ''.join(group) # more general: ''.join(map(str, group))
else:
yield from group
Demo:
>>> li1 = ["a", "a", "a", "a", "b", "b", "a", "a", "b", "c", "c", "b", "c", "c"]
>>> list(join_by_key(li1))
['aaaa', 'b', 'b', 'aa', 'b', 'c', 'c', 'b', 'c', 'c']
Details on what groupby produces (non generator approach for join_except_key)
>>> li1 = ["a", "a", "a", "a", "b", "b", "a", "a", "b", "c", "c", "b", "c", "c"]
>>> groups = [(k, list(group)) for k, group in groupby(li1)]
>>> groups
[('a', ['a', 'a', 'a', 'a']),
('b', ['b', 'b']),
('a', ['a', 'a']),
('b', ['b']),
('c', ['c', 'c']),
('b', ['b']),
('c', ['c', 'c'])]
>>>
>>> result = []
>>> for k, group in groups:
...: if k != 'b':
...: result.append(''.join(group))
...: else:
...: result.extend(group)
...:
>>> result
['aaaa', 'b', 'b', 'aa', 'b', 'cc', 'b', 'cc']
The list comprehension groups = [... in the second line was only needed for inspecting the elements of the grouping operation, it works fine with just groups = groupby(li1).
You can use itertools.groupby, dividing logic into 3 parts:
Group by equality to your separator string.
Construct an iterable of lists depending on the condition defined in groupby key.
Use itertools.chain.from_iterable to flatten your iterable of lists.
Here's a demonstration.
from itertools import chain, groupby
def sep_by_val(L, k='b'):
grouper = groupby(L, key=lambda x: x==k)
gen_of_lst = ([''.join(j)] if not i else list(j) for i, j in grouper)
return list(chain.from_iterable(gen_of_lst))
sep_by_val(li1) # ['aaaa', 'b', 'b', 'aa', 'b']
sep_by_val(li2) # ['aaa', 'b', 'a,', 'b', 'aa']
sep_by_val(li3) # ['b', 'b', 'aa', 'b']
Itertools and Yield from are great python constructs but challenging to master. Something simpler would go like so involving string shifting and splitting.
result = []
while len(li1) > 0:
split = ''.join(li1).partition('b')
before, part, after = split
if before:
result.extend( before.split() )
if part:
result.append(part)
li1 = after.split()
print(result)
Here is a function I wrote to perform this:
def Conbine(Li):
li=[]
li.append(Li[0])
Prev=Li[0]
for i in Li[1:]:
if not"b"in(i,Prev):li[-1]+=i
else:
Prev=i
li.append(i)
return li
Here is the result:
>>> Conbine(["a", "a", "a", "a", "b", "b", "a", "a", "b"])
['aaaa', 'b', 'b', 'aa', 'b']
>>> Conbine(["a", "a", "a", "b", "a,", "b", "a", "a"])
['aaa', 'b', 'a,', 'b', 'aa']
>>> Conbine(["b", "b", "a", "a", "b"])
['b', 'b', 'aa', 'b']
There are a lot of answers here already, but I hope this helped.
I don't get why all the answers look complicated for this. Did I miss something ?
li1 = ['a', 'a', 'a', 'b', 'b', 'a', 'a', 'b']
result = []
for e in li1:
if result and e != 'b' != result[-1]:
result[-1] += e
else:
result.append(e)
print(result)
Prints
['aaa', 'b', 'b', 'aa', 'b']
Keep it simple and stupid. Readability matters.
I'm late, but this is another option:
def join_in(lst, s):
res, append = [lst[0]], True
for i, e in enumerate(lst[1:]):
if res[-1][0] == s and e == s:
res[-1] += e
append = False
else: append = True
if append: res.append(e)
return res
Calling on the OP lists:
print (join_in(li1, 'a')) #=> ["aaaa", "b", "b", "aa", "b"]
print (join_in(li2, 'a')) #=> ["aaa", "b", "a", "b", "aa"]
print (join_in(li3, 'a')) #=> ["b", "b", "aa", "b"]
It is possible to call it on 'b':
print (join_in(join_in(li3, 'a'), 'b')) #=> ['bb', 'aa', 'b']

Turning a sublist of a list into a dictionary

So I was wondering whether it's possible to get a sublist of a list into a dictionary.
For example, a list contains:
cy = [["a", "b", "a"], ["a", "b", "c", "d", "a"], ["c", "b", "a", "c"], ["d, "b", "a", "d"]]
would be stored into a dictionary according to the first letter of the sublist
{a : ["a", "b", "a"], c : ["c", "b", "a", "c"], d: ["d, "b", "a", "d"] }
Notice that it only stores the first sublist of the key that starts with the "a" and not the next.
My code is as follows:
def syn(graph,start):
empty = []
cy = [["a", "b", "a"], ["a", "b", "c", "d", "a"], ["c", "b", "a", "c"], ["d, "b", "a", "d"]]
lei = dict()
for items in cy:
if items[0] in lei:
lei[items[0]] += items
else:
lei[items[0]] = items
return lei
But all I get is
{a : ["a", "b", "a"]}
Is there any way to fix this?
With simple dict comprehension:
cy = [["a", "b", "a"], ["a", "b", "c", "d", "a"], ["c", "b", "a", "c"], ["d", "b", "a", "d"]]
result = {l[0]: l[:] for l in cy[::-1]}
print(result)
The output:
{'a': ['a', 'b', 'a'], 'c': ['c', 'b', 'a', 'c'], 'd': ['d', 'b', 'a', 'd']}
cy[::-1] - processing the input list in reversed order to prevent overlapping by same next "keys"
While you can very cleverly iterate backwards, as #juanpa.arrivillaga's and #RomanPerekhrest solutions demonstrate, another approach is to simple iterate over the list and user the any function:
cy = [["a", "b", "a"], ["a", "b", "c", "d", "a"], ["c", "b", "a", "c"], ["d", "b", "a", "d"]]
new_list = []
for i in cy:
if not any(i[0] == b[0] for b in new_list):
new_list.append(i)
final_dict = {i[0]:i for i in new_list}
Output:
{'a': ['a', 'b', 'a'], 'c': ['c', 'b', 'a', 'c'], 'd': ['d', 'b', 'a', 'd']}
The most straightforward way is to iterate over your list backwards, adding the list to the dict based on the first element. Since you go backwards, this guarantees that the first sublist will be the list contained in your dict:
>>> result = {}
>>> for sub in reversed(cy):
... result[sub[0]] = sub
...
>>> result
{'d': ['d', 'b', 'a', 'd'], 'a': ['a', 'b', 'a'], 'c': ['c', 'b', 'a', 'c']}
>>>
So, you would keep the last encountered value, but the last value of list in reverse order is the first value in forward order!
Using pandas:
import pandas as pd
cy = [["a", "b", "a"], ["a", "b", "c", "d", "a"], ["c", "b", "a", "c"], ["d", "b", "a", "d"]]
# create a df
df = pd.DataFrame([[i] for i in cy],[i[0] for i in cy])
# export it in reverse order
df.iloc[::-1].T.to_dict(orient='rows')[0]
Returns:
{'a': ['a', 'b', 'a'], 'c': ['c', 'b', 'a', 'c'], 'd': ['d', 'b', 'a', 'd']}

Categories

Resources