Related
Let's say I have a list like that:
[[5, [2, 1], 3], [3, [8, 7], [2, 1, 3], 2], [[3, 2, 1], 3, -1]]
(each list can store unspecified number of integers and lists in any order).
Rules for sorting:
integers (if any) should be before any list
integers are sorted normally (from lower to higher value)
list is "smaller" than another when the first element is lower, if the same we consider the second element, the third, and so on.
If the list has the same values as another but less, it's smaller. [1, 2] < [1, 2, 3]
So the final result should be:
[[-1, 3, [1, 2, 3]], [2, 3, [1, 2, 3], [7, 8]], [3, 5, [1, 2]], ]
How to implement that kind of sorting? I've tried to pass recursive function as sort key in sorted, but mostly ended with TypeError (< not supported for int and list).
Another example:
[1, 4, 3, [[5, 7, 1, 2], 2, 5, 10, 2], 8, [2, 5, [5, 3, 3]], [2, 5, 10, 2, [1, 2, 5]]]
After sorting :
[1, 3, 4, 8, [2, 2, 5, 10, [1, 2, 5]], [2, 2, 5, 10, [1, 2, 5, 7]], [2, 5, [3, 3, 5]]]
One way to compare sibling structures (i.e., deeply nested values inside the same list) correctly, safely and efficiently, is to decorate everything and keep it decorated until all the sorting is done. Then undecorate everything at the end. This doesn't modify the given structure but builds a sorted new one:
def deepsorted(x):
def decosorted(x):
if isinstance(x, int):
return 0, x
return 1, sorted(map(decosorted, x))
def undecorated(x):
islist, x = x
if islist:
return list(map(undecorated, x))
return x
return undecorated(decosorted(x))
Another way is to not rely on natural comparisons but use an old-style cmp function instead. This sorts in-place:
from functools import cmp_to_key
def deepsort(x):
def cmp(a, b):
if isinstance(a, int):
if isinstance(b, int):
return a - b
return -1
if isinstance(b, int):
return 1
diffs = filter(None, map(cmp, a, b))
return next(diffs, len(a) - len(b))
key = cmp_to_key(cmp)
def sort(x):
if isinstance(x, list):
for y in x:
sort(y)
x.sort(key=key)
sort(x)
Testing code (Try it online!):
from copy import deepcopy
tests = [
([[5, [2, 1], 3], [3, [8, 7], [2, 1, 3], 2], [[3, 2, 1], 3, -1]],
[[-1, 3, [1, 2, 3]], [2, 3, [1, 2, 3], [7, 8]], [3, 5, [1, 2]], ]),
([1, 4, 3, [[5, 7, 1, 2], 2, 5, 10, 2], 8, [2, 5, [5, 3, 3]], [2, 5, 10, 2, [1, 2, 5]]],
[1, 3, 4, 8, [2, 2, 5, 10, [1, 2, 5]], [2, 2, 5, 10, [1, 2, 5, 7]], [2, 5, [3, 3, 5]]]),
([[1, 2], [1, [2]]],
[[1, 2], [1, [2]]]),
([[[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]],
[[[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]),
([[[1, 2]], [[1, [2]]]],
[[[1, 2]], [[1, [2]]]]),
([[[1, [2]]], [[1, 2]]],
[[[1, 2]], [[1, [2]]]]),
([[1, 3], [1, 2]],
[[1, 2], [1, 3]]),
]
for given, wanted in tests:
copy = deepcopy(given)
deepsort(copy)
print(deepsorted(given) == wanted,
copy == wanted)
Quick draft:
def big_brain_sort(li: list) -> list:
list_with_int = []
list_with_list = []
for el in li:
if isinstance(el, int):
list_with_int.append(el)
elif isinstance(el, list):
list_with_list.append(big_brain_sort(el))
sorted_list_int = sorted(list_with_int)
for lista in sorted(list_with_list, key=lambda x: (big_brain_sort(x), len(x))):
sorted_list_int.append(lista)
return sorted_list_int
Suppose I have two NumPy arrays
x = [[1, 2, 8],
[2, 9, 1],
[3, 8, 9],
[4, 3, 5],
[5, 2, 3],
[6, 4, 7],
[7, 2, 3],
[8, 2, 2],
[9, 5, 3],
[10, 2, 3],
[11, 2, 4]]
y = [0, 0, 1, 0, 1, 1, 2, 2, 2, 0, 0]
Note:
(values in x are not sorted in any way. I chose this example to better illustrate the example)
(These are just two examples of x and y. values of x and y can be arbitrarily many different numbers and y can have arbitrarily different numbers, but there are always as many values in x as there are in y)
I want to efficiently split the array x into sub-arrays according to the values in y.
My desired outputs would be
z_0 = [[1, 2, 8],
[2, 9, 1],
[4, 3, 5],
[10, 2, 3],
[11, 2, 4]]
z_1 = [[3, 8, 9],
[5, 2, 3],
[6, 4, 7],]
z_2 = [[7, 2, 3],
[8, 2, 2],
[9, 5, 3]]
Assuming that y starts with zero and is not sorted but grouped, what is the most efficient way to do this?
Note: This question is the unsorted version of this question:
Split a NumPy array into subarrays according to the values (sorted in ascending order) of another array
One way to solve this is to build up a list of filter indexes for each y value and then simply select those elements of x. For example:
z_0 = x[[i for i, v in enumerate(y) if v == 0]]
z_1 = x[[i for i, v in enumerate(y) if v == 1]]
z_2 = x[[i for i, v in enumerate(y) if v == 2]]
Output
array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]])
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]])
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])
If you want to be more generic and support different sets of numbers in y, you could use a comprehension to produce a list of arrays e.g.
z = [x[[i for i, v in enumerate(y) if v == m]] for m in set(y)]
Output:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
If y is also an np.array and the same length as x you can simplify this to use boolean indexing:
z = [x[y==m] for m in set(y)]
Output is the same as above.
Just use list comprehension and boolean indexing
x = np.array(x)
y = np.array(y)
z = [x[y == i] for i in range(y.max() + 1)]
z
Out[]:
[array([[ 1, 2, 8],
[ 2, 9, 1],
[ 4, 3, 5],
[10, 2, 3],
[11, 2, 4]]),
array([[3, 8, 9],
[5, 2, 3],
[6, 4, 7]]),
array([[7, 2, 3],
[8, 2, 2],
[9, 5, 3]])]
Slight variation.
from operator import itemgetter
label = itemgetter(1)
Associate the implied information with the label ... (index,label)
y1 = [thing for thing in enumerate(y)]
Sort on the label
y1.sort(key=label)
Group by label and construct the results
import itertools
d = {}
for key,group in itertools.groupby(y1,label):
d[f'z{key}'] = [x[i] for i,k in group]
Pandas solution:
>>> import pandas as pd
>>> >>> df = pd.DataFrame({'points':[thing for thing in x],'cat':y})
>>> z = df.groupby('cat').agg(list)
>>> z
points
cat
0 [[1, 2, 8], [2, 9, 1], [4, 3, 5], [10, 2, 3], ...
1 [[3, 8, 9], [5, 2, 3], [6, 4, 7]]
2 [[7, 2, 3], [8, 2, 2], [9, 5, 3]]
For example:
t=[[1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [3, 5], [4, 5, 6],[4,5,6],[6,7], [6], [1]]
I want to delete the short lists if the items are included in a long one, even the items are not continuous. So, I expect the result to be:
[[1, 2, 3, 4, 5, 6],[6,7]]
I might figure out this by myself, but my way is not smart enough. Could anyone help me here?
Since all the elements in a list is unique, AND I like using sets
here's my code. Haven't checked it's efficiency but it looks cleaner :D
t = [[1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [3, 5], [4, 5, 6],[4,5,6],[6,7], [6], [1]]
t = [set(l) for l in t]
t = [list(x) for x in t if not any([x.issubset(y) for y in t if x != y])]
Sort from small to large, make them sets then pop them off the list to reduce the list size for every computation.
t=[[1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [3, 5], [4, 5, 6],[4,5,6],[6,7], [6], [1]]
t = sorted(t, key=lambda x: len(x))
t = [set(x) for x in t]
for i in range(len(t)):
a = t.pop(0)
if not any([a.issubset(x) for x in t]):
print(a)
My approach is very simple
I check the last element is already present in our longer list. If we present then we don't need to add to the longer list if it is not the case then we will add to the longerlists
sorted_lists=[[1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [3, 5], [4, 5, 6],[4,5,6],[6,7], [6], [1]]
sorted_big_lists =[]
for sorted_list in sorted_lists:
for test_list in sorted_big_lists:
if sorted_list[-1] in test_list:
break
else:
sorted_big_lists.append(sorted_list)
print(sorted_big_lists)
I have two lists a and b. Then I create all combinations taking all two-length combinations of a plus one each of b:
import itertools as it
a = [1,2,3,4]
b = [5,6]
for i in it.product(it.combinations(a, 2), b):
print (i)
# output:
((1, 2), 5)
((1, 2), 6)
((1, 3), 5)
...
# expected output:
[1, 2, 5]
[1, 2, 6]
[1, 3, 5]
...
How can the tuples be transformed at the stage of the loop operation into lists?
The following comprehensions will work:
>>> [[*x, y] for x, y in it.product(it.combinations(a, 2), b)] # Py3
>>> [list(x) + [y] for x, y in it.product(it.combinations(a, 2), b)] # all Py versions
[[1, 2, 5],
[1, 2, 6],
[1, 3, 5],
[1, 3, 6],
[1, 4, 5],
[1, 4, 6],
[2, 3, 5],
[2, 3, 6],
[2, 4, 5],
[2, 4, 6],
[3, 4, 5],
[3, 4, 6]]
Simplified approach:
a = [1,2,3,4]
b = [5,6]
l = len(a)
print(sorted([a[i], a[i_n], j] for i in range(l) for j in b
for i_n in range(i+1, l) if i < l-1))
The output:
[[1, 2, 5], [1, 2, 6], [1, 3, 5], [1, 3, 6], [1, 4, 5], [1, 4, 6], [2, 3, 5], [2, 3, 6], [2, 4, 5], [2, 4, 6], [3, 4, 5], [3, 4, 6]]
I have a list of lists like this:
i = [[1, 2, 3], [2, 4, 5], [1, 2, 3], [2, 4, 5]]
I would like to get a list containing "unique" lists (based on their elements) like:
o = [[1, 2, 3], [2, 4, 5]]
I cannot use set() as there are non-hashable elements in the list. Instead, I am doing this:
o = []
for e in i:
if e not in o:
o.append(e)
Is there an easier way to do this?
You can create a set of tuples, a set of lists will not be possible because of non hashable elements as you mentioned.
>>> l = [[1, 2, 3], [2, 4, 5], [1, 2, 3], [2, 4, 5]]
>>> set(tuple(i) for i in l)
{(1, 2, 3), (2, 4, 5)}
i = [[1, 2, 3], [2, 4, 5], [1, 2, 3], [2, 4, 5]]
print([ele for ind, ele in enumerate(i) if ele not in i[:ind]])
[[1, 2, 3], [2, 4, 5]]
If you consider [2, 4, 5] to be equal to [2, 5, 4] then you will need to do further checks
You can convert each element to a tuple and then insert it in a set.
Here's some code with your example:
tmp = set()
a = [[1, 2, 3], [2, 4, 5], [1, 2, 3], [2, 4, 5]]
for i in a:
tmp.add(tuple(i))
tmp will be like this:
{(1, 2, 3), (2, 4, 5)}
Here's another way to do it:
I = [[1, 2, 3], [2, 4, 5], [1, 2, 3], [2, 4, 5]]
mySet = set()
for j in range(len(I)):
mySet = mySet | set([tuple(I[j])])
print(mySet)