Related
I have a nested list
elements = [['A'],['B','C'],['D','E','F']]
and I have list of indexes
index= [1,2,3,4,5,6]
I want to group by the indexes meaning the output would look like this:
out = [[1],[2,3],[4,5,6]]
You can slice (with itertools.islice) relying on the length of each sublist:
from itertools import islice
vals = [['A'], ['B','C'], ['D','E','F']]
indices = iter([1,2,3,4,5,6])
slices = [list(islice(indices, len(sub_l))) for sub_l in vals]
print(slices)
[[1], [2, 3], [4, 5, 6]]
A simple approach would be to use iter to make index an iterable, then loop through the elements list and get the next element in the iterable.
iter_index = iter(index)
result = [[next(iter_index) for _ in group] for group in elements]
[[1], [2,3], [4,5,6]]
You could do it in a comprehension using an internal iterator:
result = [ [next(i) for _ in e] for i in [iter(index)] for e in elements ]
print(result)
[[1], [2, 3], [4, 5, 6]]
Alternatively, you could compute the starting position of each sub-list in the index list and use that in a comprehension to get the corresponding subset of indexes:
startPos = [0]
startPos.extend(len(e)+startPos[-1] for e in elements)
result = [ index[p:p+len(e)] for p,e in zip(startPos,elements) ]
print(result)
# [[1], [2, 3], [4, 5, 6]]
Using the same logic but based on the ending positions, you could use accumulate from itertools to compute the positions:
from itertools import accumulate
endPos = accumulate(map(len,elements))
result = [index[p-len(e):p] for p,e in zip(endPos,elements)]
print(result)
# [[1], [2, 3], [4, 5, 6]]
If it's ok to "destroy" the content of the index list in the process, you could write it like this:
result = [[index.pop(0) for _ in e] for e in elements]
print(result)
# [[1], [2, 3], [4, 5, 6]]
I have a nested list in the form of [[1,2,3], [3,4,5], [8,6,2,5,6], [7,2,9]]
I would like to extract every first item into a new list, every second item into a new list and the rest into a new nested list:
a = [1,3,8,7] b = [2,4,6,2], c = [[3], [5], [2,5,6],[9]]
Is it possible to avoid using the for loop because the real nested list is quite large? Any help would be appreciated.
Ultimately, whatever your solution would be, you're gonna have to have a for loop inside your code and my advice would be to make it as clean and as readable as possible.
That being said, here's what I would propose:
arr = [[1,2,3], [3,4,5], [8,6,2,5,6], [7,2,9]]
first_arr, second_arr, third_arr = [], [], []
for nested in arr:
first_arr.append(nested[0])
second_arr.append(nested[1])
third_arr.append(nested[2:])
This is a naive, simple looped solution using list comprehensions, but see if it is fast enough for you.
l = [[1,2,3], [3,4,5], [8,6,2,5,6], [7,2,9]]
a = [i[0] for i in l]
b = [i[1] for i in l]
c = [i[2:] for i in l]
which returns:
>>a
[1, 3, 8, 7]
>>b
[2, 4, 6, 2]
>>c
[[3], [5], [2, 5, 6], [9]]
At the moment I cannot think a solution without for loops, I hope I will be able to update my answer later.
Here's a solution using for loops:
data = [[1,2,3], [3,4,5], [8,6,2,5,6], [7,2,9]]
list1 = []
list2 = []
list3 = []
for item in data:
else_list = []
for index, value in enumerate(item):
if index == 0:
list1.append(value)
elif index == 1:
list2.append(value)
else:
else_list.append(value)
list3.append(else_list)
print(list1)
print(list2)
print(list3)
Output
[1, 3, 8, 7]
[2, 4, 6, 2]
[[3], [5], [2, 5, 6], [9]]
Just for fun I share also a performance comparison, great job in using just one for loop Meysam!
import timeit
# a = [1,3,8,7] b = [2,4,6,2], c = [[3], [5], [2,5,6],[9]]
def solution_1():
data = [[1, 2, 3], [3, 4, 5], [8, 6, 2, 5, 6], [7, 2, 9]]
list1 = []
list2 = []
list3 = []
for item in data:
else_list = []
for index, value in enumerate(item):
if index == 0:
list1.append(value)
elif index == 1:
list2.append(value)
else:
else_list.append(value)
list3.append(else_list)
def solution_2():
arr = [[1, 2, 3], [3, 4, 5], [8, 6, 2, 5, 6], [7, 2, 9]]
first_arr, second_arr, third_arr = [], [], []
for nested in arr:
first_arr.append(nested[0])
second_arr.append(nested[1])
third_arr.append(nested[2:])
def solution_3():
l = [[1, 2, 3], [3, 4, 5], [8, 6, 2, 5, 6], [7, 2, 9]]
a = [i[0] for i in l]
b = [i[1] for i in l]
c = [i[2:] for i in l]
if __name__ == "__main__":
print("solution_1 performance:")
print(timeit.timeit("solution_1()", "from __main__ import solution_1", number=10))
print("solution_2 performance:")
print(timeit.timeit("solution_2()", "from __main__ import solution_2", number=10))
print("solution_3 performance:")
print(timeit.timeit("solution_3()", "from __main__ import solution_3", number=10))
Output
solution_1 performance:
9.580000000000005e-05
solution_2 performance:
1.7200000000001936e-05
solution_3 performance:
1.7499999999996685e-05
Suppose the nested list has unknown depth, then we'd have to use recursion
def get_elements(l):
ret = []
for elem in l:
if type(elem) == list:
ret.extend(get_elements(elem))
else:
ret.append(elem)
return ret
l = [1,2,[3,4],[[5],[6]]]
print(get_elements(l))
# Output: [1, 2, 3, 4, 5, 6]
Though it is not quite recommended to use unknown-depth nested lists in the first place.
I'd like to remove same lists in a list having len(a) = 5 and a = [[1,2,3],[2,3,4], [0,1,2],[2,4,6],[3,6,9]] as results.
How can I get that?
a1 = [[1,2,3],[2,3,4]]
a2 = [[0,1,2],[2,4,6]]
a3=[[1,2,3],[0,1,2],[3,6,9]]
a = a1+a2+a3
a = [tuple(l) for l in a]
print(set(a))
print(len(a))
a=[list(ele) for ele in a]
print(a)
print(len(a))
You cannot make a set of lists of lists because they are not hashable. You could first convert them to tuples and then create a set:
a1 = [[1,2,3],[2,3,4]]
a2 = [[0,1,2],[2,4,6]]
a3=[[1,2,3],[0,1,2],[3,6,9]]
a = a1+a2+a3
a = [list(x) for x in set([tuple(L) for L in a])]
output:
[[0, 1, 2], [2, 4, 6], [1, 2, 3], [2, 3, 4], [3, 6, 9]]
I have this nested list:
list_1 = [[1,2,3], [1,2,3,4,5,6], [1,2,3,4,5,6,7,8,9]]
Count of sublist elements are always in mulitple of 3. I want to have 3 elments in each sublist. Desired output:
list_1 = [[1,2,3], [1,2,3], [4,5,6],[1,2,3], [4,5,6], [7,8,9]]
I can achieve this but first i have to flatten the list and then create the nested list. My code:
list_1 = [values for sub_list in lists_1 for values in sub_list] # flatten it first
list_1 = [list_1[i:i+3] for i in range(0, len(list_1), 3)]
Is there a way to skip the flatten step and get the desired result?
You can use a nested list comprehension:
list_1 = [[1,2,3], [1,2,3,4,5,6], [1,2,3,4,5,6,7,8,9]]
result = [i[j:j+3] for i in list_1 for j in range(0, len(i), 3)]
Output:
[[1, 2, 3], [1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
Here is how you can use nested list comprehensions:
list_1 = [[1,2,3],[1,2,3,4,5,6],[1,2,3,4,5,6,7,8,9]]
list_1 = [a for b in list_1 for a in b]
list_1 = [list_1[i:i+3] for i in range(0,len(list_1),3)]
print(list_1)
Output:
[[1, 2, 3], [1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
To put my two cents in, you could use two generator functions, one that flattens the list (with an arbitrarly nested list) and one that yields pairs of n values:
def recursive_yield(my_list):
for item in my_list:
if isinstance(item, list):
yield from recursive_yield(item)
else:
yield item
def taken(gen, number = 3):
buffer = []
for item in gen:
if len(buffer) < number:
buffer.append(item)
else:
yield buffer
buffer = []
buffer.append(item)
if buffer:
yield buffer
result = [x for x in taken(recursive_yield(list_1))]
Here are some examples of the in- / outputs:
list_1 = [[1,2,3], [1,2,3,4,5,6], [1,2,3,4,5,6,7,8,9]]
# -> [[1, 2, 3], [1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
list_1 = [1,2,3,4,5,6]
# -> [[1, 2, 3], [4, 5, 6]]
list_1 = [1,2,[[1,2,4,5], [[[[1,10,9]]]]]]
# -> number = 5
# -> [[1, 2, 1, 2, 4], [5, 1, 10, 9]]
Thus, the solution is much more flexible than slicing alone.
what is the efficient way to clean sublist in list . cause I only want to got the biggest set in list. just like.
b = [[1,2,3], [1,2], [3,5], [2,3,4], [2,3,4], [3,4,5], [1,2,4,6,7]]
and I want the output as follow.
result = [[1,2,3], [2,3,4], [3,4,5], [1,2,4,6,7]]
Cause [1,2] is subset of [1,2,3] and [1,2,4,6,7], [3,5] is subset of [3,4,5], and also [2,3,4] appear 2 times, only want calculate 1 time in final result. I want to based on the subset logical to filter data.
I only think out 2 loops solution to solve this problem, but if there is other efficient way to solve this problem.
what I tried like that: (after I optimising this one more effect, add break and add 1 part not calculate 2 times)
b = [[1,2,3], [1,2], [3,5], [2,3,4], [2,3,4], [3,4,5], [1,2,4,6,7]]
i = 0
record = []
subset_status = False
for index, re in enumerate(b):
while i <= (len(b)-1):
if i != index:
if i not in record:
if set(re).issubset(b[i]):
subset_status = True
break
i += 1
i = 0
if subset_status:
record.append(index)
subset_status = False
print(record)
>>[1, 2, 3]
So I got the index in [1,2,3] is the dirty data.
Thanks.
filter your list on condition:
b = [[1,2,3], [1,2], [3,5], [2,3,4],[3,4,5]]
print(list(filter(lambda x: len(x) == 3, b)))
# [[1, 2, 3], [2, 3, 4], [3, 4, 5]]
A conditional list comprehension is a pythonic, flexible and performant approach. It is usually faster and less error prone to assemble the clean list from scratch than to repeatedly remove elements:
b = [[1, 2, 3], [1, 2], [3, 5], [2, 3, 4],[3, 4, 5]]
cleaned = [x for x in b if clean(x)] # where clean is your condition
# e.g.
cleaned = [x for x in b if len(x) == 3]
# [[1, 2, 3], [2, 3, 4], [3, 4, 5]]
If you need to mutate the original list object, use slice assignment:
b[:] = [x for x in b if clean(x)]
One way to do this is to process the lists in b in order of length, from longest to shortest.
b = [[1,2,3], [1,2], [3,5], [2,3,4], [2,3,4], [3,4,5], [1,2,4,6,7]]
result = []
for u in sorted(map(set, b), key=len, reverse=True):
if not any(u <= v for v in result):
result.append(u)
print(result)
output
[{1, 2, 4, 6, 7}, {1, 2, 3}, {2, 3, 4}, {3, 4, 5}]
If you need to keep the inner lists as actual lists, and you also need to preserve the order, then we can do that with an additional pass over the data. But instead of using a list for result I'll use a set to make the tests more efficient. And that means turning the sublists into frozensets: plain sets won't work because only hashable objects can be put into a set.
b = [[1,2,3], [1,2], [3,5], [2,3,4], [2,3,4], [3,4,5], [1,2,4,6,7]]
temp = set()
for u in sorted(map(frozenset, b), key=len, reverse=True):
if not any(u <= v for v in temp):
temp.add(u)
newb = []
for u in b:
if set(u) in temp and u not in newb:
newb.append(u)
print(newb)
output
[[1, 2, 3], [2, 3, 4], [3, 4, 5], [1, 2, 4, 6, 7]]
This is not very good, but it works:
result = []
for i in b:
for j in result:
if all(c in j for c in i):
break
else:
new_list.append(i)
for i in result:
for j in result:
if all(c in j for c in i) and result.index(i) != result.index(j):
del(result[result.index(i)])
break
You can use tuples and product to detect if item is a sublist, then construct a new list excluding those sublist
list comprehension
from itertools import product
b = [[1,2,3], [1,2], [3,5], [2,3,4], [3,4,5], [1,2,4,6,7]]
dirty = [i for i in b for j in b if i != j if tuple(i) in product(j, repeat = len(i))]
clean = [i for i in b if i not in dirty]
Expanded explanation:
dirty = []
for i in b:
for j in b:
if i != j:
if tuple(i) in product(j, repeat = len(i)):
dirty.append(i)
clean = [i for i in b if i not in dirty]
[[1, 2, 3], [2, 3, 4], [3, 4, 5], [1, 2, 4, 6, 7]]