Remove arrays from nested arrays based on first element of each array - python

I have two nested arrays say
a=[[1,2,3],[2,4,7],[4,2,8],[3,5,7],[6,1,2]]
b=[[1,6,7],[2,4,9],[4,3,5],[3,10,2],[5,3,2],[7,2,1]]
I want to only keep those arrays in b whose first element is not common to the first elements of the arrays in a, so for these two we should get
c=[[5,3,2],[7,2,1]]
Is there a way to do this in python?

You may do like this,
>>> a=[[1,2,3],[2,4,7],[4,2,8],[3,5,7],[6,1,2]]
>>> b=[[1,6,7],[2,4,9],[4,3,5],[3,10,2],[5,3,2],[7,2,1]]
>>> [i for i in b if i[0] not in [j[0] for j in a]]
[[5, 3, 2], [7, 2, 1]]
>>>
or
>>> k = [j[0] for j in a]
>>> [i for i in b if i[0] not in k]
[[5, 3, 2], [7, 2, 1]]

To make this a little faster and efficient using set
Code:
list1 = [[1,2,3], [2,4,7], [4,2,8], [3,5,7], [6,1,2]]
list2 = [[1,6,7], [2,4,9], [4,3,5], [3,10,2], [5,3,2], [7,2,1]]
check_set=set(val[0] for val in list1 )
print [val for val in list2 if val[0] not in check_set]
Output:
[[5, 3, 2], [7, 2, 1]]
Notes:
First we are creating a set to store all the unique first value of list1
The set is used to remove duplicate values at the same time member check in set is almost O(1) i.e.) 0 in set([0,1,2]) is O(1) member checking in list can go to a worst case of O(n)
Finally creating a list by iterating over list2 and checking if the first element is not present in set

This sounds like a homework problem, but I'm going to trust that this isn't one.
You can do this easily in two steps:
Store all first elements from a in a set.
Filter our lists in b whose first elements do not exist in the set.
def remove_common(a, b):
"""remove lists from b whose first element is the first element of a list in a"""
first_elements = set(sublist[0] for sublist in a)
return list(filter(lambda sublist: sublist[0] not in first_elements, b))

With dictionaries to enhance the data structure and garantees a linear complexity, assuming all l[0] are differents :
source,target={l[0]:l for l in a},{l[0]:l for l in b}
[target[key] for key in target.keys()-source.keys()]

You could assign c with a nested list comprehension.
c = [each_b for each_b in b if each_b[0] not in [each_a[0] for each_a in a]]
print(c)
>>> [[5, 3, 2], [7, 2, 1]]

Related

Extract indices of certain values from a list in python

Suppose, I have a list [0.5,1,1.5,2,2.5,3,3.5,4,4.5], now I would like to extract the indices of a list [1.5,2.5,3.5,4.5], which is a subset of that list.
You can use the inbuilt function <list>.index to find the index of each element(of second list) in the first list.
Having that learnt, you can use list comprehension to achieve what you want:
>>> list1 = [0.5,1,1.5,2,2.5,3,3.5,4,4.5]
>>> list2 = [1.5,2.5,3.5,4.5]
>>> [list1.index(elem) for elem in list2]
[2, 4, 6, 8]
One other option is to use enumerate function. See the following answer:
a = [0.5,1,1.5,2,2.5,3,3.5,4,4.5]
b = [1.5,2.5,3.5,4.5]
indexes = []
for id, i in enumerate(a):
if i in b:
indexes.append(id)
print(indexes)
The output is going to be [2, 4, 6, 8].

How to parse these operations through lists?

Program description:
Program accepts a list l containing other lists. Output l where lists with length greater than 3 will be changed accordingly: the element with index 3 is going to be a sum of removed elements (from third to the end).
My solution:
l = [[1,2], [3,4,4,3,1], [4,1,4,5]]
s = 0
for i in range(len(l)-1):
if len(l[i]) > 3:
for j in range(3,len(l[i])-1):
s += l[i][j]
l[i].remove(l[i][j])
l[i].insert(len(l[i]),s)
l
Test:
Input: [[1,2], [3,4,4,3,1], [4,1,4,5]]
Expected Output: [[1, 2], [3, 4, 8], [4, 1, 9]]
Program run:
Input: [[1,2], [3,4,4,3,1], [4,1,4,5]]
Output: [[1, 2], [4, 4, 3, 1, 3], [4, 1, 4, 5]]
Question: I don't understand what can be the source of the problem in this case, why should it add some additional numbers to the end, instead of summ. I will appreciate any help.
remove is the wrong function. You should use del instead. Read the documentation to understand why.
And another bug you have is that you do not reset s. It should be set to 0 in the outer for loop.
But you're making it too complicated. I think it's better to show how you can do it really easy.
for e in l: # No need for range. Just iterate over each element
if len(e) > 3:
e[2]=sum(e[2:]) # Sum all the elements
del(e[3:]) # And remove
Or if you want it as a list comprehension that creates a new list and does not alter the old:
[e[0:2] + [sum(e[2:])] if len(e)>3 else e for e in l]
First of all, remove() is the wrong method, as it deletes by value, not index:
Python list method remove() searches for the given element in the list
and removes the first matching element.
You'd want to use del or pop().
Second of all, you're not slicing all of the elements from the end of the list, but only one value.
remove is reason why your code is not working. (as mentioned by Mat-KH in the other answer)
You can use list comprehension and lambda function to make it a two liner.
func = lambda x: x if len(x) < 3 else x[:2] + [sum(x[2:])]
l = [func(x) for x in l]

Python how do you get the index of a list in a list, by knowing the first element of the list in lists?

Imagine the list [[0, 1],[2, 3][4, 5]],
how can I get the index of the list [2,3] (which is supposed to be 1), by knowing the number 2. In code something like:
in a list of lists, find the index of the list where list[0] == 2.
This should return 1.
You can do this using a for loop. So for example:
nums = [[0, 1],[2, 3],[4, 5]]
for index, num_list in enumerate(nums):
if num_list[0] == 2:
print(index)
You could use the next function on an enumeration of the list that would return the index of matching items
aList = [[0, 1],[2, 3],[4, 5]]
index = next(i for i,s in enumerate(aList) if s[0]==2)
print(index) # 1
or, if you're not concerned with performance, using a more compact way by building a list of the first elements of each sublist and using the index() method on that:
index = [*zip(*aList)][0].index(2)
or
index = [i for i,*_ in aList].index(2)
Iterate through the list and check the value of the first element. Use a variable to track which index you're looking at.
for i in list:
if i[0]==2:
print(list.index(i))
Check Kaggle python course for such practical exercises
https://www.kaggle.com/learn/python
Since index method only supports exact value comparison, you need to iterate. See this question:
Python: return the index of the first element of a list which makes a passed function true
array = [[0, 1],[2, 3],[4, 5]]
def get_index(array):
i = 0
for element in array:
if(element[0]==2):
break
i+=1
return i
print(str(get_index(array)))
If you have lots of such queries to make, it might be more efficient to build a dict first, with the first value of each sublist as key and the sublist as value:
data = [[0, 1], [2, 3], [4, 5]]
data_by_first_value = {lst[0]: lst for lst in data}
This dict will look like:
print(data_by_first_value)
# {0: [0, 1], 2: [2, 3], 4: [4, 5]}
It is then an O(1) operation to get the sublist you're looking for:
print(data_by_first_value[2])
# [2, 3]

Remove all duplicates in a list of list (not including the original)

I have a list of lists and I want to remove all the duplicates so that the similar lists will not appear at all in the new list.
k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]
output == [[5,6,2], [3]]
So for example, [1,2] have a duplicate so it should not appear in the final list at all. This is different from what others have asked as they wanted to keep one copy of the duplicate in the final list.
You can use count to count the number of occurrences of a element.
Approach 1 Time complexity O(n^2)
final=[]
for i in k:
if k.count(i)==1:
final.append(i)
print(final)
[[5, 6, 2], [3]]
Pythonic way to write this is :
final=[i for i in k if k.count(i)==1]
Approach 2 Time complexity O(n^2)
Or you can search if ith element is present in rest of the list or not. If present don't add it tofinal.
for i,lst in enumerate(k):
if lst not in k[:i]+k[i+1:]:
final.append(lst)
print(final)
output
[[5, 6, 2], [3]]
pythonic way to write this:
final=[i for i,lst in enumerate(k) if lst not in k[:i]+k[i+1:]]
Approach 3 Time complexity is O(n)
You can achieve this in O(n) by using dictionaries.
dic={}
k=list(map(tuple,k)) #Since key values in dictionary should always be immutable.
for i in k:
dic[i]=dic.setdefault(i,0)+1
final=[]
for k in dic:
if dic[k]== 1:
final.append(list(k))
For the general problem of dropping duplicates, you could use collections.Counter, although it requires hashable values.
import collections
def drop_all_duplicates(values):
# As of CPython 3.7, Counter remembers insertion order.
value_counts = collections.Counter(values)
return (value for value, count in value_counts.items() if count == 1)
>>> list(drop_all_duplicates([1, 1, 2, 3, 3, 4, 5, 5, 6])
[2, 4, 6]
This could be expanded to cover non-hashable values by accepting a function that converts them to hashable (e.g. a function that converts lists to tuples).

Python applying dynamic list comprehension (list append)

vals= [1]
for j in xrange(i):
vals.append([k for k in f(vals[j])])
This loop appends values to itself over a loop. If I compress this into a list comprehension, it doesn't work because it doesn't "dynamically" extend vals using itself on each iteration -- it processes vals as it is originally framed.
Is there a way to do a one line list comprehension that dynamically appends like this? Based on my research, it looks like maybe I am looking for a reduce function? (the equivalent of a fold)
You can indeed use reduce for this, using the initial list as the third parameter.
>>> def f(lst):
... return [x+1 for x in lst] + [len(lst)]
>>> reduce(lambda lst, i: lst + [f(lst[i])], range(5), [[1]])
[[1], [2, 1], [3, 2, 2], [4, 3, 3, 3], [5, 4, 4, 4, 4], [6, 5, 5, 5, 5, 5]]
(Note that the initial list should probably be [[1]], not [1], otherwise you are passing a number to f in the first iteration, but a list in all following iterations.)
Also note that concerning performance your original loop is probably a bit faster, as the reduce basically has to create two new lists in each iteration, while you just have to append to a list. Personally, I would go with a variation of the loop, removing the (probably useless) inner list comprehension and using [-1] to make clear that you are always using the previous result.
vals = [[1]]
for _ in xrange(n):
vals.append(f(vals[-1]))

Categories

Resources