Python: Delete all list indices meeting a certain condition - python

to get right down to it, I'm trying to iterate through a list of coordinate pairs in python and delete all cases where one of the coordinates is negative. For example:
in the array:
map = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
I want to remove all the pairs in which either coordinate is < 0, leaving:
map = [[2, 3], [7, 1]]
My problem is that python lists cannot have any gaps, so if I loop like this:
i = 0
for pair in map:
for coord in pair:
if coord < 0:
del map[i]
i += 1
All the indices shift when the element is deleted, messing up the iteration and causing all sorts of problems. I've tried storing the indices of the bad elements in another list and then looping through and deleting those elements, but I have the same problem: once one is gone, the whole list shifts and indices are no longer accurate.
Is there something I'm missing?
Thanks.

If the list is not large, then the easiest way is to create a new list:
In [7]: old_map = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
In [8]: new_map=[[x,y] for x,y in a_map if not (x<0 or y<0)]
In [9]: new_map
Out[9]: [[2, 3], [7, 1]]
You can follow this up with old_map = new_map if you want to discard the other pairs.
If the list is so large creating a new list of comparable size is a problem, then you can delete elements from a list in-place -- the trick is to delete them from the tail-end first:
the_map = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
for i in range(len(the_map)-1,-1,-1):
pair=the_map[i]
for coord in pair:
if coord < 0:
del the_map[i]
print(the_map)
yields
[[2, 3], [7, 1]]
PS. map is such a useful built-in Python function. It is best not to name a variable map since this overrides the built-in.

You can use list comprehension for this:
>>> mymap = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
>>> mymap = [m for m in mymap if m[0] > 0 and m[1] > 0]
>>> mymap
[[2, 3], [7, 1]]

If you do not have any other references to the map list, a list comprehension works best:
map = [[a,b] for (a,b) in map if a > 0 and b > 0]
If you do have other references and need to actually remove elements from the list referenced by map, you have to iterate over a copy of map:
for coord in map[:]:
if coord[0] < 0 or coord[1] < 0:
map.remove(coord)

Personally, I prefer in-place modification:
li = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
print li,'\n'
N = len(li)
for i,(a,b) in enumerate(li[::-1], start=1):
if a<0 or b<0:
del li[N-i]
print li
->
[[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
[[2, 3], [7, 1]]

If you wish to do this in place, without creating a new list, simply use a for loop with index running from len(map)-1 down to 0.
for index in range(len(map)-1,-1,-1):
if hasNegativeCoord(map[index]):
del(map[index])
Not very Pythonic, I admit.

If the list is small enough, it's more efficient to make a copy containing just the elements you need, as detailed in the other answers.
However, if the list is too large, or for some other reason you need to remove the elements from the list object in place, I've found the following little helper function quite useful:
def filter_in_place(func, target, invert=False):
"remove all elements of target where func(elem) is false"
pos = len(target)-1
while pos >= 0:
if (not func(target[pos])) ^ invert:
del target[pos]
pos -= 1
In your example, this could be applied as follows:
>>> data = [[-1, 2], [5, -3], [2, 3], [1, -1], [7, 1]]
>>> def is_good(elem):
return elem[0] >= 0 and elem[1] >= 0
>>> filter_in_place(is_good, data)
>>> data
[[2, 3], [7, 1]]
(This is just a list-oriented version of filter_in_place, one which supports all base Python datatypes is a bit more complex).

itertools.ifilter()/ifilterfalse() exist to do exactly this: filter an iterable by a predicate (not in-place, obviously).
Better still, avoid creating and allocating the entire filtered list object if at all possible, just iterate over it:
import itertools
l = [(4,-5), (-8,2), (-2,-3), (4,7)]
# Option 1: create a new filtered list
l_filtered = list( itertools.ifilter(lambda p: p[0]>0 and p[1]>0, l) )
# Option 2:
for p in itertools.ifilter(lambda p: p[0]>0 and p[1]>0, l):
... <subsequent code on your filtered list>

You probably want del pair instead.

Related

Sorting an array with respect to values of an other array one after the other

Given the following arrays (arr,indices) ,I need to sort the array with respect to (i[0])th index in ascending order if i[1] equals 0 and descending order if i[1] equals 1 ,where i refers to each element of the indices array.
Constraints
1<= len(indices) <=10
1<= len(arr) <=10^4
Example
arr=[[1,2,3],[3,2,1],[4,2,1],[6,4,3]]
indices=[[2,0],[0,1]]
required output
[[4,2,1],[3,2,1],[6,4,3],[1,2,3]]
Explanation
first arr gets sorted with respect to 2nd index as (indices[0][0]=2) in ascending order as (indices[0][1]=0)
[[3,2,1],[ 4,2,1],[1,2,3],[6,4,3]]
then it gets sorted with 0th index as (indices[1][0]=0) in descending order as (indices[1][1]=1)
[[4,2,1],[3,2,1],[6,4,3],[1,2,3]]
Note
arr,indices need to be taken as input , so it is not possible for me to write arr.sort(key=lambda x: (x[2],-x[0]))
My Approach
I have tried the following but it is not giving the correct output
arr.sort(key=lambda x:next(x[i[0]] if i[1]==0 else -x[i[0]] for i in indices))
My output
[[3,2,1],[4,2,1],[1,2,3],[6,4,3]]
Expected output
[[4,2,1],[3,2,1],[6,4,3],[1,2,3]]
This one requires a very complex key. It looks to me like you have many different layers of sorting, here, and earlier elements of indices take precedence over later elements, when sort order would be affected.
I think what the sorting key needs to do is return a tuple/iterable, where the first element to sort by is whatever the first element of indices says to do, and the second element to sort by (in case of a tie in the first) is whatever the second element of indices says to do, and so on.
In which case you'd want something like this (a nested comprehension inside the key lambda, to generate that tuple (or, list, in this case)):
arr=[[1,2,3],[3,2,1],[4,2,1],[6,4,3]]
indices=[[2,0],[0,1]]
out = sorted(arr, key=lambda a: [
(-1 if d else 1) * a[i]
for (i, d) in indices
])
# [[4, 2, 1], [3, 2, 1], [6, 4, 3], [1, 2, 3]]
For sorting numbers only, you can use a quick hack of "multiply by -1 to sort descending instead of ascending". Which I did here.
You could use the stability:
from operator import itemgetter
for i, r in reversed(indices):
arr.sort(key=itemgetter(i), reverse=r)
This doesn't use the negation trick, so it also works for data other than numbers.
Check this out:
>>> a
[[6, 4, 3], [4, 2, 1], [3, 2, 1], [1, 2, 3]]
>>> i
[[2, 0], [0, 1]]
>>> for j in enumerate(i):
... a.sort(key=lambda x:x[j[1][0]],reverse=False if j[1][1]==0 else True)
... print(a)
...
[[3, 2, 1], [4, 2, 1], [1, 2, 3], [6, 4, 3]]
[[6, 4, 3], [4, 2, 1], [3, 2, 1], [1, 2, 3]]
Is this what you want?
I think in your second example is small mistake. It should be
[[6, 4, 3], [4, 2, 1], [3, 2, 1], [1, 2, 3]]
rather than
[[4,2,1],[3,2,1],[6,4,3],[1,2,3]]
this is under
then it gets sorted with 0th index as (indices[1][0]=0) in descending order as (indices[1][1]=1)

Convert nested iterables to list

Is there an easy way in python (using itertools, or otherwise) to convert a nested iterable f into its corresponding list or tuple? I'd like to save f so I can iterate over it multiple times, which means that if some nested elements of f are generators, I'll be in trouble.
I'll give an example input/output.
>>> g = iter(range(2))
>>> my_input = [1, [2, 3], ((4), 5), [6, g]]
>>> magical_function(my_input)
[1, [2, 3], [[4], 5], [6, [0, 1]]]
It would be fine if the output consisted of tuples, too. The issue is that iterating over g "consumes" it, so it can't be used again.
This seems like it would be best to do by checking if each element is iterable, and calling a recursive function over it if it is iterable. Just as a quick draw-up, I would try something like:
import collections
g = iter(range(2))
my_input = [1, [2, 3], ((4), 5), [6, g]]
def unfold(iterable):
ret = []
for element in iterable:
if isinstance(element, collections.Iterable):
ret.append(unfold(element))
else:
ret.append(element)
return ret
n = unfold(my_input)
print(n)
print(n)
which returns
$ python3 so.py
[1, [2, 3], [4, 5], [6, [0, 1]]]
[1, [2, 3], [4, 5], [6, [0, 1]]]
It's not the prettiest way, and you can find ways to improve it (it puts everything in a list instead of preserving tuples), but here is the general idea I would use.

Sort lists by items at given index

Sorry if I get terminology wrong - I've only just started learning Python, and I'm receiving instruction from friends instead of being on an actual course.
I want to search a list containing lots of arrays containing multiple elements, and find arrays with some elements matching, but some different.
In less confusing terms e.g. I have a list of arrays that each contain 2 elements (I think this is called a 2D array?) so:
list = [[1, 2], [2, 2], [3, 5], [4, 1], [5, 2], ...]
In my specific example, the first elements in each sub array just ascend linearly, but the second elements are almost random. I want to find or sort the arrays only by the second number. I could just remove the first number from each array:
list = [2, 2, 5, 1, 2 ...]
And then use something like "if list[x] == 1" to find '1' etc.
(side note: I'm not sure how to find ALL the values if one value is repeated - I can't remember quite what I wrote but it would only ever find the first instance where the value matched, so e.g. it would detect the first '2' but not the second or third)
But I want to keep the first values in each array. My friend told me that you could use a dictionary with values and keys, which would work for my example, but I want to know what the more general method would be.
So in my example, I hoped that if I wrote this:
if list[[?, x]] == [?, 1]
Then it would find the array where the second value of the array was 1, (i.e. [4, 1] in my example) and not care about the first value. Obviously it didn't work because '?' isn't Python syntax as far as I'm aware, but hopefully you can see what I'm trying to do?
So for a more general case, if I had a list of 5 dimensional arrays and I wanted to find the second and fourth values of each array, I would write:
if list[[?, x, ?, y, ?]] == [?, a, ?, b, ?]
And it would match any array where the value of the second element was 'a', and the value of the fourth was 'b'.
e.g. [3, a, 4, b, 7], [20, a, 1, b, 9], ['cat', a, 'dog', b, 'fish'] etc. would all be possible results found by the command.
So I want to know if there's any similar way to my method of using a question mark (but that actually works) to denote that an element in an array can have any value.
To sort on the second element for a list containg lists (or tuples):
from operator import itemgetter
mylist = [[1, 2], [2, 2], [3, 5], [4, 1], [5, 2]]
sortedlist = sorted(mylist, key=itemgetter(1))
See the Python sorting howto.
Use sorted if you want to keep original list unaffected
lst = [[1, 2], [2, 2], [3, 5], [4, 1], [5, 2]]
In [103]: sorted(lst, key=lambda x: x[1])
Out[103]: [[4, 1], [1, 2], [2, 2], [5, 2], [3, 5]]
else use list.sort to sort current list and keep sorted list
In [106]: lst.sort(key=lambda x: x[1])
In [107]: lst
Out[107]: [[4, 1], [1, 2], [2, 2], [5, 2], [3, 5]]
or use operator.itemgetter
from operator import itemgetter
In [108]: sorted(lst, key=itemgetter(1))
Out[108]: [[4, 1], [1, 2], [2, 2], [5, 2], [3, 5]]
You could use a list comprehension to build a list of all the desired items:
In [16]: seq = [[1, 2], [2, 2], [3, 5], [4, 1], [5, 2]]
To find all items where the second element is 1:
In [17]: [pair for pair in seq if pair[1] == 1]
Out[17]: [[4, 1]]
This finds all items where the second element is 2:
In [18]: [pair for pair in seq if pair[1] == 2]
Out[18]: [[1, 2], [2, 2], [5, 2]]
Instead of
if list[[?, x, ?, y, ?]] == [?, a, ?, b, ?]
you could use
[item for item in seq if item[1] == 'a' and item[3] == 'b']
Note, however, that each time you use a list comprehension, Python has to loop
through all the elements of seq. If you are doing this search multiple times,
you might be better off building a dict:
import collections
seq = [[1, 2], [2, 2], [3, 5], [4, 1], [5, 2]]
dct = collections.defaultdict(list)
for item in seq:
key = item[1]
dct[key].append(item)
And then you could access the items like this:
In [22]: dct[1]
Out[22]: [[4, 1]]
In [23]: dct[2]
Out[23]: [[1, 2], [2, 2], [5, 2]]
The list comprehension
[pair for pair in seq if pair[1] == 1]
is roughly equivalent to
result = list()
for pair in seq:
if pair[1] == 1:
result.append(pair)
in the sense that result would then equal the list comprehension.
The list comprehension is just a syntactically prettier way to express the same
thing.
The list comprehension above has three parts:
[expression for-loop conditional]
The expression is pair, the for-loop is for pair in seq, and the conditional is if pair[1] == 1.
Most, but not all list comprehensions share this syntax. The full list comprehension grammar is given here.

Remove duplicated lists in list of lists in Python

I've seen some questions here very related but their answer doesn't work for me. I have a list of lists where some sublists are repeated but their elements may be disordered. For example
g = [[1, 2, 3], [3, 2, 1], [1, 3, 2], [9, 0, 1], [4, 3, 2]]
The output should be, naturally according to my question:
g = [[1,2,3],[9,0,1],[4,3,2]]
I've tried with set but only removes those lists that are equal (I thought It should work because sets are by definition without order). Other questions i had visited only has examples with lists exactly duplicated or repeated like this: Python : How to remove duplicate lists in a list of list?. For now order of output (for list and sublists) is not a problem.
(ab)using side-effects version of a list comp:
seen = set()
[x for x in g if frozenset(x) not in seen and not seen.add(frozenset(x))]
Out[4]: [[1, 2, 3], [9, 0, 1], [4, 3, 2]]
For those (unlike myself) who don't like using side-effects in this manner:
res = []
seen = set()
for x in g:
x_set = frozenset(x)
if x_set not in seen:
res.append(x)
seen.add(x_set)
The reason that you add frozensets to the set is that you can only add hashable objects to a set, and vanilla sets are not hashable.
If you don't care about the order for lists and sublists (and all items in sublists are unique):
result = set(map(frozenset, g))
If a sublist may have duplicates e.g., [1, 2, 1, 3] then you could use tuple(sorted(sublist)) instead of frozenset(sublist) that removes duplicates from a sublist.
If you want to preserve the order of sublists:
def del_dups(seq, key=frozenset):
seen = {}
pos = 0
for item in seq:
if key(item) not in seen:
seen[key(item)] = True
seq[pos] = item
pos += 1
del seq[pos:]
Example:
del_dups(g, key=lambda x: tuple(sorted(x)))
See In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?
What about using mentioned by roippi frozenset this way:
>>> g = [list(x) for x in set(frozenset(i) for i in [set(i) for i in g])]
[[0, 9, 1], [1, 2, 3], [2, 3, 4]]
I would convert each element in the list to a frozenset (which is hashable), then create a set out of it to remove duplicates:
>>> g = [[1, 2, 3], [3, 2, 1], [1, 3, 2], [9, 0, 1], [4, 3, 2]]
>>> set(map(frozenset, g))
set([frozenset([0, 9, 1]), frozenset([1, 2, 3]), frozenset([2, 3, 4])])
If you need to convert the elements back to lists:
>>> map(list, set(map(frozenset, g)))
[[0, 9, 1], [1, 2, 3], [2, 3, 4]]

Removing commutative pairs in a list in Python

I got a list as follows:
list_1 = [[3, 0], [0, 3], [3, 4]]
I'm trying to filter out the commutative elements in this. For example, [3,0] and [0,3] are the same and I need to keep only one of them. I tried converting this into a set, and it didn't help. I also tried iterating, but it's causing real overhead. Is there any Pythonic way to do this?
Thanks.
For example, you can use dict comprehension:
>>> {tuple(sorted(t)): t for t in list_1}.values()
[[0, 3], [3, 4]]
You can use a set of frozensets for the filtering.
If order does not matter:
>>> map(list, set(frozenset(t) for t in list_1))
[[3, 4], [0, 3]]
To retain order:
list_1 = [[3, 0], [0, 3], [3, 4]]
seen = set()
filtered = []
for item in list_1:
item_set = frozenset(item)
if item_set not in seen:
filtered.append(item)
seen.add(item_set)
Result:
>>> filtered
[[3, 0], [3, 4]]

Categories

Resources