What is the name of this topic? - python

I have seen some coding of condition checking inside the Numpy array,
like if the array is a = np.zeros((10,10))
and doing something like,
a[ a == 255 ] = 0
Now with this simple thing, I have seen people are doing complex things,
What is this concept called?

I don't think it has a name beyond indexing. Each type can define what such indexing means though, so the common example of a list being an integer-indexed container is not the only thing you can do.
A numpy.array first overloads its __eq__ operator so that an expression like a == 255 doesn't return a single Boolean value. Instead, it returns an entire array of Boolean values, were the ith element of the result is True if a[i] == 255. That is, result = a == 255 is similar to
result = [a[i] == 255 for i in range(len(a))]
Then, __getitem__ is overloaded to handle the case where you try to index an array with another array. In this case, the result "selects" each element of the array where the corresponding Boolean value is true. The assignment
a[a == 255] = 0
then is roughly equivalent to
for x in range(len(a)):
if (a == 255)[x]:
a[x] = 0

Related

Check if numpy's array_like is an empty array

Suppose a is an array_like and we want to check if it is empty. Two possible ways to accomplish this are:
if not a:
pass
if numpy.array(a).size == 0:
pass
The first solution would also evaluate to True if a=None. However I would like to only check for an empty array_like.
The second solution seems good enough for that. I was just wondering if there is a numpy built-in function for that or a better solution then to check for the size?
If you want to check if size is zero, you might use numpy.size function to get more concise code
import numpy
a = []
b = [1,2]
c = [[1,2],[3,4]]
print(numpy.size(a) == 0) # True
print(numpy.size(b) == 0) # False
print(numpy.size(c) == 0) # False

Python: Using Equality Operator Inside of Numpy Array Assignment

I saw this code in some examples online and am trying to understand and modify it:
c = a[b == 1]
Why does this work? It appears b == 1 returns true for each element of b that satisfies the equality. I don't understand how something like a[True] ends up evaluating to something like "For all values in a for which the same indexed value in b is equal to 1, copy them to c"
a,b, and c are all NumPy arrays of the same length containing some data.
I've searched around quite a bit but don't even know what to call this sort of thing.
If I want to add a second condition, for example:
c = a[b == 1 and d == 1]
I get
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I know this happens because that combination of equality operations is ambiguous for reasons explained here, but I am unsure of how to add a.any() or a.all() into that expression in just one line.
EDIT:
For question 2, c = a[(b == 1) & (d == 1)] works. Any input on my first question about how/why this works?
Why wouldn't your example in point (1) work? This is Boolean indexing. If the arrays were different shapes then it may be a different matter, but:
c = a[b == 1]
Is indistinguishable from:
c = a[a == 1]
When you don't know the actual arrays. Nothing specific to a is going on here; a == 1 is just setting up a boolean mask, that you then re-apply to a in a[mask_here]. Doesn't matter what generated the mask.
You just need to put the conditions separately in brackets. Try using this
c = a[(b == 1) & (d == 1)]

What does x[x < 2] = 0 mean in Python?

I came across some code with a line similar to
x[x<2]=0
Playing around with variations, I am still stuck on what this syntax does.
Examples:
>>> x = [1,2,3,4,5]
>>> x[x<2]
1
>>> x[x<3]
1
>>> x[x>2]
2
>>> x[x<2]=0
>>> x
[0, 2, 3, 4, 5]
This only makes sense with NumPy arrays. The behavior with lists is useless, and specific to Python 2 (not Python 3). You may want to double-check if the original object was indeed a NumPy array (see further below) and not a list.
But in your code here, x is a simple list.
Since
x < 2
is False
i.e 0, therefore
x[x<2] is x[0]
x[0] gets changed.
Conversely, x[x>2] is x[True] or x[1]
So, x[1] gets changed.
Why does this happen?
The rules for comparison are:
When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for string, numeric ordering for integers).
When you order a numeric and a non-numeric type, the numeric type comes first.
When you order two incompatible types where neither is numeric, they are ordered by the alphabetical order of their typenames:
So, we have the following order
numeric < list < string < tuple
See the accepted answer for How does Python compare string and int?.
If x is a NumPy array, then the syntax makes more sense because of boolean array indexing. In that case, x < 2 isn't a boolean at all; it's an array of booleans representing whether each element of x was less than 2. x[x < 2] = 0 then selects the elements of x that were less than 2 and sets those cells to 0. See Indexing.
>>> x = np.array([1., -1., -2., 3])
>>> x < 0
array([False, True, True, False], dtype=bool)
>>> x[x < 0] += 20 # All elements < 0 get increased by 20
>>> x
array([ 1., 19., 18., 3.]) # Only elements < 0 are affected
>>> x = [1,2,3,4,5]
>>> x<2
False
>>> x[False]
1
>>> x[True]
2
The bool is simply converted to an integer. The index is either 0 or 1.
The original code in your question works only in Python 2. If x is a list in Python 2, the comparison x < y is False if y is an integer. This is because it does not make sense to compare a list with an integer. However in Python 2, if the operands are not comparable, the comparison is based in CPython on the alphabetical ordering of the names of the types; additionally all numbers come first in mixed-type comparisons. This is not even spelled out in the documentation of CPython 2, and different Python 2 implementations could give different results. That is [1, 2, 3, 4, 5] < 2 evaluates to False because 2 is a number and thus "smaller" than a list in CPython. This mixed comparison was eventually deemed to be too obscure a feature, and was removed in Python 3.0.
Now, the result of < is a bool; and bool is a subclass of int:
>>> isinstance(False, int)
True
>>> isinstance(True, int)
True
>>> False == 0
True
>>> True == 1
True
>>> False + 5
5
>>> True + 5
6
So basically you're taking the element 0 or 1 depending on whether the comparison is true or false.
If you try the code above in Python 3, you will get TypeError: unorderable types: list() < int() due to a change in Python 3.0:
Ordering Comparisons
Python 3.0 has simplified the rules for ordering comparisons:
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types always compare unequal to each other.
There are many datatypes that overload the comparison operators to do something different (dataframes from pandas, numpy's arrays). If the code that you were using did something else, it was because x was not a list, but an instance of some other class with operator < overridden to return a value that is not a bool; and this value was then handled specially by x[] (aka __getitem__/__setitem__)
This has one more use: code golf. Code golf is the art of writing programs that solve some problem in as few source code bytes as possible.
return(a,b)[c<d]
is roughly equivalent to
if c < d:
return b
else:
return a
except that both a and b are evaluated in the first version, but not in the second version.
c<d evaluates to True or False.
(a, b) is a tuple.
Indexing on a tuple works like indexing on a list: (3,5)[1] == 5.
True is equal to 1 and False is equal to 0.
(a,b)[c<d]
(a,b)[True]
(a,b)[1]
b
or for False:
(a,b)[c<d]
(a,b)[False]
(a,b)[0]
a
There's a good list on the stack exchange network of many nasty things you can do to python in order to save a few bytes. https://codegolf.stackexchange.com/questions/54/tips-for-golfing-in-python
Although in normal code this should never be used, and in your case it would mean that x acts both as something that can be compared to an integer and as a container that supports slicing, which is a very unusual combination. It's probably Numpy code, as others have pointed out.
In general it could mean anything. It was already explained what it means if x is a list or numpy.ndarray but in general it only depends on how the comparison operators (<, >, ...) and also how the get/set-item ([...]-syntax) are implemented.
x.__getitem__(x.__lt__(2)) # this is what x[x < 2] means!
x.__setitem__(x.__lt__(2), 0) # this is what x[x < 2] = 0 means!
Because:
x < value is equivalent to x.__lt__(value)
x[value] is (roughly) equivalent to x.__getitem__(value)
x[value] = othervalue is (also roughly) equivalent to x.__setitem__(value, othervalue).
This can be customized to do anything you want. Just as an example (mimics a bit numpys-boolean indexing):
class Test:
def __init__(self, value):
self.value = value
def __lt__(self, other):
# You could do anything in here. For example create a new list indicating if that
# element is less than the other value
res = [item < other for item in self.value]
return self.__class__(res)
def __repr__(self):
return '{0} ({1})'.format(self.__class__.__name__, self.value)
def __getitem__(self, item):
# If you index with an instance of this class use "boolean-indexing"
if isinstance(item, Test):
res = self.__class__([i for i, index in zip(self.value, item) if index])
return res
# Something else was given just try to use it on the value
return self.value[item]
def __setitem__(self, item, value):
if isinstance(item, Test):
self.value = [i if not index else value for i, index in zip(self.value, item)]
else:
self.value[item] = value
So now let's see what happens if you use it:
>>> a = Test([1,2,3])
>>> a
Test ([1, 2, 3])
>>> a < 2 # calls __lt__
Test ([True, False, False])
>>> a[Test([True, False, False])] # calls __getitem__
Test ([1])
>>> a[a < 2] # or short form
Test ([1])
>>> a[a < 2] = 0 # calls __setitem__
>>> a
Test ([0, 2, 3])
Notice this is just one possibility. You are free to implement almost everything you want.

How to search a list of arrays

Consider the following list of two arrays:
from numpy import array
a = array([0, 1])
b = array([1, 0])
l = [a,b]
Then finding the index of a correctly gives
l.index(a)
>>> 0
while this does not work for b:
l.index(b)
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
It seems to me, that calling a list's .index function is not working for lists of numpy arrays.
Does anybody know an explanation?
Up to now, I always solved this problem kind of daggy by converting the arrays to strings. Does someone know a more elegant and fast solution?
The good question is in fact how l.index[a] can return a correct value. Because numpy arrays treat equality in a special manner: l[1] == b returns an array and not a boolean, by comparing individual values. Here it gives array([ True, True], dtype=bool) which cannot be directly converted to a boolean, hence the error.
In fact, Python uses rich comparison and specifically PyObject_RichCompareBool to compare the searched value to every element of the list is sequence, that means that it first test identity (a is b) and next equality (a == b). So for the first element, as a is l[0], identity is true and index 0 is returned.
But for any other element, identity with first element is false, and the equality test causes the error. (thanks to Ashwini Chaudhary for its nice explaination in comment).
You can confirm it by testing a new copy of an array containing same elements as l[0]:
d = array([0,1])
l.index(d)
it gives the same error, because identity is false, and the equality test raises the error.
It means that you cannot rely on any list method using comparison (index, in, remove) and must use custom functions such as the one proposed by #orestiss. Alternatively, as a list of numpy arrays seems hard to use, you should considere wrapping the arrays:
>>> class NArray(object):
def __init__(self, arr):
self.arr = arr
def array(self):
return self.arr
def __eq__(self, other):
if (other.arr is self.arr):
return True
return (self.arr == other.arr).all()
def __ne__(self, other):
return not (self == other)
>>> a = array([0, 1])
>>> b = array([1, 0])
>>> l = [ NArray(a), NArray(b) ]
>>> l.index(NArray(a))
0
>>> l.index(NArray(b))
1
This error comes from the way numpy treats comparison between array elements see : link,
So I am guessing that since the first element is the instance of the search you get the index for it, but trying to compare the first element with the second you get this error.
I think you could use something like:
[i for i, temp in enumerate(l) if (temp == b).all()]
to get a list with the indices of equal arrays but since I am no expert in python there could be a better solution (it seems to work...)

Python for loop doesn't have expected results

I got a bit stuck with a for loop - what I can see it is doing appears correct but isn't exactly what I'm trying to accomplish with it. I've come from a C background but any advice here would be beneficial.
def deal(player_num, cards):
a = 0
z = 0
i = 0
b = 0
c = player_num
hand = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
for a in range(player_num):
hand[a] = cards[i] + cards[i+b+c]
b == b+1
i == i+1
z == z+1
return hand
So the for a in range(player_num) seems to be working (appends a++) but hand[0], hand[1], etc. gets the same hand. I guess it loops a but not the other variables, so I need to use more than 1 nested loop to get i++, b++ and c++?
b == b+1 is a logical expression (returning False every time), not an assignment. I'm guessing you want something like: b += 1
== is the equality operator in Python. = is the assignment operator.
== checks whether its left operand and its right operand are equal and return True or False accordingly. b and b+1 will never be equal to each other and either way it does not make sense to perform an operation without side-effect (like comparing two values for equality) and then do nothing with its result.
If you want to change the values of your variables, use the assignment operator = instead of ==.

Categories

Resources