Disable silent conversions in numpy - python

Is there a way to disable silent conversions in numpy?
import numpy as np
a = np.empty(10, int)
a[2] = 4 # OK
a[3] = 4.9 # Will silently convert to 4, but I would prefer a TypeError
a[4] = 4j # TypeError: can't convert complex to long
Can numpy.ndarray objects be configured to return a TypeError when assigning any value which is not isinstance() of the ndarray type?
If not, would the best alternative be to subclass numpy.ndarray (and override __setattr__ or __setitem__)?

Unfortunately numpy doesn't offer this feature in array creation, you can set if casting is allowed only when you are converting an array (check the documentation for numpy.ndarray.astype).
You could use that feature, or subclass numpy.ndarray, but also consider using the array module offered by python itself to create a typed array:
from array import array
a = array('i', [0] * 10)
a[2] = 4 # OK
a[3] = 4.9 # TypeError: integer argument expected, got float

Just an idea.
#Python 2.7.3
>>> def test(value):
... if '.' in str(value):
... return str(value)
... else:
... return value
...
>>> a[3]=test(4.0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for long() with base 10: '4.0'

Related

Why does ctypes.c_int completely change its behaviour when put into ctypes Structure?

When I create a variable of type ctype.c_int it reports that type and does not allow any math operations:
In [107]: x = c_int(1)
In [108]: x
Out[108]: c_int(1)
In [109]: x+=1
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
----> 1 x+=1
TypeError: unsupported operand type(s) for +=: 'c_int' and 'int'
In [110]: x+=x
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
----> 1 x+=x
TypeError: unsupported operand type(s) for +=: 'c_int' and 'c_int'
In [111]: type(x)
Out[111]: ctypes.c_int
On the other hand: when I make a structure with c_int inside it is reported as int, allows math operations but still seems to be stored as 32-bit c-integer because it wraps correctly on 32 bit, and honors sign bit at position 31.
In [112]: class REC(ctypes.Structure): _fields_=[('x',ctypes.c_int),('y',ctypes.c_int)]
In [113]: rec = REC()
In [114]: rec.x
Out[114]: 0
In [114]: type(rec.x)
Out[114]: int # why not ctypes.c_int ???
In [116]: rec.x+=0x7FFFFFFF # += works, so it is regular python int ?
In [117]: rec.x
Out[117]: 2147483647
In [118]: rec.x+=1
In [119]: rec.x
Out[119]: -2147483648 # but it honors sign bit at position 31...
In [122]: rec.x=0xFFFFFFFF
In [123]: rec.x
Out[123]: -1
In [124]: rec.x+=1
In [125]: rec.x
Out[125]: 0 # ...and it wraps on 32 bits, so it is NOT python int!
Can someone explain this behavior? Is there any logic behind this?
The bare c_int has to have two identities: the C object whose addressof may be taken, and the Python object x. The former is an integer, but the latter is not. (Recall that x=2 would just rebind x and would not update the C integer.)
When you put the variable in a structure, ctypes can provide, as a convenience, an attribute-based interface that converts between the C and Python representations. This has its own surprises: store a suitably large value and you’ll see that then rec.x is not rec.x. The manufactured objects are real Python objects, but of course they don’t follow Python rules since they don’t own any data.
The same applies to the bare integer’s value attribute.
Oddly enough, it’s hard to get the equivalent of a bare integer from a structure, so you can’t easily pass a structure member to a function to fill it in.

ValueError when checking if variable is None or numpy.array

I'd like to check if variable is None or numpy.array. I've implemented check_a function to do this.
def check_a(a):
if not a:
print "please initialize a"
a = None
check_a(a)
a = np.array([1,2])
check_a(a)
But, this code raises ValueError. What is the straight forward way?
ValueError Traceback (most recent call last)
<ipython-input-41-0201c81c185e> in <module>()
6 check_a(a)
7 a = np.array([1,2])
----> 8 check_a(a)
<ipython-input-41-0201c81c185e> in check_a(a)
1 def check_a(a):
----> 2 if not a:
3 print "please initialize a"
4
5 a = None
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Using not a to test whether a is None assumes that the other possible values of a have a truth value of True. However, most NumPy arrays don't have a truth value at all, and not cannot be applied to them.
If you want to test whether an object is None, the most general, reliable way is to literally use an is check against None:
if a is None:
...
else:
...
This doesn't depend on objects having a truth value, so it works with NumPy arrays.
Note that the test has to be is, not ==. is is an object identity test. == is whatever the arguments say it is, and NumPy arrays say it's a broadcasted elementwise equality comparison, producing a boolean array:
>>> a = numpy.arange(5)
>>> a == None
array([False, False, False, False, False])
>>> if a == None:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
On the other side of things, if you want to test whether an object is a NumPy array, you can test its type:
# Careful - the type is np.ndarray, not np.array. np.array is a factory function.
if type(a) is np.ndarray:
...
else:
...
You can also use isinstance, which will also return True for subclasses of that type (if that is what you want). Considering how terrible and incompatible np.matrix is, you may not actually want this:
# Again, ndarray, not array, because array is a factory function.
if isinstance(a, np.ndarray):
...
else:
...
To stick to == without consideration of the other type, the following is also possible.
type(a) == type(None)

Writing a python function that takes mean of array

I am trying to answer the questions below but I don't understand the error code when I run it (Required argument 'object' (pos 1) not found). Any help will be appreciated.
Write a python function that takes in two arrays and returns:
a) the mean of the first array
def first_mean(a,b):
a = np.array()
b = np.array()
return np.mean(a)
first_mean([2,3,4],[4,5,6])
b) the mean of the second array
def second_mean(a,b):
a = np.array()
b = np.array()
return np.mean(b)
second_mean([2,3,4],[4,5,6])
c) the Mann-Whitney U-statistic and associated p-value of the two arrays?
def mantest(a,b):
a = np.array()
b = np.array()
return scipy.stats.mannwhitneyu(a,b)
mantest([2,3,4],[4,5,6])
You are creating new, empty arrays in your functions for no reason. You are also giving them the same name as your input parameters, thus discarding your original input arrays.
What you are doing boils down to
>>> np.mean(np.array())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Required argument 'object' (pos 1) not found
All you need to do is delete the useless lines
a = np.array()
b = np.array()
from your functions.
Demo:
>>> def first_mean_nobody_knows_why_this_has_two_arguments(a, b):
... return np.mean(a)
...
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> first_mean_nobody_knows_why_this_has_two_arguments(a, b)
2.0

Fraction object doesn't have __int__ but int(Fraction(...)) still works

In Python, when you have an object you can convert it to an integer using the int function.
For example int(1.3) will return 1. This works internally by using the __int__ magic method of the object, in this particular case float.__int__.
In Python Fraction objects can be used to construct exact fractions.
from fractions import Fraction
x = Fraction(4, 3)
Fraction objects lack an __int__ method, but you can still call int() on them and get a sensible integer back. I was wondering how this was possible with no __int__ method being defined.
In [38]: x = Fraction(4, 3)
In [39]: int(x)
Out[39]: 1
The __trunc__ method is used.
>>> class X(object):
def __trunc__(self):
return 2.
>>> int(X())
2
__float__ does not work
>>> class X(object):
def __float__(self):
return 2.
>>> int(X())
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
int(X())
TypeError: int() argument must be a string, a bytes-like object or a number, not 'X'
The CPython source shows when __trunc__ is used.

TypeError: return arrays must be of ArrayType for a function that uses only floats

This one really stumps me. I have a function that calculates the weight of a word, I've confirmed that both a and b local variables are of type float:
def word_weight(term):
a = term_freq(term)
print a, type(a)
b = idf(term)
print b, type(b)
return a*log(b,2)
running word_weight("the") logs:
0.0208837518791 <type 'float'>
6.04987801572 <type 'float'>
Traceback (most recent call last):
File "summary.py", line 59, in <module>
print word_weight("the")
File "summary.py", line 43, in word_weight
return a*log(b,2)
TypeError: return arrays must be of ArrayType
why?
You are using numpy.log function here, its second argument is not base but out array:
>>> import numpy as np
>>> np.log(1.1, 2)
Traceback (most recent call last):
File "<ipython-input-5-4d17df635b06>", line 1, in <module>
np.log(1.1, 2)
TypeError: return arrays must be of ArrayType
You can now either use numpy.math.log or Python's math.log:
>>> np.math.log(1.1, 2)
0.13750352374993502
>>> import math
>>> math.log(1.1, 2) #This will return a float object not Numpy's scalar value
0.13750352374993502
Or if you're dealing only with base 2 then as #WarrenWeckesser suggested you can use numpy.log2:

Categories

Resources