Python function to accept numpy ndarray or sequence as arguments - python

I have seen some python functions that work generally receiving a (n,2) shaped numpy ndarray as argument, but can also "automagically" receive (2,n) or even len(2) sequences (tuple or list).
How is it pythonically achieved? Is there a unified good practice to check and treat these cases (example, functions in numpy and scipy module), or each developer implements which he thinks best?
I'd just like to avoid chains of (possibly nested) chains of ifs/elifs, in case there is a well known better way.
Thanks for any help.

You can use the numpy.asarray function to convert any sequence-like input to an array:
>>> import numpy
>>> numpy.asarray([1,2,3])
array([1, 2, 3])
>>> numpy.asarray(numpy.array([2,3]))
array([2, 3])
>>> numpy.asarray(1)
array(1)
>>> numpy.asarray((2,3))
array([2, 3])
>>> numpy.asarray({1:3,2:4})
array({1: 3, 2: 4}, dtype=object)
It's important to note that as the documentation says No copy is performed if the input is already an ndarray. This is really nice since you can pass an existing array in and it just returns the same array.
Once you convert it to a numpy array, just check the length if that's a requirement. Something like:
>>> def f(x):
... x = numpy.asarray(x)
... if len(x) != 2:
... raise Exception("invalid argument")
...
>>> f([1,2])
>>> f([1,2,3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in f
Exception: invalid argument
Update:
Since you asked, here's a "magic" function that will except *args as an array also:
>>> def f(*args):
... args = numpy.asarray(args[0]) if len(args) == 1 else numpy.asarray(args)
... return args
...
>>> f(7,3,5)
array([7, 3, 5])
>>> f([1,2,3])
array([1, 2, 3])
>>> f((2,3,4))
array([2, 3, 4])
>>> f(numpy.array([1,2,3]))
array([1, 2, 3])

Related

Creating a numpy array from a set

I noticed the following behaviour exhibited by numpy arrays:
>>> import numpy as np
>>> s = {1,2,3}
>>> l = [1,2,3]
>>> np.array(l)
array([1, 2, 3])
>>> np.array(s)
array({1, 2, 3}, dtype=object)
>>> np.array(l, dtype='int')
array([1, 2, 3])
>>> np.array(l, dtype='int').dtype
dtype('int64')
>>> np.array(s, dtype='int')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a number, not 'set'
There are 2 things to notice:
Creating an array from a set results in the array dtype being object
Trying to specify dtype results in an error which suggests that the
set is being treated as a single element rather than an iterable.
What am I missing - I don't fully understand which bit of python I'm overlooking. Set is a mutable object much like a list is.
EDIT: tuples work fine:
>>> t = (1,2,3)
>>> np.array(t)
array([1, 2, 3])
>>> np.array(t).dtype
dtype('int64')
The array factory works best with sequence objects which a set is not. If you do not care about the order of elements and know they are all ints or convertible to int, then you can use np.fromiter
np.fromiter({1,2,3},int,3)
# array([1, 2, 3])
The second (dtype) argument is mandatory; the last (count) argument is optional, providing it can improve performance.
As you can see from the syntax of using curly brackets, a set are more closely related to a dict than to a list. You can solve it very simply by turning the set into a list or tuple before converting to an array:
>>> import numpy as np
>>> s = {1,2,3}
>>> np.array(s)
array({1, 2, 3}, dtype=object)
>>> np.array(list(s))
array([1, 2, 3])
>>> np.array(tuple(s))
array([1, 2, 3])
However this might be too inefficient for large sets, because the list or tuple functions have to run through the whole set before even starting the creation of the array. A better method would be to use the set as an iterator:
>>> np.fromiter(s, int)
array([1, 2, 3])
The np.array documentation says that the object argument must be "an array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence" (emphasis added).
A set is not a sequence. Specifically, sets are unordered and do not support the __getitem__ method. Hence you cannot create an array from a set like you trying to with the list.
Numpy expects the argument to be a list, it doesn't understand the set type so it creates an object array (this would be the same if you passed any other non sequence object). You can create a numpy array with a set by first converting the set to a list numpy.array(list(my_set)). Hope this helps.

Mock: assert mock_calls with a numpy array as argument raises ValueError and np.testing.assert_array_equal does not work

I have a mocked object that I would like to check its calls using mock_calls, where it is called with numpy arrays. But the problem is that it raises ValueError, as shown in the following simple toy example.
>>> mocked_model_called_with_np_array = mock.Mock()
>>> mocked_model_called_with_np_array(np.array([1, 2]))
>>> mocked_model_called_with_np_array.mock_calls
[call(array([1, 2]))]
Now I set the expected calls:
>>> expected_call_with_numpy = [mock.call(np.array([1, 2]))]
Now if I check it as shown below, it raises error:
>>> assert expected_call_with_numpy == mocked_model_called_with_np_array.mock_calls
---------------------------------------------------------------------------
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-61-9806e62badf5> in <module>
----> 1 assert expected_call_with_numpy == mocked_model_called_with_np_array.mock_calls
c:\..\python\python36\lib\unittest\mock.py in __eq__(self, other)
2053
2054 # this order is important for ANY to work!
-> 2055 return (other_args, other_kwargs) == (self_args, self_kwargs)
2056
2057
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
My search on stackoverflow and the found solutions:
HERE it is suggested to use np.testing.assert_array_equal while you have numpy arrays, but this also does not solve my problems as shown below.
>>> np.testing.assert_array_equal(expected_call_with_numpy, mocked_model_called_with_np_array.mock_calls)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-57-4a0373c94354> in <module>
----> 1 np.testing.assert_array_equal(expected_call_with_numpy, mocked_model_called_with_np_array.mock_calls)
c:\...\python\python36\lib\site-packages\numpy\testing\utils.py in assert_array_equal(x, y, err_msg, verbose)
852 __tracebackhide__ = True # Hide traceback for py.test
853 assert_array_compare(operator.__eq__, x, y, err_msg=err_msg,
--> 854 verbose=verbose, header='Arrays are not equal')
855
856
c:\...\python\python36\lib\site-packages\numpy\testing\utils.py in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
776 names=('x', 'y'), precision=precision)
777 if not cond:
--> 778 raise AssertionError(msg)
779 except ValueError:
780 import traceback
AssertionError:
Arrays are not equal
(mismatch 100.0%)
x: array([['', (array([1, 2]),), {}]], dtype=object)
y: array([['', (array([1, 2]),), {}]], dtype=object)
Note that the arrays are the same, but it produces error!
Can anyone comment on how to use mock_calls for a mockec object called with a numpy array and then check if the mock_calls produce the expected calls? e.g., something like below
assert expected_call_with_numpy == mocked_model_called_with_np_array.mock_calls
Ran in to the same problem trying to check if mocked calls contained a specific numpy array.
Turns out that the call object is indexable so you can also do stuff like this:
>>> import numpy as np
>>> from unittest import mock
>>> mock_object = mock.MagicMock()
>>> mock_object(1, np.array([1, 2, 3]), a=3)
>>> mock_object(10, np.array([10, 20, 30]), b=30)
>>> calls = mock_object.call_args_list
>>> calls
[call(1, array([1, 2, 3]), a=3), call(10, array([10, 20, 30]), b=30)]
>>> calls[0]
call(1, array([1, 2, 3]), a=3)
>>> calls[0][0]
(1, array([1, 2, 3]))
>>> calls[0][1]
{'a': 3}
So that instead of asserting that the calls are equal you can just check that the calls were made and that arguments passed were correct individually, ugly but it works:
>>> assert mock_object.call_count == 2
>>> assert calls[0][0][0] == 1
>>> np.testing.assert_array_equal(calls[0][0][1], np.array([1, 2, 3]))
>>> assert calls[0][1]['a'] == 3
...
One can track the numpy arguments passed to any method of a class, simply by creating a fake (mock) class. For example, if I want to check the numpy calls to method bar of an object of class Foo, I can do as follows:
class MockFoo():
called_by = []
def bar(self, *args):
self.called_by.extend([*args])
Now, we have:
>>> a = MockFoo()
>>> a.bar(numpy.array([1, 2]))
>>> a.bar(numpy.array([100, 200]))
>>> a.bar(numpy.array([10000, 20000]))
Now we can simply check the calls to foo.bar as below:
>>> a.called_by
[array([1, 2]), array([100, 200]), array([10000, 20000])]
The main problem comes from numpy overriding == operator, returning an array instead of single boolean value.
There is a way around that, though. If you use callee library to check call arguments, you can use callee.general.Matching and provide a lambda to check equality.
Here's how it works with np.allclose
mocked_fn.assert_called_with(
callee.general.Matching(lambda x: np.allclose(x, my_array))
)
Note: I'm not associated with callee library in any way.

How to `np.loads()` an `np.save()`d array?

To wit:
>>> foo = np.array([1, 2, 3])
>>> np.save('zomg.npy', foo)
>>> np.load('zomg.npy')
array([1, 2, 3])
All good. What about loads?
>>> np.loads(open('zomg.npy', 'rb').read())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.UnpicklingError: STACK_GLOBAL requires str
Nope. Shouldn't this work? np.load() succeeds, so I know the data is not corrupted:
I'd suggest sticking with the np.save and np.load unless there is some extra functionality of pickle that you need. Then it might be less confusing to use pickle directly rather via one of the np synonyms.
============
There is an undocumented np.loads; just another name for the pickle.loads.
In [573]: np.loads
Out[573]: <function _pickle.loads>
In [574]: np.loads??
Signature: np.loads(data, *, fix_imports=True, encoding='ASCII', errors='strict')
np.ma.loads has more docs, but is just:
def loads(strg):
...
return pickle.loads(strg)
np.load will use pickle for things that aren't regular arrays, but performs its own load from the np.save format. See what its docs says about pickled objects. And to add to the confusion. pickle.dump of an array uses np.save. That is, the pickle format for an ndarray is save.
So there is a relationship between np.load and np.loads, but it isn't quite the same as that between pickle.load and pickle.loads.
================
there isn't a np.dumps, but there is a np.ma.dumps
In [584]: d=np.ma.dumps(foo)
In [585]: d
Out[585]: b'\x80\x03cnumpy.core.multiarray\n_reconstruct\nq\x00cnumpy\nndarray\nq\x01K\x00\x85q\x02C\x01bq\x03\x87q\x04Rq\x05(K\x01K\x03\x85q\x06cnumpy\ndtype\nq\x07X\x02\x00\x00\x00i4q\x08K\x00K\x01\x87q\tRq\n(K\x03X\x01\x00\x00\x00<q\x0bNNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq\x0cb\x89C\x0c\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00q\rtq\x0eb.'
In [586]: np.loads(d)
Out[586]: array([1, 2, 3])
In [587]: np.ma.loads(d)
Out[587]: array([1, 2, 3])
In [588]: import pickle
In [589]: pickle.loads(d)
Out[589]: array([1, 2, 3])
Using the pickle interface to save and load an array:
In [594]: np.ma.dump(foo,open('test.pkl','wb'))
In [595]: np.load('test.pkl')
Out[595]: array([1, 2, 3])
In [600]: pickle.load(open('test.pkl','rb'))
Out[600]: array([1, 2, 3])
This works as a work-around for now:
>>> np.load(io.BytesIO(open('zomg.npy', 'rb').read()))
array([1, 2, 3])

simple but weird vstack/concatenate problems (python)

I've been reading over the documentation on numpy arrays and some of it is not making sense.
For instance, the answer given here suggest to use np.vstack or np.concatenate to combine arrays, as does many other places on the internet.
However, when I try to do this with converted lists to np.arrays is doesn't work:
"
>>> some_list = [1,2,3,4,5]
>>> np.array(some_list)
array([1, 2, 3, 4, 5])
>>> some_Y_list = [2,1,5,6,3]
>>> np.array(some_Y_list)
array([2, 1, 5, 6, 3])
>>> dydx = np.diff(some_Y_list)/np.diff(some_list)
>>> np.vstack([dydx, dydx[-1]])"
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
np.vstack([dydx, dydx[-1]])
File "C:\Python27\lib\site-packages\numpy\core\shape_base.py", line 226, in vstack
return _nx.concatenate(map(atleast_2d,tup),0)
ValueError: array dimensions must agree except for d_0
Any way that I can do this?
All I am needing this for in this instance is just to make the derivatives of any order the same shape as my X array given by the user so I can do processing.
Thanks for any help.
The following won't work except in some very limited circumstances:
np.vstack([dydx, dydx[-1]])
Here, dydx is an array and dydx[-1] is a scalar.
It's unclear what you're trying to achieve, but did you perhaps mean to stack them horizontally:
np.hstack([dydx, dydx[-1]])
?
In [38]: np.hstack([dydx, dydx[-1]])
Out[38]: array([-1, 4, 1, -3, -3])

How can I convert a list of strings into numerical values?

I want to find out how to convert a list of strings into a list of numbers.
I have a php form through which user enters values for x and y like this:
X: [1,3,4]
Y: [2,4,5]
These values are stored into database as varchars. From there, these are called by a python program which is supposed to use them as numerical (numpy) arrays. However, these are called as plain strings, which means that calculation can not be performed over them. Is there a way to convert them into numerical arrays before processing or is there something else which is wrong?
You can use list comprehension along with the strip() and split() function to turn this into numeric values.
x = '[1,3,4]'
new_x = [int(i) for i in x.strip('[]').split(',')]
new_x
[1, 3, 4]
Use this list of ints as you see fit, e.g., passing them on to numpy etc.
from numpy import array
a = array(new_x)
a
array([1, 3, 4])
a * 4
array([ 4, 12, 16])
Here's one way:
>>> import numpy
>>> block = "[1,3,4]"
>>> block = block.strip("[]")
>>> a = numpy.fromstring(block, sep=",", dtype=int)
>>> a
array([1, 3, 4])
>>> a*2
array([2, 6, 8])
If I understand your question correctly, you can use the eval() and compile() built-in functions to achieve your aim:
>>> lst_str = '[1,2,3]'
>>> lst_obj = compile(lst_str, '<string>', 'eval')
>>> eval(lst_obj)
[1, 2, 3]
Keep in mind that using eval() in this manner is potentially unsafe, however, unless you can validate the input.
import ast
import numpy as np
def parse_array(s):
return np.array(ast.literal_eval(s))
s = '[1,2,3]'
data = parse_array(s) # -> numpy.array([1,2,3])

Categories

Resources