Numpy convert scalars to arrays - python

I am evaluating arbitrary expressions in terms of an x array, such as 3*x**2 + 4. This normally results in an array with x's shape. However if the expression is just a constant, it returns a scalar. What is the best way to ensure it has x's shape without explicitly checking the shape? Multiplying by numpy.ones(x.shape) works, but I think that uses unnecessary computations.
Edit:
To be clear, I don't just want it to be an array with size one, I want it to be the same shape and size as X.
I'm evaluating a string using NumExpr which can contain an arbitrary function of x:
x = numpy.linspace(min, max, num)
y = numexpr.evaluate(expr, {'x': x}, {})
I want to get an array of y-values that could be plotted against x through matplotlib. Currently I am doing this, which works fine:
y = numpy.ones(x.size) * y
But I'm worried that this is wasteful for large sizes. Is there a better way?

See atleast_1d:
Convert inputs to arrays with at least one dimension.
>>> import numpy as np
>>> x = 42 # x is a scalar
>>> np.atleast_1d(x)
array([42])
>>> x_is_array = np.array(42) # A zero dim array
>>> np.atleast_1d(x_is_array)
array([42])
>>> x_is_another_array = np.array([42]) # A 1d array
>>> np.atleast_1d(x_is_another_array)
array([42])
>>> np.atleast_1d(np.ones((3, 3))) # Any other numpy array
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])

When I'm unsure whether x will be a scalar, list/tuple or array, I've been using:
x = np.asarray(x).reshape(1, -1)[0,:]
Alternatively by (ab)using the broadcasting rules, you could equally write:
x = np.asarray(x) * np.ones(1)
Perhaps a slightly more streamlined syntax is to make use of the extra arguments on the array constructor:
x = np.array(x, ndmin=1, copy=False)
Which will ensure that the array has at least one dimension.
But this is one of those things that seems a bit clumsy in numpy

You can use reshape: np.reshape(x, (1,1))
Here's demonstration:
>>> x = 4
>>> a = np.reshape(x, (1,1))
>>> a[0]
array([4])
>>> a[0][0]

lin_reg.predict(np.array(6.5).reshape(1,-1))

Related

Vectorize list returning python function into numpy nd-array [duplicate]

numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].
This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]
For example:
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))
This yields:
array([[ 0. 0. 0. 0. 0.],
[ 1. 1. 1. 1. 1.],
[ 2. 2. 2. 2. 2.],
[ 3. 3. 3. 3. 3.]], dtype=object)
Ok, so that gives the right values, but the wrong dtype. And even worse:
g(a).shape
yields:
(4,)
So this array is pretty much useless. I know I can convert it doing:
np.array(map(list, a), dtype=np.float32)
to give me what I want:
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?
np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.
The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.
Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.
The solution therefore is just to roll your own function f that works the way you desire.
A new parameter signature in 1.12.0 does exactly what you what.
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, signature='()->(n)')
Then g(np.arange(4)).shape will give (4L, 5L).
Here the signature of f is specified. The (n) is the shape of the return value, and the () is the shape of the parameter which is scalar. And the parameters can be arrays too. For more complex signatures, see Generalized Universal Function API.
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)
This should fix the problem and it will work regardless of what size your input is. "map" only works for one dimentional inputs. Using ".tolist()" and creating a new ndarray solves the problem more completely and nicely(I believe). Hope this helps.
You want to vectorize the function
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
Assuming that you want to get single np.float32 arrays as result, you have to specify this as otype. In your question you specified however otypes=[np.ndarray] which means you want every element to be an np.ndarray. Thus, you correctly get a result of dtype=object.
The correct call would be
np.vectorize(f, signature='()->(n)', otypes=[np.float32])
For such a simple function it is however better to leverage numpy's ufunctions; np.vectorize just loops over it. So in your case just rewrite your function as
def f(x):
return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))
This is faster and produces less obscure errors (note however, that the results dtype will depend on x if you pass a complex or quad precision number, so will be the result).
I've written a function, it seems fits to your need.
def amap(func, *args):
'''array version of build-in map
amap(function, sequence[, sequence, ...]) -> array
Examples
--------
>>> amap(lambda x: x**2, 1)
array(1)
>>> amap(lambda x: x**2, [1, 2])
array([1, 4])
>>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
array([2, 5])
>>> amap(lambda x: (x, x), 1)
array([1, 1])
>>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
array([[1, 9], [4, 16]])
'''
args = np.broadcast(None, *args)
res = np.array([func(*arg[1:]) for arg in args])
shape = args.shape + res.shape[1:]
return res.reshape(shape)
Let try
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))
Outputs
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
You may also wrap it with lambda or partial for convenience
g = lambda x:amap(f, x)
g(np.arange(4))
Note the docstring of vectorize says
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Thus we would expect the amap here have similar performance as vectorize. I didn't check it, Any performance test are welcome.
If the performance is really important, you should consider something else, e.g. direct array calculation with reshape and broadcast to avoid loop in pure python (both vectorize and amap are the later case).
The best way to solve this would be to use a 2-D NumPy array (in this case a column array) as an input to the original function, which will then generate a 2-D output with the results I believe you were expecting.
Here is what it might look like in code:
import numpy as np
def f(x):
return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)
a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)
This is a much simpler and less error prone way to complete the operation. Rather than trying to transform the function with numpy.vectorize, this method relies on NumPy's natural ability to broadcast arrays. The trick is to make sure that at least one dimension has an equal length between the arrays.

Why is len(a[0]) different from a.shape[1]

I have a matrix "a" that has the following properties:
a.shape
(3, 220)
a.shape[1]
220
len(a)
3
len(a[0])
1
a[0].shape
(1, 220)
I don't get why len(a[0]) is different from a.shape[1]. It seems like I can never access the subarray a[0]. Please help me to understand why that is the case. Thanks!
Note, numpy recommends here that np.matrix should not be used, instead just use arrays:
It is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future.
If you check out what a[0] is, you'll see the problem. Let's implement this in a smaller size so that it's easier to visualize:
import numpy as np
# I'm using all zeros here for simplicity
y = np.matrix(np.zeros((5, 10)))
y.shape
(5, 10)
y[0]
matrix([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
y[0] is a matrix consisting of 1 row and 10 columns:
y[0].shape
(1, 10)
If you use np.array, you avoid this problem altogether
x = np.zeros((5, 10))
x.shape
(5, 10)
len(x[0])
10
x[0].shape
(10,)
As user2357112 pointed out, the problem appears to be that you are using numpy.matrix instead of numpy.ndarray (via numpy.array).
The Numpy documentation says the following about matrix:
It is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future.
A regular Numpy array is very similar to a matrix, but can have any number of dimensions, and use the # operator instead of * to do matrix multipliation.

How to replace elements of preallocated np.array in Python, matlab style

I am new in python coming from matlab. Now when i want to save a vector in matlab to a preallocated matrix i do this (matlab code)
a = zeros(5, 2)
b = zeros(5, 1)
# save elements of b in the first column of a
a(:, 1) = b
Now i am using numpy in python. I do not really know how to describe this problem. What am i doing here is essentially this
a = np.zeros([5, 2])
b = np.ones([5, 1])
a[:, 0] = np.reshape(b, a[:, 0].shape)
because the following solution is not working:
a[:, 0] = b # Not working
Can anyone point out other ways of doing it, more closely to the matlab style?
Simple way would be -
a[:,[0]] = b
Sample run -
In [217]: a = np.zeros([5, 2])
...: b = np.ones([5, 1])
...:
In [218]: a[:,[0]] = b
In [219]: a
Out[219]:
array([[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.]])
Basically with this slicing of using a scalar a[:,0], number of dimensions are reduced (the dimension along which the scalar is used is removed) for assignment. When we specify a list of index/indices like a[:,[0]], the dimensions are preserved, i.e. kept as 2D and that allows us to assign b, which is also 2D. Let's test that out -
In [225]: a[:,0].shape
Out[225]: (5,) # 1D array
In [226]: a[:,[0]].shape
Out[226]: (5, 1) # 2D array
In [227]: b.shape
Out[227]: (5, 1) # 2D array
For reference, here's a link to the slicing scheme. Quoting the relevant part from it -
An integer, i, returns the same values as i:i+1 except the
dimensionality of the returned object is reduced by 1.
In particular, a selection tuple with the p-th element an integer (and all other
entries :) returns the corresponding sub-array with dimension N - 1.

How to assign a 1D numpy array of length x to an element of length y of a 2D Numpy Array?

I'm looking for a way to assign a 1D numpy-array consisting of x elements to a 2D numpy Array of shape (y,z).
Example:
A=np.array([[0],[0],[0]])
A[2]=np.array([0,2])
Which should result in
A=[[0],[0],[0,2]]
This works perfectly fine using a python list, but has been causing me huge trouble when trying to do it in numpy, usually resulting in the error message:
could not broadcast input array from shape (z) into shape (x)
This seems to occur as a result of the fact that numpy copies everything instead of modifying the array in place. I have only recently begun using numpy and would really be grateful if someone could help find a way to do this efficiently.
Actually the issue is that Numpy refuses to perform implicit copies or reshapes. For instance:
>>> A=np.array([[0],[0],[0]])
>>> A[2]=np.array([0,2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (2) into shape (1)
Here A[2] is a subarray of A, of shape 1. 2 cells can't fit in 1, so we get shape error. The reverse situation is possible and known as broadcasting:
>>> A[0:2]=5
>>> A
array([[5],
[5],
[0]])
Here a single scalar has been broadcast to update the entire subarray. We can resize A to be able to fit the shape 2 entry:
>>> A.shape
(3, 1)
>>> A.resize((3,2))
>>> A.shape
(3, 2)
>>> A[2]=np.array([0,2])
>>> A
array([[5, 5],
[0, 0],
[0, 2]])
We can see that the resizing actually reorganized our cells. It still starts with 5 5 0 but the cells are no longer along a single column. This is because numpy doesn't copy unless asked to, either; all of our multicell slices in fact refer into the same original array. We can make a second matrix and copy the original into a single column there:
>>> B=np.zeros((A.shape[0]+1,A.shape[1]))
>>> B[:,0]=A.transpose()
>>> B
array([[ 5., 0.],
[ 5., 0.],
[ 0., 0.]])
The transpose is because the slice of B is 1-dimensional shape (3 long) rather than a 2-dimensional shape like A (which is 1 wide and 3 high). Numpy considers the 1-dimensional array to be a horisontal shape, so a 3 wide and 1 high matrix will fit. You could think of it like copying a range of cells in a spreadsheet.
Notably, the numbers thus placed in B are copies of what was in A. This is because we did a modification of B. Views can be used to manipulate sections of a matrix (including seeing it in another shape, like transpose() does), for instance:
>>> C=B[::-1,1]
>>> C
array([ 0., 0., 0.])
>>> C[:]=[1,2,3]
>>> B
array([[ 5., 3.],
[ 5., 2.],
[ 0., 1.]])

Create a dynamic 2D numpy array on the fly

I am having a hard time creating a numpy 2D array on the fly.
So basically I have a for loop something like this.
for ele in huge_list_of_lists:
instance = np.array(ele)
creates a 1D numpy array of this list and now I want to append it to a numpy array so basically converting list of lists to array of arrays?
I have checked the manual.. and np.append() methods that doesn't work as for np.append() to work, it needs two arguments to append it together.
Any clues?
Create the 2D array up front, and fill the rows while looping:
my_array = numpy.empty((len(huge_list_of_lists), row_length))
for i, x in enumerate(huge_list_of_lists):
my_array[i] = create_row(x)
where create_row() returns a list or 1D NumPy array of length row_length.
Depending on what create_row() does, there might be even better approaches that avoid the Python loop altogether.
Just pass the list of lists to numpy.array, keep in mind that numpy arrays are ndarrays, so the concept to a list of lists doesn't translate to arrays of arrays it translates to a 2d array.
>>> import numpy as np
>>> a = [[1., 2., 3.], [4., 5., 6.]]
>>> b = np.array(a)
>>> b
array([[ 1., 2., 3.],
[ 4., 5., 6.]])
>>> b.shape
(2, 3)
Also ndarrays have nd-indexing so [1][1] becomes [1, 1] in numpy:
>>> a[1][1]
5.0
>>> b[1, 1]
5.0
Did I misunderstand your question?
You defiantly don't want to use numpy.append for something like this. Keep in mind that numpy.append has O(n) run time so if you call it n times, once for each row of your array, you end up with a O(n^2) algorithm. If you need to create the array before you know what all the content is going to be, but you know the final size, it's best to create an array using numpy.zeros(shape, dtype) and fill it in later. Similar to Sven's answer.
import numpy as np
ss = np.ndarray(shape=(3,3), dtype=int);
array([[ 0, 139911262763080, 139911320845424],
[ 10771584, 10771584, 139911271110728],
[139911320994680, 139911206874808, 80]]) #random
numpy.ndarray function achieves this. numpy.ndarray

Categories

Resources