Why Numpy array reshape takes two brackets? - python

While learning the Python numpy library I saw the array reshape function.
a = [1,2,3]
b = [5,6,7]
c = a + b
d = np.array(c)
e = d.reshape((2,3))
print(e)
In the above code why is it that reshape takes two brackets?
Why I can't write it like this?
e = d.reshape(2,3)
I saw this question anywhere else.

As mentioned in numpy doc for numpy.rehsape:
numpy.reshape(a, newshape, order='C')
newshape must be an int or tuple of ints.
And in the doc for ndarray.reshape:
ndarray.reshape(shape, order='C')
"Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example, a.reshape(10, 11) is equivalent to a.reshape((10, 11))."

reshape method get shape parameter which it should be tuple as (2, 3)
def reshape(self, shape, *shapes, order='C'): # known case of numpy.core.multiarray.ndarray.reshape
"""
a.reshape(shape, order='C')
Returns an array containing the same data with a new shape.
Refer to `numpy.reshape` for full documentation.
See Also
--------
numpy.reshape : equivalent function
Notes
-----
Unlike the free function `numpy.reshape`, this method on `ndarray` allows
the elements of the shape parameter to be passed in as separate arguments.
For example, ``a.reshape(10, 11)`` is equivalent to
``a.reshape((10, 11))``.
"""
pass
```

Related

Creating a 3x4 random integers from 0-100 array

It is an exercise from a crash course in numpy in Python from Google. I've looked at the solution, but I want to know if there's a way to solve my error, and complete the exercise the way I tried at first.
I have tried this:
rand_0_100 = np.random.randint(low=0, high=101, size=(12))
my_data = np.array(
[rand_0_100[0], rand_0_100[1], rand_0_100[2], rand_0_100[3]],
[rand_0_100[4], rand_0_100[5], rand_0_100[6], rand_0_100[7]],
[rand_0_100[8]],
[rand_0_100[9], rand_0_100[10], rand_0_100[11]]
)
... and I get this error:
TypeError: array() takes from 1 to 2 positional arguments but 4 were given
Looking at the solution, I now know I can make the random number from 0-100 an array, by modifying the size argument to (3, 4), but I'd like to know if it is possible to make the array the way I tried.
Yes, that is possible, but you will need to pass a single 2-dimensional list to np.array, not four 1-dimensional lists:
import numpy as np
rand_0_100 = np.random.randint(low=0, high=101, size=(12))
out = np.array([ # <<< Note this opening [
[rand_0_100[0], rand_0_100[1], rand_0_100[2], rand_0_100[3] ],
[rand_0_100[4], rand_0_100[5], rand_0_100[6], rand_0_100[7] ],
[rand_0_100[8], rand_0_100[9], rand_0_100[10], rand_0_100[11]]
]) # <<< and this closing ]
Numpy also provides a clean way to do this for you using reshape:
out = rand_0_100.reshape(3, 4)
But as you mention, the best way is to simply pass size=(3, 4) to randint.
by the way for the original problem you can have a matrix of 3*4 by this approach :
rand_0_100 =np.random.randint(low=0, high=101, size=(12))
my_data =rand_0_100 .reshape(3,4)
np.array get the arguments as below:
numpy.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)
object : is the array you pass to it.
dtype : The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.
copy :If true (default), then the object is copied. Otherwise, a copy will only be made if array returns a copy
order: Specify the memory layout of the array.
in your code you gave 4 list as argument to the function. that's why your program face the error
TypeError: array() takes from 1 to 2 positional arguments but 4 were given
hope it helps.

Why the length of the array appended in loop is more than the number of iteration?

I ran this code and expected an array size of 10000 as time is a numpy array of length of 10000.
freq=np.empty([])
for i,t in enumerate(time):
freq=np.append(freq,np.sin(t))
print(time.shape)
print(freq.shape)
But this is the output I got
(10000,)
(10001,)
Can someone explain why I am getting this disparity?
It turns out that the function np.empty() returns an uninitialized array of a given shape. Hence, when you do np.empty([]), it returns an uninitialized array as array(0.14112001). It's like having a value "ready to be used", but without having the actual value. You can check this out by printing the variable freq before the loop starts.
So, when you loop over freq = np.append(freq,np.sin(t)) this actually initializes the array and append a second value to it.
Also, if you just need to create an empty array just do x = np.array([]) or x = [].
You can read more about this numpy.empty function here:
https://numpy.org/doc/1.18/reference/generated/numpy.empty.html
And more about initializing arrays here:
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.3/com.ibm.xlc1313.aix.doc/language_ref/aryin.html
I'm not sure if I was clear enough. It's not a straight forward concept. So please let me know.
You should fill np.empty(0).
I look for source code of numpy numpy/core.py
def empty(shape, dtype=None, order='C'):
"""Return a new matrix of given shape and type, without initializing entries.
Parameters
----------
shape : int or tuple of int
Shape of the empty matrix.
dtype : data-type, optional
Desired output data-type.
order : {'C', 'F'}, optional
Whether to store multi-dimensional data in row-major
(C-style) or column-major (Fortran-style) order in
memory.
See Also
--------
empty_like, zeros
Notes
-----
`empty`, unlike `zeros`, does not set the matrix values to zero,
and may therefore be marginally faster. On the other hand, it requires
the user to manually set all the values in the array, and should be
used with caution.
Examples
--------
>>> import numpy.matlib
>>> np.matlib.empty((2, 2)) # filled with random data
matrix([[ 6.76425276e-320, 9.79033856e-307], # random
[ 7.39337286e-309, 3.22135945e-309]])
>>> np.matlib.empty((2, 2), dtype=int)
matrix([[ 6600475, 0], # random
[ 6586976, 22740995]])
"""
return ndarray.__new__(matrix, shape, dtype, order=order)
It will input first arg shape into ndarray, so it will init a new array as [].
And you can print np.empty(0) and freq=np.empty([]) to see what are their differences.
I think you are trying to replicate a list operation:
freq=[]
for i,t in enumerate(time):
freq.append(np.sin(t))
But neither np.empty or np.append are exact clones; the names are similar but the differences are significant.
First:
In [75]: np.empty([])
Out[75]: array(1.)
In [77]: np.empty([]).shape
Out[77]: ()
This is a 1 element, 0d array.
If you look at the code for np.append you'll see that if the 1st argument is not 1d (and axis is not provided) it flattens it (that's documented as well):
In [78]: np.append??
In [82]: np.empty([]).ravel()
Out[82]: array([1.])
In [83]: np.empty([]).ravel().shape
Out[83]: (1,)
It is not a 1d, 1 element array. Append that with another array:
In [84]: np.append(np.empty([]), np.sin(2))
Out[84]: array([1. , 0.90929743])
The result is 2d. Repeat that 1000 times and you end up with 1001 values.
np.empty despite its name does not produce a [] list equivalent. As others show np.array([]) sort of does, as would np.empty(0).
np.append is not a list append clone. It is just a cover function to np.concatenate. It's ok for adding an element to a longer array, but beyond that it has too many pitfalls to be useful. It's especially bad in a loop like this. Getting a correct start array is tricky. And it is slow (compared to list append). Actually these problems apply to all uses of concatenate and stack... in a loop.

array of unknown size

I am pulling data from a .CSV into an array, as follows:
my_data = genfromtxt('nice.csv', delimiter='')
a = np.array(my_data)
I then attempt to establish the size and shape of the array, thus:
size_array=np.size(a)
shape_array=np.shape(a)
Now, I want to generate an array of identical shape and size, and then carry out some multiplications. The trouble I am having is generating the correctly sized array. I have tried this:
D = np.empty([shape_array,])
I receive the error:
"tuple' object cannot be interpreted as an index".
After investigation my array has a shape of (248L,). Please...how do I get this array in a sensible format?
Thanks.
The line shape_array=np.shape(a) creates a shape tuple, which is the expected input to np.empty.
The expression [shape_array,] is that tuple, wrapped in a list, which seems superfluous. Use shape_array directly:
d = np.empty(shape_array)
On a related note, you can use the function np.empty_like to get an array of the same shape and type as the original more effectively:
d = np.empty_like(a)
If you want to use just the shape and size, there is really no need to store them in separate variables after calling np.size and np.shape. It is more idiomatic to use the corresponding properties of np.ndarray directly:
d = np.empty(a.shape)

What is the difference between ndarray and array in NumPy?

What is the difference between ndarray and array in NumPy? Where is their implementation in the NumPy source code?
numpy.array is just a convenience function to create an ndarray; it is not a class itself.
You can also create an array using numpy.ndarray, but it is not the recommended way. From the docstring of numpy.ndarray:
Arrays should be constructed using array, zeros or empty ... The parameters given here refer to a
low-level method (ndarray(...)) for instantiating an array.
Most of the meat of the implementation is in C code, here in multiarray, but you can start looking at the ndarray interfaces here:
https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py
numpy.array is a function that returns a numpy.ndarray object.
There is no object of type numpy.array.
Just a few lines of example code to show the difference between numpy.array and numpy.ndarray
Warm up step: Construct a list
a = [1,2,3]
Check the type
print(type(a))
You will get
<class 'list'>
Construct an array (from a list) using np.array
a = np.array(a)
Or, you can skip the warm up step, directly have
a = np.array([1,2,3])
Check the type
print(type(a))
You will get
<class 'numpy.ndarray'>
which tells you the type of the numpy array is numpy.ndarray
You can also check the type by
isinstance(a, (np.ndarray))
and you will get
True
Either of the following two lines will give you an error message
np.ndarray(a) # should be np.array(a)
isinstance(a, (np.array)) # should be isinstance(a, (np.ndarray))
numpy.ndarray() is a class, while numpy.array() is a method / function to create ndarray.
In numpy docs if you want to create an array from ndarray class you can do it with 2 ways as quoted:
1- using array(), zeros() or empty() methods:
Arrays should be constructed using array, zeros or empty (refer to the See Also section below). The parameters given here refer to a low-level method (ndarray(…)) for instantiating an array.
2- from ndarray class directly:
There are two modes of creating an array using __new__:
If buffer is None, then only shape, dtype, and order are used.
If buffer is an object exposing the buffer interface, then all keywords are interpreted.
The example below gives a random array because we didn't assign buffer value:
np.ndarray(shape=(2,2), dtype=float, order='F', buffer=None)
array([[ -1.13698227e+002, 4.25087011e-303],
[ 2.88528414e-306, 3.27025015e-309]]) #random
another example is to assign array object to the buffer
example:
>>> np.ndarray((2,), buffer=np.array([1,2,3]),
... offset=np.int_().itemsize,
... dtype=int) # offset = 1*itemsize, i.e. skip first element
array([2, 3])
from above example we notice that we can't assign a list to "buffer" and we had to use numpy.array() to return ndarray object for the buffer
Conclusion: use numpy.array() if you want to make a numpy.ndarray() object"
I think with np.array() you can only create C like though you mention the order, when you check using np.isfortran() it says false. but with np.ndarrray() when you specify the order it creates based on the order provided.

How to construct an np.array with fromiter

I'm trying to construct an np.array by sampling from a python generator, that yields one row of the array per invocation of next. Here is some sample code:
import numpy as np
data = np.eye(9)
labels = np.array([0,0,0,1,1,1,2,2,2])
def extract_one_class(X,labels,y):
""" Take an array of data X, a column vector array of labels, and one particular label y. Return an array of all instances in X that have label y """
return X[np.nonzero(labels[:] == y)[0],:]
def generate_points(data, labels, size):
""" Generate and return 'size' pairs of points drawn from different classes """
label_alphabet = np.unique(labels)
assert(label_alphabet.size > 1)
for useless in xrange(size):
shuffle(label_alphabet)
first_class = extract_one_class(data,labels,label_alphabet[0])
second_class = extract_one_class(data,labels,label_alphabet[1])
pair = np.hstack((first_class[randint(0,first_class.shape[0]),:],second_class[randint(0,second_class.shape[0]),:]))
yield pair
points = np.fromiter(generate_points(data,labels,5),dtype = np.dtype('f8',(2*data.shape[1],1)))
The extract_one_class function returns a subset of data: all data points belonging to one class label. I would like to have points be an np.array with shape = (size,data.shape[1]). Currently the code snippet above returns an error:
ValueError: setting an array element with a sequence.
The documentation of fromiter claims to return a one-dimensional array. Yet others have used fromiter to construct record arrays in numpy before (e.g http://iam.al/post/21116450281/numpy-is-my-homeboy).
Am I off the mark in assuming I can generate an array in this fashion? Or is my numpy just not quite right?
As you've noticed, the documentation of np.fromiter explains that the function creates a 1D array. You won't be able to create a 2D array that way, and #unutbu method of returning a 1D array that you reshape afterwards is a sure go.
However, you can indeed create structured arrays using fromiter, as illustrated by:
>>> import itertools
>>> a = itertools.izip((1,2,3),(10,20,30))
>>> r = np.fromiter(a,dtype=[('',int),('',int)])
array([(1, 10), (2, 20), (3, 30)],
dtype=[('f0', '<i8'), ('f1', '<i8')])
but look, r.shape=(3,), that is, r is really nothing but 1D array of records, each record being composed of two integers. Because all the fields have the same dtype, we can take a view of r as a 2D array
>>> r.view((int,2))
array([[ 1, 10],
[ 2, 20],
[ 3, 30]])
So, yes, you could try to use np.fromiter with a dtype like [('',int)]*data.shape[1]: you'll get a 1D array of length size, that you can then view this array as ((int, data.shape[1])). You can use floats instead of ints, the important part is that all fields have the same dtype.
If you really want it, you can use some fairly complex dtype. Consider for example
r = np.fromiter(((_,) for _ in a),dtype=[('',(int,2))])
Here, you get a 1D structured array with 1 field, the field consisting of an array of 2 integers. Note the use of (_,) to make sure that each record is passed as a tuple (else np.fromiter chokes). But do you need that complexity?
Note also that as you know the length of the array beforehand (it's size), you should use the counter optional argument of np.fromiter for more efficiency.
You could modify generate_points to yield single floats instead of np.arrays, use np.fromiter to form a 1D array, and then use .reshape(size, -1) to make it a 2D array.
points = np.fromiter(
generate_points(data,labels,5)).reshape(size, -1)
Following some suggestions here, I came up with a fairly general drop-in replacement for numpy.fromiter() that satisfies the requirements of the OP:
import numpy as np
def fromiter(iterator, dtype, *shape):
"""Generalises `numpy.fromiter()` to multi-dimesional arrays.
Instead of the number of elements, the parameter `shape` has to be given,
which contains the shape of the output array. The first dimension may be
`-1`, in which case it is inferred from the iterator.
"""
res_shape = shape[1:]
if not res_shape: # Fallback to the "normal" fromiter in the 1-D case
return np.fromiter(iterator, dtype, shape[0])
# This wrapping of the iterator is necessary because when used with the
# field trick, np.fromiter does not enforce consistency of the shapes
# returned with the '_' field and silently cuts additional elements.
def shape_checker(iterator, res_shape):
for value in iterator:
if value.shape != res_shape:
raise ValueError("shape of returned object %s does not match"
" given shape %s" % (value.shape, res_shape))
yield value,
return np.fromiter(shape_checker(iterator, res_shape),
[("_", dtype, res_shape)], shape[0])["_"]

Categories

Resources