Creating a 3x4 random integers from 0-100 array - python

It is an exercise from a crash course in numpy in Python from Google. I've looked at the solution, but I want to know if there's a way to solve my error, and complete the exercise the way I tried at first.
I have tried this:
rand_0_100 = np.random.randint(low=0, high=101, size=(12))
my_data = np.array(
[rand_0_100[0], rand_0_100[1], rand_0_100[2], rand_0_100[3]],
[rand_0_100[4], rand_0_100[5], rand_0_100[6], rand_0_100[7]],
[rand_0_100[8]],
[rand_0_100[9], rand_0_100[10], rand_0_100[11]]
)
... and I get this error:
TypeError: array() takes from 1 to 2 positional arguments but 4 were given
Looking at the solution, I now know I can make the random number from 0-100 an array, by modifying the size argument to (3, 4), but I'd like to know if it is possible to make the array the way I tried.

Yes, that is possible, but you will need to pass a single 2-dimensional list to np.array, not four 1-dimensional lists:
import numpy as np
rand_0_100 = np.random.randint(low=0, high=101, size=(12))
out = np.array([ # <<< Note this opening [
[rand_0_100[0], rand_0_100[1], rand_0_100[2], rand_0_100[3] ],
[rand_0_100[4], rand_0_100[5], rand_0_100[6], rand_0_100[7] ],
[rand_0_100[8], rand_0_100[9], rand_0_100[10], rand_0_100[11]]
]) # <<< and this closing ]
Numpy also provides a clean way to do this for you using reshape:
out = rand_0_100.reshape(3, 4)
But as you mention, the best way is to simply pass size=(3, 4) to randint.

by the way for the original problem you can have a matrix of 3*4 by this approach :
rand_0_100 =np.random.randint(low=0, high=101, size=(12))
my_data =rand_0_100 .reshape(3,4)
np.array get the arguments as below:
numpy.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)
object : is the array you pass to it.
dtype : The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.
copy :If true (default), then the object is copied. Otherwise, a copy will only be made if array returns a copy
order: Specify the memory layout of the array.
in your code you gave 4 list as argument to the function. that's why your program face the error
TypeError: array() takes from 1 to 2 positional arguments but 4 were given
hope it helps.

Related

Why numpy array returns oryginal array when passing array as index?

I find this behaviour an utter nonsense. This happens only with numpy arrays, typical Python's arrays will just throw an error.
Let's create two arrays:
randomNumMatrix = np.random.randint(0,20,(3,3,3), dtype=np.int)
randRow = np.array([0,1,2], dtype=np.int)
If we pass an array as index to get something from another array, an original array is returned.
randomNumMatrix[randRow]
The code above returns an equivalent of randomNumMatrix. I find this unintuitive. I would expect it, not to work or at least return an equivalent of
randomNumMatrix[randRow[0]][randRow[1]][randRow[2]].
Additional observations:
A)
The code below does not work, it throws this error: IndexError: index 3 is out of bounds for axis 0 with size 3
randRow = np.array([0, 1, 3], dtype=np.int)
B)
To my surprise, the code below works:
randRow = np.array([0, 1, 2, 2,0,1,2], dtype=np.int)
Can somebody please explain what are the advantages of this feature?
In my opinion it only creates much confusion.
What is?
randomNumMatrix[randRow[0]][randRow[1]][randRow[2]]
That's not a valid Python.
In numpy there is a difference between
arr[(x,y,z)] # equivalent to arr[x,y,z]
and
arr[np.array([x,y,z])] # equivalent to arr[np.array([x,y,z]),:,:]
The tuple provides a scalar index for each dimension. The array (or list) provides multiple indices for one dimension.
You may need to study the numpy docs on indexing, especially advanced indexing.

Why the length of the array appended in loop is more than the number of iteration?

I ran this code and expected an array size of 10000 as time is a numpy array of length of 10000.
freq=np.empty([])
for i,t in enumerate(time):
freq=np.append(freq,np.sin(t))
print(time.shape)
print(freq.shape)
But this is the output I got
(10000,)
(10001,)
Can someone explain why I am getting this disparity?
It turns out that the function np.empty() returns an uninitialized array of a given shape. Hence, when you do np.empty([]), it returns an uninitialized array as array(0.14112001). It's like having a value "ready to be used", but without having the actual value. You can check this out by printing the variable freq before the loop starts.
So, when you loop over freq = np.append(freq,np.sin(t)) this actually initializes the array and append a second value to it.
Also, if you just need to create an empty array just do x = np.array([]) or x = [].
You can read more about this numpy.empty function here:
https://numpy.org/doc/1.18/reference/generated/numpy.empty.html
And more about initializing arrays here:
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.3/com.ibm.xlc1313.aix.doc/language_ref/aryin.html
I'm not sure if I was clear enough. It's not a straight forward concept. So please let me know.
You should fill np.empty(0).
I look for source code of numpy numpy/core.py
def empty(shape, dtype=None, order='C'):
"""Return a new matrix of given shape and type, without initializing entries.
Parameters
----------
shape : int or tuple of int
Shape of the empty matrix.
dtype : data-type, optional
Desired output data-type.
order : {'C', 'F'}, optional
Whether to store multi-dimensional data in row-major
(C-style) or column-major (Fortran-style) order in
memory.
See Also
--------
empty_like, zeros
Notes
-----
`empty`, unlike `zeros`, does not set the matrix values to zero,
and may therefore be marginally faster. On the other hand, it requires
the user to manually set all the values in the array, and should be
used with caution.
Examples
--------
>>> import numpy.matlib
>>> np.matlib.empty((2, 2)) # filled with random data
matrix([[ 6.76425276e-320, 9.79033856e-307], # random
[ 7.39337286e-309, 3.22135945e-309]])
>>> np.matlib.empty((2, 2), dtype=int)
matrix([[ 6600475, 0], # random
[ 6586976, 22740995]])
"""
return ndarray.__new__(matrix, shape, dtype, order=order)
It will input first arg shape into ndarray, so it will init a new array as [].
And you can print np.empty(0) and freq=np.empty([]) to see what are their differences.
I think you are trying to replicate a list operation:
freq=[]
for i,t in enumerate(time):
freq.append(np.sin(t))
But neither np.empty or np.append are exact clones; the names are similar but the differences are significant.
First:
In [75]: np.empty([])
Out[75]: array(1.)
In [77]: np.empty([]).shape
Out[77]: ()
This is a 1 element, 0d array.
If you look at the code for np.append you'll see that if the 1st argument is not 1d (and axis is not provided) it flattens it (that's documented as well):
In [78]: np.append??
In [82]: np.empty([]).ravel()
Out[82]: array([1.])
In [83]: np.empty([]).ravel().shape
Out[83]: (1,)
It is not a 1d, 1 element array. Append that with another array:
In [84]: np.append(np.empty([]), np.sin(2))
Out[84]: array([1. , 0.90929743])
The result is 2d. Repeat that 1000 times and you end up with 1001 values.
np.empty despite its name does not produce a [] list equivalent. As others show np.array([]) sort of does, as would np.empty(0).
np.append is not a list append clone. It is just a cover function to np.concatenate. It's ok for adding an element to a longer array, but beyond that it has too many pitfalls to be useful. It's especially bad in a loop like this. Getting a correct start array is tricky. And it is slow (compared to list append). Actually these problems apply to all uses of concatenate and stack... in a loop.

Pass numpy array of list of integers in Cython method from Python

I would like to pass the following array of list of integers (i.e., it's not an two dimensional array) to the Cython method from python code.
Python Sample Code:
import numpy as np
import result
a = np.array([[1], [2,3]])
process_result(a)
The output of a is array([list([1]), list([2, 3])], dtype=object)
Cython Sample Code:
def process_result(int[:,:] a):
pass
The above code gives the following error:
ValueError: Buffer has wrong number of dimensions (expected 2, got 1)
I tried to pass a simple array instead of numpy I got the following error
a = [[1], [2,3]]
process_result(a)
TypeError: a bytes-like object is required, not 'list'
Kindly assist me how to pass the value of a into the Cython method process_result and whats the exact datatype needs to use to receive this value in Cython method.
I think you're using the wrong data-type. Instead of a numpy array of lists, you should be using a list of numpy arrays. There is very little benefit of using numpy arrays of Python objects (such as lists) - unlike numeric types they aren't stored particulatly efficiently, they aren't quick to do calculations on, and you can't accelerate them in Cython. Therefore the outermost level may as well be a normal Python list.
However, the inner levels all look to be homogenous arrays of integers, and so would be ideal candidates for Numpy arrays (especially if you want to process them in Cython).
Therefore, build your list as:
a = [ np.array([1],dtype=np.int), np.array([2,3],dtype=np.int) ]
(Or use tolist on a numpy array)
For your function you can define it like:
def process_result(list a):
cdef int[:] item
for item in a:
#operations on the inner arrays are fast!
pass
Here I've assumed that you most likely want to iterate over the list. Note that there's pretty little benefit in typing a to be list, so you could just leave it untyped (to accept any Python object) and then you could pass it other iterables too, like your original numpy array.
Convert the array of list of integer to list of object (i.e., list of list of integers - its not a two dimensional array)
Python Code:
import numpy as np
import result
a = np.array([[1], [2,3]]).tolist()
process_result(a)
The output of a is [[1], [2,3]]
Cython Sample Code:
def process_result(list a):
pass
Change the int[:, :] to list. It works fine.
Note: If anyone know the optimal answer kindly post it, It will be
helpful.

What is the difference between ndarray and array in NumPy?

What is the difference between ndarray and array in NumPy? Where is their implementation in the NumPy source code?
numpy.array is just a convenience function to create an ndarray; it is not a class itself.
You can also create an array using numpy.ndarray, but it is not the recommended way. From the docstring of numpy.ndarray:
Arrays should be constructed using array, zeros or empty ... The parameters given here refer to a
low-level method (ndarray(...)) for instantiating an array.
Most of the meat of the implementation is in C code, here in multiarray, but you can start looking at the ndarray interfaces here:
https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py
numpy.array is a function that returns a numpy.ndarray object.
There is no object of type numpy.array.
Just a few lines of example code to show the difference between numpy.array and numpy.ndarray
Warm up step: Construct a list
a = [1,2,3]
Check the type
print(type(a))
You will get
<class 'list'>
Construct an array (from a list) using np.array
a = np.array(a)
Or, you can skip the warm up step, directly have
a = np.array([1,2,3])
Check the type
print(type(a))
You will get
<class 'numpy.ndarray'>
which tells you the type of the numpy array is numpy.ndarray
You can also check the type by
isinstance(a, (np.ndarray))
and you will get
True
Either of the following two lines will give you an error message
np.ndarray(a) # should be np.array(a)
isinstance(a, (np.array)) # should be isinstance(a, (np.ndarray))
numpy.ndarray() is a class, while numpy.array() is a method / function to create ndarray.
In numpy docs if you want to create an array from ndarray class you can do it with 2 ways as quoted:
1- using array(), zeros() or empty() methods:
Arrays should be constructed using array, zeros or empty (refer to the See Also section below). The parameters given here refer to a low-level method (ndarray(…)) for instantiating an array.
2- from ndarray class directly:
There are two modes of creating an array using __new__:
If buffer is None, then only shape, dtype, and order are used.
If buffer is an object exposing the buffer interface, then all keywords are interpreted.
The example below gives a random array because we didn't assign buffer value:
np.ndarray(shape=(2,2), dtype=float, order='F', buffer=None)
array([[ -1.13698227e+002, 4.25087011e-303],
[ 2.88528414e-306, 3.27025015e-309]]) #random
another example is to assign array object to the buffer
example:
>>> np.ndarray((2,), buffer=np.array([1,2,3]),
... offset=np.int_().itemsize,
... dtype=int) # offset = 1*itemsize, i.e. skip first element
array([2, 3])
from above example we notice that we can't assign a list to "buffer" and we had to use numpy.array() to return ndarray object for the buffer
Conclusion: use numpy.array() if you want to make a numpy.ndarray() object"
I think with np.array() you can only create C like though you mention the order, when you check using np.isfortran() it says false. but with np.ndarrray() when you specify the order it creates based on the order provided.

Named dtype array: Difference between a[0]['name'] and a['name'][0]?

I came across the following oddity in numpy which may or may not be a bug:
import numpy as np
dt = np.dtype([('tuple', (int, 2))])
a = np.zeros(3, dt)
type(a['tuple'][0]) # ndarray
type(a[0]['tuple']) # ndarray
a['tuple'][0] = (1,2) # ok
a[0]['tuple'] = (1,2) # ValueError: shape-mismatch on array construction
I would have expected that both of the options below work.
Opinions?
I asked that on the numpy-discussion list. Travis Oliphant answered here.
Citing his answer:
The short answer is that this is not really a "normal" bug, but it could be considered a "design" bug (although the issues may not be straightforward to resolve). What that means is that it may not be changed in the short term --- and you should just use the first spelling.
Structured arrays can be a confusing area of NumPy for several of reasons. You've constructed an example that touches on several of them. You have a data-type that is a "structure" array with one member ("tuple"). That member contains a 2-vector of integers.
First of all, it is important to remember that with Python, doing
a['tuple'][0] = (1,2)
is equivalent to
b = a['tuple']; b[0] = (1,2)
In like manner,
a[0]['tuple'] = (1,2)
is equivalent to
b = a[0]; b['tuple'] = (1,2)
To understand the behavior, we need to dissect both code paths and what happens. You built a (3,) array of those elements in 'a'. When you write b = a['tuple'] you should probably be getting a (3,) array of (2,)-integers, but as there is currently no formal dtype support for (n,)-integers as a general dtype in NumPy, you get back a (3,2) array of integers which is the closest thing that NumPy can give you. Setting the [0] row of this object via
a['tuple'][0] = (1,2)
works just fine and does what you would expect.
On the other hand, when you type:
b = a[0]
you are getting back an array-scalar which is a particularly interesting kind of array scalar that can hold records. This new object is formally of type numpy.void and it holds a "scalar representation" of anything that fits under the "VOID" basic dtype.
For some reason:
b['tuple'] = [1,2]
is not working. On my system I'm getting a different error: TypeError: object of type 'int' has no len()
I think this should be filed as a bug on the issue tracker which is for the time being here: http://projects.scipy.org/numpy
The problem is ultimately the void->copyswap function being called in voidtype_setfields if someone wants to investigate. I think this behavior should work.
An explanation for this is given in a numpy bug report.
I get a different error than you do (using numpy 1.7.0.dev):
ValueError: setting an array element with a sequence.
so the explanation below may not be correct for your system (or it could even be the wrong explanation for what I see).
First, notice that indexing a row of a structured array gives you a numpy.void object (see data type docs)
import numpy as np
dt = np.dtype([('tuple', (int, 2))])
a = np.zeros(3, dt)
print type(a[0]) # = numpy.void
From what I understand, void is sort of like a Python list since it can hold objects of different data types, which makes sense since the columns in a structured array can be different data types.
If, instead of indexing, you slice out the first row, you get an ndarray:
print type(a[:1]) # = numpy.ndarray
This is analogous to how Python lists work:
b = [1, 2, 3]
print b[0] # 1
print b[:1] # [1]
Slicing returns a shortened version of the original sequence, but indexing returns an element (here, an int; above, a void type).
So when you slice into the rows of the structured array, you should expect it to behave just like your original array (only with fewer rows). Continuing with your example, you can now assign to the 'tuple' columns of the first row:
a[:1]['tuple'] = (1, 2)
So,... why doesn't a[0]['tuple'] = (1, 2) work?
Well, recall that a[0] returns a void object. So, when you call
a[0]['tuple'] = (1, 2) # this line fails
you're assigning a tuple to the 'tuple' element of that void object. Note: despite the fact you've called this index 'tuple', it was stored as an ndarray:
print type(a[0]['tuple']) # = numpy.ndarray
So, this means the tuple needs to be cast into an ndarray. But, the void object can't cast assignments (this is just a guess) because it can contain arbitrary data types so it doesn't know what type to cast to. To get around this you can cast the input yourself:
a[0]['tuple'] = np.array((1, 2))
The fact that we get different errors suggests that the above line might not work for you since casting addresses the error I received---not the one you received.
Addendum:
So why does the following work?
a[0]['tuple'][:] = (1, 2)
Here, you're indexing into the array when you add [:], but without that, you're indexing into the void object. In other words, a[0]['tuple'][:] says "replace the elements of the stored array" (which is handled by the array), a[0]['tuple'] says "replace the stored array" (which is handled by void).
Epilogue:
Strangely enough, accessing the row (i.e. indexing with 0) seems to drop the base array, but it still allows you to assign to the base array.
print a['tuple'].base is a # = True
print a[0].base is a # = False
a[0] = ((1, 2),) # `a` is changed
Maybe void is not really an array so it doesn't have a base array,... but then why does it have a base attribute?
This was an upstream bug, fixed as of NumPy PR #5947, with a fix in 1.9.3.

Categories

Resources