I am trying to recursively define a numpy array of N dimensions. After researching for several hours, I have came across a couple of ways this might work (np.append and np.concatenate), however neither of these has given me the desired output. I've been getting either:
[-0.6778734 -0.73517866 -0.73517866 0.6778734 ] (1-d array) or
[array([-0.6778734 , -0.73517866]), array([-0.73517866, 0.6778734 ])] (a list of arrays)
My Input:
[(1.2840277121727839, array([-0.6778734, -0.73517866])),
(0.049083398938327472, array([-0.73517866, 0.6778734 ]))]
Desired output:
array([-0.6778734, -0.73517866], [-0.73517866, 0.6778734])
Is it possible to create a numpy array from arrays, because converting them to lists and back to arrays seems computationally inefficient?
Thanks in advance!
Your input is a list of tuples, each tuple consisting of a number and an array. For some reason you want to throw away the number, and just combine the arrays into a larger array - is that right?
In [1067]: x=[(1.2840277121727839, np.array([-0.6778734, -0.73517866])),
(0.049083398938327472, np.array([-0.73517866, 0.6778734 ]))]
In [1068]: x
Out[1068]:
[(1.2840277121727839, array([-0.6778734 , -0.73517866])),
(0.04908339893832747, array([-0.73517866, 0.6778734 ]))]
A list comprehension does a nice job of extracting the desired elements for the tuples:
In [1069]: [y[1] for y in x]
Out[1069]: [array([-0.6778734 , -0.73517866]), array([-0.73517866, 0.6778734 ])]
and vstack is great for combining arrays into a larger one.
In [1070]: np.vstack([y[1] for y in x])
Out[1070]:
array([[-0.6778734 , -0.73517866],
[-0.73517866, 0.6778734 ]])
vstack is just concatenate with an added step that ensures the inputs are 2d.
np.array([y[1] for y in x]) also works, since you are adding a dimension.
I'm assuming that array([-0.6778734, -0.73517866], [-0.73517866, 0.6778734]) has a typo - that it is missing a set of []. The 2nd parameter to np.array is the dtype, not another list.
Note that both np.array and np.concatentate take a list. It can be list of lists, or list of arrays. It doesn't make much difference. And at this stage don't worry about computational efficiency. Any time you combine the data from 2 or more arrays there will be copying. Arrays have a fixed size, and can't 'grow' without making a new copy.
In [1074]: np.concatenate([y[1] for y in x]).reshape(2,2)
Out[1074]:
array([[-0.6778734 , -0.73517866],
[-0.73517866, 0.6778734 ]])
Lists are effectively 1d, so np.concatenate joins them on that dimension, producing a 4 element 1d array. reshape corrects that. vstack makes them both (1,2) and does a concatenate on the 1st dimension.
Another expression that joins the arrays on a new dimension:
np.concatenate([y[1][None,...] for y in x], axis=0)
The [None,...] adds a new dimension at the start.
Try this:
import numpy as np
a = np.array([1, 2])
b = np.array([3, 4])
print(repr(np.vstack((a, b))))
Gives:
array([[1, 2],
[3, 4]])
You can form the desired 2D array given a list input_data of the form
input_data = [(1.2840277121727839, np.array([-0.6778734, -0.73517866])),
(0.049083398938327472, np.array([-0.73517866, 0.6778734 ]))]
via
nparr = np.array(list(row[1] for row in input_data))
Related
I'm trying to convert bytes to numpy array of fixed size tuples (2 or 3 doubles) and it must be 1d array.
What I managed to get is:
values = np.fromstring(data, (np.double, (n,))) - it gives me 2d array with shape (105107, 2)
array([[0.03171165, 0.03171165],
[0.03171165, 0.03171165],
[0.03020949, 0.03020949],
...,
[0.05559354, 0.16173067],
[0.12667986, 0.04522982],
[0.14062567, 0.11422881]])
values = np.fromstring(data, [('dt', np.double, (n,))]) - it gives me 1d array with shape (105107,), but array contains tuples containing array with two doubles
array([([0.03171165, 0.03171165],), ([0.03171165, 0.03171165],),
([0.03020949, 0.03020949],), ..., ([0.05559354, 0.16173067],),
([0.12667986, 0.04522982],), ([0.14062567, 0.11422881],)],
dtype=[('dt', '<f8', (2,))])
is there any efficient way to achieve 1d array like this?:
array([(0.03171165, 0.03171165),
(0.03171165, 0.03171165),
(0.03020949, 0.03020949),
...,
(0.05559354, 0.16173067),
(0.12667986, 0.04522982),
(0.14062567, 0.11422881)])
No, I don't know an efficient way, but as nobody has so far posted any answer at all, here is a way that at least gets you the desired output. However, efficient it is not.
values = np.fromstring(data, (np.double, (n,)))
x = np.empty(values.shape[0], dtype=np.object)
for i, a in enumerate(values):
x[i] = tuple(a)
I would add that if you have an array of objects, it so much negates the benefits of using vectorisation in numpy, that you might as well just use a list instead:
values = np.fromstring(data, (np.double, (n,)))
x = [tuple(a) for a in values]
A possible alternative approach to generating the array of tuples -- not sure if it is any faster -- would be to go via such a list, and convert it back into an array in such a way as to deliberately break the conversion to a nice ordinary 2-d array that numpy would otherwise do:
values = np.fromstring(data, (np.double, (n,)))
x = [tuple(a) for a in values]
x.append(None)
y = np.array(x)[:-1]
I already solved the problem using this code:
names = ['d{i}'.format(i=i) for i in range(n)]
value = np.fromstring(data, {
'names': names,
'formats': [np.double] * n
})
I'm new to Numpy library from Python and I'm not sure what I'm doing wrong here, could you help me please with this?
So, I initialize my ndarray like this.
A = np.array([])
And then I'm training to append into this array A a new array X which has a shape like (1000,32,32) if has any importance.
np.insert(A, X)
The problem here is that if I'm checking the ndarray A after that it's empty, even though the ndarray X has elements inside.
Could you explain me what exactly I'm doing wrong please?
Make sure to write back to A if you use np.append, as in A = np.append(A,X) -- the top-level numpy functions like np.insert and np.append are usually immutable, so even though it gives you a value back, it's your job to store it. np.array likes to flatten the np.ndarray if you use append, so honestly, I think you just want a regular list for A, and that append method is mutable, so no need to write it back.
>>> A = []
>>> X = np.ndarray((1000,32,32))
>>> A.append(X)
>>> print(A)
[array([[[1.43351171e-316, 4.32573840e-317, 4.58492919e-320, ...,
1.14551501e-259, 6.01347002e-154, 1.39804329e-076],
[1.39803697e-076, 1.39804328e-076, 1.39642638e-076, ...,
1.18295070e-076, 7.06474122e-096, 6.01347002e-154],
[1.39804328e-076, 1.39642638e-076, 1.39804065e-076, ...,
1.05118732e-153, 6.01334510e-154, 3.24245662e-086],
...
In [10]: A = np.array([])
In [11]: A.shape
Out[11]: (0,)
In [13]: np.concatenate([A, np.ones((2,3))])
---------------------------------------------------------------------------
...
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)
So one first things you need to learn about numpy arrays is that they have shape, and a number of dimensions. Hopefully that error message is clear.
Concatenate with another 1d array does work:
In [14]: np.concatenate([A, np.arange(3)])
Out[14]: array([0., 1., 2.])
But that is just np.arange(3). The concatenate does nothing for us. OK, you might imagine starting a loop like this. But don't. This is not efficient.
You could easily concatenate a list of arrays, as long as the dimensions obey the rules specified in the docs. Those rules are logical, as long as you take the dimensions of the arrays seriously.
In [15]: X = np.ones((1000,32,32))
In [16]: np.concatenate([X,X,X], axis=1).shape
Out[16]: (1000, 96, 32)
I have function predicton like
def predictions(degree):
some magic,
return an np.ndarray([0..100])
I want to call this function for a few values of degree and use it to populate a larger np.ndarray (n=2), filling each row with the outcome of the function predictions. It seems like a simple task but somehow I cant get it working. I tried with
for deg in [1,2,4,8,10]:
np.append(result, predictions(deg),axis=1)
with result being an np.empty(100). But that failed with Singleton array array(1) cannot be considered a valid collection.
I could not get fromfunction it only works on a coordinate tuple, and the irregular list of degrees is not covered in the docs.
Don't use np.ndarray until you are older and wiser! I couldn't even use it without rereading the docs.
arr1d = np.array([1,2,3,4,5])
is the correct way to construct a 1d array from a list of numbers.
Also don't use np.append. I won't even add the 'older and wiser' qualification. It doesn't work in-place; and is slow when used in a loop.
A good way of building a 2 array from 1d arrays is:
alist = []
for i in ....:
alist.append(<alist or 1d array>)
arr = np.array(alist)
provided all the sublists have the same size, arr should be a 2d array.
This is equivalent to building a 2d array from
np.array([[1,2,3], [4,5,6]])
that is a list of lists.
Or a list comprehension:
np.array([predictions(i) for i in range(10)])
Again, predictions must all return the same length arrays or lists.
append is in the boring section of numpy. here you know the shape in advance
len_predictions = 100
def predictions(degree):
return np.ones((len_predictions,))
degrees = [1,2,4,8,10]
result = np.empty((len(degrees), len_predictions))
for i, deg in enumerate(degrees):
result[i] = predictions(deg)
if you want to store the degree somehow, you can use custom dtypes
For example, I want a 2 row matrix, with a first row of length 1, and second row of length 2. I could do,
list1 = np.array([1])
list2 = np.array([2,3])
matrix = []
matrix.append(list1)
matrix.append(list2)
matrix = np.array(matrix)
I wonder if I could declare a matrix of this shape directly in the beginning of a program without going through the above procedure?
A matrix is by definition a rectangular array of numbers. NumPy does not support arrays that do not have a rectangular shape. Currently, what your code produces is an array, containing a list (matrix), containing two more arrays.
array([array([1]), array([2, 3])], dtype=object)
I don't really see what the purpose of this shape could be, and would advise you simply use nested lists for whatever you are doing with this shape. Should you have found some use for this structure with NumPy however, you can produce it much more idiomatically like this:
>>> np.array([list1,list2])
array([array([1]), array([2, 3])], dtype=object)
I have 4 arrays (all the same length) which I am trying to stack together to create a new array, with each of the 4 arrays being a row.
My first thought was this:
B = -np.array([[x1[i]],[x2[j]],[y1[i]],[y2[j]]])
However the shape of that is (4,1,20).
To get the 2D output I expected I resorted to this:
B = -np.vstack((np.vstack((np.vstack(([x1[i]],[x2[j]])),[y1[i]])),[y2[j]]))
Where the shape is (4,20).
Is there a better way to do this? And why would the first method not work?
Edit
For clarity, the shapes of x1[i], x2[j], y1[i], y2[j] are all (20,).
The problem is with the extra brackets:
B = -np.array([[x1[i]],[x2[j]],[y1[i]],[y2[j]]]) # (4,1,20)
B = -np.array([x1[i],x2[j],y1[i],y2[j]]) # (4,20)
[[x1[i]] is (1,20) in shape.
In [26]: np.array([np.ones((20,)),np.zeros((20,))]).shape
Out[26]: (2, 20)
vstack works, but np.array does just as well. It's concatenate that needs the extra brackets
In [27]: np.vstack([np.ones((20,)),np.zeros((20,))]).shape
Out[27]: (2, 20)
In [28]: np.concatenate([np.ones((20,)),np.zeros((20,))]).shape
Out[28]: (40,)
In [29]: np.concatenate([[np.ones((20,))],[np.zeros((20,))]]).shape
vstack doesn't need the extra dimensions because it first passes the arrays through [atleast_2d(_m) for _m in tup]
np.vstack takes a sequence of equal-length arrays to stack, one on top of the other, as long as they have compatible shapes. So in your case, a tuple of the one-dimensional arrays would do:
np.vstack((x1[i], x2[j], y1[i], y2[j]))
would do what you want. If this statement is part of a loop building many such 4x20 arrays, however, that may be a different matter.