I have a large list files that contain 2D numpy arrays pickled through numpy.save. I am trying to read the first column of each file and create a new 2D array.
I currently read each column using numpy.load with a mmap. The 1D arrays are now in a list.
col_list = []
for f in file_list:
Temp = np.load(f,mmap_mode='r')
col_list.append(Temp[:,0])
How can I convert this into a 2D array?
You can use
numpy.stack(arrays, axis=0)
if you have an array of arrays. You can specify the axis in case you want to stack columns and not rows.
The array may be recreated:
a = np.array(a.tolist())
You can just call np.array on the list of 1D arrays.
>>> import numpy as np
>>> arrs = [np.array([1,2,3]), np.array([4,5,6]), np.array([7,8,9])]
>>> arrs
[array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
>>> arr2d = np.array(arrs)
>>> arr2d.shape
(3, 3)
>>> arr2d
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Related
I have three 2D np.array that mathematically are [8:1550] matrices, and I want to express them into 1D np.array of 12400 numbers (8 x 1550 = 12400...) so that I could create a DataFrame later with this code:
Exported_Data = pd.DataFrame({"UD": UD_Data, "NS": NS_Data, "EW": EW_Data})
Exported_Data.to_csv("EXCEL.csv")
To put a simpler example, if I have this:
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
And I want to obtain this from that:
B = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
What is the best way to do it?
I would suggest use reshape. It most likely creates a view and is more efficient whereas np.flatten creates a copy:
B = A.reshape(-1)
-1 implicitly takes care of required dimension size.
You can use A.flatten() to convert a 2D array to a 1D array.
I have a n x m numpy python array. Values of this array range between 0 and 10 what coincides with the indices of a 1D numpy array (length of 1D array = 11, so 0-10) . This 1D array stores abritrary values. I want now to replace the indices values of my 2D array with the corresponding values from the 1D array.
How can I do this ?
Cheers
Norksen
It's straightforward with numpy: result = values_1d[arr_idxs_2d]
See:
>>> import numpy as np
>>> imin, imax = 0, 9
>>> values_1d = np.random.randn(imax+1)
>>> values_1d
array([-0.80355548, 1.00645905, -1.17436992, -1.52008027, 0.73544379,
-0.72175806, -0.69933584, -0.50195895, -0.04588092, 0.86542796])
>>> arr_idxs_2d = np.random.randint(imin, imax, (4, 5))
>>> arr_idxs_2d
array([[6, 7, 4, 3, 1],
[8, 8, 3, 7, 7],
[1, 1, 5, 3, 2],
[6, 2, 3, 6, 8]])
>>> values_1d[arr_idxs_2d]
array([[-0.69933584, -0.50195895, 0.73544379, -1.52008027, 1.00645905],
[-0.04588092, -0.04588092, -1.52008027, -0.50195895, -0.50195895],
[ 1.00645905, 1.00645905, -0.72175806, -1.52008027, -1.17436992],
[-0.69933584, -1.17436992, -1.52008027, -0.69933584, -0.04588092]])
I have a Pandas Series containing 1D arrays/lists. I want to extract it to a 2D NumPy array.
s=pd.Series([[1,2,3,4],[5,6,7,8]])
With to_numpy() I get a 1D array looking like this
array([list([1, 2, 3, 4]), list([5, 6, 7, 8])], dtype=object)
However, I want something like array([[1,2,3,4],[5,6,7,8]]).
Convert first to lists and then to array:
arr = np.array(s.tolist())
print (arr)
[[1 2 3 4]
[5 6 7 8]]
I don't understand array as index in Python Numpy.
For example, I have a 2d array A in Numpy
[[1,2,3]
[4,5,6]
[7,8,9]
[10,11,12]]
What does A[[1,3], [0,1]] mean?
Just test it for yourself!
A = np.arange(12).reshape(4,3)
print(A)
>>> array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
By slicing the array the way you did (docs to slicing), you'll get the first row, zero-th column element and the third row, first column element.
A[[1,3], [0,1]]
>>> array([ 3, 10])
I'd highly encourage you to play around with that a bit and have a look at the documentation and the examples.
Your are creating a new array:
import numpy as np
A = [[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]]
A = np.array(A)
print(A[[1, 3], [0, 1]])
# [ 4 11]
See Indexing, Slicing and Iterating in the tutorial.
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas
Quoting the doc:
def f(x,y):
return 10*x+y
b = np.fromfunction(f, (5, 4), dtype=int)
print(b[2, 3])
# -> 23
You can also use a NumPy array as index of an array. See Index arrays in the doc.
NumPy arrays may be indexed with other arrays (or any other sequence- like object that can be converted to an array, such as lists, with the exception of tuples; see the end of this document for why this is). The use of index arrays ranges from simple, straightforward cases to complex, hard-to-understand cases. For all cases of index arrays, what is returned is a copy of the original data, not a view as one gets for slices.
I tried the following:
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> np.concatenate((a,b), axis=0)
array([1, 2, 3, 4, 5, 6])
>>> np.concatenate((a,b), axis=1)
array([1, 2, 3, 4, 5, 6])
However, I'd expect at least that one result looks like this
array([[1, 2, 3],
[4, 5, 6]])
Why is it not concatenated vertically?
Because both a and b have only one axis, as their shape is (3), and the axis parameter specifically refers to the axis of the elements to concatenate.
this example should clarify what concatenate is doing with axis. Take two vectors with two axis, with shape (2,3):
a = np.array([[1,5,9], [2,6,10]])
b = np.array([[3,7,11], [4,8,12]])
concatenates along the 1st axis (rows of the 1st, then rows of the 2nd):
np.concatenate((a,b), axis=0)
array([[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11],
[ 4, 8, 12]])
concatenates along the 2nd axis (columns of the 1st, then columns of the 2nd):
np.concatenate((a, b), axis=1)
array([[ 1, 5, 9, 3, 7, 11],
[ 2, 6, 10, 4, 8, 12]])
to obtain the output you presented, you can use vstack
a = np.array([1,2,3])
b = np.array([4,5,6])
np.vstack((a, b))
array([[1, 2, 3],
[4, 5, 6]])
You can still do it with concatenate, but you need to reshape them first:
np.concatenate((a.reshape(1,3), b.reshape(1,3)))
array([[1, 2, 3],
[4, 5, 6]])
Finally, as proposed in the comments, one way to reshape them is to use newaxis:
np.concatenate((a[np.newaxis,:], b[np.newaxis,:]))
If the actual problem at hand is to concatenate two 1-D arrays vertically, and we are not fixated on using concatenate to perform this operation, I would suggest the use of np.column_stack:
In []: a = np.array([1,2,3])
In []: b = np.array([4,5,6])
In []: np.column_stack((a, b))
array([[1, 4],
[2, 5],
[3, 6]])
A not well known feature of numpy is to use r_. This is a simple way to build up arrays quickly:
import numpy as np
a = np.array([1,2,3])
b = np.array([4,5,6])
c = np.r_[a[None,:],b[None,:]]
print(c)
#[[1 2 3]
# [4 5 6]]
The purpose of a[None,:] is to add an axis to array a.
a = np.array([1,2,3])
b = np.array([4,5,6])
np.array((a,b))
works just as well as
np.array([[1,2,3], [4,5,6]])
Regardless of whether it is a list of lists or a list of 1d arrays, np.array tries to create a 2d array.
But it's also a good idea to understand how np.concatenate and its family of stack functions work. In this context concatenate needs a list of 2d arrays (or any anything that np.array will turn into a 2d array) as inputs.
np.vstack first loops though the inputs making sure they are at least 2d, then does concatenate. Functionally it's the same as expanding the dimensions of the arrays yourself.
np.stack is a new function that joins the arrays on a new dimension. Default behaves just like np.array.
Look at the code for these functions. If written in Python you can learn quite a bit. For vstack:
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
Suppose you have 3 NumPy arrays (A, B, C). You can contact these arrays vertically like this:
import numpy as np
np.concatenate((A, B, C), axis=1)
np.shape