Series with 1d array to 2d numpy array - python

I have a Pandas Series containing 1D arrays/lists. I want to extract it to a 2D NumPy array.
s=pd.Series([[1,2,3,4],[5,6,7,8]])
With to_numpy() I get a 1D array looking like this
array([list([1, 2, 3, 4]), list([5, 6, 7, 8])], dtype=object)
However, I want something like array([[1,2,3,4],[5,6,7,8]]).

Convert first to lists and then to array:
arr = np.array(s.tolist())
print (arr)
[[1 2 3 4]
[5 6 7 8]]

Related

Numpy: Swap value in 2D numpy array

I have a 2D numpy array:
arr = np.array(([[6,1,2],
[3,4,5],
[0,7,8]]))
I use a other 1D numpy array:
value = np.asarray([9,8,7,6,5,4,3,2,1])
I would like to change the values ​​of my 2D array with the index value of my 1D array
For example:
In my 2D array at position (0,0), I have the value 6. I must therefore modify the value (0,0) by the value present at index 6 of my 1D array, therefore 3.
So far I have this code:
value = np.asarray([9,8,7,6,5,4,3,2,1])
arr = np.array(([[6,1,2],[3,4,5],[0,7,8]]))
print(arr)
#[[6 1 2]
#[3 4 5]
#[0 7 8]]
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
arr[i,j] = value[arr[i,j]]
print(arr)
#[[3 8 7]
#[6 5 4]
#[9 2 1]]
The problem is that this code takes time on large tables. (10 seconds for an array of size 4096²)
Is there an effective way to solve this problem?
This is very simple, you just need a single command. Numpy automatically takes care of the vectorization.
arr = value[arr]
Here is an example with the data you provided:
>>> value[arr]
array([[3, 8, 7],
[6, 5, 4],
[9, 2, 1]])

Transforming a 2D array to 1D array to create Data Frame

I have three 2D np.array that mathematically are [8:1550] matrices, and I want to express them into 1D np.array of 12400 numbers (8 x 1550 = 12400...) so that I could create a DataFrame later with this code:
Exported_Data = pd.DataFrame({"UD": UD_Data, "NS": NS_Data, "EW": EW_Data})
Exported_Data.to_csv("EXCEL.csv")
To put a simpler example, if I have this:
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
And I want to obtain this from that:
B = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
What is the best way to do it?
I would suggest use reshape. It most likely creates a view and is more efficient whereas np.flatten creates a copy:
B = A.reshape(-1)
-1 implicitly takes care of required dimension size.
You can use A.flatten() to convert a 2D array to a 1D array.

How to split an 2D array, creating arrays from "row to row" values

I want to split an 2D array this way:
Example.
From this 4x4 2D array:
np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
Create these four 2x2 2D arrays:
np.array([[1,2],[3,4]])
np.array([[5,6],[7,8]])
np.array([[9,10],[11,12]])
np.array([[13,14],[15,16]])
In a general case, from a NxN 2D array (square arrays) create 2D arrays of KxK shape, as many as possible.
Just to be more precise: to create the output array, not necessarily it will be made of all values from the row.
Example:
From a 2D 8x8 array, with values from 1 to 64, if I want to split this array in 2D 2x2 arrays, the first row from 8x8 array is a row from 1 to 8, and the first output 2D 2x2 array will be np.array([[1,2],[3,4]]), and the second output 2D 2x2 array will be np.array([[5,6],[7,8]])... It continues until the last output 2D array, that will be np.array([[61,62],[63,64]]). Look that each 2D 2x2 array was not filled with all the values from the row (CORRECT).
There is a Numpy method that do this?
You're probably looking for something like numpy.reshape.
In your example:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,4)
>>>array([[1,2], [3,4], [5,6], [7,8]])
Or, as suggested by #MSeifert, using -1 as final dimension will let numpy do the division by itself:
numpy.array([[1,2,3,4], [5,6,7,8]]).reshape(2,-1)
>>>array([[1,2], [3,4], [5,6], [7,8]])
To get your desired output, you need to reshape to a 3D array and then unpack the first dimension:
>>> inp = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
>>> list(inp.reshape(-1, 2, 2))
[array([[1, 2],
[3, 4]]),
array([[5, 6],
[7, 8]]),
array([[ 9, 10],
[11, 12]]),
array([[13, 14],
[15, 16]])]
You can also unpack using = if you want to store the arrays in different variables instead of in one list of arrays:
>>> out1, out2, out3, out4 = inp.reshape(-1, 2, 2)
>>> out1
array([[1, 2],
[3, 4]])
If you're okay with a 3D array containing your 2D 2x2 arrays you don't need unpacking or the list() call:
>>> inp.reshape(-1, 2, 2)
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16]]])
The -1 is a special value for reshape. As the documentation states:
One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
If you want it more general, just take the square root of the row-length and use that as argument for reshape:
>>> inp = np.ones((8, 8)) # 8x8 array
>>> square_shape = 2
>>> inp.reshape(-1, square_shape, square_shape) # 16 2x2 arrays
>>> square_shape = 4
>>> inp.reshape(-1, square_shape, square_shape) # 4 4x4 arrays
If you want to split it row wise, you may do np.reshape(arr,(2,2), order='C')
If you want to split it column wise, you may do not.reshape(arr,(2,2), order='F')

Visualising a numpy array/matrix

I have a function that returns a list. I think I use np.append to add this list as a new line in an array, my intention is as follow:
list = 4 5 6
b = 1 2 3
b = np.append(b, list)
output;
1 2 3
4 5 6
This isn't the code I use (there's a lot of messing around in between). But the output I get is this:
2016-06-01 PRINT [ 99.86 99.928 99.9 99.875 99.8 89.7933
97.60018333 98.903 99.928 0.2801201 98.95 98.93
98.87 98.94 99.05 89.097 97.6712 98.87
99.59 0.23538903 99.711 99.732 99.725 99.724
99.769 89.777 98.12053333 99.68 99.88
0.30333219 99.805 99.79 99.743 99.71 99.69
89.7728 98.06653333 99.617 99.82 0.28981292
99.882 99.879 99.865 99.84 99.9 89.9206
98.29823333 99.82 100.08 0.31420778]
Is this a 10 column by 5 row array/matrix or is this a 50 column/row array? I feel like I'm missing something here - or is it just that the output doesn't really show the shape of the array?
True list append:
In [701]: alist = [4,5,6]
In [702]: b=[1,2,3]
In [703]: b.append(alist)
In [704]: b
Out[704]: [1, 2, 3, [4, 5, 6]]
bad array operation:
In [705]: anArray=np.array([4,5,6])
In [706]: b=np.array([1,2,3])
In [707]: b=np.append(b,anArray)
In [708]: b
Out[708]: array([1, 2, 3, 4, 5, 6])
In [709]: b.shape
Out[709]: (6,)
Here I just concatenated anArray onto b, making a longer array.
I've said this before - np.append is not a good function. It looks too much like the list append, and people end up misusing it. Either they miss the fact that it returns a new array, as opposed to modifying in-place. Or they use it repeatedly.
Here's the preferred way of collecting lists or arrays and joining them into one
In [710]: alist = []
In [711]: b=np.array([1,2,3]) # could be b=[1,2,3]
In [712]: alist.append(b)
In [713]: b=np.array([4,5,6]) # b=[4,5,6]
In [714]: alist.append(b)
In [715]: alist
Out[715]: [array([1, 2, 3]), array([4, 5, 6])]
In [716]: np.array(alist)
Out[716]:
array([[1, 2, 3],
[4, 5, 6]])
In [717]: _.shape
Out[717]: (2, 3)
The result is a 2d array. List append is much faster than array append (which is real array concatenate). Build the list and then make the array.
The most common way of defining a 2d array is with a list of lists:
In [718]: np.array([[1,2,3],[4,5,6]])
Out[718]:
array([[1, 2, 3],
[4, 5, 6]])
np.concatenate is another option for joining arrays and lists. If gives more control over how they are joined, but you have to pay attention to the dimensions of the inputs (you should pay attention to those anyways).
There are several 'stack' functions which streamline the dimension handling a bit, stack, hstack, vstack and yes, append. It's worth looking at their code.
you should use hstack or vstack
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.vstack((a,b))
gives
array([[1, 2, 3],
[4, 5, 6]])
or
np.hstack((a,b))
gives
array([1, 2, 3, 4, 5, 6])

Numpy list of 1D Arrays to 2D Array

I have a large list files that contain 2D numpy arrays pickled through numpy.save. I am trying to read the first column of each file and create a new 2D array.
I currently read each column using numpy.load with a mmap. The 1D arrays are now in a list.
col_list = []
for f in file_list:
Temp = np.load(f,mmap_mode='r')
col_list.append(Temp[:,0])
How can I convert this into a 2D array?
You can use
numpy.stack(arrays, axis=0)
if you have an array of arrays. You can specify the axis in case you want to stack columns and not rows.
The array may be recreated:
a = np.array(a.tolist())
You can just call np.array on the list of 1D arrays.
>>> import numpy as np
>>> arrs = [np.array([1,2,3]), np.array([4,5,6]), np.array([7,8,9])]
>>> arrs
[array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
>>> arr2d = np.array(arrs)
>>> arr2d.shape
(3, 3)
>>> arr2d
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

Categories

Resources