I have a list of numbers which I wish to add a second column such that the array becomes 2D like in the example below:
a = [1,1,1,1,1]
b = [2,2,2,2,2]
should become:
c = [[1,2],[1,2],[1,2],[1,2],[1,2]]
I am not sure how to do this using numpy?
I would just stack them and then transpose the resulting array with .T:
import numpy as np
a = np.array([1, 1, 1, 1, 1])
b = np.array([2, 2, 2, 2, 2])
c = np.stack((a, b)).T
Use numpy built-in functions:
import numpy as np
c = np.vstack((np.array(a),np.array(b))).T.tolist()
np.vstack stacks arrays vertically. .T transposes the array and tolist() converts it back to a list.
Another similar way to do it, is to add a dimensions using [:,None] and then you can horizontally stack them without the need to transpose:
c = np.hstack((np.array(a)[:,None],np.array(b)[:,None])).tolist())
output:
[[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
Related
I have a numpy array with shape (100,9,17,2). I need to transform it to (100,15,17,2) adding zeros in missing cells. I created a zeros(100,6,17,2) but I can't merge them. Can you help me?
Use numpy's concatenate function.
result = np.concatenate([input_array, np.zeros((100, 6, 17, 2), dtype=input_array.dtype)], axis=1)
The axis argument is the argument over which you want the dimensions to be added.
import numpy as np
a = np.array ([[1, 2], [2, 2]])
b = np.array ([[1, 2], [2, 2], [1,2]])
print (a.shape, b.shape)
output: (2,2), (3,2)
Now, to concatenate both array mention axis as well.
c = np.concatenate((a,b), axis=0)
output: array([[1, 2],
[2, 2],
[1, 2],
[2, 2],
[1, 2]])
Very simple, you create an array containing zeros using the reference shape:
c = np.zeros((100, 15, 17, 2))
then insert the other array in the correct elements:
c[:a.shape[0],b.shape[1]:,:a.shape[2],:a.shape[3]] = a
Finally add them up:
merged = b + c
You can use, numpy.apply_along_axis() and numpy.pad()
new_array = np.apply_along_axis(lambda x: np.pad(x,(0,6)), 1, input_array)
I have a NumPy array with each row representing some (x, y, z) coordinate like so:
a = array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
I also have another NumPy array with unique values of the z-coordinates of that array like so:
b = array([1, 2])
How can I apply a function, let's call it "f", to each of the groups of rows in a which correspond to the values in b? For example, the first value of b is 1 so I would get all rows of a which have a 1 in the z-coordinate. Then, I apply a function to all those values.
In the end, the output would be an array the same shape as b.
I'm trying to vectorize this to make it as fast as possible. Thanks!
Example of an expected output (assuming that f is count()):
c = array([2, 2])
because there are 2 rows in array a which have a z value of 1 in array b and also 2 rows in array a which have a z value of 2 in array b.
A trivial solution would be to iterate over array b like so:
for val in b:
apply function to a based on val
append to an array c
My attempt:
I tried doing something like this, but it just returns an empty array.
func(a[a[:, 2]==b])
The problem is that the groups of rows with the same Z can have different sizes so you cannot stack them into one 3D numpy array which would allow to easily apply a function along the third dimension. One solution is to use a for-loop, another is to use np.split:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2],
[4, 3, 1]])
a_sorted = a[a[:,2].argsort()]
inds = np.unique(a_sorted[:,2], return_index=True)[1]
a_split = np.split(a_sorted, inds)[1:]
# [array([[0, 0, 1],
# [4, 5, 1],
# [4, 3, 1]]),
# array([[1, 1, 2],
# [4, 5, 2]])]
f = np.sum # example of a function
result = list(map(f, a_split))
# [19, 15]
But imho the best solution is to use pandas and groupby as suggested by FBruzzesi. You can then convert the result to a numpy array.
EDIT: For completeness, here are the other two solutions
List comprehension:
b = np.unique(a[:,2])
result = [f(a[a[:,2] == z]) for z in b]
Pandas:
df = pd.DataFrame(a, columns=list('XYZ'))
result = df.groupby(['Z']).apply(lambda x: f(x.values)).tolist()
This is the performance plot I got for a = np.random.randint(0, 100, (n, 3)):
As you can see, approximately up to n = 10^5 the "split solution" is the fastest, but after that the pandas solution performs better.
If you are allowed to use pandas:
import pandas as pd
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').agg(f)
Here f can be any custom function working on grouped data.
Numeric example:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').size()
z
1 2
2 2
dtype: int64
Remark that .size is the way to count number of rows per group.
To keep it into pure numpy, maybe this can suit your case:
tmp = np.array([a[a[:,2]==i] for i in b])
tmp
array([[[0, 0, 1],
[4, 5, 1]],
[[1, 1, 2],
[4, 5, 2]]])
which is an array with each group of arrays.
c = np.array([])
for x in np.nditer(b):
c = np.append(c, np.where((a[:,2] == x))[0].shape[0])
Output:
[2. 2.]
This question already has answers here:
Construct two dimensional numpy array from indices and values of a one dimensional array
(3 answers)
Closed 4 years ago.
I am trying to convert a numpy array
np.array([1,3,2])
to
np.array([[1,0,0],[0,0,1],[0,1,0]])
Any idea of how to do this efficiently?
Thanks!
Create an bool array, and then fill it:
import numpy as np
a = np.array([1, 2, 3, 0, 3, 2, 1])
b = np.zeros((len(a), a.max() + 1), bool)
b[np.arange(len(a)), a] = 1
It is also possible to just select the right values from np.eye or the identity matrix:
a = np.array([1,3,2])
b = np.eye(max(a))[a-1]
This would probably be the most straight forward.
You can compare to [1, 2, 3] like so:
>>> a = np.array([1,3,2])
>>> np.equal.outer(a, np.arange(1, 4)).view(np.int8)
array([[1, 0, 0],
[0, 0, 1],
[0, 1, 0]], dtype=int8)
or equivalent but slightly faster
>>> (a[:, None] == np.arange(1, 4)).view(np.int8)
Try pandas's get dummy method.
import pandas as pd
import numpy as np
arr = np.array([1 ,3, 2])
df = pd.get_dummies(arr)
if what you need is numpy array object, do:
arr2 = df.values
Suppose I have an array, I want to have a matrix from that array by a matrix of index.
import numpy as np
arr = np.array([1,5])
mtxidx = np.array([[0,1,0],[0,1,1],[0,0,0]])
How can I get a matrix [[1,5,1],[1,5,5],[1,1,1]] ?
An initial thought is simply say
arr(mtxidx)
however it doesn't work
Is there any function/method that do this elegantly?
"Fancy" indexing works for me (NB in your question you are trying to call the array object (round brackets) but NumPy "ndarray" objects are not callable):
In [61]: arr[mtxidx]
Out[61]:
array([[1, 5, 1],
[1, 5, 5],
[1, 1, 1]])
Your initial thought was pretty close, simply replacing the parenthesis with [] would make it work.
arr[mtxidx]
A list comprehension would work as well.
>>> np.array([arr[row] for row in mtxidx])
array([[1, 5, 1],
[1, 5, 5],
[1, 1, 1]])
I upvote the fancy indexing proposed by #xnx but if you would have done something in same range but involving an operation (or ..anything else) you can also try this :
arr = np.array([1,5])
mtxidx = np.array([[0,1,0],[0,1,1],[0,0,0]])
def func(v):
return arr[v]
vfunc = np.vectorize(func)
vfunc(mtxidx)
# array([[1, 5, 1],
# [1, 5, 5],
# [1, 1, 1]])
I have an array 3D array of points onto which I would like to append a corresponding value in a flat array.
points = [[1,2,3],[4,5,6]]
info = [1,2]
Is there a built in way to append elements from the second array to the corresponding positions in the first?
output = [[1,2,3,1],[4,5,6,2]]
Use np.hstack:
points = np.array([[1,2,3],[4,5,6]])
info = np.array([1, 2])
output = np.hstack([points, info.reshape(2,1)])
Output:
array([[1, 2, 3, 1],
[4, 5, 6, 2]])