Related
Is there an effective way in which to compare all three numpy arrays at once?
For example, if the given value to check is 5, wherever the value is not 5, it should be not 5 for all three arrays.
The only way I've thought of how to do this would be checking that occurrences that arr1 != 5 & arr2 == 5 is 0. However this only checks one direction between the two arrays, and then I need to also incorporate arr3. This seems inefficient and might end up with some logical hole.
This should pass:
arr1 = numpy.array([[1, 7, 3],
[4, 5, 6],
[4, 5, 2]])
arr2 = numpy.array([[1, 2, 3],
[4, 5, 6],
[8, 5, 6]])
arr3 = numpy.array([[1, 1, 3],
[4, 5, 6],
[9, 5, 6]])
However this should fail due to arr2 having a 3 where other arrays have 5s
arr1 = numpy.array([[1, 2, 3],
[8, 5, 6],
[4, 5, 6]])
arr2 = numpy.array([[1, 2, 3],
[2, 3, 1],
[2, 5, 6]])
arr3 = numpy.array([[1, 2, 3],
[4, 5, 6],
[4, 5, 3]])
There is a general solution (regardless number of arrays). And it's quite educational:
import numpy as np #a recommended way of import
arr = np.array([arr1, arr2, arr3])
is_valid = np.all(arr==5, axis=0) == np.any(arr==5, axis=0) #introduce axis
out = np.all(is_valid)
#True for the first case, False for the second one
Is this a valid solution?
numpy.logical_and(((arr1==5)==(arr2==5)).all(), ((arr2==5)==(arr3==5)).all())
You could AND all comparisons to 5 and compare to any one of the comparisons:
A = (arr1==5)
(A==(A&(arr2==5)&(arr3==5))).all()
Output: True for the first example, False for the second
NB. This works for any number of arrays
Suppose I have a numpy array as below
a = np.asarray([[1,2,3],[1,4,3],[2,5,4],[2,7,5]])
array([[1, 2, 3],
[1, 4, 3],
[2, 5, 4],
[2, 7, 5]])
How can I flatten column 2 and 3 for each unique element in column 1 like below:
array([[1, 2, 3, 4, 3],
[2, 5, 4, 7, 5],])
Thank you for your help.
Another option using list comprehension:
np.array([np.insert(a[a[:,0] == k, 1:].flatten(), 0, k) for k in np.unique(a[:,0])])
# array([[1, 2, 3, 4, 3],
# [2, 5, 4, 7, 5]])
import numpy as np
a = np.asarray([[1,2,3],[1,4,3],[2,5,4],[2,7,5]])
d = {}
for row in a:
d[row[0]] = np.concatenate( (d.get(row[0], []), row[1:]) )
r = np.array([np.concatenate(([key], d[key])) for key in d])
print(r)
This prints:
[[ 1. 2. 3. 4. 3.]
[ 2. 5. 4. 7. 5.]]
Since as posted in the comments, we know that each unique element in column-0 would have a fixed number of rows and by which I assumed it was meant same number of rows, we can use a vectorized approach to solve the case. We sort the rows based on column-0 and look for shifts along it, which would signify group change and thus give us the exact number of rows associated per unique element in column-0. Let's call it L. Finally, we slice sorted array to select columns-1,2 and group L rows together by reshaping. Thus, the implementation would be -
sa = a[a[:,0].argsort()]
L = np.unique(sa[:,0],return_index=True)[1][1]
out = np.column_stack((sa[::L,0],sa[:,1:].reshape(-1,2*L)))
For more performance boost, we can use np.diff to calculate L, like so -
L = np.where(np.diff(sa[:,0])>0)[0][0]+1
Sample run -
In [103]: a
Out[103]:
array([[1, 2, 3],
[3, 7, 8],
[1, 4, 3],
[2, 5, 4],
[3, 8, 2],
[2, 7, 5]])
In [104]: sa = a[a[:,0].argsort()]
...: L = np.unique(sa[:,0],return_index=True)[1][1]
...: out = np.column_stack((sa[::L,0],sa[:,1:].reshape(-1,2*L)))
...:
In [105]: out
Out[105]:
array([[1, 2, 3, 4, 3],
[2, 5, 4, 7, 5],
[3, 7, 8, 8, 2]])
I have two arrays of same size. In general dtype of these arrays is object (dtype = 'O'). What is the best way to access elements with same indicies from both arrays.
Possibility 1:
remove_indices = [i for i in range(len(array1)) if value in array1]
array1 = np.delete(array1, remove_indices, 0)
array2 = np.delete(array2, remove_indices, 0)
Possibility 2:
array3 = np.array([[array1[i], array2[i]] for i in range(len(array1))
if value not in array1[i]])
array1 = array3[:,0]
array2 = array3[:,1]
Note that Possibility 2 is faster. Is there any other solution with similar execution time (or faster)? How could I make Possiblity 2 more readable?
Not sure to understand well your examples, but sticking to What is the best way to access elements with same indicies from both arrays. make me think about zip. But using numpy why not using transpose ?
Like:
>>> array1 = numpy.array([0, 1, 2, 3, 4])
>>> array2 = numpy.array([5, 6, 7, 8, 9])
>>> numpy.array([array1, array2])
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> numpy.array([array1, array2]).T
array([[0, 5],
[1, 6],
[2, 7],
[3, 8],
[4, 9]])
Need to take the values from one array, put them through a function and put them in another array. It is meant to be done using a pair of nested for loops. Please help. Complete beginner here.
EDIT: Ok to clarify, I have a 2-d array with various values in it. I want to apply a function to all of these values and have a 2-d array returned with the values after they have gone through the function. I am working in python. Thanks for the quick responses and any help you can give!
EDIT3: Example code:
import numpy as N
def makeGrid(dim):
''' Function to return a grid of distances from the centre of an array.
This version uses loops to fill the array and is thus slow.'''
tabx = N.arange(dim) - float(dim/2.0) + 0.5
taby = N.arange(dim) - float(dim/2.0) + 0.5
grid = N.zeros((dim,dim), dtype='float')
for y in range(dim):
for x in range(dim):
grid[y,x] = N.sqrt(tabx[x]**2 + taby[y]**2)
return grid
import math
def BigGrid(dim):
l= float(raw_input('Enter a value for lambda: '))
p= float(raw_input('Enter a value for phi: '))
a = makeGrid
b= N.zeros ((10,10),dtype=float) #Create an arry to take the returned values
for i in range(10):
for j in range (10):
b[i,j] = a[i][j]*2
if __name__ == "__main__":
''' Module test code '''
size = 10 #Dimension of the array
newGrid = BigGrid(size)
newGrid = N.round(newGrid, decimals=2)
print newGrid
def map_row(row):
return map(some_function,row)
map(map_row,my_2d_list)
Is probably how I would do it...
Based on your question, it appears you're using Numpy. If you're not too concerned about speed, you can simply call the function with a numpy array; the function will operate on the entire array for you.
There's no need to write the iteration explicitly, though if you can find a way to take advantage of numpy's special features, that will be faster than using a function designed to operate on one element at a time. Unless you're working with a very large dataset, though, this should be fine:
import numpy as np
>>> g = np.array( [ [1,2,3], [ 4,5,6] ] )
array([[1, 2, 3],
[4, 5, 6]])
>>> def myfunc( myarray ):
... return 2 * myarray
...
>>> myfunc(g)
array([[ 2, 4, 6],
[ 8, 10, 12]])
First, you have a bug in your code in the following line:
a = makeGrid
You are setting a to be a function, not an array. You should have the following:
a = makeGrid(dim)
That is why you had the TypeError when you tried the answer by #abought.
Now, to apply an operation element-wise in numpy there are many possibilities. If you want to perform the same operation for every element in the array, the simplest way is to use array operations:
b = a * 2
(Note that you don't need to declare b beforehand. And you also don't need any loops.) Numpy has also many C-optimised functions that perform the same operation on each element of an array. These are called ufuncs. You can combine ufuncs to get complex expressions evaluated element-wise. For example:
b = N.sin(a**2) + N.log(N.abs(a))
Your a array from makeGrid() can also be much more efficiently created using array operations and numpy's mgrid:
grid = N.mgrid[-dim//2 + 1:dim//2:0.5, -dim//2 + 1:dim//2:0.5]
grid = N.sqrt(grid[0]**2 + grid[1]**2)
If you want to perform different operations on each array element, things get more complicated and it may not be possible to avoid loops. For these cases, numpy has a way to decompose loops on a nD array using ndenumerate or ndidex. Your example with ndenumerate:
for index, x in N.ndenumerate(a):
b[index] = x * 2
This is faster than multiple loops, but the array operations should be used whenever possible.
From what I can get in terms of context from the question and what a 2d-array typically means it looks like you are trying to do the following:
>>>> array2d = [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
>>> def add_two( v ):
... return v + 2
...
>>> [ [ add_two( v ) for v in row ] for row in array2d ]
[[2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6], [2, 3, 4, 5, 6]]
The above uses a list comprehension which is the same as using the two nested for loops and in this case more readable and involves less direct interaction of the list methods as you're describing what the list is rather than building it.
Here is a one-line with double map
map(lambda x:map(func, x), l)
Example:
l=[[1,2,3],[4,3,1]]
map(lambda x:map(lambda x:x*10,x),l)
[[10, 20, 30], [40, 30, 10]]
Easy to do it with a nested loop:
def my_function(n): # n will become y from the next part
new_num = # do whatever you want with it
return new_num
my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # just an example
new_list, final_list = [], [] # multiple assignment
for x in my_list:
print x
new_list = []
for y in x:
# y is now the first value of the first value of my_list--- 1.
my_num = my_function(y)
new_list.append(my_num)
final_list.append(new_list)
print final_list
That should do it.
Returns: [[2, 3, 4], [5, 6, 7], [8, 9, 10]].
for(int i; i < x; i++)
for(int j; j < y; j++)
array2[i][j] = func(array2[i][j])
Something like that?
For example, I have a ndarray that is:
a = np.array([1, 3, 5, 7, 2, 4, 6, 8])
Now I want to split a into two parts, one is all numbers <5 and the other is all >=5:
[array([1,3,2,4]), array([5,7,6,8])]
Certainly I can traverse a and create two new array. But I want to know does numpy provide some better ways?
Similarly, for multidimensional array, e.g.
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[2, 4, 7]])
I want to split it according to the first column <3 and >=3, which result is:
[array([[1, 2, 3],
[2, 4, 7]]),
array([[4, 5, 6],
[7, 8, 9]])]
Are there any better ways instead of traverse it? Thanks.
import numpy as np
def split(arr, cond):
return [arr[cond], arr[~cond]]
a = np.array([1,3,5,7,2,4,6,8])
print split(a, a<5)
a = np.array([[1,2,3],[4,5,6],[7,8,9],[2,4,7]])
print split(a, a[:,0]<3)
This produces the following output:
[array([1, 3, 2, 4]), array([5, 7, 6, 8])]
[array([[1, 2, 3],
[2, 4, 7]]), array([[4, 5, 6],
[7, 8, 9]])]
It might be a quick solution
a = np.array([1,3,5,7])
b = a >= 3 # variable with condition
a[b] # to slice the array
len(a[b]) # count the elements in sliced array
1d array
a = numpy.array([2,3,4,...])
a_new = a[(a < 4)] # to get elements less than 5
2d array based on column(consider value of column i should be less than 5,
a = numpy.array([[1,2],[5,6],...]
a = a[(a[:,i] < 5)]
if your condition is multicolumn based, then you can make a new array applying the conditions on the columns. Then you can just compare the new array with value 5(according to my assumption) to get indexes and follow above codes.
Note that, whatever i have written in (), returns the index array.