find index of certain array that is inside array - python

Is there a way to find an index of an array inside another array without converting them to list or using a for loop?
I have a huge data set and I don't want to add another loop and make it slower
arr = np.array([[11, 19, 18], [14, 15, 11], [19, 21, 46], [29, 21, 19]])
find_this_array = np.array([14, 15, 11])
# I want to avoid this
a = arr.tolist()
val = find_this_array.tolist()
a.index(val)
output:
1

You can try this:
np.where((arr == find_this_array).all(axis=1))[0][0]
output:
1
You can find more details about Numpy where from their documentation:
https://numpy.org/doc/stable/reference/generated/numpy.where.html

Related

How to extract non-zero values of a numpy array

I have a numpy array and want to extract some values out of it. This is my array:
arr= array([[0, 0, 23, 28],
[0, 19, 24, 29],
[0, 20, 25, 30],
[17, 21, 26, 31],
[18, 22, 27, 32]])
I want to first sort the non-zero part of it and chage it into:
arr= array([[0, 0, 27, 32],
[0, 22, 26, 31],
[0, 21, 25, 30],
[18, 20, 24, 29],
[17, 19, 23, 28]])
Then, from the first column, I want to extract the last two rows (18 and 17). In the second column I have four none zero one: it means 2 more nonzero rows compred to previous column. So, I want two upper rows. In the third column I see five non-zero rows which is one row more than the second row, so I want the one row. In last column, the difference of its non-zero rows with the previous one zero, so I do not want any row from it. Finally. I want to have these extracted numbersas a list or numpy array:
result= [17, 18, 21, 22, 27]
I tried the following but it was successful at all:
result=[]
for m in range (len (arr[0,:])):
for i in range (len (arr[:,m])):
if arr[i,m]==0 and arr[i+1,m]!=0:
b= arr[i+1:,m]
result.append (b)
I appreciate any help in advance.
Let's try:
mask = arr != 0
# mask the 0 with infinity and sort
new_arr = np.sort(np.where(mask, arr, np.inf), axis=0)
# replace back:
arr[:] = np.where(mask, new_arr[::-1], 0)
# extract the result
result = arr[np.arange(arr.shape[0]),mask.argmax(axis=1)]

Reshaping 2D array to 3D array for tiff conversion

I have a 2D array like below, where 1,2,3...6 are rows(6) and Alphabets are columns(4).
1 - A,B,C,D
2 - E,F,G,H
3 - I,J,K,L
4 - M,N,O,P
5 - Q,R,S,T
6 - U,V,W,X
I need to reshape this array to 2X3 array as mentioned below in a way that if I access "1" it should
return alphabets A,B,C,D
1, 2, 3
4, 5, 6
after reshaping I will save the array as multi-band tiff.
I know it's a very simple task and I can do it by creating an empty array and filling it using for loops, but I want to do it with reshape function or any simple method.
Kindly help me guys.
Your array format is a bit misleading. However, here's a minimal example I could prepare:
# example array
arr = pd.np.random.randint(1, 30, 30).reshape(-1, 5)
array([[25, 13, 24, 10, 14],
[13, 11, 2, 24, 20],
[16, 28, 5, 12, 24],
[ 2, 21, 24, 29, 21],
[21, 5, 18, 23, 23],
[22, 9, 10, 29, 9]])
# reshape the array by taking first value from each row
np.apply_along_axis(lambda x: x[0], 1, arr).reshape(-1, 3)
array([[25, 13, 16],
[ 2, 21, 22]])
Got answer from this link
how to save an array representing an image with 40 band to a .tif file
Actually I was saving tiff file using skimage's imsave command and skimage can handle only 4 channel data. Following command solve my issue.
tifffile.imsave("y.tif", x, planarconfig='contig') ie. band dimension last for contig.

Search a number in a sorted 2D array

I'm trying to find the number that I'm looking from in a 2D array list. However, it has to be sorted first before searching.
Everything seems to be working fine when I'm trying to find a number in the 2D array. It is just the fact of sorting the 2D array in a way that will still be working. Let's assume I want to sort a 3x3 2D array. The way that it should display is:
[[8, 27, 6],
[1, 0, 11],
[10, 9, 3]]
Then, I will be looking for a number by using the binary search method through the sorted 2D array. My mid value will be in the middle of the array from the search.
This is just an example, but what I want to accomplish when I put randomized numbers and then sort row and columns. Using this idea, I'm using the random.randint() library from Python to randomized my numbers. Then, I'm trying to sort afterward in my 2d array, but it isn't really sorting before continuing.
n = 5
m = 5
def findnum_arr(array, num):
low = 0
high = n * m - 1
while (high >= low):
mid = (low + high) // 2
i = mid // m
j = mid % m
if (num == array[i][j]):
return True
if (num < array[i][j]):
high = mid - 1
else:
low = mid + 1
return False
if __name__ == '__main__':
multi_array = [[random.randint(0, 20) for x in range(n)] for y in range(m)]
sorted(multi_array)
Sorted:
[[0, 1, 3],
[6, 8, 9],
[10, 11, 27]]
Should be the sorted 2D array. Is it possible that both the row and column are sorted respectively with the sorted function?
Calling sorted on a nested list that is just going to sort based on the first index in the list.
Example:
arr = [[8, 27, 6],[1, 0, 11],[10, 15, 3], [16, 12, 14], [4, 9, 13]]
is going to return
[[1, 0, 11], [4, 9, 13], [8, 27, 6], [10, 15, 3], [16, 12, 14]]
To do this way that you want, you are going to have to flatten and then reshape.
To do this, I would try introducing numpy.
import numpy as np
a = np.array(sorted(sum(arr, [])))
#sorted(sum(arr, [])) flattens the list
b = np.reshape(a, (-1,3)).tolist()
EDITED FOR CLARITY: You can use your m and n as parameters in np.reshape. The first parameter (m) would return the number of arrays, while (n) would return the number of arrays.
The use of -1 in either parameter means that the reshaped array will be fit to return the requirements of the other parameter.
b would return
[[0, 1, 3], [4, 6, 8], [9, 10, 11], [12, 13, 14], [15, 16, 27]]
Finally found out a proper solution without using numpy and avoiding sum() module.
if __name__ == '__main__':
x = 7
multi_array = [[random.randint(0, 200) for x in range(n)] for y in range(m)]
# one_array = sorted(list(itertools.chain.from_iterable(multi_array))) Another way if you are using itertools
one_array = sorted([x for row in multi_array for x in row])
sorted_2d = [one_array[i:i+m] for i in range(0, len(one_array), n)]
print("multi_array list is: \n{0}\n".format(multi_array))
print("sorted 2D array: \n{0}\n".format(sorted_2d))
if not findnum_arr(sorted_2d, x):
print("Not Found")
else:
print("Found")
output:
multi_array list is:
[[40, 107, 23, 27, 42], [150, 84, 108, 191, 172], [154, 22, 161, 26, 31], [18, 150, 197, 77, 191], [96, 124, 81, 1
25, 186]]
sorted 2D array:
[[18, 22, 23, 26, 27], [31, 40, 42, 77, 81], [84, 96, 107, 108, 124], [125, 150, 150, 154, 161], [172, 186, 191, 1
91, 197]]
Not Found
I wanted to find a standard library module where I could flat the 2D array into 1D and sort it. Then, I would make a list comprehension of my 1D array and build it into a 2D array to. This sounds a lot of works but seems to work fine. Let me know if there is a better way to do it without numpy and faster :)

What is the easiest way in NumPy to index vectors of a matrix excluding one index in each row? [duplicate]

This question already has answers here:
How do I get all the values from a NumPy array excluding a certain index?
(5 answers)
Closed 4 years ago.
Suppose I have a NumPy ndarray M with the following content at M[0,:]:
[2, 3.9, 7, 9, 0, 1, 8.1, 3.2]
and I am given an integer, k, at runtime between 0 and 7. I want to produce the vector consisting of all items in this row except at column k. (Example: if k=3, then the desired vector is [2,3.9,7,0,1,8.1,3.2])
Is there an easy way to do this?
What if I have a vector of indices k, one for each row of M, representing the column I want to exclude from the row?
I'm kind of lost, other than a non-vectorized loop that mutates a result matrix:
nrows = M.shape[0]
result = np.zeros(nrows,M.shape[1]-1))
for irow in xrange(nrows):
result[irow,:k[irow]] = M[irow,:k[irow]] # content before the split point
result[irow,k[irow]:] = M[irow,k[irow]+1:] # content after the split point
One approach would be with masking/boolean-indexing -
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k)),k] = 0
out = M[mask].reshape(len(M),-1)
Alternativley, we could use broadcasting to get that mask -
np.not_equal.outer(k,np.arange(M.shape[1]))
# or k[:,None]!=np.arange(M.shape[1])
Thus, giving us a one-liner/compact version -
out = M[k[:,None]!=np.arange(M.shape[1])].reshape(len(M),-1)
To exclude multiple ones per row, edit the advanced-indexing part for the first method -
def exclude_multiple(M,*klist):
k = np.stack(klist).T
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k))[:,None],k] = 0
out = M[mask].reshape(len(M),-1)
return out
Sample run -
In [185]: M = np.arange(40).reshape(4,10)
In [186]: exclude_multiple(M,[1,3,2,0],[4,5,8,1])
Out[186]:
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])
Improvement on #Divakar's answer to extend this to zero or more excluded indices per row:
def excluding(A, *klist):
"""
excludes column k from each row of A, for each k in klist
(make sure the index vectors have no common elements)
"""
mask = np.ones(A.shape,dtype=bool)
for k in klist:
mask[np.arange(len(k)),k] = 0
return A[mask].reshape(len(A),-1)
Test:
M = np.arange(40).reshape(4,10)
excluding(M,[1,3,2,0],[4,5,8,1])
returns
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])

sum every element of first array with all the elements of second array

I have two arrays:
array1 = [1,2,3]
array2 = [10,20,30]
I want the next sum:
array3 = [10+1,10+2,10+3,20+1,20+2,20+3,30+1,30+2,30+3]
How can I do that?
(I know that it can be done with two for loops but I want something more efficient if possible)
Note: those two arrays are contained in a dataframe (pandas)
I do not think pandas is necessary here
[x+y for x in array2 for y in array1]
Out[293]: [11, 12, 13, 21, 22, 23, 31, 32, 33]
If they are in the dataframe
df=pd.DataFrame({'a':array1,'b':array2})
df
Out[296]:
a b
0 1 10
1 2 20
2 3 30
df.a.values+df.b.values[:,None]
Out[297]:
array([[11, 12, 13],
[21, 22, 23],
[31, 32, 33]], dtype=int64)
Update
(df.a.values+df.b.values[:,None]).ravel()
Out[308]: array([11, 12, 13, 21, 22, 23, 31, 32, 33], dtype=int64)
I wanted to recommend using itertools.product here, https://docs.python.org/3/library/itertools.html included a lot of other recipes that allows you to code
more clearly
from itertools import product
array1 = [1,2,3]
array2 = [10,20,30]
[x+y for x,y in product(array1,array2)]
# fp style
[*map(sum, product(array1,array2))]

Categories

Resources