Get a subarray from a numpy array based on index - python

I have a numpy array vector, and I want to get a subset based on the indexes:
import numpy as np
input=np.array([1,2,3,4,5,6,7,8,9,10])
index=np.array([0,1,0,0,0,0,1,0,0,1])
what is a pythonic way to get out output=[2,7,10]?

output = input[index.astype(np.bool)]
or
output = input[np.where(index)[0]]

Related

Error: 3D Matlab array to 0 dimensional np array

I'm having an issue transforming 3 dimensional matlab array into a 3 dimensional np array in python. When I read it in, an error message shows me it is a 0 dimensional np array.
This is the code I am using:
import scipy.io
import numpy as np
mat = scipy.io.loadmat('2021.01.25.FC.mat')
matrix = np.array(mat)
However, when I index the array like this:
x=matrix[2,2,2]
I receive this error:
IndexError: too many indices for array: array is 0-dimensional, but 3 were indexed
Does any one know the reason why this array is being read in as a 0 dimensional array in numpy or how to correct this?
Thanks!
I think that it is due to 'mat' being a dictionary as the scipy.io.loadmat documentation suggests.
'mat' is likely a dictionary that stores all the variables present in your '2021.01.25.FC.mat' file. If the matrix you are interested in is named "MyMatrix" in your matlab file then a quick fix could be:
import scipy.io
import numpy as np
mat = scipy.io.loadmat('2021.01.25.FC.mat')['MyMatrix']
matrix = np.array(mat)

Concatenate columns while maintaining rows

I have a numpy array that I would like to concatenate the columns into a single value for the row. Below is what I have tried so far.
import numpy as np
randoma=np.random.choice(list('ACTG'),(5,21),replace=True)# create a 7x21 raqndom matrix with A,C,T,G
randoma=np.concatenate(randoma, axis=None)
expected results is something like
randoma = ['AAGCCGCACACAGACCCTGAG',
'AAGCTGCACGCAGACCCTGAG',
'AGGCTGCACGCAGACCCTGAG',
'AAGCTGCACGTGGACCCTGAG',
'AGGCTGCACGTGGACCCTGAG',
'AGGCTGCACGTGGACCCTGAG',
'AAGCTGCATGTGGACCCTGAG']
import numpy as np
randoma = np.random.choice(list('ACTG'),(5,21),replace=True) # create a 7x21 raqndom matrix with A,C,T,G
new_list = [''.join(x) for x in randoma.tolist()]
new_list
['CGGGACGCACTTCCTGTGCAG',
'TGTAGCGGCTTGGTGTCCAAG',
'GAAAGTTTAGGATTGCGTCGG',
'AGTATTGTGATTCTATCTGAC',
'TTAGTAAGAGTGTCTCACTAT']

Numpy nonzero function not working/minimum of a numpy array without zeros

I'm trying to get maximum and minimum values out of a numpy array. In order to have a good overview of the array, I used pandas. Based on this resulting array, I wanted to get a column of maximum and minimum values.
import pandas as pd
import numpy as np
TEST = np.load('NPY TEST.npy')
input_array = pd.DataFrame(TEST)
print(input_array)
inputs_max = np.max(input_array, axis=0)
print(inputs_max)
inputs_min = np.min(input_array[np.nonzero(input_array)], axis=0)
print(inputs_min)
The problem is that if I use
np.min(input_array, axis=0)
the resulting column only consists of zeros, although there is not one 0 in my numpy array. So I tried to use the np.nonzero command, which led to many errors:
AttributeError: 'DataFrame' object has no attribute 'nonzero'
Could anyone help me? Thanks in advance.
I can just guess what your data is looking like, but I'll give it a try:
inputs_min = input_array[input_array != 0.].min(axis=0)

Convert two numpy array to dataframe

I want to convert two numpy array to one DataFrame containing two columns.
The first numpy array 'images' is of shape 102, 1024.
The second numpy array 'label' is of shape (1020, )
My core code is:
images=np.array(images)
label=np.array(label)
l=np.array([images,label])
dataset=pd.DataFrame(l)
But it turns out to be an error saying that:
ValueError: could not broadcast input array from shape (1020,1024) into shape (1020)
What should I do to convert these two numpy array into two columns in one dataframe?
You can't stack them easily, especially if you want them as different columns, because you can't insert a 2D array in one column of a DataFrame, so you need to convert it to something else, for example a list.
So something like this would work:
import pandas as pd
import numpy as np
images = np.array(images)
label = np.array(label)
dataset = pd.DataFrame({'label': label, 'images': list(images)}, columns=['label', 'images'])
This will create a DataFrame with 1020 rows and 2 columns, where each item in the second column contains 1D arrays of length 1024.
Coming from engineering, I like the visual side of creating matrices.
matrix_aux = np.vstack([label,images])
matrix = np.transpose(matrix_aux)
df_lab_img = pd.DataFrame(matrix)
Takes a little bit more of code but leaves you with the Numpy array too.
You can also use hstack
import pandas as pd
import numpy as np
dataset = pd.DataFrame(np.hstack((images, label.reshape(-1, 1))))

How to search in one NumPy array for positions for getting at these position the value from a second NumPy array?

I have two raster files which I have converted into NumPy arrays (arcpy.RasterToNumpyArray) to work with the values in the raster cells with Python.
One of the raster has two values True and False. The other raster has different values in the range between 0 to 1000. Both rasters have exactly the same extent, so both NumPy arrays are build up identically (columns and rows), except the values.
My aim is to identify all positions in NumPy array A which have the value True. These positions shall be used for getting the value at these positions from NumPy array B.
Do you have any idea how I can implement this?
If I understand your description right, you should just be able to do B[A].
You can use the array with True and False values to simply index into the other. Here's a sample:
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([[True,False,False],[False,True,False],[False,False,True]])
a[b] ## gives array([1, 5, 9])

Categories

Resources