delete row from numpy array based on partial string in python

delete row from numpy array based on partial string in python - python

I have a very large numpy array that looks similar to my example. The partial string that I'm trying to detect is "F_H" and it's usually on column 0 of my array.
a = np.array([['#define', 'bad_stringF_H', 'some_value'],
['#define', 'good_string', 'some_value2'],
['#define', 'good_string_2', 'some_value3'],
['#define', 'bad_string2F_H', 'some_value4']])
I just want to delete the whole array if that partial string is detected in the row so the desired output would be like this.
[['#define' 'good_string' 'some_value2']
['#define' 'good_string_2' 'some_value3']]

You can use NumPy's Boolean indexing to create a new array that only includes the rows that do not contain the string 'F_H':
import numpy as np
a = np.array([['#define', 'bad_stringF_H', 'some_value'],
['#define', 'good_string', 'some_value2'],
['#define', 'good_string_2', 'some_value3'],
['#define', 'bad_string2F_H', 'some_value4']])
mask = np.array('F_H' not in x[1] for x in a])
print(mask)
new_a = a[mask]
print(new_a)

Related

Numpy array apply a function only to some elements

I have a numpy array let's say that has a shape (10,10) for example.
Now i want to apply np.exp() to this array, but just to some specific elements that satisfy a condition. For example i want to apply np.exp to all the elements that are not 0 or 1. Is there a way to do that without using for loop that iterate on each element of the array?

This is achievable with basic numpy operations. Here is a way to do that :
A = np.random.randint(0,5,size=(10,10)).astype(float) # data
goods = (A!=0) & (A!=1) # 10 x 10 boolean array
A[goods] = np.exp(A[goods]) # boolean indexing

Python Numpy: replace values in one array with corresponding values in another array

I am using Python Numpy arrays (rasters converted to 2D arrays, specifically) and what I want to do is take one array that has arbitrary dummy values of -999 representing "no data" and I want to replace those values with the corresponding "real" values from a different array of the same size and shape in the correct location. I couldn't find a very similar question to this but note that I am a novice with Python and Numpy.
But what I want to do is this:
array_a =
([[0.564,-999,-999],
[0.234,-999,0.898],
[-999,0.124,0.687],
[0.478,0.786,-999]])
array_b =
([[0.324,0.254,0.204],
[0.469,0.381,0.292],
[0.550,0.453,0.349],
[0.605,0.582,0.551]])
use the values of array_b to fill in the -999 values in array_a and create a new array:
new_array_a =
([[0.564,0.254,0.204],
[0.234,0.381,0.898],
[0.550,0.124,0.687],
[0.478,0.786,0.551]])
I don't really want to change the shape or dimensions of the array because I am going to convert back out into a raster afterwards so I need the correct values in the correct locations.
What is the best way to do this?

Just do boolean masking:
mask = (array_a == -999)
new_array = np.copy(array_a)
new_array[mask] = array_b[mask]

all you need to do is
array_a[array_a==-999]=array_b[array_a==-999]
we are putting boolean condition on array elements to update should have value -999
import numpy as np
array_a =np.array([[0.564,-999,-999],
[0.234,-999,0.898],
[-999,0.124,0.687],
[0.478,0.786,-999]])
array_b =np.array([[0.324,0.254,0.204],
[0.469,0.381,0.292],
[0.550,0.453,0.349],
[0.605,0.582,0.551]])
array_a[array_a==-999]=array_b[array_a==-999]
run this snippet

How to index numpy array on subset of array of bools that is smaller than numpy array's dimensions?

My question is inspired by another one: Intersection of 2d and 1d Numpy array I am looking for a succinct solution that does not use in1d
The setup is this. I have a numpy array of bools telling me which values of numpy array A I should set equal to 0, called listed_array. However, I want to ignore the information in the first 3 columns of listed_array and only set A to zero as indicated in the other columns of listed_array.
I know the following is incorrect:
A[listed_array[:, 3:]] = 0
I also know I can pad this subset of listed_array with a call to hstack, and this will yield correct output, but is there something more succinct?

If I understand the question, this should do it:
A[:, 3:][listed_array[:, 3:]] = 0
which is a concise version of
mask3 = listed_array[:, 3:]
A3 = A[:, 3:] # This slice is a *view* of A, so changing A3 changes A.
A3[mask3] = 0

Use specific columns and lines numpy array

I have a matrix where the first column and line are composed by strings and rest of it is floats:
[["City","Score1","Score2","Score3"],
["Berkley",23,432,321],
["Ohio",3,432,54],
["Columbia",123,432,53]]
I just need to make another matrix to store the floats.
It would look like this:
[[23,432,321],
[3,432,54],
[123,432,53]]

Using numpy:
import numpy as np
arr = np.array([["City","Score1","Score2","Score3"],
["Berkley",23,432,321],
["Ohio",3,432,54],
["Columbia",123,432,53]])
new_arr = arr[1:, 1:].astype(float)
NOTE: In your example those are ints not floats, but I've still used floats here

Copying row element in a numpy array

I have an array X of <class 'scipy.sparse.csr.csr_matrix'> format with shape (44, 4095)
I would like to now to create a new numpy array say X_train = np.empty([44, 4095]) and copy row by row in a different order. Say I want the 5th row of X in 1st row of X_train.
How do I do this (copying an entire row into a new numpy array) similar to matlab?

Define the new row order as a list of indices, then define X_train using integer indexing:
row_order = [4, ...]
X_train = X[row_order]
Note that unlike Matlab, Python uses 0-based indexing, so the 5th row has index 4.
Also note that integer indexing (due to its ability to select values in arbitrary order) returns a copy of the original NumPy array.
This works equally well for sparse matrices and NumPy arrays.

Python works generally by reference, which is something you should keep in mind. What you need to do is make a copy and then swap. I have written a demo function which swaps rows.
import numpy as np # import numpy
''' Function which swaps rowA with rowB '''
def swapRows(myArray, rowA, rowB):
temp = myArray[rowA,:].copy() # create a temporary variable
myArray[rowA,:] = myArray[rowB,:].copy()
myArray[rowB,:]= temp
a = np.arange(30) # generate demo data
a = a.reshape(6,5) # reshape the data into 6x5 matrix
print a # prin the matrix before the swap
swapRows(a,0,1) # swap the rows
print a # print the matrix after the swap
To answer your question, one solution would be to use
X_train = np.empty([44, 4095])
X_train[0,:] = x[4,:].copy() # store in the 1st row the 5th one
unutbu answer seems to be the most logical.
Kind Regards,

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

delete row from numpy array based on partial string in python - python

Related

Numpy array apply a function only to some elements

Python Numpy: replace values in one array with corresponding values in another array

How to index numpy array on subset of array of bools that is smaller than numpy array's dimensions?

Use specific columns and lines numpy array

Copying row element in a numpy array

Categories

Resources