I'm trying to do a sorting algorithm and I need to do something like this:
arr = numpy.array([2,4,5,6])
#some function here or something
array([5,2,4,6])
#element "5" moved from position 3 to 1, and all of the others moved up one position
What I mean is I want to change an element's position (index) and move all the other elements up one position. Is this possible?
You could use numpy.roll() with a subset assignment:
arr = numpy.array([2,4,5,6])
arr[:3] = numpy.roll(arr[:3],1)
print(arr)
[5 2 4 6]
If you know the position/index of the element to be shifted, then you could do:
indx = 2
np.r_[arr[indx], np.delete(arr, indx)]
Out[]: array([5, 2, 4, 6])
You could do this in-place without the need of creating first an intermediate array of n-1 elements and then creating another final array by concatenation. Instead, you could try this:
idx = 2
tmp = arr[idx]
arr[1:idx+1] = arr[0:idx]
arr[0] = tmp
So there are more than one ways of doing this, and the choice depends upon your algorithm's constraints.
I have a NumPy array with size [49152, 3], I want to append another array with same size [49152, 3] as a new row to become [2, 49152, 3]. May I know the way to do this?
All of these are RGB image pixel values from 192x256 resolution images.
You can use the insert function of numpy in Python.
x = np.array([49152, 3])
X = np.insert(x, 0, 2)
x is the original array to append to
0 is the index where you need to append
2 is what needs to be appended.
You most likely need to use numpy.stack:
>>> a = np.random.rand(49152, 3)
>>> b = np.random.rand(49152, 3)
>>> np.stack([a, b]).shape
(2, 49152, 3)
You can provide a list (or tuple) of arrays of same shape, and np.stack will stack them on a newly created axis (by default on the first: axis=0).
There is a part of the following code that I don't quite understand.
Here is the code:
import numpy as np
medalNames = np.array(['none', 'bronze', 'silver', 'gold'])
ageGroupCategories = np.array(['B','P','G','T'])
allLowerThresholds = np.array([[-1,0,5,10], [0,5,10,15], [0,11,14,17], [0,15,17,19]])
ageGroupIndex = np.where(ageGroup[0] == ageGroupCategories)[0][0]
In the last line, what does the [0][0] do, why doesn't the code work without it?
A few things:
Use embedded Code boxes
Your code isn't working at all because the variable ageGroup doesn't exist
Now to your question:
since it is an array the [0][0] calls on the first row and first column of the result of the array np.where().
Your question is general and related to the numpy.where function.
Let's take a simple example as follows:
A=np.array([[3,2,1],[4,5,1]])
# array([[3, 2, 1],
# [4, 5, 1]])
print(np.where(A==1))
# (array([0, 1]), array([2, 2]))
As you can see the np.where function returns a tuple. The first element (it's a numpy array) of the tuple is the row/line index, and the second element (it's again a numpy array), is the column index.
np.where(A==1)[0] # this is the first element of the tuple thus,
# the numpy array containing all the row/line
# indices where the value is = 1.
#array([0, 1])
The above tells you that there is a value = 1 in the first (0) and second (1) row of the matrix A.
Next:
np.where(A==1)[0][0]
0
returns the index of the first line that contains a value = 1. 0 here is the first line of matrix A
I have an array y_filtered that contains some masked values. I want to replace these values by some value I calculate based on their neighbouring values. I can get the indices of the masked values by using masked_slices = ma.clump_masked(y_filtered). This returns a list of slices, e.g. [slice(194, 196, None)].
I can easily get the values from my masked array, by using y_filtered[masked_slices], and even loop over them. However, I need to access the index of the values as well, so i can calculate its new value based on its neighbours. Enumerate (logically) returns 0, 1, etc. instead of the indices I need.
Here's the solution I came up with.
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
y_enum = [(i, y_i) for i, y_i in zip(range(len(y_filtered)), y_filtered)]
for sl in masked_slices:
for i, y_i in y_enum[sl]:
# simplified example calculation
y_filtered[i] = np.average(y_filtered[i-2:i+2])
It is very ugly method i.m.o. and I think there has to be a better way to do this. Any suggestions?
Thanks!
EDIT:
I figured out a better way to achieve what I think you want to do. This code picks every window of 5 elements and compute its (masked) average, then uses those values to fill the gaps in the original array. If some index does not have any unmasked value close enough it will just leave it as masked:
import numpy as np
from numpy.lib.stride_tricks import as_strided
SMOOTH_MARGIN = 2
x = np.ma.array(data=[1, 2, 3, 4, 5, 6, 8, 9, 10],
mask=[0, 1, 0, 0, 1, 1, 1, 1, 0])
print(x)
# [1 -- 3 4 -- -- -- -- 10]
pad_data = np.pad(x.data, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant')
pad_mask = np.pad(x.mask, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant',
constant_values=True)
k = 2 * SMOOTH_MARGIN + 1
isize = x.dtype.itemsize
msize = x.mask.dtype.itemsize
x_pad = np.ma.array(
data=as_strided(pad_data, (len(x), k), (isize, isize), writeable=False),
mask=as_strided(pad_mask, (len(x), k), (msize, msize), writeable=False))
x_avg = np.ma.average(x_pad, axis=1).astype(x_pad.dtype)
fill_mask = ~x_avg.mask & x.mask
result = x.copy()
result[fill_mask] = x_avg[fill_mask]
print(result)
# [1 2 3 4 3 4 10 10 10]
(note all the values are integers here because x was originally of integer type)
The original posted code has a few errors, firstly it both reads and writes values from y_filtered in the loop, so the results of later indices are affected by the previous iterations, this could be fixed with a copy of the original y_filtered. Second, [i-2:i+2] should probably be [max(i-2, 0):i+3], in order to have a symmetric window starting at zero or later always.
You could do this:
from itertools import chain
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
for idx in chain.from_iterable(range(s.start, s.stop) for s in masked_slices):
y_filtered[idx] = np.average(y_filtered[max(idx - 2, 0):idx + 3])
Is there a numpy method which is equivalent to the builtin pop for python lists?
Popping obviously doesn't work on numpy arrays, and I want to avoid a list conversion.
There is no pop method for NumPy arrays, but you could just use basic slicing (which would be efficient since it returns a view, not a copy):
In [104]: y = np.arange(5); y
Out[105]: array([0, 1, 2, 3, 4])
In [106]: last, y = y[-1], y[:-1]
In [107]: last, y
Out[107]: (4, array([0, 1, 2, 3]))
If there were a pop method it would return the last value in y and modify y.
Above,
last, y = y[-1], y[:-1]
assigns the last value to the variable last and modifies y.
Here is one example using numpy.delete():
import numpy as np
arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(arr)
# array([[ 1, 2, 3, 4],
# [ 5, 6, 7, 8],
# [ 9, 10, 11, 12]])
arr = np.delete(arr, 1, 0)
print(arr)
# array([[ 1, 2, 3, 4],
# [ 9, 10, 11, 12]])
Pop doesn't exist for NumPy arrays, but you can use NumPy indexing in combination with array restructuring, for example hstack/vstack or numpy.delete(), to emulate popping.
Here are some example functions I can think of (which apparently don't work when the index is -1, but you can fix this with a simple conditional):
def poprow(my_array,pr):
""" row popping in numpy arrays
Input: my_array - NumPy array, pr: row index to pop out
Output: [new_array,popped_row] """
i = pr
pop = my_array[i]
new_array = np.vstack((my_array[:i],my_array[i+1:]))
return [new_array,pop]
def popcol(my_array,pc):
""" column popping in numpy arrays
Input: my_array: NumPy array, pc: column index to pop out
Output: [new_array,popped_col] """
i = pc
pop = my_array[:,i]
new_array = np.hstack((my_array[:,:i],my_array[:,i+1:]))
return [new_array,pop]
This returns the array without the popped row/column, as well as the popped row/column separately:
>>> A = np.array([[1,2,3],[4,5,6]])
>>> [A,poparow] = poprow(A,0)
>>> poparow
array([1, 2, 3])
>>> A = np.array([[1,2,3],[4,5,6]])
>>> [A,popacol] = popcol(A,2)
>>> popacol
array([3, 6])
There isn't any pop() method for numpy arrays unlike List, Here're some alternatives you can try out-
Using Basic Slicing
>>> x = np.array([1,2,3,4,5])
>>> x = x[:-1]; x
>>> [1,2,3,4]
Or, By Using delete()
Syntax - np.delete(arr, obj, axis=None)
arr: Input array
obj: Row or column number to delete
axis: Axis to delete
>>> x = np.array([1,2,3,4,5])
>>> x = x = np.delete(x, len(x)-1, 0)
>>> [1,2,3,4]
The important thing is that it takes one from the original array and deletes it.
If you don't m
ind the superficial implementation of a single method to complete the process, the following code will do what you want.
import numpy as np
a = np.arange(0, 3)
i = 0
selected, others = a[i], np.delete(a, i)
print(selected)
print(others)
# result:
# 0
# [1 2]
The most 'elegant' solution for retrieving and removing a random item in Numpy is this:
import numpy as np
import random
arr = np.array([1, 3, 5, 2, 8, 7])
element = random.choice(arr)
elementIndex = np.where(arr == element)[0][0]
arr = np.delete(arr, elementIndex)
For curious coders:
The np.where() method returns two lists. The first returns the row indexes of the matching elements and the second the column indexes. This is useful when searching for elements in a 2d array. In our case, the first element of the first returned list is interesting.
To add, If you want to implement pop for a row or column from a numpy 2D array you could do like:
col = arr[:, -1] # gets the last column
np.delete(arr, -1, 1) # deletes the last column
and for row:
row = arr[-1, :] # gets the last row
np.delete(arr, -1, 0) # deletes the last row
unutbu had a simple answer for this, but pop() can also take an index as a parameter. This is how you replicate it with numpy:
pop_index = 4
pop = y[pop_index]
y = np.concatenate([y[:pop_index],y[pop_index+1:]])
OK, since I didn't see a good answer that RETURNS the 1st element and REMOVES it from the original array, I wrote a simple (if kludgy) function utilizing global for a 1d array (modification required for multidims):
tmp_array_for_popfunc = 1d_array
def array_pop():
global tmp_array_for_popfunc
r = tmp_array_for_popfunc[0]
tmp_array_for_popfunc = np.delete(tmp_array_for_popfunc, 0)
return r
check it by using-
print(len(tmp_array_for_popfunc)) # confirm initial size of tmp_array_for_popfunc
print(array_pop()) #prints return value at tmp_array_for_popfunc[0]
print(len(tmp_array_for_popfunc)) # now size is 1 smaller
I made a function as follow, doing almost the same. This function has 2 arguments: np_array and index, and return the value of the given index of the array.
def np_pop(np_array, index=-1):
'''
Pop the "index" from np_array and return the value.
Default value for index is the last element.
'''
# add this to make sure 'numpy' is imported
import numpy as np
# read the value of the given array at the given index
value = np_array[index]
# remove value from array
np.delete(np_array, index, 0)
# return the value
return value
Remember you can add a condition to make sure the given index is exist in the array and return -1 if anything goes wrong.
Now you can use it like this:
import numpy as np
i = 2 # let's assume we want to pop index number 2
y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) # assume 'y' is our numpy array
poped_val = np_pop(y, i) # value of the piped index