I am new to python, Here I have an numpy array. Now, In this ,
I am trying to add an element in the index in the array .
for x in index:
output_result[x:x] = [300]
But it is not getting added, index is the position. where I want to add that element. So, can any one help mw eith this ?
are you maybe looking for something like this:
import numpy as np
a = np.zeros(10) # create numpy array with ten zeros
a = np.where(a == 0, 300, a) # substitute 300 where there are zeros in array - **i assume this is what you need**
print(a) # print generated array
print(type(a)) # print data type to show a numpy array was generated
or do you want to "append" a new element?
With a Python list, you can insert a value with:
In [104]: alist = [0,1,2,3]
In [105]: alist[1:1]=[300]
In [106]: alist
Out[106]: [0, 300, 1, 2, 3]
But this does not work with ndarray. The array size is fixed. The best you can do is create a new array, with original values and the new one(s).
np.insert is a function that can do that. Since the operation is not particularly efficient, it's best to do a whole set of inserts at once, rather than do it iteratively.
In [108]: np.insert(np.arange(4),1,300)
Out[108]: array([ 0, 300, 1, 2, 3])
In [109]: np.insert(np.arange(4),[1,2],[300,400])
Out[109]: array([ 0, 300, 1, 400, 2, 3])
(Even with a list, iterative insertion can be tricky, since each insertion changes the size of the list. The insertion point has to take that into account (unless you iterate from the end).)
Related
Let's say we have a simple 1D ndarray. That is:
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9,10])
I want to get the first 3 and the last 2 values, so that the output would be [ 1 2 3 9 10].
I have already solved this by merging and concatenating the merged variables as follows :
b= a[:2]
c= a[-2:]
a=np.concatenate([b,c])
However I would like to know if there is a more direct way to achieve this using slices, such as a[:2 and -2:] for instance. As an alternative I already tried this :
a = a[np.r_[:2, -2:]]
but it not seems to be working. It returns me only the first 2 values that is [1 2] ..
Thanks in advance!
Slicing a numpy array needs to be continuous AFAIK. The np.r_[-2:] does not work because it does not know how big the array a is. You could do np.r_[:2, len(a)-2:len(a)], but this will still copy the data since you are indexing with another array.
If you want to avoid copying data or doing any concatenation operation you could use np.lib.stride_tricks.as_strided:
ds = a.dtype.itemsize
np.lib.stride_tricks.as_strided(a, shape=(2,2), strides=(ds * 8, ds)).ravel()
Output:
array([ 1, 2, 9, 10])
But since you want the first 3 and last 2 values the stride for accessing the elements will not be equal. This is a bit trickier, but I suppose you could do:
np.lib.stride_tricks.as_strided(a, shape=(2,3), strides=(ds * 8, ds)).ravel()[:-1]
Output:
array([ 1, 2, 3, 9, 10])
Although, this is a potential dangerous operation because the last element is reading outside the allocated memory.
In afterthought, I cannot find out a way do this operation without copying the data somehow. The numpy ravel in the code snippets above is forced to make a copy of the data. If you can live with using the shapes (2,2) or (2,3) it might work in some cases, but you will only have reading permission to a strided view and this should be enforced by setting the keyword writeable=False.
You could try to access the elements with a list of indices.
import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9,10])
b = a[[0,1,2,8,9]] # b should now be array([ 1, 2, 3, 9, 10])
Obviously, if your array is too long, you would not want to type out all the indices.
Thus, you could build the inner index list from for loops.
Something like that:
index_list = [i for i in range(3)] + [i for i in range(8, 10)]
b = a[index_list] # b should now be array([ 1, 2, 3, 9, 10])
Therefore, as long as you know where your desired elements are, you can access them individually.
I want to make a function which takes a n-dimensional array, the dimension and the column index, and it will return the (n-1)-dimensional array after removing all the other columns of that specific dimension.
Here is the code I am using now
a = np.arange(6).reshape((2, 3)) # the n-dimensional array
axisApplied = 1
colToKeep = 0
colsToDelete = np.delete(np.arange(a.shape[axisApplied]), colToKeep)
a = np.squeeze(np.delete(a, colsToDelete, axisApplied), axis=axisApplied)
print(a)
# [0, 3]
Note that I have to manually calculate the n-1 indices (the complement of the specific column index) to use np.delete(), and I am wondering whether there is a more convenient way to achieve my goal, e.g. specify which column to keep directly.
Thank you for reading and I am welcome to any suggestions.
In [1]: arr = np.arange(6).reshape(2,3)
In [2]: arr
Out[2]:
array([[0, 1, 2],
[3, 4, 5]])
Simple indexing:
In [3]: arr[:,0]
Out[3]: array([0, 3])
Or if you need to used the general axis parameter, try take:
In [4]: np.take(arr,0,axis=1)
Out[4]: array([0, 3])
Picking one element, or a list of elements, along an axis is a lot easier than deleting some. Look at the code for np.delete.
I'm trying to create a function that will calculate the lattice distance (number of horizontal and vertical steps) between elements in a multi-dimensional numpy array. For this I need to retrieve the actual numbers from the indexes of each element as I iterate through the array. I want to store those values as numbers that I can run through a distance formula.
For the example array A
A=np.array([[1,2,3],[4,5,6],[7,8,9]])
I'd like to create a loop that iterates through each element and for the first element 1 it would retrieve a=0, b=0 since 1 is at A[0,0], then a=0, b=1 for element 2 as it is located at A[0,1], and so on...
My envisioned output is two numbers (corresponding to the two index values for that element) for each element in the array. So in the example above, it would be the two values that I am assigning to be a and b. I only will need to retrieve these two numbers within the loop (rather than save separately as another data object).
Any thoughts on how to do this would be greatly appreciated!
As I've become more familiar with the numpy and pandas ecosystem, it's become clearer to me that iteration is usually outright wrong due to how slow it is in comparison, and writing to use a vectorized operation is best whenever possible. Though the style is not as obvious/Pythonic at first, I've (anecdotally) gained ridiculous speedups with vectorized operations; more than 1000x in a case of swapping out a form like some row iteration .apply(lambda)
#MSeifert's answer much better provides this and will be significantly more performant on a dataset of any real size
More general Answer by #cs95 covering and comparing alternatives to iteration in Pandas
Original Answer
You can iterate through the values in your array with numpy.ndenumerate to get the indices of the values in your array.
Using the documentation above:
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
for index, values in np.ndenumerate(A):
print(index, values) # operate here
You can do it using np.ndenumerate but generally you don't need to iterate over an array.
You can simply create a meshgrid (or open grid) to get all indices at once and you can then process them (vectorized) much faster.
For example
>>> x, y = np.mgrid[slice(A.shape[0]), slice(A.shape[1])]
>>> x
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
>>> y
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
and these can be processed like any other array. So if your function that needs the indices can be vectorized you shouldn't do the manual loop!
For example to calculate the lattice distance for each point to a point say (2, 3):
>>> abs(x - 2) + abs(y - 3)
array([[5, 4, 3],
[4, 3, 2],
[3, 2, 1]])
For distances an ogrid would be faster. Just replace np.mgrid with np.ogrid:
>>> x, y = np.ogrid[slice(A.shape[0]), slice(A.shape[1])]
>>> np.hypot(x - 2, y - 3) # cartesian distance this time! :-)
array([[ 3.60555128, 2.82842712, 2.23606798],
[ 3.16227766, 2.23606798, 1.41421356],
[ 3. , 2. , 1. ]])
Another possible solution:
import numpy as np
A=np.array([[1,2,3],[4,5,6],[7,8,9]])
for _, val in np.ndenumerate(A):
ind = np.argwhere(A==val)
print val, ind
In this case you will obtain the array of indexes if value appears in array not once.
I'm in need of a more customized way to extract given elements from a numpy array than the general indexing seems to allow me. In particular, I want to get a number of arbitrary, predefined elements, then every n-th, starting at a given point.
Say, e.g., I want the second (as in index number 2) and fourth element of an array, and then, every third element, beginning from the sixth one. So far, I'm doing:
newArray = np.concatenate(myArray[(2, 4)], myArray[6::3])
Is there a more convenient way to achieve this?
It's effectively identical to what you're doing, but you might find it a bit more convenient to do:
new_array = my_array[np.r_[2, 4, 6:len(my_array):3]]
np.r_ is basically concatenation + arange-like slicing.
For example:
In [1]: import numpy as np
In [2]: np.r_[np.arange(5), np.arange(1, 4)]
Out[2]: array([0, 1, 2, 3, 4, 1, 2, 3])
In [3]: np.r_[1, 2, :5]
Out[3]: array([1, 2, 0, 1, 2, 3, 4])
In [4]: np.r_[:5]
Out[4]: array([0, 1, 2, 3, 4])
The downside to this approach is that you're building up an (potentially very large) additional indexing array. In either case, you're going to wind up creating a copy, but if my_array is very large, your original approach is more efficient.
np.r_ is a bit unreadable (meant for interactive use), but it can be a very handy way of building up arbitrary indexing arrays.
Consider the following NumPy array:
a = np.array([[1,4], [2,1],(3,10),(4,8)])
This gives an array that looks like the following:
array([[ 1, 4],
[ 2, 1],
[ 3, 10],
[ 4, 8]])
What I'm trying to do is find the minimum value of the second column (which in this case is 1), and then report the other value of that pair (in this case 2). I've tried using something like argmin, but that gets tripped up by the 1 in the first column.
Is there a way to do this easily? I've also considered sorting the array, but I can't seem to get that to work in a way that keeps the pairs together. The data is being generated by a loop like the following, so if there's a easier way to do this that isn't a numpy array, I'd take that as an answer too:
results = np.zeros((100,2))
# Loop over search range, change kappa each time
for i in range(100):
results[i,0] = function1(x)
results[i,1] = function2(y)
How about
a[np.argmin(a[:, 1]), 0]
Break-down
a. Grab the second column
>>> a[:, 1]
array([ 4, 1, 10, 8])
b. Get the index of the minimum element in the second column
>>> np.argmin(a[:, 1])
1
c. Index a with that to get the corresponding row
>>> a[np.argmin(a[:, 1])]
array([2, 1])
d. And take the first element
>>> a[np.argmin(a[:, 1]), 0]
2
Using np.argmin is probably the best way to tackle this. To do it in pure python, you could use:
min(tuple(r[::-1]) for r in a)[::-1]