Assign large value to np.array - python

i try to replace all the 0 value inside the array with 1.0/875713. But my code did not work, so just wondering is this due to type size limitation and how to solve this problem?
value = 1.0/875713
print(value)
arr = np.array([1,2,3,0,3,0,0,0,2,3,4,5])
arr[arr == 0] = value
print(arr)
1.14192663578e-06
[1 2 3 0 3 0 0 0 2 3 4 5]
Expecting results
[1 2 3 1.14192663578e-06 3 1.14192663578e-06 1.14192663578e-06 1.14192663578e-06 2 3 4 5]

Numpy array has a type. You can learn more in docs
In your code, if you type arr.dtype, the result will be dtype('int32')
To reach your goal, you should run arr = arr.astype('float32') before running arr[arr == 0] = value, then you will get the expected output.

Related

Odd behavior of np.argsort with Pandas

Here is np.argsort applied in four different ways.
print(np.argsort([1,np.nan,3,np.nan, 4]))
print(np.argsort(pd.DataFrame([[1,np.nan,3,np.nan, 4]])).values)
print(np.argsort(pd.Series([1,np.nan,3,np.nan, 4]).values)) # same as first
print(np.argsort(pd.Series([1,np.nan,3,np.nan, 4])).values)
Output:
[0 2 4 1 3]
[[0 2 4 1 3]]
[0 2 4 1 3]
[ 0 -1 1 -1 2]
This is very unexpected behavior. No mention of it in numpy (obviously it will not mention Pandas).
In the Pandas documentation you can find
Returns: Series[np.intp]
Positions of values within the sort order with -1 indicating nan values.
Why? What would be a place where we would want this kind of behavior?

Stacked numpy arrays

I am considerably new to Python.I am willing to stack a few 1-d arrays in python and create a 2-d array using that. The arrays which I want to stack are returned from another function. The minimum reproducible code that I have written for this purpose has been shown below:
def fun1(i): # this function returns the array
return array # This array is a function of i
h0= np.empty(5,dtype=object)
arrays=[h0]
for i in range(4):
arr=fun1(i)
arrays.append(arr)
h=np.vstack(arrays)
print (h)
The desired output is of the form :
[[1 1 1 1 1]
[2 2 2 2 2]
[3 3 3 3 3]
[4 4 4 4 4]]
But I get :
[[None None None None None]
[1 1 1 1 1]
[2 2 2 2 2]
[3 3 3 3 3]
[4 4 4 4 4]]
I understand that I get the above output because an empty array of dtype=object has all elements None . But I am not being able to solve the problem. Any help regarding this would be highly appreciated.
You are getting None because you are creating an empty array with dtype object:
h0 = np.empty(5,dtype=object)
This line creates an array with None elements.
You can remove this line so that it works as you expect:
def fun1(i): # this function returns the array
return array # This array is a function of i
arrays=[]
for i in range(4):
arr=fun1(i)
arrays.append(arr)
h = np.vstack(arrays)
print (h)
if np.where(~a.any(axis=1))[0]:
#do nothing
else:
#print as you like
try this
h = np.vstack([
fun1(i)
for i in range(4)
])
print(h)
^^ this is assuming that fun1() is some proprietary function returning arrays of same length
but if you want a 2D array with equal rows entries and incrementing column-wise like
[[1 1 1 1 1]
[2 2 2 2 2]
[3 3 3 3 3]
[4 4 4 4 4]]
you can also try a faster method
x = np.arange(1,5).reshape([-1,1])
h = np.repeat(x, 5, axis= 1)

finding where 2d list overlaps by value

One numpy 2d-array looks like this:
[[0 1 2]
[1 5 0]]
Another numpy 2d array which looks like this:
[[0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2]
[0 1 3 4 8 0 1 3 6 7 8 0 1 2 3 6 8]]
I want to get just the places where they "overlap":
[[0 2]
[1 0]]
without using a for loop
You can use intersect1d.
I called n1 the first array and n2 the second one.
The result is not exactly what you expected, but I believe it's correct.
intersection = np.intersect1d(n1, n2)
print(intersection)
[0 1 2]

quickly calculate randomized 3D numpy array from 2D numpy array

I have a 2-dimensional array of integers, we'll call it "A".
I want to create a 3-dimensional array "B" of all 1s and 0s such that:
for any fixed (i,j) sum(B[i,j,:])==A[i.j], that is, B[i,j,:] contains A[i,j] 1s in it
the 1s are randomly placed in the 3rd dimension.
I know how I would do this using standard python indexing but this turns out to be very slow.
I am looking for a way to do this that takes advantage of the features that can make Numpy fast.
Here is how I would do it using standard indexing:
B=np.zeros((X,Y,Z))
indexoptions=range(Z)
for i in xrange(Y):
for j in xrange(X):
replacedindices=np.random.choice(indexoptions,size=A[i,j],replace=False)
B[i,j,[replacedindices]]=1
Can someone please explain how I can do this in a faster way?
Edit: Here is an example "A":
A=np.array([[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4]])
in this case X=Y=5 and Z>=5
Essentially the same idea as #JohnZwinck and #DSM, but with a shuffle function for shuffling a given axis:
import numpy as np
def shuffle(a, axis=-1):
"""
Shuffle `a` in-place along the given axis.
Apply numpy.random.shuffle to the given axis of `a`.
Each one-dimensional slice is shuffled independently.
"""
b = a.swapaxes(axis,-1)
# Shuffle `b` in-place along the last axis. `b` is a view of `a`,
# so `a` is shuffled in place, too.
shp = b.shape[:-1]
for ndx in np.ndindex(shp):
np.random.shuffle(b[ndx])
return
def random_bits(a, n):
b = (a[..., np.newaxis] > np.arange(n)).astype(int)
shuffle(b)
return b
if __name__ == "__main__":
np.random.seed(12345)
A = np.random.randint(0, 5, size=(3,4))
Z = 6
B = random_bits(A, Z)
print "A:"
print A
print "B:"
print B
Output:
A:
[[2 1 4 1]
[2 1 1 3]
[1 3 0 2]]
B:
[[[1 0 0 0 0 1]
[0 1 0 0 0 0]
[0 1 1 1 1 0]
[0 0 0 1 0 0]]
[[0 1 0 1 0 0]
[0 0 0 1 0 0]
[0 0 1 0 0 0]
[1 0 1 0 1 0]]
[[0 0 0 0 0 1]
[0 0 1 1 1 0]
[0 0 0 0 0 0]
[0 0 1 0 1 0]]]

Numpy extract values on the diagonal from a matrix

My question is similar(the expanded version) to this post:Numpy extract row, column and value from a matrix. In that post, I extract elements which are bigger than zero from the input matrix, now I want to extract elements on the diagonal, too. So in this case,
from numpy import *
import numpy as np
m=np.array([[0,2,4],[4,0,0],[5,4,0]])
dist=[]
index_row=[]
index_col=[]
indices=np.where(matrix>0)
index_col, index_row = indices
dist=matrix[indices]
return index_row, index_col, dist
we could get,
index_row = [1 2 0 0 1]
index_col = [0 0 1 2 2]
dist = [2 4 4 5 4]
and now this is what I want,
index_row = [0 1 2 0 1 0 1 2]
index_col = [0 0 0 1 1 2 2 2]
dist = [0 2 4 4 0 5 4 0]
I tried to edit line 8 in the original code to this,
indices=np.where(matrix>0 & matrix.diagonal)
but got this error,
How to get the result I want? Please give me some suggestions, thanks!
You can use following method:
get the mask array
fill diagonal of the mask to True
select elements where elements in mask is True
Here is the code:
m=np.array([[0,2,4],[4,0,0],[5,4,0]])
mask = m > 0
np.fill_diagonal(mask, True)
m[mask]

Categories

Resources