How to change chunks of data in a numpy array - python

I have a large numpy 1 dimensional array of data in Python and want entries x (500) to y (520) to be changed to equal 1. I could use a for loop but is there a neater, faster numpy way of doing this?
for x in range(500,520)
numpyArray[x] = 1.
Here is the for loop that could be used but it seems like there could be a function in numpy that I'm missing - I'd rather not use the masked arrays that numpy offers

You can use [] to access a range of elements:
import numpy as np
a = np.ones((10))
print(a) # Original array
# [ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
startindex = 2
endindex = 4
a[startindex:endindex] = 0
print(a) # modified array
# [ 1. 1. 0. 0. 1. 1. 1. 1. 1. 1.]

Related

Calculating the difference between each element against other randomly generated elements in python

I am calculating the difference of each element in a numpy array. My code is
import numpy as np
M = 10
x = np.random.uniform(0,1,M)
y = np.array([x])
# Calculate the difference
z = np.array(y[:,None]-y)
When I run my code I get [[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]. I don't get a 10 by 10 array.
Where do I go wrong?
You should read the broadcasting rules for numpy
y.T - x
Another way:
np.subtract.outer(x, x)
You are not getting 10 by 10 array because value of M is 10. Try:
M = (10,10)

Numpy array: efficiently assign values

I have an array and I want to loop through its values to update it as follows:
import numpy as np
arr=np.ones((5,7))
for i in range(1,arr.shape[0]-1):
for j in range(1,arr.shape[1]-1):
arr[i,j]=arr[i+1,j]+arr[i,j+1]
The result is, as desired,
[[1. 1. 1. 1. 1. 1. 1.]
[1. 2. 2. 2. 2. 2. 1.]
[1. 2. 2. 2. 2. 2. 1.]
[1. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 1. 1. 1. 1.]]
However, for-loops are quite slow and I'd like to know if there is a way to make this more efficient.
Edit: The input is not always np.ones((5,7)), it will be something more heterogeneous in general.
If you draw a box around the "inner" elements, your code is setting the new value of those elements to be the sum of (a) that box "shifted one row down", and (b) that box "shifted one column to the right".
For example:
----- ----- -----
-XXX- ----- --XXX
-XXX- = -XXX- + --XXX
-XXX- -XXX- --XXX
----- -XXX- -----
And you can do that without loops as follows:
arr[1:-1,1:-1] = arr[2:,1:-1] + arr[1:-1,2:]
Here is the code for question.
import numpy as np
a=np.random.randn(5, 7)
a1=a
a2=a
mid_mat= a[1:, :][:, :-1]+a[:, :-1][:-1, :]
a1[1:-1, 1:-1] = mid_mat[:-1, :-1]
# Assert Code
for i in range(1,a.shape[0]-1):
for j in range(1,a.shape[1]-1):
a2[i,j]=a[i+1,j]+a[i,j+1]
np.testing.assert_array_equal(a1, a2)

Reading 2d arrays into a 3d array in python

I searched stackoverflow but could not find an answer to this specific question. Sorry if it is a naive question, I am a newbie to python.
I have several 2d arrays (or lists) that I would like to read into a 3d array (list) in python. In Matlab, I can simply do
for i=1:N
# read 2d array "a"
newarray(:,:,i)=a(:,:)
end
so newarray is a 3d array with "a" being the 2d slices arranged along the 3rd dimension.
Is there a simple way to do this in python?
Edit: I am currently trying the following:
for file in files:
img=mpimg.imread(file)
newarray=np.array(0.289*cropimg[:,:,0]+0.5870*cropimg[:,:,1]+0.1140*cropimg[:,:,2])
i=i+1
I tried newarray[:,:,i] and it gives me an error
NameError: name 'newarray' is not defined
Seems like I have to define newarray as a numpy array? Not sure.
Thanks!
If you're familiar with MATLAB, translating that into using NumPy is fairly straightforward.
Lets say you have a couple arrays
a = np.eye(3)
b = np.arange(9).reshape((3, 3))
print(a)
# [[ 1. 0. 0.]
# [ 0. 1. 0.]
# [ 0. 0. 1.]]
print(b)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
If you simply want to put them into another dimension, pass them both to the array constructor in an iterable (e.g. a list) like so:
x = np.array([a, b])
print(x)
# [[[ 1. 0. 0.]
# [ 0. 1. 0.]
# [ 0. 0. 1.]]
#
# [[ 0. 1. 2.]
# [ 3. 4. 5.]
# [ 6. 7. 8.]]]
Numpy is smart enough to recognize the arrays are all the same size and creates a new dimension to hold it all.
print(x.shape)
# (2, 3, 3)
You can loop through it, but if you want to apply the same operations to it across some dimensions, I would strongly suggest you use broadcasting so that NumPy can vectorize the operation and it runs a whole lot faster.
For example, across one dimension, lets multiply one slice by 2, another by 3. (If it's not a pure scalar, we need to reshape the array to the same number of dimensions to broadcast, then the size on each needs to either match the array or be 1). Note that I'm working along the 0th axis, your image is probably different. I don't have a handy image to load up to toy with
y = x * np.array([2, 3]).reshape((2, 1, 1))
print(y)
#[[[ 2. 0. 0.]
# [ 0. 2. 0.]
# [ 0. 0. 2.]]
#
# [[ 0. 3. 6.]
# [ 9. 12. 15.]
# [ 18. 21. 24.]]]
Then we can add them up
z = np.sum(y, axis=0)
print(z)
#[[ 2. 3. 6.]
# [ 9. 14. 15.]
# [ 18. 21. 26.]]
If you're using NumPy arrays, you can translate almost directly from Matlab:
for i in range(1, N+1):
# read 2d array "a"
newarray[:, :, i] = a[:, :]
Of course you'd probably want to use range(N), because arrays use 0-based indexing. And obviously you're going to need to pre-create newarray in some way, just as you'd have to in Matlab, but you can translate that pretty directly too. (Look up the zeros function if you're not sure how.)
If you're using lists, you can't do this directly—but you probably don't want to anyway. A better solution would be to build up a list of 2D lists on the fly:
newarray = []
for i in range(N):
# read 2d list of lists "a"
newarray.append(a)
Or, more simply:
newarray = [read_next_2d_list_of_lists() for i in range(N)]
Or, even better, make that read function a generator, then just:
newarray = list(read_next_2d_list_of_lists())
If you want to transpose the order of the axes, you can use the zip function for that.

Evaluating a function over a lattice of unknown dimension using meshgrid and vectorize

When you know the number of dimensions of your lattice ahead of time, it is straight-forward to use meshgrid to evaluate a function over a mesh.
from pylab import *
lattice_points = linspace(0,3,4)
xs,ys = meshgrid(lattice_points,lattice_points)
zs = xs+ys # <- stand-in function, to be replaced by something more interesting
print(zs)
Produces
[[ 0. 1. 2. 3.]
[ 1. 2. 3. 4.]
[ 2. 3. 4. 5.]
[ 3. 4. 5. 6.]]
But I would like to have a version of something similar, for which the number of dimensions is determined during runtime, or is passed as a parameter.
from pylab import *
#np.vectorize
def fn(listOfVars) :
return sum(listOfVars) # <- stand-in function, to be replaced
# by something more interesting
n_vars = 2
lattice_points = linspace(0,3,4)
indices = meshgrid(*(n_vars*[lattice_points])) # this works fine
zs = fn(indices) # <-- this line is wrong, but I don't
# know what would work instead
print(zs)
Produces
[[[ 0. 1. 2. 3.]
[ 0. 1. 2. 3.]
[ 0. 1. 2. 3.]
[ 0. 1. 2. 3.]]
[[ 0. 0. 0. 0.]
[ 1. 1. 1. 1.]
[ 2. 2. 2. 2.]
[ 3. 3. 3. 3.]]]
But I want it to produce the same result as above.
There is probably a solution where you can find the indices of each dimension and use itertools.product to generate all of the possible combinations of indices etc. etc., but is there not a nice pythonic way of doing this?
Joe Kington and user2357112 have helped me to see the error in my ways. For those of you that would like to see a complete solution:
from pylab import *
## 2D "preknown case" (for testing / to compare output)
lattice_points = linspace(0,3,4)
xs,ys = meshgrid(lattice_points,lattice_points)
zs = xs+ys
print('2-D Case')
print(zs)
## 3D "preknown case" (for testing / to compare output)
lattice_points = linspace(0,3,4)
ws,xs,ys = meshgrid(lattice_points,lattice_points,lattice_points)
zs = ws+xs+ys
print('3-D Case')
print(zs)
## Solution, thanks to comments from Joe Kington and user2357112
def fn(listOfVars) :
return sum(listOfVars)
n_vars = 3 ## can change to 2 or 3 to compare to example cases above
lattice_points = linspace(0,3,4)
indices = meshgrid(*(n_vars*[lattice_points]))
zs = np.apply_along_axis(fn,0,indices)
print('adaptable n-D Case')
print(zs)

Simple implementation of NumPy cov (covariance) function

I was trying to implement the numpy.cov() function as given here: numpy cov (covariance) function, what exactly does it compute?, but I am getting some bizarre results. Please correct me:
import numpy as np
def my_covar(X):
X -= X.mean(axis=0)
N = X.shape[1]
return np.dot(X, X.T.conj())/float(N-1)
X = np.asarray([[1.0,1.0],[2.0,2.0],[3.0,3.0]])
## Run NumPy's implementation
print np.cov(X)
"""
NumPy's output:
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
"""
## Run my implementation
print my_covar(X)
"""
My output:
[[ 2. 0. -2.]
[ 0. 0. 0.]
[ -2. 0. 2.]]
"""
What is going wrong?
Both your function and np.cov (by default) assume that the rows of X correspond to variables, and the columns correspond to observations.
When you center X by subtracting the mean, you need to compute the mean over observations, i.e. the columns of X rather than the rows:
X -= X.mean(axis=1)[:, None]

Categories

Resources