Iterating to a numpy list - python

I've been looking around but I have still faced problems with understanding the error in my code. I am trying to get get_pos values listed in pos_list array of size (numstep, N, 3). The values for get_pos is clearly checking as numstep,N,3 but when I call for pos_list at the bottom, it only prints out the last value. Is there anything I've missed?
def get_pos(p_list, N):
sys_position = np.zeros([N,3])
for i in range(N):
sys_position[i] = p_list[i].pos
return sys_position
pos_list = np.zeros([numstep, N, 3])
for num in range(numstep):
# Update particle position
update_pos(p_list, dt, N)
# Update force
force_new = get_force(p_list, dt, N)
# Update particle velocity by averaging
# current and new forces
update_vel(p_list, 0.5*(force+force_new), dt, N)
# Re-define force value
force = force_new
# Increase time
time += dt
print(get_pos(p_list,N))
# Append information to data lists
pos_list = get_pos(p_list, N)
print(pos_list)
ANy help would be appreciated I've been staring at this for far too long.
EDIT: attaching get_pos; function to list down array of position. Calling get_pos gets me the array of position at time t in the range of (0, numstep).

From the looks of it, pos_list is getting overwritten by whatever the get_pos function is returning and therefor is no longer numpy array after the first iteration of the loop.
EDIT:
pos_list = get_pos(p_list, N) overwrites the entire pos_list 3D numpy array with the 2D numpy array returned by get_pos(p_list, N).
To append the 2D numpy array from get_pos to the 3D array, at the specified index use pos_list[num, ...] = get_pos(p_list, N). The returned 2D array will be recorded on the row pointed by num provided the 2D array is the same shape as the 3D array excluding the first column of the 3D array.

Related

Setup sliding windows as columns (IM2COL from MATLAB) in multi-dimensional array - Python

Currently, I have a 4d array, say,
arr = np.arange(48).reshape((2,2,3,4))
I want to apply a function that takes a 2d array as input to each 2d array sliced from arr. I have searched and read this question, which is exactly what I want.
The function I'm using is im2col_sliding_broadcasting() which I get from here. It takes a 2d array and list of 2 elements as input and returns a 2d array. In my case: it takes 3x4 2d array and a list [2, 2] and returns 4x6 2d array.
I considered using apply_along_axis() but as said it only accepts 1d function as parameter. I can't apply im2col function this way.
I want an output that has the shape as 2x2x4x6. Surely I can achieve this with for loop, but I heard that it's too time expensive:
import numpy as np
def im2col_sliding_broadcasting(A, BSZ, stepsize=1):
# source: https://stackoverflow.com/a/30110497/10666066
# Parameters
M, N = A.shape
col_extent = N - BSZ[1] + 1
row_extent = M - BSZ[0] + 1
# Get Starting block indices
start_idx = np.arange(BSZ[0])[:, None]*N + np.arange(BSZ[1])
# Get offsetted indices across the height and width of input array
offset_idx = np.arange(row_extent)[:, None]*N + np.arange(col_extent)
# Get all actual indices & index into input array for final output
return np.take(A, start_idx.ravel()[:, None] + offset_idx.ravel()[::stepsize])
arr = np.arange(48).reshape((2,2,3,4))
output = np.empty([2,2,4,6])
for i in range(2):
for j in range(2):
temp = im2col_sliding_broadcasting(arr[i, j], [2,2])
output[i, j] = temp
Since my arr in fact is a 10000x3x64x64 array. So my question is: Is there another way to do this more efficiently ?
We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.
from skimage.util.shape import view_as_windows
W1,W2 = 2,2 # window size
# create sliding windows along last two axes1
w = view_as_windows(arr,(1,1,W1,W2))[...,0,0,:,:]
# Merge the window axes (tha last two axes) and
# merge the axes along which those windows were created (3rd and 4th axes)
outshp = arr.shape[:-2] + (W1*W2,) + ((arr.shape[-2]-W1+1)*(arr.shape[-1]-W2+1),)
out = w.transpose(0,1,4,5,2,3).reshape(outshp)
The last step forces a copy. So, skip it if possible.

efficient setting 1D range values in a DataFrame (or a ndarray) with boolean array

PREREQUISITE
import numpy as np
import pandas as pd
INPUT1:boolean 2d array (a sample array as below)
x = np.array(
[[False,False,False,False,True],
[True,False,False,False,False],
[False,False,True,False,True],
[False,True,True,False,False],
[False,False,False,False,False]])
INPUT2:1D Range values (a sample as below)
y=np.array([1,2,3,4])
EXPECTED OUTPUT:2D ndarray
[[0,0,0,0,1],
[1,0,0,0,2],
[2,0,1,0,1],
[3,1,1,0,2],
[4,2,2,0,3]]
I want to set a range value(vertical vector) for each True in 2d ndarray(INPUT1) efficiently. Is there some useful APIs or solutions for this purpose?
Unfortunately I couldn't come up with an elegant solution, so I came up with multiple inelegant ones. The two main approaches I could think of are
brute-force looping over each True value and assigning slices, and
using a single indexed assignment to replace the necessary values.
It turns out that the time complexity of these approaches is non-trivial, so depending on the size of your array either can be faster.
Using your example input:
import numpy as np
x = np.array(
[[False,False,False,False,True],
[True,False,False,False,False],
[False,False,True,False,True],
[False,True,True,False,False],
[False,False,False,False,False]])
y = np.array([1,2,3,4])
refout = np.array([[0,0,0,0,1],
[1,0,0,0,2],
[2,0,1,0,1],
[3,1,1,0,2],
[4,2,2,0,3]])
# alternative input with arbitrary size:
# N = 100; x = np.random.rand(N,N) < 0.2; y = np.arange(1,N)
def looping_clip(x, y):
"""Loop over Trues, use clipped slices"""
nmax = x.shape[0]
n = y.size
# initialize output
out = np.zeros_like(x, dtype=y.dtype)
# loop over True values
for i,j in zip(*x.nonzero()):
# truncate right-hand side where necessary
out[i:i+n, j] = y[:nmax-i]
return out
def looping_expand(x, y):
"""Loop over Trues, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# loop over True values
for i,j in zip(ivals, jvals):
# slice will always be complete, i.e. of length y.size
out[i:i+n, j] = y
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
def index_2d(x, y):
"""Assign directly with 2d indices, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# now we can safely index for each "(ivals:ivals+n, jvals)" so to speak
upped_ivals = ivals[:,None] + np.arange(n) # shape (ntrues, n)
upped_jvals = jvals.repeat(y.size).reshape(-1, n) # shape (ntrues, n)
out[upped_ivals, upped_jvals] = y # right-hand size of shape (n,) broadcasts
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
def index_1d(x,y):
"""Assign using linear indices, use an expanded buffer"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# grab linear indices corresponding to Trues in a buffed-up array
inds = np.ravel_multi_index((ivals, jvals), out.shape)
# now all we need to do is start stepping along rows for each item and assign y
upped_inds = inds[:,None] + mmax*np.arange(n) # shape (ntrues, n)
out.flat[upped_inds] = y # y of shape (n,) broadcasts to (ntrues, n)
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
# check that the results are correct
print(all([np.array_equal(refout, looping_clip(x,y)),
np.array_equal(refout, looping_expand(x,y)),
np.array_equal(refout, index_2d(x,y)),
np.array_equal(refout, index_1d(x,y))]))
I tried to document each function, but here's a synopsis:
looping_clip loops over every True value in the input and assigns to a corresponding slice in the output. We take care on the right-hand side to shorten the assigned array for when part of the slice would go beyond the edge of the array along the first dimension.
looping_expand loops over every True value in the input and assigns to a corresponding full slice in the output after allocating a padded output array ensuring that every slice will be full. We do more work when allocating a larger output array, but we don't have to shorten the right-hand side on assignment. We could omit the .copy() call in the last step, but I prefer not to return a nontrivially strided array (i.e. a view to an auxiliary array rather than a proper copy) as this might lead to obscure surprises for the user.
index_2d computes the 2d indices of every value to be assigned to, and assumes that duplicate indices will be handled in order. This is not guaranteed! (More on this a bit later.)
index_1d does the same using linearized indices and indexing into the flatiter of the output.
Here are the timings of the above methods using random arrays (see the commented line near the start):
What we can see is that for small and large arrays the looping versions are faster, but for linear sizes between roughly 10 and 150 the indexing versions are better. The reason I didn't go to higher sizes is that the indexing cases start to use a lot of memory, and I didn't want to have to worry about this messing with timings.
Just to make the above worse, note that the indexing versions assume that duplicate indices in a fancy indexing scenario are handled in order, so when True values are handled which are "lower" in the array, previous values will be overwritten as per your requirements. There's only one problem: this is not guaranteed:
For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.
This doesn't sounds very encouraging. While in my experiments it seems that the indices are handled in order (according to C order), this can also be coincidence, or an implementation detail. So if you want to use the indexing versions, make sure that on your specific version and specific dimensions and shapes this still holds true.
We can make the assignment safer by getting rid of duplicate indices ourselves. For this we can make use of this answer by Divakar on a corresponding question:
def index_1d_safe(x,y):
"""Same as index_1d but use Divakar's safe solution for reducing duplicates"""
n = y.size
nmax,mmax = x.shape
ivals,jvals = x.nonzero()
# initialize buffed-up output
out = np.zeros((nmax + max(n + ivals.max() - nmax,0), mmax), dtype=y.dtype)
# grab linear indices corresponding to Trues in a buffed-up array
inds = np.ravel_multi_index((ivals, jvals), out.shape)
# now all we need to do is start stepping along rows for each item and assign y
upped_inds = inds[:,None] + mmax*np.arange(n) # shape (ntrues, n)
# now comes https://stackoverflow.com/a/44672126
# need additional step: flatten upped_inds and corresponding y values for selection
upped_flat_inds = upped_inds.ravel() # shape (ntrues, n) -> (ntrues*n,)
y_vals = np.broadcast_to(y, upped_inds.shape).ravel() # shape (ntrues, n) -> (ntrues*n,)
sidx = upped_flat_inds.argsort(kind='mergesort')
sindex = upped_flat_inds[sidx]
idx = sidx[np.r_[np.flatnonzero(sindex[1:] != sindex[:-1]), upped_flat_inds.size-1]]
out.flat[upped_flat_inds[idx]] = y_vals[idx]
return out[:nmax, :].copy() # rather not return a view to an auxiliary array
This still reproduces your expected output. The problem is that now the function takes much longer to finish:
Bummer. Considering how my indexing versions are only faster for an intermediate array size and how their faster versions are not guaranteed to work, perhaps it's simplest to just use one of the looping versions. This is not to say, of course, that there aren't any optimal vectorized solutions that I missed.

extracting ith and ith+1 from random 2D numpy array

I have a numpy array consisting of
[1,3,8,6,0,2,4,5,9,7]
This array is a random array consisting of 10 numbers 0-9.
I also have a 2D numpy array, a 10X10 2D numpy array with numerical values.
I would like to use my 1D numpy array (above) to access specific instances in my 2D numpy array, by looping through the 1D array
Loop 1: takes in 1 and 3, and finds the value at [1:3] in my 2D numpy array.
Loop 2: takes in 3 and 8, and finds the value at [3:8] in my 2D numpy array.
.
Loop 10: takes in 7 and 1, and finds the value at [7:1] in my 2D numpy array.
I would like to add up these values in my 2D numpy array.
so far I have :
array=[1,3,8,6,0,2,4,5,9,7]
values =0
for i in range (0, len(array)): #this is 10
a=array2[i,array[i]+1] #array2 is the 2D numpy array with the values
values=values+a
This works to some degree but how to I get it to access the last element to the first? i.e. find [7,1]
You can use simple slicing to make this work.
arr = np.random.randint(0, 10, (10,10))
pos = np.array([1,3,8,6,0,2,4,5,9,7])
pos = np.append(pos, pos[0])
rows = pos[0:-1]
cols = pos[1:]
result = sum(arr[rows, cols])
You can do the slicing twice to make it work.
values = 0
for i in range(len(array)):
a = Matrix[array[i],array[i+1]]
values += a
Also, the array you put has 11 elements which means the 10-th loop will not be what you intended.
I'm not sure I fully understood what you were trying to achieve but...
What about something like this?
a = np.array([1,3,8,6,0,9,2,4,5,9,7])
b = np.array(range(100)).reshape(10,10)
for i in range (len(a)):
print (a[i%len(a)],a[(i+1)%len(a)])
print (b[a[i%len(a)],a[(i+1)%len(a)]])
I removed 10 from the a array to avoid an index out of range error.
I also took the value [x,y] (and not the range [x:y] from the 2D array.

How to append an an array but not the memory location?

I am generating a large set of monte carlo data and ideally want to store in it an array of arrays.
When i use the array.append(x) and then cycle over a loop that returns a new array for x when I look at the elements of the array at the end they are all the same as the last array x added to the list. I believe this must be because i'm adding the memory location to the list and not the actual array data hence when I add more arrays all the other elements that point to the same location also update.
Is there anyway to prevent this by setting a kwarg or something or do i have to construct my arrays in a different way?
#test to illustrate point
import numpy as np
x = np.random.choice((-1, 1), size=(5, 5))
array_test = []
for T in range(10):
array_test.append(x)
print(x)
x += 10
print(array_test)

efficient and fast function calling on 2d array

I have a 2d numpy object array : A and a numpy array of values : l (nx2 array)
Every element in the 2d numpy object array has two values like [ax1, ay1] and similarly for l.
Im calling my own distance function and calculating the distance between the first element of l with every element in A and then taking the minimum distance. The loop looks like this:
for r in range(A.shape[0]):
for s in range(A.shape[1]):
lencent = l.shape[0]
dist = []
for p in range(lencent):
dist.append(distancefunction(A[r,s],A[r,s],l[p,0],l[p,1]))
#print dist
val = np.min(dist)
#print val
tempimg[r,s] = val
But this take huge amount of time when the numpy array A is large, or l is large, or both. Is there any pythonic way to optimize the performance of this loop?

Categories

Resources