I have a set of sphere coordinates in 3D that evolves.
They represent a stack of spheres which are continuously removed from a box from the bottom of the geometry, and reinserted at the top at a random location. Since this kind of simulation is really periodic, I would like to simulate the drainage of the box a few times (say, 5 times, so t=1 takes positions 1 -> t=5 takes positions 5), and then come back to the first state to simulate the next steps (t=6 takes position 1, t=10 takes positions 5, same for t=11->15, etc.)
The problem is that at the coordinates of a given sphere (say, sphere 1) can be very different from the first state to the last simulated one. However, it is very important, for the sake of the simulation, to have a simulation as smooth as possible. If I had to quantify it, I would say that I need the distance between state 5 and state 6 for each pebble to be as low as possible.
It seems to me like an assignment problem. Is there any known solution and method for this kind of problems?
Here is an example of what I would like to have (I mostly use Python):
import numpy as np
# Mockup of the simulation positions
Nspheres = 100
Nsteps = 5 # number of simulated steps
coordinates = np.random.uniform(0,100, (Nsteps, Nspheres, 3)) # mockup x,y,z for each step
initial_positions = coordinates[0]
final_positions = coordinates[Nsteps-1]
**indices_adjust_initial_positions = adjust_initial_positions(initial_positions, final_positions) # to do**
adjusted_initial_positions = initial_positions[indices_adjust_initial_positions]
# Quantification of error made
mean_error = np.mean(np.abs(final_positions-adjusted_initial_positions))
max_error = np.max(np.abs(final_positions-adjusted_initial_positions))
print(mean_error, max_error)
# Assign it for each "cycle"
Ncycles = 5 # Number of times the simulation is repeated
simulation_coordinates = np.empty((Nsteps*Ncycles, Nspheres, 3))
simulation_coordinates[:Nsteps] = np.array(coordinates)
for n in range(1, Ncycles):
new_cycle_coordinates = simulation_coordinates[Nsteps*(n-1):Nsteps*(n):, indices_adjust_initial_positions, :]
simulation_coordinates[Nsteps*n:Nsteps*(n+1)] = new_cycle_coordinates
# Print result
print(simulation_coordinates)
The adjust_initial_positions would therefore take the initial and final states, and determine what would be the ideal set of indices to apply to the initial state to look the most like the final state. Please note that if that makes the problem any simpler, I do not really care if the very top spheres are not really matching between the two states, however it is important to be as close as possible at more towards the bottom.
Would you have any suggestion?
After some research, it seems that scipy.optimize has some nice features able to do something like it. If list1 is my first step, list2 is my last simulated step, we can do something like:
cost = np.linalg.norm(list2[:, np.newaxis, :] - list1, axis=2)
_, indexes = scipy.optimize.linear_sum_assignment(cost)
list3 = list1[indexes]
Therefore, list3 will be as close as list2 as possible thanks to the index sorting, while taking the positions of list1.
Related
I have a problem where in a grid of x*y size I am provided a single dot, and I need to find the nearest neighbour. In practice, I am trying to find the closest dot to the cursor in pygame that crosses a color distance threshold that is calculated as following:
sqrt(((rgb1[0]-rgb2[0])**2)+((rgb1[1]-rgb2[1])**2)+((rgb1[2]-rgb2[2])**2))
So far I have a function that calculates the different resolutions for the grid and reduces it by a factor of two while always maintaining the darkest pixel. It looks as following:
from PIL import Image
from typing import Dict
import numpy as np
#we input a pillow image object and retrieve a dictionary with every grid version of the 3 dimensional array:
def calculate_resolutions(image: Image) -> Dict[int, np.ndarray]:
resolutions = {}
#we start with the highest resolution image, the size of which we initially divide by 1, then 2, then 4 etc.:
divisor = 1
#reduce the grid by 5 iterations
resolution_iterations = 5
for i in range(resolution_iterations):
pixel_lookup = image.load() #convert image to PixelValues object, which allows for pixellookup via [x,y] index
#calculate the resolution of the new grid, round upwards:
resolution = (int((image.size[0] - 1) // divisor + 1), int((image.size[1] - 1) // divisor + 1))
#generate 3d array with new grid resolution, fill in values that are darker than white:
new_grid = np.full((resolution[0],resolution[1],3),np.array([255,255,255]))
for x in range(image.size[0]):
for y in range(image.size[1]):
if not x%divisor and not y%divisor:
darkest_pixel = (255,255,255)
x_range = divisor if x+divisor<image.size[0] else (0 if image.size[0]-x<0 else image.size[0]-x)
y_range = divisor if y+divisor<image.size[1] else (0 if image.size[1]-y<0 else image.size[1]-y)
for x_ in range(x,x+x_range):
for y_ in range(y,y+y_range):
if pixel_lookup[x_,y_][0]+pixel_lookup[x_,y_][1]+pixel_lookup[x_,y_][2] < darkest_pixel[0]+darkest_pixel[1]+darkest_pixel[2]:
darkest_pixel = pixel_lookup[x_,y_]
if darkest_pixel != (255,255,255):
new_grid[int(x/divisor)][int(y/divisor)] = np.array(darkest_pixel)
resolutions[i] = new_grid
divisor = divisor*2
return resolutions
This is the most performance efficient solution I was able to come up with. If this function is run on a grid that continually changes, like a video with x fps, it will be very performance intensive. I also considered using a kd-tree algorithm that simply adds and removes any dots that happen to change on the grid, but when it comes to finding individual nearest neighbours on a static grid this solution has the potential to be more resource efficient. I am open to any kinds of suggestions in terms of how this function could be improved in terms of performance.
Now, I am in a position where for example, I try to find the nearest neighbour of the current cursor position in a 100x100 grid. The resulting reduced grids are 50^2, 25^2, 13^2, and 7^2. In a situation where a part of the grid looks as following:
And I am on the aggregation step where a part of the grid consisting of six large squares, the black one being the current cursor position and the orange dots being dots where the color distance threshold is crossed, I would not know which diagonally located closest neighbour I would want to pick to search next. In this case, going one aggregation step down shows that the lower left would be the right choice. Depending on how many grid layers I have this could result in a very large error in terms of the nearest neighbour search. Is there a good way how I can solve this problem? If there are multiple squares that show they have a relevant location, do I have to search them all in the next step to be sure? And if that is the case, the further away I get the more I would need to make use of math functions such as the pythagorean theorem to assert whether the two positive squares I find are overlapping in terms of distance and could potentially contain the closest neighbour, which would start to be performance intensive again if the function is called frequently. Would it still make sense to pursue this solution over a regular kd tree? For now the grid size is still fairly small (~800-600) but if the grid gets larger the performance may start suffering again. Is there a good scalable solution to this problem that could be applied here?
I want to do random generation of levels using 2D arrays. 0 = emptiness, 1 = wall, how generate levels which have a passable route from a starting point to a finish(yellow circle, appears in a random free spot on the map) location?
pole = [[0] * 20 for i in range(20)]
for i in range(20):
for j in range(20):
pole[i][j] = random.randint(0, 1)
At the moment it just randomizes the ones and zeros and it comes out:
The best way to do this is to generate a random map just like you do now and them check with an algorithm if there is an open route between start and finish. If not, just generate a new map and check again.
A "flood fill" algorithm where you start filling from start and see if the fill reaches the finish spot could work. For algorithm help, see:
Need help implementing flood-fill algorithm
I have to work on task of using hand keypoints as pointer (or touchless mouse).
The main problem here is that the (deep learning) hand keypoints are not perfect (sometime under varied of light, skin colors), therefore the chosen key point are scattering, not smoothly moving like the real mouse we use.
How can I smooth them online (in real-time). Not the solution given array of 2D points and then we smooth on this array. This is the case of new point get in one by one and we have to correct them immediately! to avoid user suffer the scattering mouse.
I'm using opencv and python. Please be nice since I'm new to Computer Vision.
Thanks
The simplest way is to use a moving average. You can compute, very efficiently, the average position of the last n steps and use that to "smooth" the trajectory:
n = 5 # the average "window size"
counter = 0 # count how many steps so far
avg = 0. # the average
while True:
# every time step
val = get_keypoint_value_for_this_time_step()
counter += 1
coeff = 1. / min(counter, n)
# update using moving average
avg = coeff * val + (1. - coeff) * avg
print(f'Current time step={val} smothed={avg}')
More variants of moving averages can be found here.
Since you want a physics like behavior, you can use a simple physics model. Note all arrays below describe properties of the current state of your dynamics and therefore have the same shape (1, 2).
define forces
a:attraction = (k - p) * scaler
velocity
v:velocity = v + a
positions
p:current position = p + v
k:new dl key point = whatever your dl outputs
You output p to the user. Note, If you want a more natural motion, you can play with the scaler or add additional forces (like a) to v.
Also, to get all points, concatenate p to ps where ps.shape = (n, 2).
A square box of size 10,000*10,000 has 10,00,000 particles distributed uniformly. The box is divided into grids, each of size 100*100. There are 10,000 grids in total. At every time-step (for a total of 2016 steps), I would like to identify the grid to which a particle belongs. Is there an efficient way to implement this in python? My implementation is as below and currently takes approximately 83s for one run.
import numpy as np
import time
start=time.time()
# Size of the layout
Layout = np.array([0,10000])
# Total Number of particles
Population = 1000000
# Array to hold the cell number
cell_number = np.zeros((Population),dtype=np.int32)
# Limits of each cell
boundaries = np.arange(0,10100,step=100)
cell_boundaries = np.dstack((boundaries[0:100],boundaries[1:101]))
# Position of Particles
points = np.random.uniform(0,Layout[1],size = (Population,2))
# Generating a list with the x,y boundaries of each cell in the grid
x = []
limit_list = cell_boundaries
for i in range(0,Layout[1]//100):
for j in range(0,Layout[1]//100):
x.append([limit_list[0][i,0],limit_list[0][i,1],limit_list[0][j,0],limit_list[0][j,1]])
# Identifying the cell to which the particles belong
i=0
for y in (x):
cell_number[(points[:,1]>y[0])&(points[:,1]<y[1])&(points[:,0]>y[2])&(points[:,0]<y[3])]=i
i+=1
print(time.time()-start)
I am not sure about your code. You seem to be accumulating the i variable globally. While it should be accumulated on a per cell basis, correct? Something like cell_number[???] += 1, maybe?
Anyhow, the way I see is from a different perspective. You could start by assigning each point a cell id. Then inverse the resulting array with a kind of counter function. I have implemented the following in PyTorch, you will most likely find equivalent utilities in Numpy.
The conversion from 2-point coordinates to cell ids corresponds to applying floor on the coordinates then unfolding them according to your grid's width.
>>> p = torch.from_numpy(points).floor()
>>> p_unfold = p[:, 0]*10000 + p[:, 1]
Then you can "inverse" the statistics, i.e. find out how many particles there are in each respective cell based on the cell ids. This can be done using PyTorch histogram's counter torch.histc:
>>> torch.histc(p_unfold, bins=Population)
There's some context to this, so bear with me please.
I have a list of lists, call it nested_lists, where each list is of the form [[1,2,3,...], [4,3,1,...]] (i.e. each list contains two lists of integers). Now, in each of these lists, the two lists of integers have the same length and two integers corresponding to the same index represent a coordinate in R^2.
So for example, (1,4) would be one coordinate from the above example.
Now, my task is to draw 5 unique coordinates from nested_lists uniformly (i.e. each coordinate has the same probability of being chosen), without replacement. That is, from all of the coordinates from the lists in nested_lists, I am trying to draw 5 unique coordinates uniformly without replacement.
One very straightforward way to do this would be to : 1. Create a list of ALL the unique coordinates in nested_lists. 2. Use numpy.random.choice to sample 5 elements uniformly without replacement.
The code would be something like this:
import numpy as np
coordinates = []
#Get list of all unique coordinates
for list in nested_lists:
l = len(list[0])
for i in range(0, l):
coordinate = (list[0][i], list[1][i])
if coordinate not coordinates:
coordinates += [coordinate]
draws = np.random.choice(coordinates, 5, replace=False, p= [1/len(coordinates)]*len(coordinates))
But getting a set of all the unique coordinates can be very computationally expensive, especially if nested_lists contains millions of lists, each with thousands of coordinates in them. So I'm looking for methods to perform the same draws without having to get a list of all the coordinates first.
One method I thought of would be to sample with weighted probabilities from each list in nested_lists.
So get a list of the sizes (number of coordinates) of each list, and then go through each list and draw a coordinate with probability (size/sum(size))*(1/sum(sizes)). Repeating the process until 5 unique coordinates are drawn should then correspond to what we wanted to draw. The code would be something like this:
no_coordinates = lambda x: len(x[0])
sizes = list(map(no_coordinates, nested_lists))
i = 0
sum_sizes = sum(sizes)
draws = []
while i != 5: #to make sure we get 5 draws
for list in nested_lists:
size = len(list[0])
p = size/(sum_sizes**2)
for j in range(0, size):
if i >= 5: exit for loop when we reach 5 draws
break
if np.random.random() < p and (list[0][j], list[1][j]) not in draws:
draws += (list[0][j], list[1][j])
i += 1
The code above seems to be more computationally efficient, but I am not sure if it actually draws with the same probability that would be required overall. From my calculation, the overall probability would sum(size)/sum_sizes**2 which is the same as 1/sum_sizes (our required probability), but again, I'm not sure if this is correct.
So I was wondering if there are more efficient approaches to drawing like I want, and if my approach is actually correct or not.
You can use bootstrapping. Basically, the idea is to draw some large (but fixed) amount of coordinates with replacement to estimate probability of each coordinate. Then, you can subsample from this list using transformed densities.
from collections import Counter
bootstrap_sample_size = 1000
total_lists = len(nested_lists)
list_len = len(nested_lists[0])
# set will make more sense in this example
# I used counter to allow for future statistical manipulations
c = Counter()
for _ in range(bootstrap_sample_size):
x, y = random.randrange(total_lists), random.randrange(list_len)
random_point = nested_lists[x][0][y], nested_lists[x][1][y]
c.update((random_point,))
# now c contains counts for 1000 points with replacements
# let's just ignore these probabilities to get uniform sample
result = random.sample(c.keys(), 5)
This will not be exactly uniform, but bootstrap provides statistical guarantees that it will be arbitrary close to uniform distribution as the bootstrap_sample_size is increased. 1000 samples is usually enough for most real-life applications.