I have been playing around with writing my own physics engine in Python as an exercise in physics and programming. I started out by following the tutorial located here. That went well, but then I found the article "Advanced character physics" by thomas jakobsen, which covered using Verlet integration for simulations, which I found fascinating.
I have been attempting to write my own basic physics simulator using verlet integration, but it turns out to be slightly more difficult than I first expected. I was out browsing for example programs to read, and stumbled accross this one written in Python and I also found this tutorial which uses Processing.
What impresses me about the Processing version is how fast it runs. The cloth alone has 2400 different points being simulated, and that's not including the bodies.
The python example only uses 256 particles for the cloth, and it runs at about 30 frames per second. I tried increasing the number of particles to 2401 (it has to be square for that program to work), it ran at about 3 fps.
Both of these work by storing instances of a particle object in a list, and then iterating through the list, calling each particles "update position" method. As an example, this is the part of the code from the Processing sketch that calculates each particle's new postion:
for (int i = 0; i < pointmasses.size(); i++) {
PointMass pointmass = (PointMass) pointmasses.get(i);
pointmass.updateInteractions();
pointmass.updatePhysics(fixedDeltaTimeSeconds);
}
EDIT: Here is the code from the python version I linked earlier:
"""
verletCloth01.py
Eric Pavey - 2010-07-03 - www.akeric.com
Riding on the shoulders of giants.
I wanted to learn now to do 'verlet cloth' in Python\Pygame. I first ran across
this post \ source:
http://forums.overclockers.com.au/showthread.php?t=870396
http://dl.dropbox.com/u/3240460/cloth5.py
Which pointed to some good reference, that was a dead link. After some searching,
I found it here:
http://www.gpgstudy.com/gpgiki/GDC%202001%3A%20Advanced%20Character%20Physics
Which is a 2001 SIGGRAPH paper by Thomas Jakobsen called:
"GDC 2001: Advanced Characer Physics".
This code is a Python\Pygame interpretation of that 2001 Siggraph paper. I did
borrow some code from 'domlebo's source code, it was a great starting point. But
I'd like to think I put my own flavor on it.
"""
#--------------
# Imports & Initis
import sys
from math import sqrt
# Vec2D comes from here: http://pygame.org/wiki/2DVectorClass
from vec2d import Vec2d
import pygame
from pygame.locals import *
pygame.init()
#--------------
# Constants
TITLE = "verletCloth01"
WIDTH = 600
HEIGHT = 600
FRAMERATE = 60
# How many iterations to run on our constraints per frame?
# This will 'tighten' the cloth, but slow the sim.
ITERATE = 2
GRAVITY = Vec2d(0.0,0.05)
TSTEP = 2.8
# How many pixels to position between each particle?
PSTEP = int(WIDTH*.03)
# Offset in pixels from the top left of screen to position grid:
OFFSET = int(.25*WIDTH)
#-------------
# Define helper functions, classes
class Particle(object):
"""
Stores position, previous position, and where it is in the grid.
"""
def __init__(self, screen, currentPos, gridIndex):
# Current Position : m_x
self.currentPos = Vec2d(currentPos)
# Index [x][y] of Where it lives in the grid
self.gridIndex = gridIndex
# Previous Position : m_oldx
self.oldPos = Vec2d(currentPos)
# Force accumulators : m_a
self.forces = GRAVITY
# Should the particle be locked at its current position?
self.locked = False
self.followMouse = False
self.colorUnlocked = Color('white')
self.colorLocked = Color('green')
self.screen = screen
def __str__(self):
return "Particle <%s, %s>"%(self.gridIndex[0], self.gridIndex[1])
def draw(self):
# Draw a circle at the given Particle.
screenPos = (self.currentPos[0], self.currentPos[1])
if self.locked:
pygame.draw.circle(self.screen, self.colorLocked, (int(screenPos[0]),
int(screenPos[1])), 4, 0)
else:
pygame.draw.circle(self.screen, self.colorUnlocked, (int(screenPos[0]),
int(screenPos[1])), 1, 0)
class Constraint(object):
"""
Stores 'constraint' data between two Particle objects. Stores this data
before the sim runs, to speed sim and draw operations.
"""
def __init__(self, screen, particles):
self.particles = sorted(particles)
# Calculate restlength as the initial distance between the two particles:
self.restLength = sqrt(abs(pow(self.particles[1].currentPos.x -
self.particles[0].currentPos.x, 2) +
pow(self.particles[1].currentPos.y -
self.particles[0].currentPos.y, 2)))
self.screen = screen
self.color = Color('red')
def __str__(self):
return "Constraint <%s, %s>"%(self.particles[0], self.particles[1])
def draw(self):
# Draw line between the two particles.
p1 = self.particles[0]
p2 = self.particles[1]
p1pos = (p1.currentPos[0],
p1.currentPos[1])
p2pos = (p2.currentPos[0],
p2.currentPos[1])
pygame.draw.aaline(self.screen, self.color,
(p1pos[0], p1pos[1]), (p2pos[0], p2pos[1]), 1)
class Grid(object):
"""
Stores a grid of Particle objects. Emulates a 2d container object. Particle
objects can be indexed by position:
grid = Grid()
particle = g[2][4]
"""
def __init__(self, screen, rows, columns, step, offset):
self.screen = screen
self.rows = rows
self.columns = columns
self.step = step
self.offset = offset
# Make our internal grid:
# _grid is a list of sublists.
# Each sublist is a 'column'.
# Each column holds a particle object per row:
# _grid =
# [[p00, [p10, [etc,
# p01, p11,
# etc], etc], ]]
self._grid = []
for x in range(columns):
self._grid.append([])
for y in range(rows):
currentPos = (x*self.step+self.offset, y*self.step+self.offset)
self._grid[x].append(Particle(self.screen, currentPos, (x,y)))
def getNeighbors(self, gridIndex):
"""
return a list of all neighbor particles to the particle at the given gridIndex:
gridIndex = [x,x] : The particle index we're polling
"""
possNeighbors = []
possNeighbors.append([gridIndex[0]-1, gridIndex[1]])
possNeighbors.append([gridIndex[0], gridIndex[1]-1])
possNeighbors.append([gridIndex[0]+1, gridIndex[1]])
possNeighbors.append([gridIndex[0], gridIndex[1]+1])
neigh = []
for coord in possNeighbors:
if (coord[0] < 0) | (coord[0] > self.rows-1):
pass
elif (coord[1] < 0) | (coord[1] > self.columns-1):
pass
else:
neigh.append(coord)
finalNeighbors = []
for point in neigh:
finalNeighbors.append((point[0], point[1]))
return finalNeighbors
#--------------------------
# Implement Container Type:
def __len__(self):
return len(self.rows * self.columns)
def __getitem__(self, key):
return self._grid[key]
def __setitem__(self, key, value):
self._grid[key] = value
#def __delitem__(self, key):
#del(self._grid[key])
def __iter__(self):
for x in self._grid:
for y in x:
yield y
def __contains__(self, item):
for x in self._grid:
for y in x:
if y is item:
return True
return False
class ParticleSystem(Grid):
"""
Implements the verlet particles physics on the encapsulated Grid object.
"""
def __init__(self, screen, rows=49, columns=49, step=PSTEP, offset=OFFSET):
super(ParticleSystem, self).__init__(screen, rows, columns, step, offset)
# Generate our list of Constraint objects. One is generated between
# every particle connection.
self.constraints = []
for p in self:
neighborIndices = self.getNeighbors(p.gridIndex)
for ni in neighborIndices:
# Get the neighbor Particle from the index:
n = self[ni[0]][ni[1]]
# Let's not add duplicate Constraints, which would be easy to do!
new = True
for con in self.constraints:
if n in con.particles and p in con.particles:
new = False
if new:
self.constraints.append( Constraint(self.screen, (p,n)) )
# Lock our top left and right particles by default:
self[0][0].locked = True
self[1][0].locked = True
self[-2][0].locked = True
self[-1][0].locked = True
def verlet(self):
# Verlet integration step:
for p in self:
if not p.locked:
# make a copy of our current position
temp = Vec2d(p.currentPos)
p.currentPos += p.currentPos - p.oldPos + p.forces * TSTEP**2
p.oldPos = temp
elif p.followMouse:
temp = Vec2d(p.currentPos)
p.currentPos = Vec2d(pygame.mouse.get_pos())
p.oldPos = temp
def satisfyConstraints(self):
# Keep particles together:
for c in self.constraints:
delta = c.particles[0].currentPos - c.particles[1].currentPos
deltaLength = sqrt(delta.dot(delta))
try:
# You can get a ZeroDivisionError here once, so let's catch it.
# I think it's when particles sit on top of one another due to
# being locked.
diff = (deltaLength-c.restLength)/deltaLength
if not c.particles[0].locked:
c.particles[0].currentPos -= delta*0.5*diff
if not c.particles[1].locked:
c.particles[1].currentPos += delta*0.5*diff
except ZeroDivisionError:
pass
def accumulateForces(self):
# This doesn't do much right now, other than constantly reset the
# particles 'forces' to be 'gravity'. But this is where you'd implement
# other things, like drag, wind, etc.
for p in self:
p.forces = GRAVITY
def timeStep(self):
# This executes the whole shebang:
self.accumulateForces()
self.verlet()
for i in range(ITERATE):
self.satisfyConstraints()
def draw(self):
"""
Draw constraint connections, and particle positions:
"""
for c in self.constraints:
c.draw()
#for p in self:
# p.draw()
def lockParticle(self):
"""
If the mouse LMB is pressed for the first time on a particle, the particle
will assume the mouse motion. When it is pressed again, it will lock
the particle in space.
"""
mousePos = Vec2d(pygame.mouse.get_pos())
for p in self:
dist2mouse = sqrt(abs(pow(p.currentPos.x -
mousePos.x, 2) +
pow(p.currentPos.y -
mousePos.y, 2)))
if dist2mouse < 10:
if not p.followMouse:
p.locked = True
p.followMouse = True
p.oldPos = Vec2d(p.currentPos)
else:
p.followMouse = False
def unlockParticle(self):
"""
If the RMB is pressed on a particle, if the particle is currently
locked or being moved by the mouse, it will be 'unlocked'/stop following
the mouse.
"""
mousePos = Vec2d(pygame.mouse.get_pos())
for p in self:
dist2mouse = sqrt(abs(pow(p.currentPos.x -
mousePos.x, 2) +
pow(p.currentPos.y -
mousePos.y, 2)))
if dist2mouse < 5:
p.locked = False
#------------
# Main Program
def main():
# Screen Setup
screen = pygame.display.set_mode((WIDTH, HEIGHT))
clock = pygame.time.Clock()
# Create our grid of particles:
particleSystem = ParticleSystem(screen)
backgroundCol = Color('black')
# main loop
looping = True
while looping:
clock.tick(FRAMERATE)
pygame.display.set_caption("%s -- www.AKEric.com -- LMB: move\lock - RMB: unlock - fps: %.2f"%(TITLE, clock.get_fps()) )
screen.fill(backgroundCol)
# Detect for events
for event in pygame.event.get():
if event.type == pygame.QUIT:
looping = False
elif event.type == MOUSEBUTTONDOWN:
if event.button == 1:
# See if we can make a particle follow the mouse and lock
# its position when done.
particleSystem.lockParticle()
if event.button == 3:
# Try to unlock the current particles position:
particleSystem.unlockParticle()
# Do stuff!
particleSystem.timeStep()
particleSystem.draw()
# update our display:
pygame.display.update()
#------------
# Execution from shell\icon:
if __name__ == "__main__":
print "Running Python version:", sys.version
print "Running PyGame version:", pygame.ver
print "Running %s.py"%TITLE
sys.exit(main())
Because both programs work roughly the same way, but the Python version is SO much slower, it makes me wonder:
Is this performance difference part of the nature of Python?
What should I do differently from the above if I want to get better performance from my own Python programs? E.g store the properties of all particles inside an array instead of using individual objects, etc.
EDIT: Answered!!
#Mr E's linked PyCon talk in the comments, and #A. Rosa answer with the linked resources all helped ENORMOUSLY in better understanding how to write good, fast python code. I am now bookmarking this page for future reference :D
There is a Guido van Rossum's article linked in the section Performance Tips of the Python Wiki. In its conclusion, you can read the following sentence:
If you feel the need for speed, go for built-in functions - you can't beat a loop written in C.
The essay continues with a list of guidelines for loop optimization. I recommend both resources, since they give concrete and practical advices about optimizing Python code.
There is also a well-known group of benchmarks in benchmarksgame.alioth.debian.org, where you can find comparasions among different programs and languages in distinct machines. As can be seen, there are lots of variables in play that makes impossible state something as broad as Java is faster than Python. This is commonly summed up in the sentence "Languages don't have speeds; implementations do".
In your code can be applied more pythonic and faster alternatives using built-in functions. For example, there are several nested loops (some of them don't require processing the whole list) which can be rewritten using imap or list comprehensions. PyPy is also another interesting option to improve the performance. I'm not an expert about Python optimization, but there are lots of tips which are extremely useful (Notice that don't write Java in Python is one of them!).
Resources and another related questions on SO:
Performance differences between Python and C
Is it reasonable to integrate python with c for performance?
http://www.ibm.com/developerworks/opensource/library/os-pypy-intro/index.html?ca=drs-
http://pyevolve.sourceforge.net/wordpress/?p=1189
If you write Python like you write Java, of course it's going to be slower, idiomatic java does not translate well to idiomatic python.
Is this performance difference part of the nature of Python?
What should I do differently from the above if I want to get better performance from my own Python programs? E.g store the properties of all particles inside an array instead of using individual objects, etc.
Hard to say without seeing your code.
Here are an incomplete list of differences between python and java that may sometimes affect performance:
Processing uses immediate mode canvas, if you want a comparable performance in Python, you also need to use immediate mode canvas. Canvases in most GUI framework (including Tkinter canvas) is retained mode, which is easier to use, but inherently slower than immediate mode. You'll need to use immediate mode canvas like those provided by pygame, SDL, or Pyglet.
Python is dynamic language, that means instance member access, module member access, and global variable access is resolved at run time. Instance member access, module member access, and global variable access in python is really dictionary access. In java, they are resolved at compile time and by its nature much faster. Cache frequently accessed globals, module variables, and attributes to a local variable.
In python 2.x, range() produces a concrete list, in python, iteration done using iterator, for item in list, is usually faster than iteration done using iteration variable, for n in range(len(list)). You should almost always iterate directly using iterator instead of iterating using range(len(...)).
Python's numbers is immutable, this means any arithmetic calculation allocates a new object. This is one reason why plain python is not very suitable for low level calculations; most people that want to be able to write low level calculations without having to resort to writing C extension typically uses cython, psyco, or numpy. This usually only becomes a problem when you have millions of calculations though.
This are just partial, very incomplete list, there are many other reasons why translating java to python would produce suboptimal code. Without seeing your code it's impossible to tell what you need to do differently. Optimized python code generally looks very different than optimized java code.
I would also suggest to read about other physics engines. There are a few open source engines which use a variety of methods for calculating the "physics".
Newton Game Dynamics
Chipmunk
Bullet
Box2D
ODE (Open Dynamics Engine)
There are also ports of most of the engines:
Pymunk
PyBullet
PyBox2D
PyODE
If you read through the documentation of those engines you will often find statements saying that they are optimized for speed (30fps - 60fps). But if you think they can do this while calculating "real" physics you are wrong. Most engines calculate physics to a point where a normal user cannot optically distinguish between "real" physical behavior and "simulated" physical behavior. However if you investigate the error it is neglectable if you want to write games. But if you want to do physics, all of those engines are of no use to you.
Thats why I would say if you are doing a real physical simulation you are slower than those engines by design and you will never outrun another physics engine.
Particle-based physics simulation translates easily into linear algebra operations ie. matrix operations. Numpy offers such operations, which are implemented in Fortran/C/C++ under the hood. Well-written python/Numpy code (taking full advantage of language & library) allows to write decently fast code.
Related
I currently have an array of desired position vs. time of an object in my plant. I am using an inverse dynamics controller in order to drive the object to this desired position but I'm experiencing some difficulties. Here is how I am doing this:
I created the controller system
ID_cont = InverseDynamicsController(robot=controller_plant, kp=np.array([0.5]), ki=np.array([0.3]), kd=np.array([0.4]), has_reference_acceleration=False)
ID_controller = builder.AddSystem(ID_cont)
I got the controller input and output ports
control_estimated_state_input_port = ID_controller.get_input_port(0)
control_desired_state_input_port = ID_controller.get_input_port(1)
control_output_port = ID_controller.get_output_port(0)
I added a constant state source (likely wrong to do) and a state interpolator
constant_state_source = ConstantVectorSource(np.array([0.0]))
builder.AddSystem(constant_state_source)
position_to_state = StateInterpolatorWithDiscreteDerivative(controller_plant.num_positions(),
controller_plant.time_step())
builder.AddSystem(position_to_state)
I wired the controller to the plant
builder.Connect(constant_state_source.get_output_port(), position_to_state.get_input_port())
builder.Connect(position_to_state.get_output_port(), control_desired_state_input_port)
builder.Connect(plant.get_state_output_port(model_instance_1), control_estimated_state_input_port)
builder.Connect(control_output_port, plant.get_actuation_input_port(model_instance_1))
Next, I am trying to create a while loop that advances the simulation and changes the 'constant vector source' so I can feed in my position vs. time values but I'm unsure if the reason this isn't working out is because this is the complete wrong approach or if this is the right approach but I just have a few things wrong
diagram_context = diagram.CreateDefaultContext()
sim_time_temp = diagram_context.get_time()
time_step = 0.1
while sim_time_temp < duration:
ID_controller_context = diagram.GetMutableSubsystemContext(ID_controller, diagram_context)
simulator.AdvanceTo(sim_time_temp)
sim_time_temp = sim_time_temp + time_step
I added a constant state source (likely wrong to do) and a state interpolator
As you suspected, this is not the best way to go if you already have a desired sequence of positions and times that you want the system to track. Instead, you should use a TrajectorySource. Since you have a set of positions samples, positions (num_times x num_positions array), that you'd like the system to hit at specified times (num_times x 1 array), PiecewisePolynomial.CubicShapePreserving is a reasonable choice for building the trajectory.
desired_position_trajectory = PiecewisePolynomial.CubicShapePreserving(times, positions)
desired_state_source = TrajectorySource(desired_position_trajectory,
output_derivative_order=1)
builder.AddSystem(desired_state_source)
The output_derivative_order=1 argument makes desired_state_source output a [position, velocity] vector rather than just a position vector. You can connect desired_state_source directly to the controller, without an interpolator.
With this setup, you can advance the simulation all the way to duration without the need for a while loop.
I've created an image approximating genetic algorithm using python 3 and opencv. What it does is, it creates a population of individuals that draw random colored,sized, and opacity circles onto a blank image. The fittest eventually saturate the population after several hundred generations.
I tried to implement multiprocessing because rendering the images takes time correlating to population size and circle size, as well as target image size (important for detail fineness)
What I did is I used multiprocessing and Pool, with the array of individual objects as the iterable and mapped out only the fitness and id. In effect, in the main process none of the individuals have their own canvas, whereas in the multiprocess processes, each individuals render out their canvas and calculate fitness/difference.
However, it seems using multiprocessing makes the whole program slower? In fact, the rendering process seems to be taking the same amount of speed compared to serialized processing, but is taking slower because of the multiprocessing aspect.
class PopulationCircle:
def renderPop(self, individual):
individual.render()
return [individual.index, individual.fitness]
class IndividualCircle:
def render(self):
self.genes.sort(key=lambda x: x[-1], reverse=False)
self.canvas = np.zeros((self.height,self.width, 4), np.uint8)
for i in range(self.maxCount):
overlay=self.canvas.copy()
cv2.circle(overlay, (self.genes[i][0], self.genes[i][1]), self.genes[i][2], (self.genes[i][3],self.genes[i][4],self.genes[i][5]), -1, lineType=cv2.LINE_AA)
self.canvas = cv2.addWeighted(overlay, self.genes[i][6], self.canvas, 1-self.genes[i][6], 0)
diff = np.absolute(np.array(self.target)- np.array(self.canvas))
diffSum = np.sum(diff)
self.fitness = diffSum
def evolution(mainPop, generationLimit):
p = mp.Pool()
for i in range(int(generationLimit)):
start_time = time.time()
result =[]
print(f"""
-----------------------------------------
Current Generation: {mainPop.generation}
Initial Score: {mainPop.score}
-----------------------------------------
""")
#Multiprocessing used for rendering out canvas since it takes time.
result = p.map(mainPop.renderPop, mainPop.population)
#returns [individual.index, individual.fitness]; results is a list of list
result.sort(key = lambda x: x[0], reverse=False)
#Once multiprocessing is done, we only receive fitness value and index.
for k in mainPop.population:
k.fitness = result[k.index][1]
mainPop.population.sort(key = lambda x: x.fitness, reverse = True)
if mainPop.generation == 0:
mainPop.score = mainPop.population[-1].fitness
"""
Things to note:
In main process, none of the individuals have a canvas since the rendering
is done on a different process tree.
The only thing that changes in this main process is the individual's
fitness.
After calling .renderHD and .renderLD, the fittest member will have a canvas
drawn in this process.
"""
end_time = time.time() - start_time
print(f"Time taken: {end_time}")
if i%50==0:
mainPop.population[0].renderHD()
cv2.imwrite( f"../output/generationsPoly/generation{i}.jpg", mainPop.population[0].canvasHD)
if i%10==0:
mainPop.population[0].renderLD()
cv2.imwrite( f"../output/allGenPoly/image{i}.jpg", mainPop.population[0].canvas)
mainPop.toJSON()
mainPop.breed()
p.close()
p.join()
if __name__ == "__main__":
#Creates Population object
#init generates self.population array which is an array of IndividualCircle objects that contain DNA and render methods
pop = PopulationCircle(targetDIR, maxPop, circleAmount, mutationRate, mutationAmount, cutOff)
#Starts loop
evolution(pop, generations)
if I use 600 population with 800 circles,
serial took: 11siteration avg.
multiprocess: 18s/iteration avg.
I'm very new to multiprocessing so any help would be appreciated.
The reason it's happening is that opencv internally spawns a lot of threads. When you fork from the main and run a number of processes, each of these processes will create separate bunch of opencv threads, resulting in a small avalanche.The problem here is that they will end up syncing and waiting for a lock release, something you can easily check by profiling your code with cProfile.
The problem is described in joblib docs. That's also likely your solution: switch to joblib. I have had a similar problem in the past, you will find it in this SO post.
[EDIT] Extra piece of evidence and solution here. In short, according to that post, it's a known problem, but since opencv releases GIL, it could be possible to run multithreading instead of multiprocessing and therefore reduce the overhead.
Using the threading library to accelerate calculating each point's neighborhood in a points-cloud. By calling function CalculateAllPointsNeighbors at the bottom of the post.
The function receives a search radius, maximum number of neighbors and a number of threads to split the work on. No changes are done on any of the points. And each point stores data in its own np.ndarray cell accessed by its own index.
The following function times how long it takes N number of threads to finish calculating all points neighborhoods:
def TimeFuncThreads(classObj, uptothreads):
listTimers = []
startNum = 1
EndNum = uptothreads + 1
for i in range(startNum, EndNum):
print("Current Number of Threads to Test: ", i)
tempT = time.time()
classObj.CalculateAllPointsNeighbors(searchRadius=0.05, maxNN=25, maxThreads=i)
tempT = time.time() - tempT
listTimers.append(tempT)
PlotXY(np.arange(startNum, EndNum), listTimers)
The problem is, I've been getting very different results in each run. Here are the plots from 5 subsequent runs of the function TimeFuncThreads. The X axis is number of threads, Y is the runtime. First thing is, they look totally random. And second, there is no significant acceleration boost.
I'm confused now whether I'm using the threading library wrong and what is this behavior that I'm getting?
The function that handles the threading and the function that is being called from each thread:
def CalculateAllPointsNeighbors(self, searchRadius=0.20, maxNN=50, maxThreads=8):
threadsList = []
pointsIndices = np.arange(self.numberOfPoints)
splitIndices = np.array_split(pointsIndices, maxThreads)
for i in range(maxThreads):
threadsList.append(threading.Thread(target=self.GetPointsNeighborsByID,
args=(splitIndices[i], searchRadius, maxNN)))
[t.start() for t in threadsList]
[t.join() for t in threadsList]
def GetPointsNeighborsByID(self, idx, searchRadius=0.05, maxNN=20):
if isinstance(idx, int):
idx = [idx]
for currentPointIndex in idx:
currentPoint = self.pointsOpen3D.points[currentPointIndex]
pointNeighborhoodObject = self.GetPointNeighborsByCoordinates(currentPoint, searchRadius, maxNN)
self.pointsNeighborsArray[currentPointIndex] = pointNeighborhoodObject
self.__RotatePointNeighborhood(currentPointIndex)
It pains me to be the one to introduce you to the Python Gil. Is a very nice feature that makes parallelism using threads in Python a nightmare.
If you really want to improve your code speed, you should be looking at the multiprocessing module
An N-body simulation is used to simulated dynamics of a physical system involving particles interactions, or a problem reduced to some kind of particles with physical meaning. A particle could be a gas molecule or a star in a galaxy. Dask.bag provides a simple way to distribute the particles in a cluster, for example, giving dask.bag.from_sequence() a custom iterator, that returns a particle object:
class ParticleGenerator():
def __init__(self, num_of_particles, max_position, seed=time.time()):
random.seed(seed)
self.index = -1
self.limit = num_of_particles
self.max_position = max_position
def __iter__(self):
return self
def __next__(self):
self.index += 1
if self.index < self.limit :
return np.array([self.max_position*random.random(), self.max_position*random.random(), self.max_position*random.random()])
else :
raise StopIteration
b = db.from_sequence( ParticleGenerator(1000, 1, seed=123456789) )
Here, the particle object is simply a numpy array, but could be anything. Now, to compute the interactions between all particles, information about position, speed and similar quantities must be shared. dask.bag.map maps a function across all elements in collection, inside this function, interaction between the element and all other particles is calculated to obtain the new particle state.
b = b.map(update_position, others=list(b))
b.compute()
For completitude, this is update_position function:
def update_position(e, others=None, mass=1, dt=1e-4):
f = np.zeros(3)
for o in others:
r = e - o
r_mag = np.sqrt(r.dot(r))
if r_mag == 0 :
continue
f += ( A/(r_mag**7) + B/(r_mag**13) ) * r
return e + f * (dt**2 / mass)
A and B some arbitrary values. dask.bag.map() could be called multiple times inside a loop to execute the simulation.
Is Dask.bag a good collection (abstraction) for dealing with this kind of problems? Maybe Dask.distributed is a better idea?
Programming the simulation this way, is the scheduler handling all communications or information about position, speed, etc is shared with inter-worker communication?
Any comments to optimize the code? Specially about the overheat of transforming the collection into a list while calling dask.bag.map().
Generally speaking N-Body simulations require sophisticated algorithms and data structures to run efficiently. Many common solutions include the use of complex tree data structures. You might want to search for terms like kd-tree or barnes-hut.
Dask.bag on the other hand is one of the simplest/dumbest parallel programming abstractions you can imagine, similar to other bulk data processing systems like MapReduce and Spark. These systems are not flexible enough to give good performance on complex problems like N-Body simulations.
Something like dask.array or dask.delayed will offer more flexibility, but even these won't be the same as a finely tuned KD-Tree.
really not long ago I had my first dumb question answered here so... there I am again, with a hopefully less dumb and more interesting headscratcher. Keep in my mind I am still making my baby steps in scripting !
There it is : I need to rig a feathered wing, and I already have all the feathers in place. I thought of mimicking another rig I animated recently that had the feathers point-constrained to the arm and forearm, and orient-constrained to three other controllers on the arm : each and every feather was constrained to two of those controllers at a time, and the constraint's weights would shift as you went down the forearm towards the wrist, so that one feather perfectly at mid-distance between the elbow and the forearm would be equally constrained by both controllers... you get the picture.
My reasoning was as follows : let's make a loop that iterates over every feather, gets its world position, finds the distance from that feather to each of the orient controllers (through Pythagoras), normalize that and feed the values into the weight attribute of an orient constraint. I could even go the extra mile and pass the normalized distance through a sine function to get a nice easing into the feathers' silhouette.
My pseudo-code is ugly and broken, but it's a try. My issues are inlined.
Second try !
It works now, but only on active object, instead of the whole selection. What could be happening ?
import maya.cmds as cmds
# find world space position of targets
base_pos = cmds.xform('base',q=1,ws=1,rp=1)
tip_pos = cmds.xform('tip',q=1,ws=1,rp=1)
def relative_dist_from_pos(pos, ref):
# vector substract to get relative pos
pos_from_ref = [m - n for m, n in zip(pos, ref)]
# pythagoras to get distance from vector
dist_from_ref = (pos_from_ref[0]**2 + pos_from_ref[1]**2 + pos_from_ref[2]**2)**.5
return dist_from_ref
def weight_from_dist(dist_from_base, dist_to_tip):
normalize_fac = (1/(dist_from_base + dist_to_tip))
dist_from_base *= normalize_fac
dist_to_tip *= normalize_fac
return dist_from_base, dist_to_tip
sel = cmds.ls(selection=True)
for obj in sel:
# find world space pos of feather
feather_pos = cmds.xform(obj, q=1, ws=1, rp=1)
# call relative_dist_from_pos
dist_from_base = relative_dist_from_pos(feather_pos, base_pos)
dist_to_tip = relative_dist_from_pos(feather_pos, tip_pos)
# normalize distances
weight_from_dist(dist_from_base, dist_to_tip)
# constrain the feather - weights are inverted
# because the smaller the distance, the stronger the constraint
cmds.orientConstraint('base', obj, w=dist_to_tip)
cmds.orientConstraint('tip', obj, w=dist_from_base)
There you are. Any pointers are appreciated.
Have a good night,
Hadriscus