Fastest screenshot library python/improve performance of mss package - python

python 3.6, Windows 10:
I am trying to take one (partial) screenshot every 1-5 milliseconds to then run some custom OCR on it to extract some data.
My code for taking the screenshots using the mss package takes between 16 and 47ms depending upon the number of pixels I try to capture.
I have 3 separate lines of questions:
1.) Is there an alternative to mss that is faster?
2.) Is there a way to speed up mss by factor 2-3?
3.) How can I find out via code profiling/cProfile output shown below how I can achieve performance improvements? The way I read the output is that a lot of time is spent in the "grab" function, but it's unclear what inside the grab function actually takes so long.
from mss import mss
import mss.tools as mss_tools
import cProfile, pstats, io
def profile(fnc):
def inner(*args, **kwargs):
pr = cProfile.Profile()
pr.enable()
retval = fnc(*args, **kwargs)
pr.disable()
s = io.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print(s.getvalue())
return retval
return inner
#profile
def main():
with mss() as sct:
for i in range(100):
monitor = sct.monitors[1]
left = monitor["left"]
top = monitor["top"]
right = left + 1
lower = top + 1
bbox = (left, top, right, lower)
shot = sct.grab(bbox)
# mss_tools.to_png(shot.rgb, shot.size, output='partialscreen.png') #no performance difference with or without this
# sct.shot() #code takes much more time (almost factor 10 higher compared to taking a large share of the screen)
main()

I am the MSS developer :)
Completely impartially, I do not think there is faster than MSS. But if we can make it even faster, I am +1000 on that :)
A little improvement, not related to MSS, would be to move out vars from the for loop:
#profile
def main():
with mss() as sct:
monitor = sct.monitors[1]
left = monitor["left"]
top = monitor["top"]
right = left + 1
lower = top + 1
bbox = (left, top, right, lower)
for i in range(100):
shot = sct.grab(bbox)
To measure what goes inside MSS.grab(), perhaps could you add the #profile onto the method in MSS. Ugly but for testing it is OK.
In that method, there are 2 things that may take time:
BitBlt()
GetDIBits()
I am curious to know where the code is slower in the method.

Related

why my script to present audio stimuli has an unpredictable behaviour?

I am trying to run a simple experiment using python. I want to present two different types of audio stimuli, an higher and a lower pitch. The higher pitch has a fixed duration of 200ms while the lower pitch come in pairs, the first with a fixed duration of 250ms and the second with a variable duration that can take the following values [.4, .6, .8, 1, 1.2]. I need to know at what time (machine) the stimuli start and end, and their duration (precision is not the most important issue, I have a tolerance of ~ 10ms), thus I log this information
I am using the library audiomath to create and present the stimuli and I have create several custom functions to manage the other aspects of the task. I have 3 scripts: one in which I define the functions, one in which I set the specific parameters for the experiment for each subject (source) and one with the main()
My problem is that the main() works erratically: it works sometimes, some other times it seems to enter into an infinite loop and a certain sound is presented and never stop playing. The point is that this behavior seems to be really random, with the problem that presents itself at different trials, or not at all, even with the exact same parameter.
This is my code:
source file
#%%imports
from exp_funcs import tone440Hz, tone880Hz
import numpy as np
#%%global var
n_long = 10
n_short = 10
short_duration = .2
long_durations = [.4, .6, .8, 1, 1.2]
#%%calculations
n_tot = n_long + n_long
trial_types = ['short_blink'] * n_short + ['long_blink'] * n_long
sounds = [tone880Hz] * n_short + [tone440Hz] * n_long
np.random.seed(10)
durations = [short_duration] * n_short + [el for el in np.random.choice(long_durations, n_long)]
durations = [.5 if el < .2 else el for el in durations]
cue_duration = [.25] * n_tot
spacing = [1.25] * n_tot
np.random.seed(10)
iti = [el for el in (3 + np.random.normal(0, .25, n_tot))]
functions
import numpy as np
import audiomath as am
import time
import pandas as pd
TWO_PI = 2.0 * np.pi
#am.Synth(fs=22050)
def tone880Hz(fs, sampleIndices, channelIndices):
timeInSeconds = sampleIndices / fs
return np.sin(TWO_PI * 880 * timeInSeconds)
#am.Synth(fs=22050)
def tone440Hz(fs, sampleIndices, channelIndices):
timeInSeconds = sampleIndices / fs
return np.sin(TWO_PI * 440 * timeInSeconds)
def short_blink(sound, duration):
p = am.Player(sound)
init = time.time()
while time.time() < init + duration:
p.Play()
end = time.time()
p.Stop()
print(f'start {init} end {end} duration {end - init}')
return(init, end, end - init)
def long_blink(sound, duration, cue_duration, spacing):
p = am.Player(sound)
i_ = time.time()
while time.time() < i_ + cue_duration:
p.Play()
p.Stop()
time.sleep(spacing)
init = time.time()
while time.time() < init + duration:
p.Play()
end = time.time()
p.Stop()
print(f'start {init} end {end} duration {end - init}')
return(init, end, end - init)
def run_trial(ttype, sound, duration, cue_duration, spacing):
if ttype == 'short_blink':
init, end, effective_duration = short_blink(sound, duration)
else:
init, end, effective_duration = long_blink(sound, duration,
cue_duration, spacing)
otp_df = pd.DataFrame([[ttype, init, end, effective_duration]],
columns = ['trial type', 'start', 'stop',
'effective duration'])
return(otp_df)
main
import pandas as pd
import sys
import getopt
import os
import time
import random
from exp_funcs import run_trial
from pathlib import PurePath
def main(argv):
try:
opts, args = getopt.getopt(argv,'hs:o:',['help', 'source_file=', 'output_directory='])
except getopt.GetoptError:
print ('experiment.py -s source file -o output directory')
sys.exit(2)
for opt, arg in opts:
if opt == '-h':
print ('experiment.py -s source file')
sys.exit()
elif opt in ("-s", "--source_file"):
source_file = arg
elif opt in ("-o", "--output_directory"):
output_dir = arg
os.chdir(os.getcwd())
if not os.path.isfile(f'{source_file}.py'):
raise FileNotFoundError('{source_file} does not exist')
else:
source = __import__('source')
complete_param = list(zip(source.trial_types,
source.sounds,
source.durations,
source.cue_duration,
source.spacing,
source.iti))
# shuffle_param = random.sample(complete_param, len(complete_param))
shuffle_param = complete_param
dfs = []
for ttype, sound, duration, cue_duration, spacing, iti in shuffle_param:
time.sleep(iti)
df = run_trial(ttype, sound, duration, cue_duration, spacing)
dfs.append(df)
dfs = pd.concat(dfs)
dfs.to_csv(PurePath(f'{output_dir}/{source_file}.csv'), index = False)
if __name__ == "__main__":
main(sys.argv[1:])
The 3 files are in the same directory, I browse with the terminal within the directory and run the main as follow python experiment.py -s source -o /whatever/output/directory.
Any help would be more than appreciated
This is too big/complex a program to hope for help on non-specific "erratic" behavior here on stackoverflow. You need to boil it down into a small reproducible example that behaves unexpectedly. If it works sometimes and not others, systematically home in on the conditions that make it fail. I did make one attempt to run the whole thing, but after fixing a few missing imports there was still the matter of the unspecified "source file" content.
So I don't know specifically what your problem is. However, from the audiomath and general real-time-performance perspectives, I can certainly identify a few things you shouldn't be doing:
Although Player instances are designed to be played, stopped or manipulated at time-critical moments, they are not (by default) designed to be created and destroyed at time-critical moments. If you want to create/destroy them fast, pre-initialize a persistent Stream() instance and pass it as the stream argument when creating the Player, as described towards the end of https://audiomath.readthedocs.io/en/release/auto/Examples.html#play-sounds
If you are using Synth instances, you could take advantage of their .duration attribute instead of checking the clock explicitly in a while loop. For example, you can set tone880Hz.duration = 0.5, and then play the sound synchronously with p.Play(wait=True). The big problem with your clock-watching while loops is that they are currently "busy-wait" loops that will thrash the CPU, likely leading to sporadic disruption to your sound (Python's multithreading is far from perfect). However, before you fix this problem you should know...
The strategy "Play(), wait, sleep, Play()" is never going to achieve precise timing of one stimulus relative to the other anyway. First, whenever you issue a command to play a sound in any software, there will unavoidably be a non-zero (and randomly varying!) latency between the command and the physical onset of the sound. Second, sleep() is unlikely to be as precise as you think it is. This applies both to the sleep() you’ve been using to create a gap, and also to the sleep() that would be used internally by Play(wait=True). Sleep implementations suspend operation for "at least" the specified amount of time but they don't guarantee an upper bound on that. This is very hardware- and OS-dependent; on some Windows systems you may even find that the granularity never gets any better than 10ms.
If you really want to use the Synth approach I suppose you could program the gap procedurally into the function definitions of tone440Hz() and tone880Hz(), accessing cue_duration, duration and spacing as global variables (in fact, while you're at it, why not make frequency a global variable too, and only write one function). But I don't see any great advantage in this, either in performance or in code maintainability.
What I would do instead is pre-initialize the following (once, at the start of your program):
max_duration = 1 # length, in seconds, of the longest continuous tone we'll need
tone440Hz = am.Sound(fs=22050).GenerateWaveform(freq_hz=440, duration_msec=max_duration*1000)
tone880Hz = am.Sound(fs=22050).GenerateWaveform(freq_hz=880, duration_msec=max_duration*1000)
m = am.Stream()
Then compose each "long blink" stimulus as a static Sound using the parameters you want.
This will ensure that the tone and gap durations are precise:
s = tone440Hz[:cue_duration] % spacing % tone440Hz[:duration]
For best real-time performance, you could pre-compute a whole set of these stimuli with different parameters. Or, if it turns out that those composition operations (slicing and splicing) happen fast enough, you might decide you can get away with doing that at trial time, in your long_blink() function.
Either way, when it comes to playing the stimulus at trial time:
p = am.Player(s, stream=m) # to make Player() initialization fast, use a pre-initialized Stream() instance
p.Play(wait=True)
Finally: in implementing this, start from scratch—start simple, and test the performance of a few simple cases before compounding things.

Multiprocessing Freezing with Scipy Optimise

EDIT: The error does not lie with multiprocessing but rather with another library. I would close the question but #Lukasz Tracewski's point of using joblib may help someone else.
I've got an issue with a task I'm trying to parallelise on windows.
It seems to work for a while then suddenly halts on a particular instance (I can tell as I get the code to print which iteration it's on). I've noticed in the task manager that the CPU usage, which such be about 50%, is minimal for the Python processes. When I then do a keyboard interrupt on the cmd prompt, this suddenly shoots forward a number of instances and the activity goes back up for a short while, only to go back down again.
I've included the bits of my code which I think are relevant. I know it can work (I don't think it's stuck on a problem) and there seems to be a degree of randomness as to when it does freeze.
I'm using the COBYLA solver with max_iter set. I'm not sure if it is relevant, but when I tried to use BFGS, I got a freezing problem.
def get_optimal_angles(G,p,coeff,objective_function,initial_gamma_range,initial_beta_range,seed):
'''
This performs the classical-quantum interchange, improving the values of beta and gamma by reducing the value of
< beta, gamma | C | beta, gamma >. Returns the best angles found and the objective value this refers to. Bounds on the minimiser are set as the starting points
'''
initial_starting_points = random_intial_values((np.array(initial_gamma_range)),(np.array(initial_beta_range)),p,seed)
optimiser_function = minimize(objective_function, initial_starting_points, method='COBYLA', options={'maxiter':1500})
best_angles = optimiser_function.x
objective_value = optimiser_function.fun
return best_angles,objective_value
def find_best_angles_for_graph_6(graph_seed):
print("6:On graph",graph_seed)
#graph = gp.unweighted_erdos_graph(10,0.4,graph_seed)
graph = gp.unweighted_erdos_graph(6,0.4,graph_seed)
graph_coefficients = quantum_operator_z_coefficients(graph,'Yo')
exact_energy =get_exact_energy(graph)
angle_starting_seed = np.arange(1,number_of_angle_points,1)
objective_function= get_black_box_objective_sv(graph,p,graph_coefficients)
list_of_results = []
for angle_seed in angle_starting_seed:
print("On Angle Seed",angle_seed)
best_angles_for_seed, best_objective_value_for_seed = get_optimal_angles(graph,p,graph_coefficients,objective_function,[0,np.pi],[0,np.pi],angle_seed)
success_prob = calculate_energy_success_prob(graph,p,exact_energy, graph_coefficients,best_angles_for_seed,angle_seed)
angle_seed_data_list = [angle_seed,best_objective_value_for_seed,success_prob,best_angles_for_seed]
list_of_results.append(angle_seed_data_list)
list_of_best = get_best_results(list_of_results)
full_results = [graph_seed,exact_energy,list_of_best,list_of_results]
return full_results
import multiprocessing as mp
def main():
physical_cores = 5
pool = mp.Pool(physical_cores)
list_of_results_every_graph_6 = []
list_of_all_graph_results_6 = pool.map(find_best_angles_for_graph_6,range(1,number_of_graphs+1))
print(list_of_all_graph_results_6)
file_name_6 = 'unweighted_erdos_graph_N_6_p_8.pkl'
pickle_6 = open((file_name_6),'wb')
pickle.dump(list_of_all_graph_results_6, pickle_6)
pickle_6.close()
list_of_results_every_graph_10 = []
list_of_all_graph_results_10 = pool.map(find_best_angles_for_graph_10,range(1,number_of_graphs+1))
print(list_of_all_graph_results_10)
file_name_10 = 'unweighted_erdos_graph_N_9_p_8.pkl'
pickle_10 = open((file_name_10),'wb')
pickle.dump(list_of_all_graph_results_10, pickle_10)
pickle_10.close()
if __name__ == "__main__":
main()
EDIT: Here is the full code as a Jupyter notebook. https://www.dropbox.com/sh/6xb7setjsn1c1o3/AAANfH7mEmhuuf9cxh5QWsRQa?dl=0

Choppy audio from separating and then joining .wav stereo channels

I am currently working on processing .wav files with python, using Pyaudio for streaming the audio, and the python wave library for loading the file data.
I plan to later on include processing of the individual stereo channels, with regards to amplitude of the signal, and panning of the stereo signal, but for now i'm just trying to seperate the two channels of the wave file, and stitch them back together - Hopefully ending up with data that is identical to the input data.
Below is my code.
The method getRawSample works perfectly fine, and i can stream audio through that function.
The problem is my getSample method. Somewhere along the line, where i'm seperating the two channels of audio, and joining them back together, the audio gets distorted. I have even commented out the part where i do amplitude and panning adjustment, so in theory it's data in -> data out.
Below is an example of my code:
class Sample(threading.Thread) :
def __init__(self, filepath, chunk):
super(Sample, self).__init__()
self.CHUNK = chunk
self.filepath = filepath
self.wave = wave.open(self.filepath, 'rb')
self.amp = 0.5 # varies from 0 to 1
self.pan = 0 # varies from -pi to pi
self.WIDTH = self.wave.getsampwidth()
self.CHANNELS = self.wave.getnchannels()
self.RATE = self.wave.getframerate()
self.MAXFRAMEFEEDS = self.wave.getnframes()/self.CHUNK # maximum even number of chunks
self.unpstr = '<{0}h'.format(self.CHUNK*self.WIDTH) # format for unpacking the sample byte string
self.pckstr = '<{0}h'.format(self.CHUNK*self.WIDTH) # format for unpacking the sample byte string
self.framePos = 0 # keeps track of how many chunks of data fed
# panning and amplitude adjustment of input sample data
def panAmp(self, data, panVal, ampVal): # when panning, using constant power panning
[left, right] = self.getChannels(data)
#left = np.multiply(0.5, left) #(np.sqrt(2)/2)*(np.cos(panVal) + np.sin(panVal))
#right = np.multiply(0.5, right) # (np.sqrt(2)/2)*(np.cos(panVal) - np.sin(panVal))
outputList = self.combineChannels(left, right)
dataResult = struct.pack(self.pckstr, *outputList)
return dataResult
def getChannels(self, data):
dataPrepare = list(struct.unpack(self.unpstr, data))
left = dataPrepare[0::self.CHANNELS]
right = dataPrepare[1::self.CHANNELS]
return [left, right]
def combineChannels(self, left, right):
stereoData = left
for i in range(0, self.CHUNK/self.WIDTH):
index = i*2+1
stereoData = np.insert(stereoData, index, right[i*self.WIDTH:(i+1)*self.WIDTH])
return stereoData
def getSample(self, panVal, ampVal):
data = self.wave.readframes(self.CHUNK)
self.framePos += 1
if self.framePos > self.MAXFRAMEFEEDS: # if no more audio samples to process
self.wave.rewind()
data = self.wave.readframes(self.CHUNK)
self.framePos = 1
return self.panAmp(data, panVal, ampVal)
def getRawSample(self): # for debugging, bypasses pan and amp functions
data = self.wave.readframes(self.CHUNK)
self.framePos += 1
if self.framePos > self.MAXFRAMEFEEDS: # if no more audio samples to process
self.wave.rewind()
data = self.wave.readframes(self.CHUNK)
self.framePos = 1
return data
i am suspecting that the error is in the way that i stitch together the left and right channel, but not sure.
I load the project with 16 bit 44100khz .wav files.
Below is a link to an audio file so that you can hear the resulting audio output.
The first part is running two files (both two channel) through the getSample method, while the next part is running those same files, through the getRawSample method.
https://dl.dropboxusercontent.com/u/24215404/pythonaudiosample.wav
Basing on the audio, as said earlier, it seems like the stereo file gets distorted. Looking at the waveform of above file, it seems as though the right and left channels are exactly the same after going through the getSample method.
If needed, i can also post my code including the main function.
Hopefully my question isn't too vague, but i am grateful for any help or input!
As it so often happens, i slept on it, and woke up the next day with a solution.
The problem was in the combineChannels function.
Following is the working code:
def combineChannels(self, left, right):
stereoData = left
for i in range(0, self.CHUNK):
index = i*2+1
stereoData = np.insert(stereoData, index, right[i:(i+1)])
return stereoData
The changes are
For loop bounds: as i have 1024 items (the same as my chunk size) in the lists left and right, i ofcourse need to iterate through every one of those.
index: the index definition remains the same
stereoData: Again, here i remember that im working with lists, each containing a frame of audio. The code in the question assumes that my list is stored as a bytestring, but this is ofcourse not the case. And as you see, the resulting code is much simpler.

Writing a faster Python physics simulator

I have been playing around with writing my own physics engine in Python as an exercise in physics and programming. I started out by following the tutorial located here. That went well, but then I found the article "Advanced character physics" by thomas jakobsen, which covered using Verlet integration for simulations, which I found fascinating.
I have been attempting to write my own basic physics simulator using verlet integration, but it turns out to be slightly more difficult than I first expected. I was out browsing for example programs to read, and stumbled accross this one written in Python and I also found this tutorial which uses Processing.
What impresses me about the Processing version is how fast it runs. The cloth alone has 2400 different points being simulated, and that's not including the bodies.
The python example only uses 256 particles for the cloth, and it runs at about 30 frames per second. I tried increasing the number of particles to 2401 (it has to be square for that program to work), it ran at about 3 fps.
Both of these work by storing instances of a particle object in a list, and then iterating through the list, calling each particles "update position" method. As an example, this is the part of the code from the Processing sketch that calculates each particle's new postion:
for (int i = 0; i < pointmasses.size(); i++) {
PointMass pointmass = (PointMass) pointmasses.get(i);
pointmass.updateInteractions();
pointmass.updatePhysics(fixedDeltaTimeSeconds);
}
EDIT: Here is the code from the python version I linked earlier:
"""
verletCloth01.py
Eric Pavey - 2010-07-03 - www.akeric.com
Riding on the shoulders of giants.
I wanted to learn now to do 'verlet cloth' in Python\Pygame. I first ran across
this post \ source:
http://forums.overclockers.com.au/showthread.php?t=870396
http://dl.dropbox.com/u/3240460/cloth5.py
Which pointed to some good reference, that was a dead link. After some searching,
I found it here:
http://www.gpgstudy.com/gpgiki/GDC%202001%3A%20Advanced%20Character%20Physics
Which is a 2001 SIGGRAPH paper by Thomas Jakobsen called:
"GDC 2001: Advanced Characer Physics".
This code is a Python\Pygame interpretation of that 2001 Siggraph paper. I did
borrow some code from 'domlebo's source code, it was a great starting point. But
I'd like to think I put my own flavor on it.
"""
#--------------
# Imports & Initis
import sys
from math import sqrt
# Vec2D comes from here: http://pygame.org/wiki/2DVectorClass
from vec2d import Vec2d
import pygame
from pygame.locals import *
pygame.init()
#--------------
# Constants
TITLE = "verletCloth01"
WIDTH = 600
HEIGHT = 600
FRAMERATE = 60
# How many iterations to run on our constraints per frame?
# This will 'tighten' the cloth, but slow the sim.
ITERATE = 2
GRAVITY = Vec2d(0.0,0.05)
TSTEP = 2.8
# How many pixels to position between each particle?
PSTEP = int(WIDTH*.03)
# Offset in pixels from the top left of screen to position grid:
OFFSET = int(.25*WIDTH)
#-------------
# Define helper functions, classes
class Particle(object):
"""
Stores position, previous position, and where it is in the grid.
"""
def __init__(self, screen, currentPos, gridIndex):
# Current Position : m_x
self.currentPos = Vec2d(currentPos)
# Index [x][y] of Where it lives in the grid
self.gridIndex = gridIndex
# Previous Position : m_oldx
self.oldPos = Vec2d(currentPos)
# Force accumulators : m_a
self.forces = GRAVITY
# Should the particle be locked at its current position?
self.locked = False
self.followMouse = False
self.colorUnlocked = Color('white')
self.colorLocked = Color('green')
self.screen = screen
def __str__(self):
return "Particle <%s, %s>"%(self.gridIndex[0], self.gridIndex[1])
def draw(self):
# Draw a circle at the given Particle.
screenPos = (self.currentPos[0], self.currentPos[1])
if self.locked:
pygame.draw.circle(self.screen, self.colorLocked, (int(screenPos[0]),
int(screenPos[1])), 4, 0)
else:
pygame.draw.circle(self.screen, self.colorUnlocked, (int(screenPos[0]),
int(screenPos[1])), 1, 0)
class Constraint(object):
"""
Stores 'constraint' data between two Particle objects. Stores this data
before the sim runs, to speed sim and draw operations.
"""
def __init__(self, screen, particles):
self.particles = sorted(particles)
# Calculate restlength as the initial distance between the two particles:
self.restLength = sqrt(abs(pow(self.particles[1].currentPos.x -
self.particles[0].currentPos.x, 2) +
pow(self.particles[1].currentPos.y -
self.particles[0].currentPos.y, 2)))
self.screen = screen
self.color = Color('red')
def __str__(self):
return "Constraint <%s, %s>"%(self.particles[0], self.particles[1])
def draw(self):
# Draw line between the two particles.
p1 = self.particles[0]
p2 = self.particles[1]
p1pos = (p1.currentPos[0],
p1.currentPos[1])
p2pos = (p2.currentPos[0],
p2.currentPos[1])
pygame.draw.aaline(self.screen, self.color,
(p1pos[0], p1pos[1]), (p2pos[0], p2pos[1]), 1)
class Grid(object):
"""
Stores a grid of Particle objects. Emulates a 2d container object. Particle
objects can be indexed by position:
grid = Grid()
particle = g[2][4]
"""
def __init__(self, screen, rows, columns, step, offset):
self.screen = screen
self.rows = rows
self.columns = columns
self.step = step
self.offset = offset
# Make our internal grid:
# _grid is a list of sublists.
# Each sublist is a 'column'.
# Each column holds a particle object per row:
# _grid =
# [[p00, [p10, [etc,
# p01, p11,
# etc], etc], ]]
self._grid = []
for x in range(columns):
self._grid.append([])
for y in range(rows):
currentPos = (x*self.step+self.offset, y*self.step+self.offset)
self._grid[x].append(Particle(self.screen, currentPos, (x,y)))
def getNeighbors(self, gridIndex):
"""
return a list of all neighbor particles to the particle at the given gridIndex:
gridIndex = [x,x] : The particle index we're polling
"""
possNeighbors = []
possNeighbors.append([gridIndex[0]-1, gridIndex[1]])
possNeighbors.append([gridIndex[0], gridIndex[1]-1])
possNeighbors.append([gridIndex[0]+1, gridIndex[1]])
possNeighbors.append([gridIndex[0], gridIndex[1]+1])
neigh = []
for coord in possNeighbors:
if (coord[0] < 0) | (coord[0] > self.rows-1):
pass
elif (coord[1] < 0) | (coord[1] > self.columns-1):
pass
else:
neigh.append(coord)
finalNeighbors = []
for point in neigh:
finalNeighbors.append((point[0], point[1]))
return finalNeighbors
#--------------------------
# Implement Container Type:
def __len__(self):
return len(self.rows * self.columns)
def __getitem__(self, key):
return self._grid[key]
def __setitem__(self, key, value):
self._grid[key] = value
#def __delitem__(self, key):
#del(self._grid[key])
def __iter__(self):
for x in self._grid:
for y in x:
yield y
def __contains__(self, item):
for x in self._grid:
for y in x:
if y is item:
return True
return False
class ParticleSystem(Grid):
"""
Implements the verlet particles physics on the encapsulated Grid object.
"""
def __init__(self, screen, rows=49, columns=49, step=PSTEP, offset=OFFSET):
super(ParticleSystem, self).__init__(screen, rows, columns, step, offset)
# Generate our list of Constraint objects. One is generated between
# every particle connection.
self.constraints = []
for p in self:
neighborIndices = self.getNeighbors(p.gridIndex)
for ni in neighborIndices:
# Get the neighbor Particle from the index:
n = self[ni[0]][ni[1]]
# Let's not add duplicate Constraints, which would be easy to do!
new = True
for con in self.constraints:
if n in con.particles and p in con.particles:
new = False
if new:
self.constraints.append( Constraint(self.screen, (p,n)) )
# Lock our top left and right particles by default:
self[0][0].locked = True
self[1][0].locked = True
self[-2][0].locked = True
self[-1][0].locked = True
def verlet(self):
# Verlet integration step:
for p in self:
if not p.locked:
# make a copy of our current position
temp = Vec2d(p.currentPos)
p.currentPos += p.currentPos - p.oldPos + p.forces * TSTEP**2
p.oldPos = temp
elif p.followMouse:
temp = Vec2d(p.currentPos)
p.currentPos = Vec2d(pygame.mouse.get_pos())
p.oldPos = temp
def satisfyConstraints(self):
# Keep particles together:
for c in self.constraints:
delta = c.particles[0].currentPos - c.particles[1].currentPos
deltaLength = sqrt(delta.dot(delta))
try:
# You can get a ZeroDivisionError here once, so let's catch it.
# I think it's when particles sit on top of one another due to
# being locked.
diff = (deltaLength-c.restLength)/deltaLength
if not c.particles[0].locked:
c.particles[0].currentPos -= delta*0.5*diff
if not c.particles[1].locked:
c.particles[1].currentPos += delta*0.5*diff
except ZeroDivisionError:
pass
def accumulateForces(self):
# This doesn't do much right now, other than constantly reset the
# particles 'forces' to be 'gravity'. But this is where you'd implement
# other things, like drag, wind, etc.
for p in self:
p.forces = GRAVITY
def timeStep(self):
# This executes the whole shebang:
self.accumulateForces()
self.verlet()
for i in range(ITERATE):
self.satisfyConstraints()
def draw(self):
"""
Draw constraint connections, and particle positions:
"""
for c in self.constraints:
c.draw()
#for p in self:
# p.draw()
def lockParticle(self):
"""
If the mouse LMB is pressed for the first time on a particle, the particle
will assume the mouse motion. When it is pressed again, it will lock
the particle in space.
"""
mousePos = Vec2d(pygame.mouse.get_pos())
for p in self:
dist2mouse = sqrt(abs(pow(p.currentPos.x -
mousePos.x, 2) +
pow(p.currentPos.y -
mousePos.y, 2)))
if dist2mouse < 10:
if not p.followMouse:
p.locked = True
p.followMouse = True
p.oldPos = Vec2d(p.currentPos)
else:
p.followMouse = False
def unlockParticle(self):
"""
If the RMB is pressed on a particle, if the particle is currently
locked or being moved by the mouse, it will be 'unlocked'/stop following
the mouse.
"""
mousePos = Vec2d(pygame.mouse.get_pos())
for p in self:
dist2mouse = sqrt(abs(pow(p.currentPos.x -
mousePos.x, 2) +
pow(p.currentPos.y -
mousePos.y, 2)))
if dist2mouse < 5:
p.locked = False
#------------
# Main Program
def main():
# Screen Setup
screen = pygame.display.set_mode((WIDTH, HEIGHT))
clock = pygame.time.Clock()
# Create our grid of particles:
particleSystem = ParticleSystem(screen)
backgroundCol = Color('black')
# main loop
looping = True
while looping:
clock.tick(FRAMERATE)
pygame.display.set_caption("%s -- www.AKEric.com -- LMB: move\lock - RMB: unlock - fps: %.2f"%(TITLE, clock.get_fps()) )
screen.fill(backgroundCol)
# Detect for events
for event in pygame.event.get():
if event.type == pygame.QUIT:
looping = False
elif event.type == MOUSEBUTTONDOWN:
if event.button == 1:
# See if we can make a particle follow the mouse and lock
# its position when done.
particleSystem.lockParticle()
if event.button == 3:
# Try to unlock the current particles position:
particleSystem.unlockParticle()
# Do stuff!
particleSystem.timeStep()
particleSystem.draw()
# update our display:
pygame.display.update()
#------------
# Execution from shell\icon:
if __name__ == "__main__":
print "Running Python version:", sys.version
print "Running PyGame version:", pygame.ver
print "Running %s.py"%TITLE
sys.exit(main())
Because both programs work roughly the same way, but the Python version is SO much slower, it makes me wonder:
Is this performance difference part of the nature of Python?
What should I do differently from the above if I want to get better performance from my own Python programs? E.g store the properties of all particles inside an array instead of using individual objects, etc.
EDIT: Answered!!
#Mr E's linked PyCon talk in the comments, and #A. Rosa answer with the linked resources all helped ENORMOUSLY in better understanding how to write good, fast python code. I am now bookmarking this page for future reference :D
There is a Guido van Rossum's article linked in the section Performance Tips of the Python Wiki. In its conclusion, you can read the following sentence:
If you feel the need for speed, go for built-in functions - you can't beat a loop written in C.
The essay continues with a list of guidelines for loop optimization. I recommend both resources, since they give concrete and practical advices about optimizing Python code.
There is also a well-known group of benchmarks in benchmarksgame.alioth.debian.org, where you can find comparasions among different programs and languages in distinct machines. As can be seen, there are lots of variables in play that makes impossible state something as broad as Java is faster than Python. This is commonly summed up in the sentence "Languages don't have speeds; implementations do".
In your code can be applied more pythonic and faster alternatives using built-in functions. For example, there are several nested loops (some of them don't require processing the whole list) which can be rewritten using imap or list comprehensions. PyPy is also another interesting option to improve the performance. I'm not an expert about Python optimization, but there are lots of tips which are extremely useful (Notice that don't write Java in Python is one of them!).
Resources and another related questions on SO:
Performance differences between Python and C
Is it reasonable to integrate python with c for performance?
http://www.ibm.com/developerworks/opensource/library/os-pypy-intro/index.html?ca=drs-
http://pyevolve.sourceforge.net/wordpress/?p=1189
If you write Python like you write Java, of course it's going to be slower, idiomatic java does not translate well to idiomatic python.
Is this performance difference part of the nature of Python?
What should I do differently from the above if I want to get better performance from my own Python programs? E.g store the properties of all particles inside an array instead of using individual objects, etc.
Hard to say without seeing your code.
Here are an incomplete list of differences between python and java that may sometimes affect performance:
Processing uses immediate mode canvas, if you want a comparable performance in Python, you also need to use immediate mode canvas. Canvases in most GUI framework (including Tkinter canvas) is retained mode, which is easier to use, but inherently slower than immediate mode. You'll need to use immediate mode canvas like those provided by pygame, SDL, or Pyglet.
Python is dynamic language, that means instance member access, module member access, and global variable access is resolved at run time. Instance member access, module member access, and global variable access in python is really dictionary access. In java, they are resolved at compile time and by its nature much faster. Cache frequently accessed globals, module variables, and attributes to a local variable.
In python 2.x, range() produces a concrete list, in python, iteration done using iterator, for item in list, is usually faster than iteration done using iteration variable, for n in range(len(list)). You should almost always iterate directly using iterator instead of iterating using range(len(...)).
Python's numbers is immutable, this means any arithmetic calculation allocates a new object. This is one reason why plain python is not very suitable for low level calculations; most people that want to be able to write low level calculations without having to resort to writing C extension typically uses cython, psyco, or numpy. This usually only becomes a problem when you have millions of calculations though.
This are just partial, very incomplete list, there are many other reasons why translating java to python would produce suboptimal code. Without seeing your code it's impossible to tell what you need to do differently. Optimized python code generally looks very different than optimized java code.
I would also suggest to read about other physics engines. There are a few open source engines which use a variety of methods for calculating the "physics".
Newton Game Dynamics
Chipmunk
Bullet
Box2D
ODE (Open Dynamics Engine)
There are also ports of most of the engines:
Pymunk
PyBullet
PyBox2D
PyODE
If you read through the documentation of those engines you will often find statements saying that they are optimized for speed (30fps - 60fps). But if you think they can do this while calculating "real" physics you are wrong. Most engines calculate physics to a point where a normal user cannot optically distinguish between "real" physical behavior and "simulated" physical behavior. However if you investigate the error it is neglectable if you want to write games. But if you want to do physics, all of those engines are of no use to you.
Thats why I would say if you are doing a real physical simulation you are slower than those engines by design and you will never outrun another physics engine.
Particle-based physics simulation translates easily into linear algebra operations ie. matrix operations. Numpy offers such operations, which are implemented in Fortran/C/C++ under the hood. Well-written python/Numpy code (taking full advantage of language & library) allows to write decently fast code.

Blender performance issues with generating UV spheres

When I run this script:
import bpy, time
t0 = time.time()
for i in range(1000):
bpy.ops.mesh.primitive_uv_sphere_add()
if i % 100 == 0:
print(time.time()-t0)
t0 = time.time()
This is the output (exponential growth vs. time):
1.1920928955078125e-05
0.44658803939819336
0.46373510360717773
0.5661759376525879
0.7258329391479492
0.9994637966156006
1.381392002105713
1.8257861137390137
2.4634311199188232
3.2817111015319824
Why does this happen? Is there a better approach?
I am running this on a server with ample memory, and I know Blender can expand to use most of it (it does in rendering).
The quick answer:
bpy.ops.object.select_all(action='DESELECT')
bpy.ops.mesh.primitive_uv_sphere_add()
sphere = bpy.context.object
for i in range(1000):
ob = sphere.copy()
ob.data = sphere.data.copy()
bpy.context.scene.objects.link(ob)
bpy.context.scene.update()
Explanation:
Anything in bpy.ops.* causes a scene redraw with each call. You want to avoid calling these in loops. The above script calls lower-level copy() methods, which don't redraw. If you want linked duplicates, you can remove the sphere.data.copy() line.
This solution is not my own. Kudos go to CoDEmanX over at BlenderArtists for this answer!

Categories

Resources