Most of my Python program is spent in a method called _build_userdbs. I'm using the awesome tool SnakeViz which helps interpreting the results. There's a screenshot below.
So right now, in that picture, I'm in _build_userdbs. The big green circle right outside of that is a method called _append_record and, as you can see, it takes up almost all of _build_userdbs. I understand this.
But here's the confusing part. The green circle outside of the inner green circle (which takes up the vast majority of the time) is the cumulative time of the _append_record minus the cumulative time of the functions called in _append_record.
Quantitatively, _append_record's cumulative time is 55970 seconds. That's the inner green circle. Its total time is 54210 seconds. That's the outer green circle.
_append_record, as you can see if you open that image in a new tab, calls a couple other functions. Those are:
json.dumps() (459 seconds cumulative)
_userdb_scratch_file_path() (161 seconds cumulative)
open (2160 seconds cumulative)
print (less than .1% of frame so not displayed)
Alright, this makes sense; because of the relatively small difference between _append_record's cumulative time and total time, it must be doing a lot of processing in its own stack frame, rather than delegating it to other functions. But here's the body of the function:
def _append_record(self, user, record):
record = json.dumps(record)
dest_path = self._userdb_scratch_file_path(user)
with open(dest_path, 'at') as dest:
print(record, file=dest)
So where is all of this processing going on? Is this function call overhead that accounts for the difference? Profiling overhead? Are these results just inaccurate? Why isn't the close() function called?
Related
I am using matplotlib to make many figures and save them. I have 5 or so functions that perform either simple or no computations with the data, plot the data, and then format the figure to fit a specific form (title, axes, paper size).
These 5 or so plotting functions get called one at a time from another function in between computations. (Some computations, plotting_function_1, some more computations, plotting_function_2, ...).
I start each plotting function in a new process via plotProcess1 = multiprocessing.Process(target=plot_data1, args=(data, save_directory); myProcess.start() in order to continue with the computation while the plotting functions are running.
When I check the figures after the program has finished, many of the figures have very strange formatting errors with the titles and paper size. The weird part is that the data is always plotted exactly as it should be (The scatter data may look like some is missing, but that is just part of the dataset). Take a look at the Good figures vs bad figures to see what I am talking about (Top left is expected output, others are the issue).
This only started when I started using multiprocessing to make the plots in the background. The weirdest part is that it does not always do it, and it seems to do it at random. Any thoughts as to why it might be doing this and how to fix it? I would really like to keep the computations running while I make the plots due to timing. With some datasets, a few hundred plots will be made with each plotting function and the entire program might take 10s of hours to complete.
Edit: My datasets are very large spatial datasets, so each one of my plotting functions creates and saves multiple plots (around 20, but could be less or more depending on the size of the dataset). I have figured out when the issue occurs now, but still not why. The strange behavior happens when two plotting functions are running at the same time.
A typical timeline where the strange behaviour happens is: (plotting_function_1 has been started --> some small computations happen --> plotting_function_2 is started --> plotting_function_1 finishes --> Plotting_function_2 finishes)
This still doesn't make sense to me, because each plotting function runs in a separate process, does not change any data, and saves to a unique filename.
Edit 2: Here is a snippet of code that will create strangely formatted figures.
# Plot the raw data
if plot_TF is True:
photon_data_copy = photon_data.copy()
plot_segments_copy = plot_segments.copy()
if isParallel is True:
rawPlotProcess = multiprocessing.Process(target=plot_raw_data, args=(photon_data_copy, plot_segments_copy, run_name, plotdir))
rawPlotProcess.start()
else:
plot_raw_data(photon_data_copy, plot_segments_copy, run_name, plotdir)
# Calculate signal slab and noise slab
start = time.time()
signal_mask, noise_mask = assign_slabs_by_histogram_max_bin(photon_data, pixel_dimensions, slab_thickness)
logger.info('Time elapsed assigning slabs: {}'.format(time.time() - start))
photon_signal = photon_data[signal_mask, :]
photon_noise = photon_data[noise_mask, :]
# Plot the Signal and Noise slabs
if plot_TF is True:
photon_signal_copy = photon_signal.copy()
photon_noise_copy = photon_noise.copy()
if isParallel is True:
slabPlotProcess = multiprocessing.Process(target=plot_slabs, args=(photon_signal_copy, photon_noise_copy, plot_segments_copy, run_name, plotdir))
slabPlotProcess.start()
else:
plot_slabs(photon_signal_copy, photon_noise_copy, plot_segments_copy, run_name, plotdir)
My class has been learning Python's Turtle module recently (which I gather uses tkinter), and I was wondering if there was a way to adjust the rate at which tkinter/turtle executes its code, because it doesn't seem (from my limited understanding) to be limited by the computational abilities of my computer. I say that because in task manager (I'm on Windows if that affects anything), the python shell only uses a small percentage of the CPU's limits (~2%) and likewise for the GPU, RAM, disc etc. Additionally, increasing it's operational priority neither affects how much of my CPU is used, nor does it increase the rate it executes its code.
Note that I'm not referring to the speed that the Turtle executes each action as determined by turtle.speed(), I've already got that at '0' such that it's effectively instantaneous, my problem instead lies with what seems to be the time taken between each step which appears to be limited to 80 actions per second (more on this later).
For example, the following code draws an approximation of a parabola, given some precision. The higher the precision, the better the approximation but the longer it takes to draw, as it's taking more, smaller steps.
precision=0.1
t.penup()
t.goto(-250,150)
t.pendown()
for n in range(800*precision):
t.setheading(math.degrees(math.atan(0.02*n-8)))
t.fd(1)
Effectively, for precisions close to or above 1, it takes far longer than I would like, and in general, drawing fine curves in Tkinter is too slow, so I want to know if there's a way to adjust this speed.
Part of my difficulty when trying to research a solution has been that I simply don't know what the relevant terminology is‚ so I've tried using vaguely related terms including some hardware-based analogues along with various other things that are kind of analogous eg:
clock speed
refresh rate
frame rate
tick speed (Minecraft ftw?)
step-through rate
execution rate
actions per second
steps per second
But all to no avail, attempting to describe the issue in Google fails too.
Additionally, I simply don't understand what the underlying bottleneck is (or even if there is a single bottleneck) that's causing it to be so slow, which makes the issue difficult to solve.
I've noticed that if a command for the turtle takes a significant amount of time to calculate (for example by forcing it to do ridiculous amounts of calculations to work out a simple value), then it does simply take longer to execute each step, suggesting that maybe it is just a hardware limitation. However, when using the python timeit decorator to time the execution, it seems to always execute precisely some number of actions per second for any function, regardless of the complexity of the individual action, up to a point, beyond which complexity begins to slow it down. So it's as though there's some limit on the rate it can occur. Though additionally, this specific limit seems to occasionally change suggesting that the computer's state does influence it to some degree.
Also, just in case, this is the timeit setup I used:
import timeit
mysetup="""
import math
import turtle as t
def DefaultDerivative(x):
return 2*x-x
def GeneralEquation(precision=1,XShift=0,YShift=0,Derivative=DefaultDerivative):
t.penup()
t.goto(XShift,YShift)
t.pendown()
for n in range(0,int(800*precision)):
t.setheading((math.degrees(math.atan(Derivative(((0.01*n)-(4*precision))/precision)))))
t.fd(1/precision)
def equation1(x):
return (2*(x**2))+(2*x)
def equation2(x):
return x**2
def equation3(x):
return math.cos(x)
def equation4(x):
return 2*x
t.speed(0)
"""
mycode="""
GeneralEquation(5,-350,300,equation4)
"""
print("time: "+str(timeit.timeit(setup=mysetup,stmt=mycode,number=10)))
Anyway, this is my first question so I hope I explained myself well enough.
Thank you.
Is this quick enough for your purpose:
import timeit
mysetup = """
import turtle
from math import atan, cos
def DefaultDerivative(x):
return 2 * x - x
def GeneralEquation(precision=1, XShift=0, YShift=0, Derivative=DefaultDerivative):
turtle.radians()
turtle.tracer(False)
turtle.penup()
turtle.goto(XShift, YShift)
turtle.pendown()
for n in range(0, int(800 * precision)):
turtle.setheading(atan(Derivative((0.01 * n - 4 * precision) / precision)))
turtle.forward(1 / precision)
turtle.tracer(True)
def equation1(x):
return 2 * x ** 2 + 2 * x
def equation2(x):
return x ** 2
def equation3(x):
return cos(x)
def equation4(x):
return 2 * x
"""
mycode = """
GeneralEquation(5, -350, 300, equation4)
"""
print("time: " + str(timeit.timeit(setup=mysetup, stmt=mycode, number=10)))
Basically, I've turned off turtle's attempts at animation. I also threw in a command to make turtle think in radians so you don't need to call the degrees() function over and over. If you want to see some animation, you can tweak the argument to tracer(), eg. turtle.tracer(20).
I need to make a functionnal test that assert that one run is significantly faster than one other.
Here is the code I have written so far:
def test_run5(self):
cmd_line = ["python", self.__right_def_file_only_files]
start = time.clock()
with self.assertRaises(SystemExit):
ClassName().run(cmd_line)
end = time.clock()
runtime1 = end - start
start = time.clock()
with self.assertRaises(SystemExit):
ClassName().run(cmd_line)
end = time.clock()
runtime2 = end - start
self.assertTrue(runtime2 < runtime1 * 1.4)
It works but I don't like this way because the 1.4 factor has been chosen experimentaly with my specific example of execution.
How would you test that the second execution is always faster than the first?
EDIT
I didn't think that it would be necessary to explain it but in the context of my program, it is not up to me to say that a factor is significant for an unknown execution.
The whole program is a kind of Make and it is the pipeline definition file that will define what is the "significant difference of speed", not me:
If the definition file contains a lot of rules that are very fast, the difference of execution time between two consecutive execution will be very small, let's say 5% faster but still significant
Else if the definition file contains few rules but very long ones, the difference will be big, let's say 90% faster so a difference of 5% would not be significant at all.
I found out an equation named Michaelis-Menten kinetics which fit with my needs. Here is the function which should do the trick
def get_best_factor(full_exec_time, rule_count, maximum_ratio=1):
average_rule_time = full_exec_time / rule_count
return 1 + (maximum_ratio * average_rule_time / (1.5 + average_rule_time))
full_exec_time parameter is runtime1 which is the maximum execution time for a given pipeline definition file.
rule_count is the number of rules in the given pipeline definition file.
maximum_ratio means that the second execution will be, at max, 100% faster than the first (impossible, in practice)
The variable parameter of the Michaelis-Menten kinetics equation, is the average rule execution time. And I have arbitrarily chosen 1.5 seconds as the average rule execution time at which the execution time should be maximum_ratio / 2 faster. It is the actual parameter that depends on your use of this equation.
I am experiencing a strange thing: I wrote a program to simulate economies. Instead of running this simulation one by one on one CPU core, I want to use multiprocessing to make things faster. So I run my code (fine), and I want to get some stats from the simulations I am doing. Then arises one surprise: all the simulations done at the same time yield the very same result! Is there some strange relationship between Pool() and random.seed()?
To be much clearer, here is what the code can be summarized as:
class Economy(object):
def __init__(self,i):
self.run_number = i
self.Statistics = Statistics()
self.process()
def run_and_return(i):
eco = Economy(i)
return eco
collection = []
def get_result(x):
collection.append(x)
if __name__ == '__main__':
pool = Pool(processes=4)
for i in range(NRUN):
pool.apply_async(run_and_return, (i,), callback=get_result)
pool.close()
pool.join()
The process(i) is the function that goes through every step of the simulation, during i steps. Basically I simulate NRUN Economies, from which I get the Statistics that I put in the list collection.
Now the strange thing is that the output of this is exactly the same for the first 4 runs: during the same "wave" of simulations, I get the very same output. Once I get to the second wave, then I get a different output for the next 4 simulations!
All these simulations run well if I use the same program with processes=1: I get different results when I only work on one core, taking simulations one by one... I have tried a few things, but can't get my head around this, hence my post...
Thank you very much for taking the time to read this long post, do not hesitate to ask for more precisions!
All the best,
If you are on Linux then each pool process is made by forking the parent process. This means the process is literally duplicated - this includes the seed any random object may be using.
The random module selects the seed for its default functions on import. Meaning the seed has already been selected before you create the Pool.
To get around this you must use an initialiser for each pool process that sets the random seed to something unique.
A decent way to seed random would be to use the process id and the current time. The process id is bound to be unique on a single run of your program. Whilst using the time will ensure uniqueness over multiple runs in case the same process id is produced. Passing process id and time through as a string will mean that the digest of the string is also used to seed the random number generator -- meaning two similar strings will produce substantially different seeds. Alternatively, you could use the uuid module to generate seeds.
def proc_init():
random.seed(str(os.getpid()) + str(time.time()))
pool = Pool(num_procs, initializer=proc_init)
I have a function that runs a tick() for all players and objects within my game server. I do this by looping through a set every .1 seconds. I need it to be a solid .1. Lots of timing and math depends on this pause being as exact as possible to .1 seconds. To achieve this, I added this to the tick thread:
start_time = time.time()
# loops and code and stuff for tick thread in here...
time_lapsed = time.time() - start_time # get the time it took to run the above code
if 0.1 - time_lapsed > 0:
time.sleep(0.1 - time_lapsed)
else:
print "Server is overloaded!"
# server lag is greater that .1, so don't sleep, and just eat it on this run.
# the goal is to never see this.
My question is, is this the best way to do this? If the duration of my loop is 0.01, then time_lapsed == 0.01 ... and then the sleep should only be for 0.09. I ask, because it doesn't seem to be working. I started getting the overloaded server message the other day, and the server was most definitely not overloaded. Any thoughts on a good way to "dynamically" control the sleep? Maybe there's a different way to run code every tenth of a second without sleeping?
It would be better to base your "timing and math" on the amount of time actually passed since the last tick(). Depending on "very exact" timings will be fragile at the best of times.
Update: what I mean is that your tick() method would take an argument, say "t", of the elapsed time since the last call. Then, to do movement you'd store each thing's position (say in pixels) and velocity (in "pixels/second") so the magnitude of its movement for that call to tick() becomes "velocity * t".
This has the additional benefit of decoupling your physics simulation from the frame-rate.
I see pygame mentioned below: their "pygame.time.Clock.tick()" method is meant to be used this way, as it returns the number of seconds since the last time you called it.
Other Python threads may run in between leaving your thread less time. Also time.time() is subject to system time adjustments; it can be set back.
There is a similar function Clock.tick() in pygame. Its purpose is to limit the maximum frame rate.
To avoid outside influence you could keep an independent frame/turn-based counter to measure the game time.