I'm learning Redis and trying to measure its time performance. In this setup I set 5 000 000 keys and measured the time for each unique operation.
I have got the following plot (x-label is the number of operations (set), y-label is the time in milliseconds):
I'm noticing that the performance of Redis is going extremely down. I have also measured the performance with 1 000 000 keys:
I can monitor the constant areas with bad performance. Anyway the operating time goes down after some interval of time.
My question is if somebody has done similar researches in the area of Redis performance, could he/she discuss these results. For me it is very interesting and important to understand the reason the server does such "time jumping" and how I can improve its performance.
If somebody needs my Python-code:
def setKeyValue(key, value):
assert (key != None), "Please, get a key"
redis_server.set(key, value);
def loopKeyValues(number):
timeUse = []
start_tot = time.time()
for x in range(number):
start = time.time()
#setHashKeyValue('mydata',x, x**2)
setKeyValue(x, x**2)
end = time. time()
timeUse.append(end-start)
end_tot = time.time()
time_total = end_tot - start_tot
plt.plot(timeUse)
plt.ylabel("time")
plt.show()
return time_total;
Related
The project
I am conducting a project where I need to both detect faces (bounding boxes and landmarks) and perform face recognition (identify a face). The detection is really fast (it takes not even a few milliseconds on my laptop) but the recognition can be really slow (about 0.4 seconds on my laptop). I am using the face_recognition Python library to do so. After a few tests, I discovered that it is the embedding of the image that is slow.
Here is an example code to try it out for yourself :
# Source : https://pypi.org/project/face-recognition/
import face_recognition
known_image = face_recognition.load_image_file("biden.jpg")
biden_encoding = face_recognition.face_encodings(known_image)[0]
image = face_recognition.load_image_file("your_file.jpg")
face_locations = face_recognition.face_locations(image)
face_landmarks_list = face_recognition.face_landmarks(image)
unknown_encoding = face_recognition.face_encodings(image)[0]
results = face_recognition.compare_faces([biden_encoding], unknown_encoding)
The problem
What I need to do is to process a video (30 FPS), therefore 0.4s of computation is unacceptable. The idea that I have is that the recognition will only need to be run a few times and not every frame since from one frame to another, if there are no cuts in the video, a given head will be close to its previous position. Therefore, the first time the head appears, we run the recognition which is very slow but then for the next X frames, we won't have to since we'll detect that the position is close to the previous one, therefore it must be the same person that moved. Of course, this approach is not perfect but seems to be a good compromise and I would like to try it.
The only problem is that by doing so the video is smooth until a head appears, then the video freezes because of the recognition and then becomes smooth again. This is where I would like to introduce multiprocessing, I would like to be able to compute the recognition in parallel of looping through the frame of the video. If I manage to do so, I will then only have to process a few frames in advance so that when a face shows up it already computed its recognition a few seconds ago during several frames so that we did not see a reduced frame rate.
Simple formulation
Therefore here is what I have (in python pseudo code so that it is clearer):
def slow_function(image):
# This function takes a lot of time to compute and would normally slow down the loop
return Recognize(image)
# Loop that we need to maintain at a given speed
person_name = "unknown"
frame_index = -1
while True:
frame_index += 1
frame = new_frame() # this is not important and therefore not detailes
# Every ten frames, we run a heavy function
if frame_index % 10 == 0:
person_name = slow_function(image)
# each frame we use the person_name even if we only compute it every so often
frame.drawText(person_name)
And I would like to do something like this :
def slow_function(image):
# This function takes a lot of time to compute and would normally slow down the loop
return Recognize(image)
# Loop that we need to maintain at a given speed
person_name = "unknown"
frame_index = -1
while True:
frame_index += 1
frame = new_frame() # this is not important and therefore not detailes
# Every ten frames, we run a heavy function
if frame_index % 10 == 0:
DO slow_function(image) IN parallel WITH CALLBACK(person_name = result)
# each frame we use the person_name even if we only compute it every so often
frame.drawText(person_name)
The goal is to compute a slow function over several iterations of a loop.
What I have tried
I looked up multiprocessing and Ray but I did not find examples of what I wanted to do. Most of the time I found people using multiprocessing to compute at the same time the result of a function for different inputs. This is not what I want. I want to have in parallel a loop and a process that accepts data from the loop (a frame), do some computation, and returns a value to the loop without interrupting or slowing down the loop (or at least, spreading the slow down rather than having one really slow iteration and 9 fast ones).
I think I found pretty much how to do what I want. Here is an example:
from multiprocessing import Pool
import time
# This seems to me more precise than time.sleep()
def sleep(duration, get_now=time.perf_counter):
now = get_now()
end = now + duration
while now < end:
now = get_now()
def myfunc(x):
time.sleep(1)
return x
def mycallback(x):
print('Callback for i = {}'.format(x))
if __name__ == '__main__':
pool=Pool()
# Approx of 5s in total
# Without parallelization, this should take 15s
t0 = time.time()
titer = time.time()
for i in range(100):
if i% 10 == 0: pool.apply_async(myfunc, (i,), callback=mycallback)
sleep(0.05) # 50ms
print("- i =", i, "/ Time iteration:", 1000*(time.time()-titer), "ms")
titer = time.time()
print("\n\nTotal time:", (time.time()-t0), "s")
t0 = time.time()
for i in range(100):
sleep(0.05)
print("\n\nBenchmark sleep time time:", 10*(time.time()-t0), "ms")
Of course, I will need to add flags so that I do not write a value with the callback at the same time that I read it in the loop.
At work, I have a need: to do sampling every 0.08 seconds in 10 seconds.
I use while loop but it fails.
import time
start_t =time.time()
while time.time() -start_t <=10:
if float(time.time() -start_t) % float(0.08) == 0:
"""do sample record""
finally, I got no data at all, I think the if float(time.time() -start_t) % float(0.08) == 0: does not work.
I am confused how to set the condition to enter the sampling code.
The easiest way is to use time.sleep:
from time import sleep
for i in range(125):
"""do sample record"""
sleep(0.08)
You probably get no data because you collect the time only at discrete moments. In these moments, they will never be perfect multiples of 0.08.
Q : "How to accurately sample in python"
At work ( Chongqing ),I have a need: to do sampling every 0.08 seconds in 10 seconds.
Given the python is to be used, the such precise sampling will need a pair of signal.signal()-handlers on the unix-systems,
import signal
#------------------------------------------------------------------
# DEFINE HANDLER, responsible for a NON-BLOCKING data-acquisition
#------------------------------------------------------------------
def aSIG_HANDLER( aSigNUM, aPythonStackFRAME ):
... collect data ...
return
#------------------------------------------------------------------
# SET THE SIGNAL->HANDLER MAPPING
#------------------------------------------------------------------
signal.signal( signal.SIGALM, aSIG_HANDLER )
#------------------------------------------------------------------
# SET THE INTERVAL OF SIGNAL-ACTIVATIONS
#------------------------------------------------------------------
signal.setitimer( signal.ITIMER_REAL, seconds = 0, # NOW WAIT ZERO-SECONDS
interval = 0.08 # FIRE EACH 80 [ms]
)
#------------------------------------------------------------------
# ... more or less accurately wait for 10 seconds, doing NOP-s ...
#------------------------------------------------------------------
#----------------------------------------------------------------
# AFTER 10 [s] turn off the signal.ITIMER_REAL activated launcher
#----------------------------------------------------------------
signal.setitimer( signal.ITIMER_REAL, seconds = 0, # NOW WAIT ZERO-SECONDS
interval = 0.0 # STOP SENDING SIGALM-s
)
or,for a Windows-based systems,there is a chance to tweak ( and fine-tune up to a self-correcting, i.e. non-drifting ) Tkinter-based sampler as shown in this answer.
class App():
def __init__( self ):
self.root = tk.Tk()
self.label = tk.Label( text = "init" )
self.label.pack()
self.sampler_get_one() # inital call to set a scheduled sampler
self.root.lower() # hide the Tk-window from GUI-layout
self.root.mainloop()
def sampler_get_one( self ):
# \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
#
# DEMO to show real plasticity of the Tkinter scheduler timing(s)
#
# /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
... review a drift of the activation + adapt the actual delay for a next .after()
# SET .after() vv-----------# re-calculate this value to adapt/avoid drifting
self.root.after( 80, # re-instate a next scheduled call,
self.sampler_get_one
) # .after a given ( self-corrected ) delay in [ms]
#-------------------------------#-NOW--------------------------------------------
... acquire ... data ... # best in a non-blocking manner & leave ASAP
You use float number divide by float number, and time.time() will return a long decimal number so you get no data because your result always 0.00001234 or something like that. I think you should use round to get 2 decimal number
temp = time.time()-start_t
if round(temp,2) % 0.08 == 0:
"""do sample record"""
However, this script will return about 27000 result in 10 second. Because you will have 0.08, 0.081,0.082,etc and they all do your recording work.
So I think you should work with Maximilian Janisch solution (using sleep function) is better. I just want to explain why you reach no solution.
Hope this helpful!
EPILOGUE :
With all due respect, the proposed code is awfully dangerous & mis-leading.Just test how naive it gets : 8.00 % 0.08 yields 0.07999999999999984 that is by no means == 0,while the if-condition ought be served & sample taken, if it were not for the (known) trap in real-numbers IEEE-754 handling.So as to see the scope of the disaster, try :sum( [ round( i * 0.08, 2 ) % 0.08 == 0 for i in range( 126 ) ] )+ compare it with 125-samples the task was defined above to acquire.Get 8 samples instead of 125 # regular, 12.5 [Hz] samplingis nowhere near a solution! – user3666197 22 hours ago
#user3666197 wow, a very clear explanation, I think I should delete this answer to avoid misleading in the future. Thank you! – Toby 5 hours ago
Better do not remove the Answer, as it documents what shall never be done,which is of a specific value to the Community- best to mention the rationale, not to use this kind of approaches in any real-life system.The overall lesson is positive- all learned a next step towards better system designs. I wish you all the best, man! – user3666197 4 mins ago
this will probably never exactly be true as checking for equality with floats like this will need to be very precise.
try doing something like:
start_t =time.time()
looped_t = start_t
while time.time() - start_t <= 10:
if time.time() - looped_t >= 0.08:
looped_t = time.time()
"""do sample record""
The sleep answer from Maximillian is fine as well, except if your sampling takes a significant amount of time (several hundreds of a second) then you will not stay near the 10 second requirement.
It also depends on what you prioritize as this method will at most provide 124 samples instead of the exact 125 you would expect (and do get with the sleep function).
I have written a small game about knight's tour problem in Python.
When I finished the algorithm part of the game, I found it run about 0.01 second to find a successful path of a 8*8 chess board. But when I run it on the computer of my office, it cost over 10 seconds to find the same path. Then I tried it on three other computers, the result is about 0.005, 6 and 8 seconds.
Why the execution speed of the same code has a huge difference on the five computers? The result is 0.005, 0.010, 6, 8 and 10 seconds. We can see the difference can be over 1000 times. The hardware of computers whose speeds are 6s and 8s are similar or better than 0.01's. And if the hardware affects the speed, it can't be that much -- about 1000 times.
I have corrected my code, the first edition has a mistake. I am using Python 3.6. And the test has been changed to 8*8 size, I'm sorry that I misremembered.
The following is the code.
import sys
import time
def init_path(size):
allow = []
for i in range(size):
for j in range(size):
allow.append([i, j])
return allow
def get_next_choice(step, hist, raws, allow):
num = 0
for raw in raws:
nextstep = [raw[i]+step[i] for i in range(2)]
if nextstep in allow and nextstep not in hist:
num += 1
return num
def search_next(size, pos, history, allow):
nextsteps = {}
raws = [[1,2], [1,-2], [2,1], [2,-1], [-1,2], [-1,-2], [-2,1], [-2,-1]]
if len(history) == size*size:
return True
for raw in raws:
nextstep = [raw[i]+pos[i] for i in range(2)]
if nextstep in allow and nextstep not in history:
next_choice = get_next_choice(nextstep, history, raws, allow)
nextsteps[next_choice] = nextstep
sorted(nextsteps.items())
for nextstep in nextsteps.values():
history.append(nextstep)
back = search_next(size, nextstep, history, allow)
if back:
return True
else:
history.pop()
else:
return False
def search_path(size, history):
allow = init_path(size)
position = history[-1]
back = search_next(size, position, history, allow)
if back:
return history
else:
return False
atime = time.time()
path = search_path(8, [[0,0]])
btime = time.time()
print(btime - atime)
Different computers have different hardwares! Different clock speeds and different RAM sizes and other specifications which may make code run faster!
That is why something called 'Asymptotic Notations' exist in the first place! Since we can't assess the speed or the time it will take to run a code, because every machine will have different specifications, we use 'Asymptotic Notations' as a standard way to explain the time complexity of a give piece of code.
The computer in your office may have different hardware, memory, lower clock-speed and other hardware related contributing factors that causes the running of that exact same code to slow down! Where as a better computer with faster configurations will run the same code much faster.
You're performing a task which is computationally expensive, and requires a lot of memory and processing speed.
In my app I want to allow the user to scroll through images by holding down an arrow key. Not surprisingly with larger images the pc can't keep up, and builds up a potentially large buffer that carries on being processed after the key is released.
None of this is unexpected, and my normal answer is just to check the timestamp in the event against the current time and discard any events that are more than (say) .2 seconds old. This way the backlog can never get too large.
But tkinter uses some random timebase of events so that comparing with time.time() is meaningless, and I can't find a function to get hold of tkinter's own clock. I'm sure its in there, it's just most of the pythonised tkinter documentation is a bit naff, and searching for time or clock isn't helping either.
def plotprev(self,p):
if time.time() - p.time > .2:
return
Sadly this test always returns true, where is tkinter's pseudo clock to be found?
Any other method will be complex in comparison.
well it's nor very nice, but it isn't too tedious and seems to work quite well: (with a little bit of monitoring as well)
def checklag(self,p):
if self.lasteventtime is None: #assume first event arrives with no significant delay
self.lasteventtime = p.time
self.lasteventrealtime = time.time()
self.lagok=0
self.lagfail=0
return True
ptdiff = (p.time-self.lasteventtime) / 1000
rtdiff = time.time() - self.lasteventrealtime
lag = rtdiff-ptdiff
if lag < .3:
self.lagok += 1
if self.lagok %20 == 0:
print("lagy? OK: %d, fail: %d" %(self.lagok, self.lagfail))
return True
else:
self.lagfail += 1
return False
Developing a process that should read data at consistent intervals. The time period to read data varies depending on the network. I thought this should be straightforward but I can never get consistent timing. Looking for a more consistent and stable system that responds well to network speed variability.
currently I am using a model that follows
|<--read data-->|<--post process-->|<--sleep x seconds to maintain period-->|
|<------------------------------known data rate---------------------------->|
My code does something like
data_rate = 5 # Hz
while 1:
# read in data
rd_start = time.time()
data = getdata()
rd_stop = time.time()
# Post processing
pp_start = time.time()
rate = 1.0/(rd_start - oldstart) if oldstart else data_rate
old_start = rd_start
print rate
post_process(data)
pp_stop = time.time()
sleep_time = 1.0/data_rate - ((rd_stop-rd_start) + (pp_stop-pp_start))
sleep_time = sleep_time if sleep_time>0 else 0
time.sleep(sleep_time)
I also have some logic that changes the update rate (data_rate) if the network is having trouble meeting that speed (sleep times are consistently negative) but that is working correctly.
For some reason my data rate is never consistent (And runs at about 4.92 Hz when it stabilizes). Also this method is pretty unstable. What is the better way to do this? Threading.Timers() comes to mind?
Could the consistent offset in frequency be caused by errors with time.sleep()?
How accurate is python's time.sleep()?