Create Two sequential tasks for images in Vizard - python

I'm new in Vizard. I'm trying to create a simple code to perform two tasks sequentially for a specific time set:
A black image for 0.8 seconds
A sequence of images (from a folder) taken randomly, for 1.5 seconds.
I can perform these task separately, but I can't merge together. If someone has suggestions, thank you
import viz
import vizact
import vizinfo
import random
viz.setMultiSample(4)
viz.fov(60)
viz.go()
vizinfo.InfoPanel()
viz.clearcolor(viz.BLACK)
FRAME_RATE = 0.667 # in Hertz
r = list(range(7))
random.shuffle(r)
movieImages = viz.cycle( [ viz.addTexture('sequence_IMG/img%d.jpg' % i) for i in r ] )
screen = viz.addTexQuad()
screen.setPosition([0, 1.82, 1.5])
screen.setScale([4.0/3, 1, 1])
def executeExperiment():
for trialNumber in range(3):
yield Dark() #wait for doTrial to finish
yield vizact.ontimer(1.0/FRAME_RATE, NextMovieFrame)
print('Trial Done: ', trialNumber)
print('done with experiment')
#Setup timer to swap texture at specified frame rate
def NextMovieFrame():
screen.texture(movieImages.next())
def Dark():
yield viztask.waitTime(1) #wait for 1 second
viz.clearcolor(viz.BLACK)
vizact.ontimer(1.0/FRAME_RATE, NextMovieFrame)
viztask.schedule(executeExperiment())

Related

Use multiprocessing and multithreading to print on a image

I am facing a problem with multiprocessing and threading my program to fast the process.
My program take a list of point into an excel and create a gray scale image from this points.
The problem is I have a million points and it takes around 1 min to process. I am sure, there is a way to speed up the processing.
Here is the code without threading:
import os
import math
import json
import time
import pandas as pd
from PIL import Image
# FUNCTIONS
def CreateDataFrame(path, columns):
print('DataFrame creation ... ', end='')
with open(path) as excel_file:
lines = excel_file.read().splitlines()
np_array = []
for line in lines:
np_array.append(list(map(float, line.split(' '))))
print('done')
return pd.DataFrame(np_array, columns=columns)
def GetOffsets(df):
print('Getting offsets ... ', end='')
dict = {}
for c in df.columns:
dict[c] = min(df[c])
print('done')
return dict
def GetMaximums(df, offsets):
print('Getting Maximums ... ', end='')
max_dict = {}
for c in df.columns:
max_dict[c] = max(df[c]) - offsets[c]
print('done')
return max_dict
def CreateImage(maximums, scale = 1):
return Image.new('RGB', (int(maximums['x'] * scale) + 1, int(maximums['z'] * scale) + 1), color='black')
# MAIN
columns = ['x', 'z', 'y']
scale = 1
df = CreateDataFrame('raw data.csv', columns)
offsets = GetOffsets(df)
maximums = GetMaximums(df, offsets)
img = CreateImage(maximums, scale)
pixels = img.load()
print('Printing ... ', end='')
for i in range(len(df)):
line = df.iloc[i]
color = int(255 * (line['y'] - offsets['y']) / maximums['y'])
pixels[int((maximums['x'] - (line['x'] - offsets['x'])) * scale), int((line['z'] - offsets['z']) * scale)] = (color, color, color)
print('done')
img.save(f'terrain {scale}.png')
If you are interested, this is how it works. First, I create a dataframe from the excel and assign columns. Then, I get the minimum values of each columns to get my offset values. Once done, I do the same thing but with maximums. Thanks to those maximums, I can create an image with the maximum x and y values. Finally, I iterate into my dataframe to get the x and y for the position and gray scale the y value.
Now I am trying to multiprocess/thread it. To do that, I added this code:
import concurrent.futures
def Process(id, start, end):
start_time = time.time()
for i in range(start, end):
line = df.iloc[i]
color = int(255 * (line['y'] - offsets['y']) / maximums['y'])
pixels[int((maximums['x'] - (line['x'] - offsets['x'])) * scale), int((line['z'] - offsets['z']) * scale)] = (color, color, color)
print(f"Thread {id} ends: {time.time() - start_time}s")
nb_thread = 12
df_size = len(df)
nb_full_thread = df_size // nb_thread
thread_rest = df_size % nb_thread
with concurrent.futures.ThreadPoolExecutor(max_workers=nb_thread) as executor:
for i in range(nb_thread):
executor.submit(Process, i, i*nb_full_thread, (i+1) * nb_full_thread - 1)
print(f"Process {i} launched")
img.save(f'threading terrain {scale}.png')
Some explanations, nb_thread is the number of threads I want to create. Then, I get the number of line in my dataframe (df_size). This is usefull to determine how many lines, a thread will manage. Once done, I create my threads in order to they process the image and save the image.
With ThreadPoolExecutor, the program works but it takes the same amount of time as the previous version. And with ProcessPoolExecutor, the forloop in the Process function not orks as expected, it stops at the first value of the range.
I don't understand those behaviours, that is why I am turning to you.
I hope, it's clear enough, do not hesitate if you have a question.

How to save data to csv Cantera and error <cantera.composite.SolutionArray object at 0x7f4badca0fd0>

I have this script on Cantera. I want to save data into csv for both the two parts of the script: the first that evaluate Tfinal vs autoignition delay time and the second that evalutate the NTC behavior. In the first part the example suggests to uncomment # timeHistory.to_csv("time_history.csv") but it doesn't work. I think I need to create a dataframe because it's not well defined (I suppose). Not only this, but I saw also this error: <cantera.composite.SolutionArray object at 0x7f4badca0fd0>.
How can I solve this, and how can I create the two csv for this script?
Thank you very much
import pandas as pd
import numpy as np
import time
import cantera as ct
print('Runnning Cantera version: ' + ct.__version__)
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 18
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
plt.rcParams['figure.autolayout'] = True
plt.style.use('ggplot')
plt.style.use('seaborn-pastel')
gas = ct.Solution('Seiser.cti')
# Define the reactor temperature and pressure
reactor_temperature = 1000 # Kelvin
reactor_pressure = 101325 # Pascals
gas.TP = reactor_temperature, reactor_pressure
# Define the fuel, oxidizer and set the stoichiometry
gas.set_equivalence_ratio(phi=1.0, fuel="nc7h16", oxidizer={"o2": 1.0, "n2": 3.76})
# Create a batch reactor object and add it to a reactor network
# In this example, the batch reactor will be the only reactor
# in the network
r = ct.IdealGasReactor(contents=gas, name="Batch Reactor")
reactor_network = ct.ReactorNet([r])
# use the above list to create a DataFrame
time_history = ct.SolutionArray(gas, extra="t")
def ignition_delay(states, species):
"""
This function computes the ignition delay from the occurence of the
peak in species' concentration.
"""
i_ign = states(species).Y.argmax()
return states.t[i_ign]
reference_species = "oh"
# Tic
t0 = time.time()
# This is a starting estimate. If you do not get an ignition within this time, increase it
estimated_ignition_delay_time = 0.1
t = 0
counter = 1
while t < estimated_ignition_delay_time:
t = reactor_network.step()
if not counter % 10:
# We will save only every 10th value. Otherwise, this takes too long
# Note that the species concentrations are mass fractions
time_history.append(r.thermo.state, t=t)
counter += 1
# We will use the 'oh' species to compute the ignition delay
tau = ignition_delay(time_history, reference_species)
# Toc
t1 = time.time()
print(f"Computed Ignition Delay: {tau:.3e} seconds. Took {t1-t0:3.2f}s to compute")
# If you want to save all the data - molefractions, temperature, pressure, etc
# >>>>>>>>>>>>>>>>>>>>>>>>uncomment the next line
time_history.to_csv("time_history_TEST.csv")
plt.figure()
plt.plot(time_history.t, time_history(reference_species).Y, "-o")
plt.xlabel("Time (s)")
plt.ylabel("$Y_{OH}$")
plt.xlim([0,0.05])
plt.arrow(0, 0.008, tau, 0, width=0.0001, head_width=0.0005,
head_length=0.001, length_includes_head=True, color="r", shape="full")
plt.annotate(r"$Ignition Delay: \tau_{ign}$", xy=(0,0), xytext=(0.01, 0.0082), fontsize=16);
# Make a list of all the temperatures we would like to run simulations at
T = np.hstack((np.arange(1800, 900, -100), np.arange(975, 475, -25)))
estimated_ignition_delay_times = np.ones_like(T, dtype=float)
# Make time adjustments for the highest and lowest temperatures. This we do empirically
estimated_ignition_delay_times[:6] = 6 * [0.1]
estimated_ignition_delay_times[-4:-2] = 10
estimated_ignition_delay_times[-2:] = 100
# Now create a SolutionArray out of these
ignition_delays = ct.SolutionArray(gas, shape=T.shape, extra={"tau": estimated_ignition_delay_times})
ignition_delays.set_equivalence_ratio(1.0, fuel="nc7h16", oxidizer={"o2": 1.0, "n2": 3.76})
ignition_delays.TP = T, reactor_pressure
for i, state in enumerate(ignition_delays):
# Setup the gas and reactor
gas.TPX = state.TPX
r = ct.IdealGasReactor(contents=gas, name="Batch Reactor")
reactor_network = ct.ReactorNet([r])
reference_species_history = []
time_history = []
t0 = time.time()
t = 0
while t < estimated_ignition_delay_times[i]:
t = reactor_network.step()
time_history.append(t)
reference_species_history.append(gas[reference_species].X[0])
i_ign = np.array(reference_species_history).argmax()
tau = time_history[i_ign]
t1 = time.time()
print('Computed Ignition Delay: {:.3e} seconds for T={}K. Took {:3.2f}s to compute'.format(tau, state.T, t1-t0))
ignition_delays.tau[i] = tau
fig = plt.figure()
ax = fig.add_subplot(111)
ax.semilogy(1000/ignition_delays.T, ignition_delays.tau, 'o-')
ax.set_ylabel('Ignition Delay (s)')
ax.set_xlabel(r'$\frac{1000}{T (K)}$', fontsize=18)
# Add a second axis on top to plot the temperature for better readability
ax2 = ax.twiny()
ticks = ax.get_xticks()
ax2.set_xticks(ticks)
ax2.set_xticklabels((1000/ticks).round(1))
ax2.set_xlim(ax.get_xlim())
ax2.set_xlabel(r'Temperature: $T(K)$');
I modified the first part of the script. I deleted time_history as function of ct.SolutionArray(gas, extra="t") because It created problems to create a functional dataframe to save data. Now, I implemented pandas to save into csv but, It creates the csv file with columns and declaration of variables but it doesn't fill the csv. Moreover, I see the error:
Traceback (most recent call last):
File "test.py", line 77, in <module>
tau = ignition_delay(tHyBatch_base, reference_species)
File "test.py", line 50, in ignition_delay
i_ign = states(species).Y.argmax()
TypeError: 'DataFrame' object is not callable
import pandas as pd
import numpy as np
import time
import csv
import cantera as ct
print('Running Cantera version: ' + ct.__version__)
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 18
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
plt.rcParams['figure.autolayout'] = True
plt.style.use('ggplot')
plt.style.use('seaborn-pastel')
gas = ct.Solution('Seiser.cti')
# Define the reactor temperature and pressure
reactor_temperature = 1000 # Kelvin
reactor_pressure = 101325 # Pascals
gas.TP = reactor_temperature, reactor_pressure
# Define the fuel, oxidizer and set the stoichiometry
gas.set_equivalence_ratio(phi=1.0, fuel="nc7h16", oxidizer={"o2": 1.0, "n2": 3.76})
# Create a batch reactor object and add it to a reactor network
# In this example, the batch reactor will be the only reactor
# in the network
r = ct.IdealGasReactor(contents=gas, name="Batch Reactor")
reactor_network = ct.ReactorNet([r])
# Now compile a list of all variables for which we will store data
columnNames = [r.component_name(item) for item in range(r.n_vars)]
columnNames = ['pressure'] + columnNames
tHyBatch_base=pd.DataFrame(columns=columnNames)
tHyBatch_base.index.name = 'time'
def ignition_delay(states, species):
"""
This function computes the ignition delay from the occurence of the
peak in species' concentration.
"""
i_ign = states(species).Y.argmax()
return states.t[i_ign]
reference_species = "oh"
# Tic
t0 = time.time()
# This is a starting estimate. If you do not get an ignition within this time, increase it
estimated_ignition_delay_time = 0.1
t = 0
counter = 1
while t < estimated_ignition_delay_time:
t = reactor_network.step()
if not counter % 10:
# We will save only every 10th value. Otherwise, this takes too long
# Note that the species concentrations are mass fractions
state = np.hstack([r.thermo.state])
# Update the dataframe
tHyBatch_base.append(pd.Series(state, index=tHyBatch_base.columns[:len(state)]), ignore_index=True)
counter += 1
tHyBatch_base.to_csv("TESTCSV.csv")
# We will use the 'oh' species to compute the ignition delay
tau = ignition_delay(tHyBatch_base, reference_species)
# Toc
t1 = time.time()
print(f"Computed Ignition Delay: {tau:.3e} seconds. Took {t1-t0:3.2f}s to compute")
Someone can help? Thanks to all who want to give me an answer for an intrinsic problem of using pandas.
You should only change the command
timeHistory.to_csv("time_history.csv")
as below :
time_history.write_csv('time_history.csv')

Multiprocessing on chunks of an image

I have a function that has to loop through individual pixels of an image and calculate some geometry. This function takes a very long time to run (~5 hours on a 24 Megapixel image) but seems like it should be easy to run in parallel on multiple cores. However, I can't for the life of me find a well documented, well explained example of doing something like this using the Multiprocessing package. Here is the code I am running right now as a toy example:
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
from skimage import color
import multiprocessing
from multiprocessing import Process
#Some dumb stand in function for this exercise
def dumb_func(image):
ny, nx = image.shape
temp = np.empty_like(image)
for y in range(ny):
for x in range(nx):
temp[y, x] = np.square(image[y, x])
return temp
#Convert image to greyscale
img = color.rgb2gray(misc.ascent())
#Resize the image
ns = 2048 #Pixel size
img = misc.imresize(img, size = (ns, ns))
#Split the image into equal chunks...not sure how this works for arrays that
#are weird shapes and aren't the same size in each dimension
divs = 4
init_split = np.array_split(img, divs, axis = 0)
side = init_split[0].shape[0]
chunked = np.empty((divs, divs, side, side))
cur = 0
for i in range(divs):
split = np.array_split(init_split[i], divs, axis = 1)
for j in range(divs):
chunked[i, j, :, :] = split[j]
cur +=1
#Pull core count and divide by two to be safe
cores = int(multiprocessing.cpu_count() / 2)
result = np.empty_like(chunked)
idxs = np.array(np.meshgrid(np.arange(0, divs, 1),
np.arange(0, divs, 1))).T.reshape(-1, 2)
Basically this code loads in an image, converts it to greyscale, makes it bigger, and then chunks it up. The chunked array is of shape (i, j, ny, nx) where i and j are indices that identify the chunk of the image I am working with, and ny,nx describe the size in pixels of each chunk.
Additionally, I am creating an array called idxs that stores all possible indices into the chunked array to pull the chunked images out.
What I want to do is run a function (in this case the dumb_func as an example) over the chunks in parallel and store the results in the results array of the same shape. The way I imagined doing it was to loop over the idxs array and assign processes the chunks belonging to those indexes up to the number of cores, wait for those cores to finish, then feed the cores more processes until finished. I got stuck because I couldn't A) figure out how to access the return value in the function, and B) how to handle a situation where I might have 16 chunks and 5 cores leading to the last iteration only requiring a single process.
How can I go about doing this? I've spent the last 6-7 hours reading about Multiprocessing Pool, Process, Map, Starmap, etc... and can't for the life of me understand how to implement this.
Edit for Reedinationer:
This is my updated code and runs without error. However the new_data array is never updated. I filled it with a value of 100 and at the end of the routine new_data is exactly how it was initialized.
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
from multiprocessing import Process, JoinableQueue
from time import time
#SOme dumb stand in function for this exercise
def dumb_func(q, new_data):
while True:
index, image = q.get()
temp = image **2
new_data[index[0], index[1], :, :] = temp
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
img = misc.ascent()
#Resize the image
ns = 2048 #Pixel size
img = misc.imresize(img, size = (ns, ns))
#Split the image into equal chunks...not sure how this works for arrays that
#are weird shapes and aren't the same size in each dimension
divs = 4
init_split = np.array_split(img, divs, axis = 0)
side = init_split[0].shape[0]
chunked = np.empty((divs, divs, side, side))
cur = 0
for i in range(divs):
split = np.array_split(init_split[i], divs, axis = 1)
for j in range(divs):
chunked[i, j, :, :] = split[j]
cur +=1
new_data = np.full(chunked.shape, 100)
idxs = np.array(np.meshgrid(np.arange(0, divs, 1),
np.arange(0, divs, 1))).T.reshape(-1, 2)
for i in range(len(idxs)):
q.put((idxs[i], chunked[idxs[i][0], idxs[i][1], :, :]))
print ('starting workers')
worker_count = len(idxs)
processes = []
for i in range(worker_count):
p = Process(target=dumb_func, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
I'd do something like this, starting with dependencies:
from multiprocessing import Pool
import numpy as np
from PIL import Image
# and some for testing
from random import random
from time import sleep
first I define a function to divide an image up into "chunks", sort of as you talked about:
def chunkit(ys, xs, blocksize=64):
for y in range(0, ys, blocksize):
yt = (y, min(ys, y + blocksize))
for x in range(0, xs, blocksize):
xt = (x, min(xs, x + blocksize))
yield yt, xt
it's a lazy iterator, so this can go on for a while.
I then define my worker function:
def dumb_func(cc):
(y0,y1), (x0,x1) = cc
# convert to floats for ease of processing
chunk = image[y0:y1,x0:x1] / 255.
# random slow down for testing
# sleep(random() ** 6)
res = chunk ** 2
# convert back to bytes for efficiency
return cc, (res * 255).astype(np.uint8)
I make sure the source array stays as close to original format as possible for efficiency and send it back in the same format (this might take some fiddling, if you're dealing with other pixel formats obviously).
then I put it together:
if __name__ == '__main__':
source = Image.open('tmp.jpeg')
image = np.asarray(source)
print("loaded", image.shape, image.dtype)
with Pool() as pool:
resit = pool.imap_unordered(
dumb_func, chunkit(*image.shape[:2]))
output = np.empty_like(image)
for cc, res in resit:
(y0,y1), (x0,x1) = cc
output[y0:y1,x0:x1] = res
im = Image.fromarray(output, 'RGB')
im.save('out.jpeg')
this churns through a 15Mpixel image in a couple of seconds, with most of that spent loading/saving the image. it could probably be a lot more clever with array strides and cache friendliness, but hope that helps!
note: I think this code relies on CPython Unix style process forking semantics to make sure the image is shared between processes efficiently. not sure what would happen if you ran it on something else
I've been working on code for basically this same thing. Right now the goal is just to replace white pixels with transparent ones, but it seems to replace the entire image so there is a bug somewhere...It doesn't get an error within the multiprocessing module anymore though, so maybe it could serve as an example of how to load a Queue and then have your worker processes work on it!
from PIL import Image
from multiprocessing import Process, JoinableQueue
from threading import Thread
from time import time
def worker_function(q, new_data):
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
my_image = Image.open('InputImage.jpg')
my_image = my_image.convert('RGBA')
datas = list(my_image.getdata())
new_data = [0] * len(datas) # make a blank array the size of our image to fill later
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
print('starting workers')
worker_count = 50
processes = []
for i in range(worker_count):
p = Process(target=worker_function, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
my_image.putdata(new_data)
my_image.save('output.png', "PNG")
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
I think it's important to "protect" your code inside the if __name__ == "__main__" block otherwise the spawned processes seem to run it.
update
It looks like you need to implement a Manager() (or there are probably other ways I am ignorant of as well!). I got my code to run by altering it into:
from PIL import Image
from multiprocessing import Process, JoinableQueue, Manager
from threading import Thread
from time import time
def worker_function(q, new_data):
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
my_image = Image.open('InputImage.jpg')
my_image = my_image.convert('RGBA')
datas = list(my_image.getdata())
# new_data = [(0, 0, 0, 0)]*len(datas)
manager = Manager()
new_data = manager.list([(0, 0, 0, 0)]*len(datas))
print(new_data)
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
print('starting workers')
worker_count = 50
processes = []
for i in range(worker_count):
p = Process(target=worker_function, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
print("Saving Image")
my_image.putdata(new_data)
my_image.save('output.png', "PNG")
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
Although this doesn't seem like the fastest option! I'm sure there are other ways to increase speed. My code to do the same thing with Threads looks VERY similar:
from PIL import Image
from threading import Thread
from queue import Queue
import time
start = time.time()
q = Queue()
planeIm = Image.open('InputImage.jpg')
planeIm = planeIm.convert('RGBA')
datas = planeIm.getdata()
new_data = [0] * len(datas)
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
def worker_function():
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
print('starting workers')
worker_count = 100
for i in range(worker_count):
t = Thread(target=worker_function)
t.daemon = True
t.start()
print('main thread waiting')
q.join()
print('Queue has been joined')
planeIm.putdata(new_data)
planeIm.save('output.png', "PNG")
end = time.time()
elapsed = end - start
print('{:3.3} seconds elapsed'.format(elapsed))
Yet, processing my image takes ~23 seconds with threads and ~170 seconds with multiprocessing!! I suspect this would come from the larger overhead needed to start Process objects, and the fact that my algorithm for processing each pixel is simple for now (just the if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240: bit), so I'm likely not yielding the speed improvements that a complex pixel processing algorithm would get me. Also to note multiprocessing documentation
a single manager can be shared by processes on different computers over a network. They are, however, slower than using shared memory.
Which leads me to believe that there are alternatives that are faster.

Why is this Python script with Matplotlib so slow?

I'm trying so simulate coin tosses and profits and plot the graph in matplotlib:
from random import choice
import matplotlib.pyplot as plt
import time
start_time = time.time()
num_of_graphs = 2000
tries = 2000
coins = [150, -100]
last_loss = 0
for a in range(num_of_graphs):
profit = 0
line = []
for i in range(tries):
profit = profit + choice(coins)
if (profit < 0 and last_loss < i):
last_loss = i
line.append(profit)
plt.plot(line)
plt.show()
print("--- %s seconds ---" % (time.time() - start_time))
print("No losses after " + str(last_loss) + " iterations")
The end result is
--- 9.30498194695 seconds ---
No losses after 310 iterations
Why is it taking so long to run this script? If I change num_of_graphs to 10000, the scripts never finishes.
How would you optimize this?
Your measure of execution time is too rough. The following allows you to measure the time needed for the simulation, separate from the time needed for plotting:
It is using numpy.
import matplotlib.pyplot as plt
import numpy as np
import time
def run_sims(num_sims, num_flips):
start = time.time()
sims = [np.random.choice(coins, num_flips).cumsum() for _ in range(num_sims)]
end = time.time()
print(f"sim time = {end-start}")
return sims
def plot_sims(sims):
start = time.time()
for line in sims:
plt.plot(line)
end = time.time()
print(f"plotting time = {end-start}")
plt.show()
if __name__ == '__main__':
start_time = time.time()
num_sims = 2000
num_flips = 2000
coins = np.array([150, -100])
plot_sims(run_sims(num_sims, num_flips))
result:
sim time = 0.13962197303771973
plotting time = 6.621474981307983
As you can see, the sim time is greatly reduced (it was on the order of 7 seconds on my 2011 laptop); The plotting time is matplotlib dependent.
matplotlib is getting slower as the script progresses because it is
redrawing all of the lines that you have previously plotted - even the
ones that have scrolled off the screen.
This is the answer from a previous post answered by Simon Gibbons.
matplotlib isn't optimized for speed, rather its graphics. Here are the links to a few which were developed for speed:
http://www.pyqtgraph.org/
http://code.google.com/p/guiqwt/
http://code.enthought.com/projects/chaco/
You can refer to the matplotlib cookbook for more about performance.
In order to better optimize your code, I would always try to replace loops by vectorization using numpy or, depending on my specific needs, other libraries that use numpy under the hood.
In this case, you could calculate and plot your profits this way:
import matplotlib.pyplot as plt
import time
import numpy as np
start_time = time.time()
num_of_graphs = 2000
tries = 2000
coins = [150, -100]
# Create a 2-D array with random choices
# rows for tries, columns for individual runs (graphs).
coin_tosses = np.random.choice(coins, (tries, num_of_graphs))
# Caculate 2-D array of profits by summing
# cumulatively over rows (trials).
profits = coin_tosses.cumsum(axis=0)
# Plot everything in one shot.
plt.plot(profits)
plt.show()
print("--- %s seconds ---" % (time.time() - start_time))
In my configuration, this code took aprox. 6.3 seconds (6.2 plotting) to run, while your code took almost 15 seconds.

Python - How to alternate between two loops?

I am curious how to alternate between two loops. The 1st loop takes a set of pictures. The second loop deletes all of them. I want to take pics, delete, take, delete infinitely (or at least for a very long period of time).
import time
import picamera
import webbrowser
import io
import os
frames = 60
deletecount = 0
def filenames():
frame = 0
while frame < frames:
yield 'image%02d.jpg' % frame
frame += 1
with picamera.PiCamera() as camera:
camera.resolution = (1024, 768)
camera.framerate = 60
camera.start_preview()
time.sleep(1)
start = time.time()
camera.capture_sequence(filenames(),use_video_port=True)
finish = time.time() #takes 60 pics
while deletecount < frames:
if os.path.exists("/home/pi/image%02d.jpg"%deletecount):
os.remove("/home/pi/image%02d.jpg"%deletecount)
deletecount += 1

Categories

Resources