Spawning Python Threads at EXACTLY the Same Time? - python

Right now some friends and I are making a program that generates music using square waves in Python (we're still very early on in development). One of the roadblocks along the way was that we figured that PyAudio will only play one sound at a time, and if you tried to play sounds over each other, e.g. to make a chord, the sounds just overwrite each other. Our current strategy is using threading to get around it, and it almost works, but the timing for when the threads start is very slightly off. Here is a snippet of our code that generates a C major chord:
import numpy as np
import pyaudio
import math
from scipy import signal
import multiprocessing
from time import time
def noteTest(frequency):
l = np.linspace(0, 2, 384000, endpoint=False)
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=192000, output=True)
wave_data = signal.square(2 * math.pi * frequency * l)
stream.write(wave_data)
def playNotes():
if __name__ == "__main__":
multiprocessing.Process(target = noteTest, args = [523.25113060119]).start()
print(time())
multiprocessing.Process(target = noteTest, args = [659.25511382575]).start()
print(time())
multiprocessing.Process(target = noteTest, args = [783.99087196355]).start()
print(time())
playNotes()
When I look at the output of the program, here are the times it gives:
1510810518.870557
1510810518.8715587
1510810518.8730626
As you can see, the threads are over a thousandth of a second apart. This is surprisingly noticeable, even for just one chord, but we fear that this will become an even bigger problem if we try and make an actual song as the tracks will drift apart and get out of time with each other. Note that all of the computers we tested this with DO have multiple physical cores. Is there any way to make the threads synchronize better, or are we better off finding an alternate solution?

An option is to have a delay in each thread, before playing the sound. If you have a reasonable idea of the offset involved in starting the threads, you can pass that value in as the delay.
For example, let's say there is 1ms delay between starting threads:
0ms: Start thread 1, with 1ms delay
0ms: Thread 1 starts on new core, waits
1ms: Start thread 2, with no delay
1ms: Thread 1 starts playing after delay
1ms: Thread 2 starts on new core, no delay, and starts playing
Another option, is to have each thread kick off, but wait for a signal from the main process loop to ALL the threads, before they start playing.

Related

In python, how can I run a function without the program waiting for its completion?

In Python, I am making a cube game (like Minecraft pre-classic) that renders chunk by chunk (16x16 blocks). It only renders blocks that are not exposed (not covered on all sides). Even though this method is fast when I have little height (like 16x16x2, which is 512 blocks in total), once I make the terrain higher (like 16x16x64, which is 16384 blocks in total), rendering each chunk takes roughly 0.03 seconds, meaning that when I render multiple chunks at once the game freezes for about a quarter of a second. I want to render the chunks "asynchronously", meaning that the program will keep on drawing frames and calling the chunk render function multiple times, no matter how long it takes. Let me show you some pictures to help:
I tried to make another program in order to test it:
import threading
def run():
n=1
for i in range(10000000):
n += 1
print(n)
print("Start")
threading.Thread(target=run()).start()
print("End")
I know that creating such a lot of threads is not the best solution, but nothing else worked.
Threading, however, didn't work, as this is what the output looked like:
>>> Start
>>> 10000001
>>> End
It also took about a quarter of a second to complete, which is about how long the multiple chunk rendering takes.
Then I tried to use async:
import asyncio
async def run():
n = 1
for i in range(10000000):
n += 1
print(n)
print("Start")
asyncio.run(run())
print("End")
It did the exact same thing.
My questions are:
Can I run a function without stopping/pausing the program execution until it's complete?
Did I use the above correctly?
Yes. No. The answer is complicated.
First, your example has at least one error on it:
print("Start")
threading.Thread(target=run).start() #notice the missing parenthesis after run
print("End")
You can use multithreading for your game of course, but it can come at a disadvantage of code complexity because of synchronization and you might not gain any performance because of GIL.
asyncio is probably not for this job either, since you don't need to highly parallelize many tasks and it has the same problems with GIL as multithreading.
The usual solution for this kind of problem is to divide your work into small batches and only process the next batch if you have time to do so on the same frame, kind of like so:
def runBatch(range):
for x in range:
print(x)
batches = [range (x, x+200) for x in range(0, 10000, 200)]
while (true): # main loop
while (timeToNextFrame() > 15):
runBatch(batch.pop())
renderFrame() #or whatever
However, in this instance, optimizing the algorithm itself could be even better than any other option. One thing that Minecraft does is it subdivides chunks into subchunks (you can mostly ignore subchunks that are full of blocks). Another is that it only considers the visible surfaces of the blocks (renders only those sides of the block that could be visible, not the whole block).
asyncio only works asynchronously only when your function is waiting on I/O task like network call or wait on disk I/O etc.
for non I/O tasks to execute asynchronously multi-threading is the only option so create all your threads and wait for the threads to complete their tasks using thread join method
from threading import Thread
import time
def draw_pixels(arg):
time.sleep(arg)
print(arg)
threads = []
args = [1,2,3,4,5]
for arg in args:
t = Thread(target=draw_pixels, args=(arg, ))
t.start()
threads.append(t)
# join all threads
for t in threads:
t.join()

Multiprocessing - pass shared Queue and unique number for each worker

I can't quite find solution to a code where I pass to each worker Shared Queue but also a number for each worker.
My code:
The idea is to create several channels for putting audio songs. Each channels must be unique. So If a song arrives I put it to channel which is available
from multiprocessing import Pool,Queue
from functools import partial
import pygame
queue = Queue()
def play_song(shared_queue, chnl):
channel = pygame.mixer.Channel(chnl)
while True:
sound_name = shared_queue.get()
channel.play(pygame.mixer.Sound(sound_name))
if __name__ == "__main__":
channels= [0,1, 2, 3, 4]
func = partial(play_song,queue)
p = Pool(5,func, (channels,))
This code of course doesn't return any error, because its multiprocessing, but the problem is that channels is passed to play_song as whole list instead of being mapped to all workers.
So basically instead of each worker initialize channel like this:
channel = pygame.mixer.Channel(0) # each worker would have number from list so 1,2,3,4
I am getting this
channel = pygame.mixer.Channel([0,1,2,3,4]) # for each worker
I tried playing with partial function, but unsuccessfully.
I was successful with pool.map function, but while I could pass individual numbers from channels list, I couldn't share Queue among workers
Eventually I found solution to my Pygame problem that does not require threads or multiprocessing.
Background to the problem:
I was working with Pyaudio and since it is quite lowlevel api to audio, I had problems with mixing several sounds at the same time and in general. The reasons are:
1) It is not easy(maybe imposible) to start several streams at the same time or feed those streams at the same time (looks like hardware problem)
2) Based on 1) I tried different attitude - have one stream where audio waves from different sounds are added up before entering stream - that works but its unreliable as adding up audiowaves is not really compatible - adding to much waves results in 'sound cracking' as the amplitudes are too high.
Based on 1) and 2) I wanted to try run streams in different processes, therefore this question.
Pygame solution (single processed):
for sound_file in sound_files:
availible_channel = pygame.mixer.find_channel() #if there are 8 channels, it can play 8 sounds at the same time
availible_channel.play(sound_file )
if sound_files are already loaded, this gives near simultaneous results.
Multiprocessing solution
Thanks to Darkonaut who pointed out multiprocessing method, I manage to answer my initial question on multiprocessing, which I think is already answered on stackoverflow, but I will include it.
Example is not finished, because I didnt use it at the end, but it answers my initial requirement on the processes with shared queue but with different parameters
import multiprocessing as mp
shared_queue = mp.Queue()
def channel(que,channel_num):
que.put(channel_num)
if __name__ == '__main__':
processes = [mp.Process(target=channel, args=(shared_queue, channel_num)) for channel_num in range(8)]
for p in processes:
p.start()
for i in range(8):
print(shared_queue.get())
for p in processes:
p.join()

How to start two algorithms using different threads and only take the output from the first algorithm to finish?

I have two algorithms that are capable of solving Sudoku puzzles. The first uses a dancing links algorithm (a form of recursive backtracking) the other uses constraint propagation followed by recursive backtracking.
I would like to start both algorithms with their own thread and I only want the output from the first to finish. Because both will have the same output (or at least both will have an acceptable output), I want to take the answer from the first to finish and kill the other thread.
Sometimes of the algorithms can take 100+ seconds to finish on puzzles perfectly made to trip up that algorithm, but they I haven't found a puzzle that trips up both algorithms.
Do I need to mark the threads as Daemon? I'm getting answers with the following code, I'm just worried it's not doing what I hope it's doing.
Edit: after further reading its looking like maybe i actually want to use multiprocessing instead of threading. I'm going to test both and see how they compare.
import threading
import time
import queue
from dlx import dlx_solve
from norvig import norvig_solve
# Test grids:
grid1 = '003020600900305001001806400008102900700000008006708200002609500800203009005010300'
grid2 = '4.....8.5.3..........7......2.....6.....8.4......1.......6.3.7.5..2.....1.4......'
hard1 = '.....6....59.....82....8....45........3........6..3.54...325..6..................'
def solve(algo, grid, worker_queue, id, stop_event):
while not stop_event.is_set():
ans = algo(grid)
if not stop_event.is_set():
worker_queue.put((ans, id))
break
# queue for workers
worker_queue = queue.Queue()
# indicator for other threads to stop
stop_event = threading.Event()
# run workers
threads = []
threads.append(threading.Thread(target=solve, args=(dlx_solve, grid1, worker_queue, 'dlx', stop_event)))
threads.append(threading.Thread(target=solve, args=(norvig_solve, grid1, worker_queue, 'norvig', stop_event)))
for thread in threads:
thread.start()
# this will block until the first element is in the queue
first_finished = worker_queue.get()
print(first_finished)

Python multiprocessing - Is it possible to introduce a fixed time delay between individual processes?

I have searched and cannot find an answer to this question elsewhere. Hopefully I haven't missed something.
I am trying to use Python multiprocessing to essentially batch run some proprietary models in parallel. I have, say, 200 simulations, and I want to batch run them ~10-20 at a time. My problem is that the proprietary software crashes if two models happen to start at the same / similar time. I need to introduce a delay between processes spawned by multiprocessing so that each new model run waits a little bit before starting.
So far, my solution has been to introduced a random time delay at the start of the child process before it fires off the model run. However, this only reduces the probability of any two runs starting at the same time, and therefore I still run into problems when trying to process a large number of models. I therefore think that the time delay needs to be built into the multiprocessing part of the code but I haven't been able to find any documentation or examples of this.
Edit: I am using Python 2.7
This is my code so far:
from time import sleep
import numpy as np
import subprocess
import multiprocessing
def runmodels(arg):
sleep(np.random.rand(1,1)*120) # this is my interim solution to reduce the probability that any two runs start at the same time, but it isn't a guaranteed solution
subprocess.call(arg) # this line actually fires off the model run
if __name__ == '__main__':
arguments = [big list of runs in here
]
count = 12
pool = multiprocessing.Pool(processes = count)
r = pool.imap_unordered(runmodels, arguments)
pool.close()
pool.join()
multiprocessing.Pool() already limits number of processes running concurrently.
You could use a lock, to separate the starting time of the processes (not tested):
import threading
import multiprocessing
def init(lock):
global starting
starting = lock
def run_model(arg):
starting.acquire() # no other process can get it until it is released
threading.Timer(1, starting.release).start() # release in a second
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = Pool(processes=12,
initializer=init, initargs=[multiprocessing.Lock()])
for _ in pool.imap_unordered(run_model, arguments):
pass
One way to do this with thread and semaphore :
from time import sleep
import subprocess
import threading
def runmodels(arg):
subprocess.call(arg)
sGlobal.release() # release for next launch
if __name__ == '__main__':
threads = []
global sGlobal
sGlobal = threading.Semaphore(12) #Semaphore for max 12 Thread
arguments = [big list of runs in here
]
for arg in arguments :
sGlobal.acquire() # Block if more than 12 thread
t = threading.Thread(target=runmodels, args=(arg,))
threads.append(t)
t.start()
sleep(1)
for t in threads :
t.join()
The answer suggested by jfs caused problems for me as a result of starting a new thread with threading.Timer. If the worker just so happens to finish before the timer does, the timer is killed and the lock is never released.
I propose an alternative route, in which each successive worker will wait until enough time has passed since the start of the previous one. This seems to have the same desired effect, but without having to rely on another child process.
import multiprocessing as mp
import time
def init(shared_val):
global start_time
start_time = shared_val
def run_model(arg):
with start_time.get_lock():
wait_time = max(0, start_time.value - time.time())
time.sleep(wait_time)
start_time.value = time.time() + 1.0 # Specify interval here
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = mp.Pool(processes=12,
initializer=init, initargs=[mp.Value('d')])
for _ in pool.imap_unordered(run_model, arguments):
pass

How to reduce thread switching latency in Python

I have a Python 2.7 app that has 3 producer threads and 1 consumer thread that are connected to a Queue.queue. I'm using get and put, and the producer threads spend most of their time blocked in IO (reading from serial ports) - basically doing nothing. Basically calling serial.read()...
However, I seem to have what I would call a high latency between the time a producer thread puts to the queue and the time the consumer thread gets from the queue, like 25 ms (I'm running a 1 processor Beagle Bone Black (1GHz) on Angstrom Linux).
I would think that if all the processes are blocked, then the elapsed time between put and get should be really small, a few microseconds or so, not tens of milliseconds, except when the consumer thread is actually busy (which is not the case here).
I've read some things online that suggest that Python is guilty of busy spin, and that the GIL in Python is to blame. I guess I would rather not know the reason and just get something that is more responsive. I'm fine with the actual latency of serial transmission (about 1-2 ms).
The code looks basically like
q = Queue.queue
def a1():
while True:
p = read_serial_packet("/dev/ttyO1")
p.timestamp = time.time()
q.put(p)
def a2():
while True:
p = read_serial_packet("/dev/ttyO2")
p.timestamp = time.time()
q.put(p)
def a3():
while True:
p = read_serial_packet("/dev/ttyO3")
p.timestamp = time.time()
q.put(p)
def main():
while True:
p = q.get()
d = time.time() - p.timestamp
print str(d)
and there are 4 threads running a1, a2,a3 and main.
Here are some sample times
0.0119640827179
0.0178141593933
0.0154139995575
0.0192430019379
0.0185649394989
0.0225830078125
0.018187046051
0.0234098434448
0.0208261013031
0.0254039764404
0.0257620811462
Is this something that is "fixed" in Python 3?
As #fileoffset hinted, the answer seems to be switching from threading (which suffers from the fact that the Python GIL does not actually do "real" threading) to multiprocessing, which has several python processes instead of threads.
The conversion from threading to multiprocessing looks like this:
useMP = True # or False if you want threading
if useMP:
import multiprocessing
import multiprocessing.queues
import Queue # to import Queue.Empty exception, but don't use Queue.Queue
else:
import threading
import Queue
...
if useMP:
self.event_queue = multiprocessing.queues.Queue()
t1 = multiprocessing.Process(target=self.upstream_thread)
t2 = multiprocessing.Process(target=self.downstream_thread)
t3 = multiprocessing.Process(target=self.scanner_thread)
else :
self.event_queue = Queue.Queue()
t1 = threading.Thread(target=self.upstream_thread)
t2 = threading.Thread(target=self.downstream_thread)
t3 = threading.Thread(target=self.scanner_thread)
The rest of the API looks the same.
There is one other important issue though that was not easy to migrate and is left as an exercise. The issue is catch Unix signals, such as SIGINT or Ctrl-C handlers. Previously, the master thread catches the signal and all the other threads ignore it. Now, the signal is sent to all processes. So you have to be careful about catching KeyboardInterrupt and installing signal handlers. I don't think I did it the right way, so I am not going to elaborate... :)
You might try playing around with the value of the "check interval"
sys.setcheckinterval(50)
A brief explanation of the general concept can be found in these slides, starting around page 10.

Categories

Resources