Python: Threads are not running in parrallel

Python: Threads are not running in parrallel - python

I'm trying to create a networking project using UDP connections. The server that I'm creating has to multithread in order to be able to receive multiple commands from multiple clients. However when trying to multithread the server, only one thread is running. Here is the code:
def action_assigner():
print('Hello Assign')
while True:
if work_queue.qsize() != 0:
data, client_address, request_number = work_queue.get()
do_actions(data, client_address, request_number)
def task_putter():
request_number = 0
print('Hello Task')
while True:
data_received = server_socket.recvfrom(1024)
request_number += 1
taskRunner(data_received, request_number)
try:
thread_task = threading.Thread(target=task_putter())
action_thread = threading.Thread(target=action_assigner())
action_thread.start()
thread_task.start()
action_thread.join()
thread_task.join()
except Exception as e:
server_socket.close()
When running the code, I only get Hello Task as the result meaning that the action_thread never started. Can someone explain how to fix this?

The problem here is that you are calling the functions that should be the "body" of each thread when creating the Threads themselves.
Upon executing the line thread_task = threading.Thread(target=task_putter()) Python will resolve first the expession inside the parentheses - it calls the function task_putter, which never returns. None of the subsequent lines on your program is ever run.
What we do when creating threads, and other calls that takes callable objects as arguments, is to pass the function itself, not calling it (which will run the function and evaluate to its return value).
Just change both lines creating the threads to not put the calling parentheses on the target= argument and you will get past this point:
...
try:
thread_task = threading.Thread(target=task_putter)
action_thread = threading.Thread(target=action_assigner)
...

Related

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks

I got two processes and in order to do some clean up in case of fatal errors (instead of processes keeping running), I want to remove all remaining tasks en empty the queue (in order to let join() proceed). How can I achieve that (preferably it should be code to apply in both processes, but my code allows the child process to signal the main process of its failure state and instruct main to do the clean up as well)?
I was trying to get a understand it by inspecting the source at:
https://github.com/python/cpython/blob/main/Lib/multiprocessing/queues.py
But I got a little bit lost with code like:
...
self._unfinished_tasks._semlock._is_zero():
...
def __init__(self, maxsize=0, *, ctx):
Queue.__init__(self, maxsize, ctx=ctx)
self._unfinished_tasks = ctx.Semaphore(0)
...
(also where does the _semlock property comes from?)
For example, what is ctx and it appears not be required as I did not use it in my object creation. Digging further, it may have something to do with (a little bit too mysterious or me)
mp.get_context('spawn')
or
#asynccontextmanager
async def ctx():
yield
I need something like mentioned here by V.E.O (which is quite understandable, but that is only a single process as far as I understand):
Clear all items from the queue

I came up with the following code (to be tested):
def clearAndDiscardQueue(self):
try: # cleanup, preferably in the process that is adding to the queue
while True:
self.task_queue.get_nowait()
except Empty:
pass
except ValueError: # in case of closed
pass
self.task_queue.close()
# theoretically a new item could be placed by the
# other process by the time the interpreter is on this line,
# therefore the part above should be run in the process that
# fills (put) the queue when it is in its failure state
# (when the main process fails it should communicate to
# raise an exception in the child process to run the cleanup
# so main process' join will work)
try: # could be one of the processes
while True:
self.task_queue.task_done()
except ValueError: # too many times called, do not care
# since all remaining will not be processed due to failure state
pass
Else I would need to try understanding code like the following. I think messing with the next code, analogous to calling queue.clear() as in a single process queue, would have serious consequences in terms of race conditions when clearing the buffer/pipe myself somehow.
class Queue(object):
def __init__(self, maxsize=0, *, ctx):
…
self._reader, self._writer = connection.Pipe(duplex=False)
…
def put(self, obj, block=True, timeout=None):
…
self._buffer.append(obj) # in case of close() the background thread
# will quit once it has flushed all buffered data to the pipe.
…
def get(self, block=True, timeout=None):
…
res = self._recv_bytes()
…
return _ForkingPickler.loads(res)
…
class JoinableQueue(Queue):
def __init__(self, maxsize=0, *, ctx):
…
self._unfinished_tasks = ctx.Semaphore(0)
…
def task_done(self):
…
if not self._unfinished_tasks._semlock._is_zero():
…
in which _is_zero() is somehow externally defined (see synchronize.py), like mentioned here:
Why doesn't Python's _multiprocessing.SemLock have 'name'?

How to properly encapsulate an asyncio Process

I have a program executed in a subprocess. This program runs forever, reads a line from its stdin, processes it, and outputs a result on stdout. I have encapsulated it as follows:
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = asyncio.run(self.create_process(filepath))
# check if the program could not be executed
if self._process.returncode is not None:
raise BrainException(f"Could not start process {filepath}")
#staticmethod
async def create_process(filepath):
process = await sp.create_subprocess_exec(
filepath, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
return process
# destructor function
def __del__(self):
self._process.kill() # kill the program, since it never stops
# waiting for the program to terminate
# self._process.wait() is asynchronous so I use async.run() to execute it
asyncio.run(self._process.wait())
async def _send(self, msg):
b = bytes(msg + '\n', "utf-8")
self._process.stdin.write(b)
await self._process.stdin.drain()
async def _readline(self):
return await self._process.stdout.readline()
def send_start_cmd(self, size):
asyncio.run(self._send(f"START {size}"))
line = asyncio.run(self._readline())
print(line)
return line
From my understanding asyncio.run() is used to run asynchronous code in a synchronous context. That is why I use it at the following lines:
# in __init__
self._process = asyncio.run(self.create_process(filepath))
# in send_start_cmd
asyncio.run(self._send(f"START {size}"))
# ...
line = asyncio.run(self._readline())
# in __del__
asyncio.run(self._process.wait())
The first line seems to work properly (the process is created correctly), but the other throw exceptions that look like got Future <Future pending> attached to a different loop.
Code:
brain = BrainProcess("./test")
res = brain.send_start_cmd(20)
print(res)
So my questions are:
What do these errors mean ?
How do I fix them ?
Did I use asyncio.run() correctly ?
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?

asyncio.run is meant to be used for running a body of async code, and producing a well-defined result. The most typical example is running the whole program:
async def main():
# your application here
if __name__ == '__main__':
asyncio.run(main())
Of couurse, asyncio.run is not limited to that usage, it is perfectly possible to call it multiple times - but it will create a fresh event loop each time. This means you won't be able to share async-specific objects (such as futures or objects that refer to them) between invocations - which is precisely what you tried to do. If you want to completely hide the fact that you're using async, why use asyncio.subprocess in the first place, wouldn't the regular subprocess do just as well?
The simplest fix is to avoid asyncio.run and just stick to the same event loop. For example:
_loop = asyncio.get_event_loop()
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = _loop.run_until_complete(self.create_process(filepath))
...
...
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?
The idea is precisely for the whole application to use async/await, otherwise you won't be able to take advantage of asyncio - e.g. you won't be able to parallelize your async code.

Unable to Kill Greenlets in Python Script with Ctrl C command

When I press Ctrl+C the call jumps into signal_handler as expected, but the greenlets are not getting killed as they continue the process.
# signal handler to process after catch ctrl+c command
def signal_handler(signum, frame):
print("Inside Signal Handler")
gevent.sleep(10)
print("Signal Handler After sleep")
gevent.joinall(maingreenlet)
gevent.killall(maingreenlet,block=True,timeout=10)
gevent.kill(block=True)
sys.exit(0)
def main():
signal.signal(signal.SIGINT, signal_handler) // Catching Ctrl+C
try:
maingreenlet = [] // Creating a list of greenlets
while True:
for key,profileval in profile.items():
maingreenlet.append(gevent.spawn(key,profileval)) # appending all grrenlets to list
gevent.sleep(0)
except (Error) as e:
log.exception(e)
raise
if __name__ == "__main__":
main()

The main reason your code is not working is because the variable maingreenlet is defined inside the main function, and is out of the scope of the signal_handler function which tries to access it. Your code should throw an error like this:
NameError: global name 'maingreenlet' is not defined
If you were to move the line maingreenlet = [] into the global scope, i.e. anywhere outside of the two def blocks, the greenlets should get killed without problem.
Of course that's after you fix the other issues in your code, like using // instead of # to start the comments, or calling the function gevent.kill with the wrong parameter. (you didn't specify your gevent version but I assume the current version 1.3.7) Actually this function call is redundant after you call gevent.killall.
Learn to use a Python debugger liker pdb or rpdb2 to help you debug your code. It'll save your precious time in the long run.

Returning value from thread in python without blocking main thread

I have got an XMLRPC server and client runs some functions on server and gets returned value. If the function executes quickly then everything is fine but I have got a function that reads from file and returns some value to user. Reading takes about minute(there is some complicated stuff) and when one client runs this function on the server then server is not able to respond for other users until the function is done.
I would like to create new thread that will read this file and return value for user. Is it possible somehow?
Are there any good solutions/patters to do not block server when one client run some long function?

Yes it is possible , this way
#starting the thread
def start_thread(self):
threading.Thread(target=self.new_thread,args=()).start()
# the thread you are running your logic
def new_thread(self, *args):
#call the function you want to retrieve data from
value_returned = partial(self.retrieved_data_func,arg0)
#the function that returns
def retrieved_data_func(self):
arg0=0
return arg0

Yes, using the threading module you can spawn new threads. See the documentation. An example would be this:
import threading
import time
def main():
print("main: 1")
thread = threading.Thread(target=threaded_function)
thread.start()
time.sleep(1)
print("main: 3")
time.sleep(6)
print("main: 5")
def threaded_function():
print("thread: 2")
time.sleep(4)
print("thread: 4")
main()
This code uses time.sleep to simulate that an action takes a certain amount of time. The output should look like this:
main: 1
thread: 2
main: 3
thread: 4
main: 5

How do I get all instances of VLC on dbus quickly?

basically the problem is, that the only way to get all instances of VLC is to search all non-named instances for the org.freedesktop.MediaPlayer identity function and call it.
(alternatively I could use the introspection API, but this wouldn't seem to solve my problem)
Unfortunately many programs upon having sent a dbus call, simply do not respond, causing a long and costly timeout.
When this happens multiple times it can add up.
Basically the builtin timeout is excessively long.
If I can decrease the dbus timeout somehow that will solve my problem, but the ideal solution would be a way.
I got the idea that I could put each call to "Identify" inside a thread and that I could kill threads that take too long, but this seems not to be suggested. Also adding multithreading greatly increases the CPU load while not increasing the speed of the program all that much.
here is the code that I am trying to get to run quickly (more or less) which is currently painfully slow.
import dbus
bus = dbus.SessionBus()
dbus_proxy = bus.get_object('org.freedesktop.DBus', '/org/freedesktop/DBus')
names = dbus_proxy.ListNames()
for name in names:
if name.startswith(':'):
try:
proxy = bus.get_object(name, '/')
ident_method = proxy.get_dbus_method("Identity",
dbus_interface="org.freedesktop.MediaPlayer")
print ident_method()
except dbus.exceptions.DBusException:
pass

Easier than spawning a bunch of threads would be to make the calls to the different services asynchronously, providing a callback handler for when a result comes back or a D-Bus error occurs. All of the calls effectively happen in parallel, and your program can proceed as soon as it gets some positive results.
Here's a quick-and-dirty program that prints a list of all the services it finds. Note how quickly it gets all the positive results without having to wait for any timeouts from anything. In a real program you'd probably assign a do-nothing function to the error handler, since your goal here is to ignore the services that don't respond, but this example waits until it's heard from everything before quitting.
#! /usr/bin/env python
import dbus
import dbus.mainloop.glib
import functools
import glib
class VlcFinder (object):
def __init__ (self, mainloop):
self.outstanding = 0
self.mainloop = mainloop
bus = dbus.SessionBus ()
dbus_proxy = bus.get_object ("org.freedesktop.DBus", "/org/freedesktop/DBus")
names = dbus_proxy.ListNames ()
for name in dbus_proxy.ListNames ():
if name.startswith (":"):
proxy = bus.get_object (name, "/")
iface = dbus.Interface (proxy, "org.freedesktop.MediaPlayer")
iface.Identity (reply_handler = functools.partial (self.reply_cb, name),
error_handler = functools.partial (self.error_cb, name))
self.outstanding += 1
def reply_cb (self, name, ver):
print "Found {0}: {1}".format (name, ver)
self.received_result ()
def error_cb (self, name, msg):
self.received_result ()
def received_result (self):
self.outstanding -= 1
if self.outstanding == 0:
self.mainloop.quit ()
if __name__ == "__main__":
dbus.mainloop.glib.DBusGMainLoop (set_as_default = True)
mainloop = glib.MainLoop ()
finder = VlcFinder (mainloop)
mainloop.run ()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Threads are not running in parrallel - python

Related

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks

How to properly encapsulate an asyncio Process

Unable to Kill Greenlets in Python Script with Ctrl C command

Returning value from thread in python without blocking main thread

How do I get all instances of VLC on dbus quickly?

Categories

Resources