Python ignore error in multiprocessing pool map - python

In main program I have:
def main_func:
...
#main function
pool= Pool(processes=10)
res = pool.map(function_b,verylonglist)
...
and function_b
def function_b:
try:
#dosomething
except:
pass
I'm just new to python so I want to ask how can handle the exception. I want the process ignore all mapping that raise error and collect the correct one. As in my code, the process will say some errors and then running non stop again and again at that point. Thank you.

Related

Python logging to a class attribute

I have two threads that are instances of some classes and are working independently. They do their stuff and then wait for each other. One of the threads can raise an exception - then I want to stop the parallel thread and the whole program. To do this I want to pass the log with exception to get it in the main program loop. But I want to get this message from logging - the same which will be displayed on the console, and in the log file - there is some reason.
Is there any supported way to do this? I saw some complicated solutions, but they wre not working for me.
I cant just append a string with "Some exception!" - it has to be correlated with logging and its formatter, and all logs have to be consistent.
How to make append "catch" the log message - is it possible?
self.exceptions.append( catch_somehow(log.error(message)) )
Example code below
import Thread
# other imports
class SomeThreadWrapper(Thread)
def __init__(self):
self.exceptions = []
# class stuff
def SomeFunction(self):
try:
# some logic
except SomeException as e:
# do something
log.error("Oh no, some exception occured!")
self.exceptions.append(<somehow_catch_the_logging_from_line_above>)
raise e
def get_exception(self):
return self.exceptions
class SomeThread(Thread)
# class stuff
if __name__ == "__main__":
# some logic
thread_wrapper = SomeThreadWrapper()
thread = SomeThread()
thread_wrapper.start()
thread.start()
thread_wrapper.join()
if len(thread_wrapper.get_exception):
<Join and Kill the thread to not waste time>
<stop_the program>
thread.join()
# some other logic
Is logging module infrastructure lets for something like this?

How to make my code stopable? (Not killing/interrupting)

I'm asking this question in a more broad spectrum because I'm not facing this specific issue right now, but I'm wondering how to do it in the future.
If I have a long running python script, that is supposed to do something all the time (could be a infine loop, if that helps). The code is started by running python main.py command on a terminal.
The code doesn't have an ending, so there will be no sys.exit().
I don't want to use KeyboardInterrupt and I don't want to kill the task. Because those options are abrupt, and you can't predict precisely at what point you are stoping the code.
Is there a way to 'softly' terminate the code when I eventually decide to fo it? For example using another command, preparing a class or running another script?
What would be the best practice for this?
PS.: Please, bear in mind that I'm a novice coder.
EDIT:
I'm adding some generic code, in order to make my question clearer.
import time,csv
import GenericAPI
class GenericDataCollector:
def __init__(self):
self.generic_api = GenericAPI()
def collect_data(self):
while True: #Maybe this could be a var that is changed from outside of the class?
data = self.generic_api.fetch_data() #Returns a JSON with some data
self.write_on_csv(data)
time.sleep(1)
def write_on_csv(self, data):
with open('file.csv','wt') as f:
writer = csv.writer(f)
writer.writerow(data)
def run():
obj = GenericDataCollector()
obj.collect_data()
if __name__ == "__main__":
run()
In this particular case, the class is collecting data from some generic API (that comes in JSON) and writing it in a csv file, in a infinite loop. How could I code a way (method?) to stop it (when called uppon, so unexpected), without abruptly interrupting (Ctrl+C or killing task).
I would recommend use the signal module. This allows you to handle signal interrupts (SIGINT) and clean up the program before your exit. Take the following code for example:
import signal
running = True
def handle(a, b):
global running
running = False
# catch the SIGINT signal and call handle() when the process
# receives it
signal.signal(signal.SIGINT, handle)
# your code here
while running:
pass
You can still exit with a Ctrl+C, but what you put in the while loop will not be cut off half way.
Based on #Calder White, how about this (not tested):
import signal
import time,csv
import GenericAPI
class GenericDataCollector:
def __init__(self):
self.generic_api = GenericAPI()
self.cont = True
def collect_data(self):
while self.cont:
signal.signal(signal.SIGINT, self.handle)
data = self.generic_api.fetch_data() #Returns a JSON with some data
self.write_on_csv(data)
time.sleep(1)
def handle(self):
self.cont = False
def write_on_csv(self, data):
with open('file.csv','wt') as f:
writer = csv.writer(f)
writer.writerow(data)
def run():
obj = GenericDataCollector()
obj.collect_data()
if __name__ == "__main__":
run()

Unable to Kill Greenlets in Python Script with Ctrl C command

When I press Ctrl+C the call jumps into signal_handler as expected, but the greenlets are not getting killed as they continue the process.
# signal handler to process after catch ctrl+c command
def signal_handler(signum, frame):
print("Inside Signal Handler")
gevent.sleep(10)
print("Signal Handler After sleep")
gevent.joinall(maingreenlet)
gevent.killall(maingreenlet,block=True,timeout=10)
gevent.kill(block=True)
sys.exit(0)
def main():
signal.signal(signal.SIGINT, signal_handler) // Catching Ctrl+C
try:
maingreenlet = [] // Creating a list of greenlets
while True:
for key,profileval in profile.items():
maingreenlet.append(gevent.spawn(key,profileval)) # appending all grrenlets to list
gevent.sleep(0)
except (Error) as e:
log.exception(e)
raise
if __name__ == "__main__":
main()
The main reason your code is not working is because the variable maingreenlet is defined inside the main function, and is out of the scope of the signal_handler function which tries to access it. Your code should throw an error like this:
NameError: global name 'maingreenlet' is not defined
If you were to move the line maingreenlet = [] into the global scope, i.e. anywhere outside of the two def blocks, the greenlets should get killed without problem.
Of course that's after you fix the other issues in your code, like using // instead of # to start the comments, or calling the function gevent.kill with the wrong parameter. (you didn't specify your gevent version but I assume the current version 1.3.7) Actually this function call is redundant after you call gevent.killall.
Learn to use a Python debugger liker pdb or rpdb2 to help you debug your code. It'll save your precious time in the long run.

How to detect exceptions in concurrent.futures in Python3?

I have just moved on to python3 as a result of its concurrent futures module. I was wondering if I could get it to detect errors. I want to use concurrent futures to parallel program, if there are more efficient modules please let me know.
I do not like multiprocessing as it is too complicated and not much documentation is out. It would be great however if someone could write a Hello World without classes only functions using multiprocessing to parallel compute so that it is easy to understand.
Here is a simple script:
from concurrent.futures import ThreadPoolExecutor
def pri():
print("Hello World!!!")
def start():
try:
while True:
pri()
except KeyBoardInterrupt:
print("YOU PRESSED CTRL+C")
with ThreadPoolExecutor(max_workers=3) as exe:
exe.submit(start)
The above code was just a demo, of how CTRL+C will not work to print the statement.
What I want is to be able to call a function is an error is present. This error detection must be from the function itself.
Another example
import socket
from concurrent.futures import ThreadPoolExecutor
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
def con():
try:
s.connect((x,y))
main()
except: socket.gaierror
err()
def err():
time.sleep(1)
con()
def main():
s.send("[+] Hello")
with ThreadPoolExecutor as exe:
exe.submit(con)
Way too late to the party, but maybe it'll help someone else...
I'm pretty sure the original question was not really answered. Folks got hung up on the fact that user5327424 was using a keyboard interrupt to raise an exception when the point was that the exception (however it was caused) was not raised. For example:
import concurrent.futures
def main():
numbers = range(10)
with concurrent.futures.ThreadPoolExecutor() as executor:
results = {executor.submit(raise_my_exception, number): number for number in numbers}
def raise_my_exception(number):
print('Proof that this function is getting called. %s' % number)
raise Exception('This never sees the light of day...')
main()
When the example code above is executed, you will see the text inside the print statement displayed on the screen, but you will never see the exception. This is because the results of each thread are held in the results object. You need to iterate that object to get to your exceptions. The following example shows how to access the results.
import concurrent.futures
def main():
numbers = range(10)
with concurrent.futures.ThreadPoolExecutor() as executor:
results = {executor.submit(raise_my_exception, number): number for number in numbers}
for result in results:
# This will cause the exception to be raised (but only the first one)
print(result.result())
def raise_my_exception(number):
print('Proof that this function is getting called. %s' % number)
raise Exception('This will be raised once the results are iterated.')
main()
I'm not sure I like this behavior or not, but it does allow the threads to fully execute, regardless of the exceptions encountered inside the individual threads.
Here's a solution. I'm not sure you like it, but I can't think of any other. I've modified your code to make it work.
from concurrent.futures import ThreadPoolExecutor
import time
quit = False
def pri():
print("Hello World!!!")
def start():
while quit is not True:
time.sleep(1)
pri()
try:
pool = ThreadPoolExecutor(max_workers=3)
pool.submit(start)
while quit is not True:
print("hei")
time.sleep(1)
except KeyboardInterrupt:
quit = True
Here are the points:
When you use with ThreadPoolExecutor(max_workers=3) as exe, it waits until all tasks have been done. Have a look at Doc
If wait is True then this method will not return until all the pending futures are done executing and the resources associated with the executor have been freed. If wait is False then this method will return immediately and the resources associated with the executor will be freed when all pending futures are done executing. Regardless of the value of wait, the entire Python program will not exit until all pending futures are done executing.
You can avoid having to call this method explicitly if you use the with statement, which will shutdown the Executor (waiting as if Executor.shutdown() were called with wait set to True)
It's like calling join() on a thread.
That's why I replaced it with:
pool = ThreadPoolExecutor(max_workers=3)
pool.submit(start)
Main thread must be doing "work" to be able to catch a Ctrl+C. So you can't just leave main thread there and exit, the simplest way is to run an infinite loop
Now that you have a loop running in main thread, when you hit CTRL+C, program will enter the except KeyboardInterrupt block and set quit=True. Then your worker thread can exit.
Strictly speaking, this is only a workaround. It seems to me it's impossible to have another way for this.
Edit
I'm not sure what's bothering you, but you can catch exception in another thread without problem:
import socket
import time
from concurrent.futures import ThreadPoolExecutor
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
def con():
try:
raise socket.gaierror
main()
except socket.gaierror:
print("gaierror occurred")
err()
def err():
print("err invoked")
time.sleep(1)
con()
def main():
s.send("[+] Hello")
with ThreadPoolExecutor(3) as exe:
exe.submit(con)
Output
gaierror occurred
err invoked
gaierror occurred
err invoked
gaierror occurred
err invoked
gaierror occurred
...

How to safely run unreliable piece of code?

Suppose you are working with some bodgy piece of code which you can't trust, is there a way to run it safely without losing control of your script?
An example might be a function which only works some of the time and might fail randomly/spectacularly, how could you retry until it works? I tried some hacking with using threading module but had trouble to kill a hung thread neatly.
#!/usr/bin/env python
import os
import sys
import random
def unreliable_code():
def ok():
return "it worked!!"
def fail():
return "it didn't work"
def crash():
1/0
def hang():
while True:
pass
def bye():
os._exit(0)
return random.choice([ok, fail, crash, hang, bye])()
result = None
while result != "it worked!!":
# ???
To be safe against exceptions, use try/except (but I guess you know that).
To be safe against hanging code (endless loop) the only way I know is running the code in another process. This child process you can kill from the father process in case it does not terminate soon enough.
To be safe against nasty code (doing things it shall not do), have a look at http://pypi.python.org/pypi/RestrictedPython .
You can try running it in a sandbox.
In your real case application can you switch to multiprocessing? Becasue it seems that what you're asking could be done with multiprocessing + threading.Timer + try/except.
Take a look at this:
class SafeProcess(Process):
def __init__(self, queue, *args, **kwargs):
self.queue = queue
super().__init__(*args, **kwargs)
def run(self):
print('Running')
try:
result = self._target(*self._args, **self._kwargs)
self.queue.put_nowait(result)
except:
print('Exception')
result = None
while result != 'it worked!!':
q = Queue()
p = SafeProcess(q, target=unreliable_code)
p.start()
t = Timer(1, p.terminate) # in case it should hang
t.start()
p.join()
t.cancel()
try:
result = q.get_nowait()
except queues.Empty:
print('Empty')
print(result)
That in one (lucky) case gave me:
Running
Empty
None
Running
it worked!!
In your code samples you have 4 out of 5 chances to get an error, so you might also spawn a pool or something to improve your chances of having a correct result.

Categories

Resources