Doing something before program exit in Python

Doing something before program exit in Python - python

Good evening, after several hours looking for a solution, I still can't find any way to solve this :
I'm currently working on a Selenium script that creates X threads, each thread run a Firefox instance that makes test on a website. The thing is, when I'm using Ctrl C or leaving the executable with the cross at top-right, every Mozilla instances created keep living.
I assumed this is caused by the fact that sub-threads created in the main thread are not stopped due to these processes that are still running so I decided to make a function that takes a list of drivers will close EVERY drivers in the list, and these drivers are added to the list when they are created.
The issue happens when I'm running it as an executable, the "Stop" function for my IDE (PyCharm) has no issue with it.
What I've tried :
using atexit module to shutdown every drivers (Firefox instances) on exit with a clean_threads function -> It doesn't work because it looks like atexit is running once every threads are shutdown, so in my case, the function was never called
Running my main function in a "try - finally" structure with the clean_threads function called in the finally -> doesn't work as well, I might have used it the wrong way but it did not worked as well.
Running my main function in a "try - except (KeyboardInterrupt, SystemExit)", didn't managed to make it work aswell, for some unknown reason it just made Ctrl C being not able to
I'd love to have some advice on the procedure to follow, I admit going in circles and not finding a solution to the problem..
Any help will be appreciated, thanks in advance :) And if there is a need for more clarification, snippets or whatever, please do not hesitate.
Code of my main function :
def main():
global THREADS
load_settings()
try :
# TODO clean firefox instances
# TODO proper switch
THREADS = [Thread(target=automation, args=(i, HEADLESS)) for i in
range(FIRST_ACC, ACCOUNT_NUMBER + FIRST_ACC, 1)]
for thread in THREADS:
thread.start()
time.sleep(30)
for thread in THREADS:
thread.join()
except (KeyboardInterrupt, SystemExit):
print("Exception catchée")
clean_threads(DRIVER_LIST)
The clean_threads function :
def clean_threads(driver_list):
discard_list = []
print("Test")
for each in driver_list:
each.exit()
discard_list.append(each)
print(len(discard_list))

Related

Can't pause python process using debug

I have a python script which starts multiple sub processes using these lines :
for elm in elements:
t = multiprocessing.Process(target=sub_process,args=[elm])
threads.append(t)
t.start()
for t in threads:
t.join()
Sometimes, for some reason the thread halts and the script never finishes.
I'm trying to use VSCode debugger to find the problem and check where in the thread itself it stuck but I'm having issues pausing these sub processes because when I click the pause in the debugger window:
It will pause the main thread and some other threads that are running properly but it won't pause the stuck sub process.
Even when I try to pause the threads manually one by one using the Call Stack window, I can still pause only the working threads and not the stuck one.
Please help me figure this thing, It's a hard thing because the thing that makes the process stuck doesn't always happen so it makes it very hard to debug.

First, those are subprocesses, not threads. It's important to understand
the difference, although it doesn't answer your question.
Second, a pause (manual break) in the Python debugger will break in Python code.
It won't break in the machine code below that executes the Python, or in the machine
code below that performing the OS services the Python code is asking for.
If you execute a pause, the pause will occur in the Python code above
the machine code when (and if) the machine code returns to the Python interpreter loop.
Given a complete example:
import multiprocessing
import time
elements = ["one", "two", "three"]
def sub_process(gs, elm):
gs.acquire()
print("sleep", elm)
time.sleep(60)
print("awake", elm);
gs.release()
def test():
gs = multiprocessing.Semaphore()
subprocs = []
for elm in elements:
p = multiprocessing.Process(target=sub_process,args=[gs, elm])
subprocs.append(p)
p.start()
for p in subprocs:
p.join()
if __name__ == '__main__':
test()
The first subprocess will grab the semaphore and sleep for a minute,
and the second and third subprocesses will wait inside gs.acquire() until they
can move forward. A pause will not break into the debugger until the
subprocess returns from the acquire, because acquire is below the Python code.
It sounds like you have an idea where the process is getting stuck,
but you don't know why. You need to determine what questions
you are trying to answer. For example:
(Assuming) one of the processess is stuck in acquire. That means one of the other
processess didn't release the semaphore. What code in which process is
acquiring a semaphore and not releasing it?
Looking at the semaphore object itself might tell you which subprocess is holding it,
but this is a tangent: can you use the debugger to inspect the semaphore
and determine who is holding it? For example, using a machine level debugger in windows,
if these were threads and a critical section, it's possible to look at the critical section
and see which thread is still holding it. I don't know if this could be
done using processes and semaphores on your chosen platform.
Which debuggers you have access to depend on the platform you're running on.
In summary:
You can't break the Python debugger in machine code
You can run the Python interpreter in a machine code debugger, but this
won't show you the Python code at all, which make life interesting.
This can be helpful if you have an idea what you're looking for -
for example, you might be able to tell that you're stuck waiting for a semaphore.
Running a machine code debugger becomes more difficult when you're running
sub-processes, because you need to know which sub-process you're interested
in, and attach to that one. This becomes simpler if you're using a single
process and multiple threads instead, since there's only one process to deal with.
"You can't get there from here, you have to go someplace else first."
You'll need to take a closer look at your code and figure out how
to answer the questions you need to answer using other means.

It's just an idea, Why not to set a timeout on your sub processes and terminate it?
TIMEOUT = 60
for elm in elements:
t = multiprocessing.Process(target=sub_process,args=[elm])
t.daemon = True
threads.append(t)
t.start()
t.join(TIMEOUT)
for t in threads:
t.join()

Threading module python stopping and re-start threads

I was looking how to stop a thread on python using the thread module, and I found that this method is not provided by the module. I have seen some tricks to implement a way to stop the threads but nothing of this worked for me.
My program have a main window that shows every function on it, and one of this functions opens another window that do a "function2" with a button.
I want to be able to do things, or not let the windows freeze while "function2" is running, so I have used threading.Thread to define the "function2" and called it using Thread.run() method.
This, works great, but when "function2" is done, I cannot re-run the function because of the threads can only be started once.
I need a solution to this, if someone can help me, I would be glad.
Thanks.

Expanding on comments. What you have is
fun2 = threading.Thread(name='funcion2',target=funcion2)
ttk.Button(loginpanel,text='Initfun2',command=fun2.start)
which basically creates one thread and tries to re-run it on click. There is no such thing as re-runing threads so instead you have to create a new thread on click:
def fun2():
threading.Thread(name='funcion2',target=funcion2).start()
ttk.Button(loginpanel,text='Initfun2',command=fun2)
While this is better it has another drawback: what if someone starts clicking the button like mad? You want to restrain the number of threads to use. For that using a thread pool is a good option:
from concurrent.futures import ThreadPoolExecutor
THREADPOOL = ThreadPoolExecutor(10)
def fun2():
THREADPOOL.submit(funcion2)
ttk.Button(loginpanel,text='Initfun2',command=fun2)
This code is for Python3.x. For Python2 I think you need some external library.

Python Threading Issue on Windows

At first I thought I had some kind of memory leak causing an issue but I'm getting an exception I'm not sure I fully understand but at least I've narrowed it down now.
I'm using a while True loop to keep a thread running and retrieving data. If it runs into a problem it logs it and keeps running. It seems to work fine at first - at least the first time and then it constantly logs a Threading Exception.
I narrowed it down to this section:
while True:
yada yada yada...
#Works fine to this part
pool = ThreadPool(processes=1)
async_result = pool.apply_async(SpawnPhantomJS, (dcap, service_args))
Driver = async_result.get(10)
Driver.set_window_size(1024, 768) # optional
Driver.set_page_load_timeout(30)
I do this because there's an issue spawning a lot of selenium webdrivers it times out eventually (no exception - just hangs there) and using this gave it a timeout so if it couldn't spawn in 10 the exception would catch it and go again. Seemed like a great fix. But I think it's causing problems in a loop.
It works fine to start with but then throws the same exception on every loop.
I don't understand the thread pooling well enough maybe I shouldn't constantly be defining it. It's a hard exception to catch happening so testing is a bit of a pain but I'm thinking something like this might fix it?
pool = ThreadPool(processes=1)
async_result = pool.apply_async(SpawnPhantomJS, (dcap, service_args))
while True:
Driver = async_result.get(10)
That looks neater to me but I don't understand the problem well enough to say for sure it would fix it.
I'd really appreciate any suggestions.
Update:
I've tracked the problem to this section of code 100% I put a variable named bugcounter = 1 before it and = 2 afterwards and logged this on an exception.
But when trying to reproduce it with just this code in a loop it runs fine and keeps spawning web drivers. So I've no idea.
Further update:
I can run this locally for hours. Sometimes it'll run on the (Windows) server for hours. But after a while it fails somewhere here and I can't figure out why.
An exception could be thrown because the timeout hits and the browser wouldn't spawn on time. This happens rarely but that's why we loop back to it.
My assumption here is I'm creating too many threads and the OS isn't having it. I have just spotted there's a .terminate for thread pooling maybe if I terminate the pool after using it to spawn a browser?

The question I came to in the final answer solved it.
I was using a thread pool to give the browser spawn a timeout as a workaround for the bug in the library. But I wasn't terminating that thread pool so eventually after the x amount of loops the OS wouldn't let it create another pool.
Adding a .terminate once the browser had been spawned and the pool was no longer needed solved the problem.

Terminating an IronPython script

This may not specifically be an IronPython question, so a Python dev out there might be able to assist.
I want to run python scripts in my .Net desktop app using IronPython, and would like to give users the ability to forcibly terminate a script. Here's my test script (I'm new to Python so it might not be totally correct):-
import atexit
import time
import sys
#atexit.register
def cleanup():
print 'doing cleanup/termination code'
sys.exit()
for i in range(100):
print 'doing something'
time.sleep(1)
(Note that I might want to specify an "atexit" function in some scripts, allowing them to perform any cleanup during normal or forced termination).
In my .Net code I'm using the following code to terminate the script:
_engine.Runtime.Shutdown();
This results in the script's atexit function being called, but the script doesn't actually terminate - the for loop keeps going. A couple of other SO articles (here and here) say that sys.exit() should do the trick, so what am I missing?

It seems that it's not possible to terminate a running script - at least not in a "friendly" way. One approach I've seen is to run the IronPython engine in another thread, and abort the thread if you need to stop the script.
I wasn't keen on this brute-force approach, which would risk leaving any resources used by the script (e.g. files) open.
In the end, I create a C# helper class like this:-
public class HostFunctions
{
public bool AbortScript { get; set; }
// Other properties and functions that I want to expose to the script...
}
When the hosting application wants to terminate the script it sets AbortScript to true. This object is passed to the running script via the scope:-
_hostFunctions = new HostFunctions();
_scriptScope = _engine.CreateScope();
_scriptScope.SetVariable("HostFunctions", _hostFunctions);
In my scripts I just need to strategically place checks to see if an abort has been requested, and deal with it appropriately, e.g.:-
for i in range(100):
print 'doing something'
time.sleep(1)
if HostFunctions.AbortScript:
cleanup()

It seems that if you are using ".NET 5" or higher then aborting Thread might work imperfect.
Thread.Abort() is not supported on ".NET 5" or higher and throws PlatformNotSupportedException.
You probably will find a solution to use Thread.Interrupt(), but it has slightly different behavior:
If your Python script does not have any Thread.Sleep() it won't stop your script;
It looks like you couldn't Abort that Thread twice, but you can Interrupt that Thread twice. So, if your Python script is using finally blocks or "Context Manager", you will be able to Interrupt it by calling Thread.Interrupt() twice (with some delays between those calls).

A thread is blocked by a blocking call - how do I make a timeout on the blocking call?

I have a python program which operates an external program and starts a timeout thread. Timeout thread should countdown for 10 minutes and if the script, which operates the external program isn't finished in that time, it should kill the external program.
My thread seems to work fine on the first glance, my main script and the thread run simultaneously with no issues. But if a pop up window appears in the external program, it stops my scripts, so that even the countdown thread stops counting, therefore totally failing it's job.
I assume the issue is that the script calls a blocking function in API for the external program, which is blocked by the pop up window. I understand why it blocks my main program, but don't understand why it blocks my countdown thread. So, one possible solution might be to run a separate script for the countdown, but I would like to keep it as clean as possible and it seems really messy to start a script for this.
I have searched everywhere for a clue, but I didn't find much. There was a reference to the gevent library here:
background function in Python
, but it seems like such a basic task, that I don't want to include external library for this.
I also found a solution which uses a windows multimedia timer here, but I've never worked with this before and am afraid the code won't be flexible with this. Script is Windows-only, but it should work on all Windows from XP on.
For Unix I found signal.alarm which seems to do exactly what I want, but it's not available for Windows. Any alternatives for this?
Any ideas on how to work with this in the most simplified manner?
This is the simplified thread I'm creating (run in IDLE to reproduce the issue):
import threading
import time
class timeToKill():
def __init__(self, minutesBeforeTimeout):
self.stop = threading.Event()
self.countdownFrom = minutesBeforeTimeout * 60
def startCountdown(self):
self.countdownThread= threading.Thread(target=self.countdown, args=(self.countdownFrom,))
self.countdownThread.start()
def stopCountdown(self):
self.stop.set()
self.countdownThread.join()
def countdown(self,seconds):
for second in range(seconds):
if(self.stop.is_set()):
break
else:
print (second)
time.sleep(1)
timeout = timeToKill(1)
timeout.startCountdown()
raw_input("Blocking call, waiting for input:\n")

One possible explanation for a function call to block another Python thread is that CPython uses global interpreter lock (GIL) and the blocking API call doesn't release it (NOTE: CPython releases GIL on blocking I/O calls therefore your raw_input() example should work as is).
If you can't make the buggy API call to release GIL then you could use a process instead of a thread e.g., multiprocessing.Process instead of threading.Thread (the API is the same). Different processes are not limited by GIL.

For quick and dirty threading, I usually resort to subprocess commands. it is quite robust and os independent. It does not give as fine grained control as the thread and queue modules but for external calls to programs generally does nicely. Note the shell=True must be used with caution.
#this can be any command
p1 = subprocess.Popen(["python", "SUBSCRIPTS/TEST.py", "0"], shell=True)
#the thread p1 will run in the background - asynchronously. If you want to kill it after some time, then you need
#here do some other tasks/computations
time.sleep(10)
currentStatus = p1.poll()
if currentStatus is None: #then it is still running
try:
p1.kill() #maybe try os.kill(p1.pid,2) if p1.kill does not work
except:
#do something else if process is done running - maybe do nothing?
pass

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.