I am working on a service wherein we have to store the user data in DB and send an email to the user as a notification and return the success response and it is taking some time to complete this process.
As Python script execute synchronously I want to run this process asynchronously, for example, user details are stored and return the success response and then mail process has to be done asynchronously (later after returning the success response) such that the overall response should not depend on this mail execution
def userregistration(details):
#store user details
result = storeuserdb(details)
print("storeuserdb result", result)
result["status"] == True:
sendmailtouser() #have to be asynchronously
return result
def storeuserdb(details)
#store code goes here
def sendmailtouser()
#email code goes here
Is there any chance to run again after returning the response to service?
Like this
def userregistration(details):
#store user details
result = storeuserdb(details)
print("storeuserdb result", result)
return result
result["status"] == True:
sendmailtouser() #have to be asynchronously
You can use Threads. Here is a little example:
import threading
def worker(num):
"""thread worker function"""
print 'Worker: %s\n' % num
if num == 1:
print 'I am number 1'
raw_input("Press Enter to continue...")
return
threads = []
for i in range(5):
t = threading.Thread(target=worker, args=(i,))
threads.append(t)
t.start()
Output:
Worker: 3
Worker: 0
Worker: 1
Worker: 2
Worker: 4
I am number 1
Press Enter to continue...
The program does not stop when the worker 1 is handled. Although an input is expected.
Related
I am making a bot that auto-posts to Instagram using instabot, now the thing is that if I exceed a number of request the bot terminate the script after retrying for some minutes.
The solution I came up with is to schedule the script to run every hour or so, and to ensure that the script will keep running constantly I used threading to restart the posting function when the thread is dead.
The function responsible for posting, in this code if the bot instance from instabot retried sending requests for some minutes and failed, it terminates the whole script.
def main():
create_db()
try:
os.mkdir("images")
print("[INFO] Images Directory Created")
except:
print("[INFO] Images Directory Found")
# GET A SUBMISSION FROM HOT
submissions = list(reddit.subreddit('memes').hot(limit=100))
for sub in submissions:
print("*"*68)
url = sub.url
print(f'[INFO] URL : {url}')
if "jpg" in url or "png" in url:
if not sub.stickied:
print("[INFO] Valid Post")
if check_if_exist(sub.id) is None:
id_ = sub.id
name = sub.title
link = sub.url
status = "FALSE"
print(f"""
[INFO] ID = {id_}
[INFO] NAME = {name}
[INFO] LINK = {link}
[INFO] STATUS = {status}
""")
# SAVE THE SUBMISSION TO THE DATABASE
insert_db(id_, name, link, status)
post_instagram(id_)
print(f"[INFO] Picture Uploaded, Next Upload is Scheduled in 60 min")
break
time.sleep(5 * 60)
The scheduling function:
def func_start():
schedule.every(1).hour.do(main)
while True:
schedule.run_pending()
time.sleep(10 * 60)
And last piece of code:
if __name__ == '__main__':
t = threading.Thread(target=func_start)
while True:
if not t.is_alive():
t.start()
else:
pass
So basically I want to keep running the main function every hour or so, but I am not having any successes.
Looks to me like schedule and threading are overkill for your use case as your script only performs one single task, so you do not need concurrency and can run the whole thing in the main thread. You primarily just need to catch exceptions from the main function. I would go with something like this:
if __name__ == '__main__':
while True:
try:
main()
except Exception as e:
# will handle exceptions from `main` so they do not
# terminate the script
# note that it would be better to only catch the exact
# exception you know you want to ignore (rather than
# the very broad `Exception`), and let other ones
# terminate the script
print("Exception:", e)
finally:
# will sleep 10 minutes regardless whether the last
# `main` run succeeded or not, then continue running
# the infinite loop
time.sleep(10 * 60)
...unless you actually want each main run to start precisely in 60-minute intervals, in which case you'll probably need either threading or schedule. Because, if running main takes, say, 3 to 5 minutes, then simply sleeping 60 minutes after each execution means you'll be launching the function every 63 to 65 minutes.
I have a function that uses threading to connect to a number of network devices and runs commands against them
#!/usr/bin/python3
# Importing Netmiko modules
from netmiko import Netmiko
from netmiko.ssh_exception import (
NetMikoAuthenticationException,
NetMikoTimeoutException,
)
import signal, os, json, re
# Queuing and threading libraries
from queue import Queue
import threading
class MyFunction:
def __init__(self):
# Get the password
self.password = "password"
# Switch IP addresses from text file that has one IP per line
self.hops = ["switch1", "switch2", "switch3"]
self.hop_info = []
# Set up thread count for number of threads to spin up
self.num_threads = 8
# This sets up the queue
self.enclosure_queue = Queue()
# Set up thread lock so that only one thread prints at a time
self.print_lock = threading.Lock()
self.command = "show ip route 127.0.0.1 | json"
def deviceconnector(self, i, q):
while True:
# and aren't required
print("{}: Waiting for IP address...".format(i))
ip = q.get()
print("{}: Acquired IP: {}".format(i, ip))
# k,v passed to net_connect
device_dict = {
"host": ip,
"username": "admin",
"password": self.password,
"device_type": "cisco_ios",
}
try:
net_connect = Netmiko(**device_dict)
except NetMikoTimeoutException:
with self.print_lock:
print("\n{}: ERROR: Connection to {} timed-out.\n".format(i, ip))
q.task_done()
continue
except NetMikoAuthenticationException:
with self.print_lock:
print(
"\n{}: ERROR: Authenticaftion failed for {}. Stopping script. \n".format(
i, ip
)
)
q.task_done()
output = net_connect.send_command(self.command)
with self.print_lock:
print("{}: Printing output...".format(i))
print(output)
# Disconnect from device
net_connect.disconnect
q.task_done()
def main(self):
for i in range(len(self.hops)):
thread = threading.Thread(
target=self.deviceconnector, args=(i, self.enclosure_queue)
)
# Set the thread as a background daemon/job
thread.setDaemon(True)
# Start the thread
thread.start()
for hop in self.hops:
self.enclosure_queue.put(hop)
# Wait for all tasks in the queue to be marked as completed (task_done)
self.enclosure_queue.join()
print("*** Script complete")
if __name__ == "__main__":
# Calling the main function
run = Trace()
run.main()
This prints the output of the commands fine but instead of printing the output, I want to do the threading in a function so the returned data can be manipulated and used as part of another function. I can't seem to get the threading functionality in main() to allow this.
Update
I'm using this in an api endpoint so taking out the
if name == "main":
and calling it from another a main.py. The problem I'm having is that it hangs and doesn't return but can't quite figure out why. It only hangs when ran from main.py, not when if has the main function.
You are already writing the data to your Queue, so you just have to get them back from there
my_data = q.get()
Did you write the code? If so, use your Queue :)
I am writing multiprocess program. There are four class: Main, Worker, Request and Ack. The Main class is the entry point of program. It will create the sub-process called Worker to do some jobs. The main process put the Request into JoinableQueue, and than Worker get request from queue. When Worker finished the request, it will put the ACK into queue. The part of code shown as below:
Main:
class Main():
def __init__(self):
self.cmd_queue = JoinableQueue()
self.worker = Worker(self.cmd_queue)
def call_worker(self, cmd_code):
if self.cmd_queue.empty() is True:
request = Request(cmd_code)
self.cmd_queue.put(request)
self.cmd_queue.join()
ack = self.cmd_queue.get()
self.cmd_queue.task_done()
if ack.value == 0:
return True
else:
return False
else:
# TODO: Error Handling.
pass
def run_worker(self):
self.worker.start()
Worker:
class Worker(Process):
def __init__(self, cmd_queue):
super(Worker, self).__init__()
self.cmd_queue = cmd_queue
...
def run(self):
while True:
ack = Ack(0)
try:
request = self.cmd_queue.get()
if request.cmd_code == ReqCmd.enable_handler:
self.enable_handler()
elif request.cmd_code == ReqCmd.disable_handler:
self.disable_handler()
else:
pass
except Exception:
ack.value = -1
finally:
self.cmd_queue.task_done()
self.cmd_queue.put(ack)
self.cmd_queue.join()
It often works normally. But Main process stuck at self.cmd_queue.join(), and the Worker stuck at self.cmd_queue.join() sometimes. It is so weird! Does anyone have any ideas? Thanks
There's nothing weird in the above issue: you shouldn't call queue's join within a typical single worker process activity because
Queue.join()
Blocks until all items in the queue have been gotten and
processed.
Such a calls where they are in your current implementation will make the processing pipeline wait.
Usually queue.join() is called in the main (supervisor) thread after initiating/starting all threads/workers.
https://docs.python.org/3/library/queue.html#queue.Queue.join
I have a piece of multi threaded code - 3 threads that polls data from SQS and add it to a python queue. 5 threads that take the messages from python queue, process them and send it to a back end system.
Here is the code:
python_queue = Queue.Queue()
class GetDataFromSQS(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, python_queue):
threading.Thread.__init__(self)
self.python_queue = python_queue
def run(self):
while True:
time.sleep(0.5) //sleep for a few secs before querying again
try:
msgs = sqs_queue.get_messages(10)
if msgs == None:
print "sqs is empty now"!
for msg in msgs:
#place each message block from sqs into python queue for processing
self.python_queue.put(msg)
print "Adding a new message to Queue. Queue size is now %d" % self.python_queue.qsize()
#delete from sqs
sqs_queue.delete_message(msg)
except Exception as e:
print "Exception in GetDataFromSQS :: " + e
class ProcessSQSMsgs(threading.Thread):
def __init__(self, python_queue):
threading.Thread.__init__(self)
self.python_queue = python_queue
self.pool_manager = PoolManager(num_pools=6)
def run(self):
while True:
#grabs the message to be parsed from sqs queue
python_queue_msg = self.python_queue.get()
try:
processMsgAndSendToBackend(python_queue_msg, self.pool_manager)
except Exception as e:
print "Error parsing:: " + e
finally:
self.python_queue.task_done()
def processMsgAndSendToBackend(msg, pool_manager):
if msg != "":
###### All the code related to processing the msg
for individualValue in processedMsg:
try:
response = pool_manager.urlopen('POST', backend_endpoint, body=individualValue)
if response == None:
print "Error"
else:
response.release_conn()
except Exception as e:
print "Exception! Post data to backend: " + e
def startMyPython():
#spawn a pool of threads, and pass them queue instance
for i in range(3):
sqsThread = GetDataFromSQS(python_queue)
sqsThread.start()
for j in range(5):
parseThread = ProcessSQSMsgs(python_queue)
#parseThread.setDaemon(True)
parseThread.start()
#wait on the queue until everything has been processed
python_queue.join()
# python_queue.close() -- should i do this?
startMyPython()
The problem:
3 python workers die randomly (monitored using top -p -H) once every few days and everything is alright if i kill the process and start the script again. I suspect the workers that vanish are the 3 GetDataFromSQS threads.. And because the GetDataFromSQS dies, the other 5 workers although running always sleep as there is no data in the python queue. I am not sure what I am doing wrong here as I am pretty new to python and followed this tutorial for creating this queuing logic and threads - http://www.ibm.com/developerworks/aix/library/au-threadingpython/
Thanks in advance for your help. Hope I have explained my problem clear.
The problem for the thread hanging was related to getting a handle of the sqs queue. I used IAM for managing credentials and the boto sdk for connecting to sqs.
The root cause for this issue was that the boto package was reading the metadata for auth from AWS and it was failing once in a while.
The fix is to edit the boto config, increasing the attempts that are made to perform the auth call to AWS.
[Boto]
metadata_service_num_attempts = 5
( https://groups.google.com/forum/#!topic/boto-users/1yX24WG3g1E )
I am writing a small multi-threaded http file downloader and would like to be able to shrink the available threads as the code encounters errors
The errors would be specific to http errors returned where the web server is not allowing any more connections
eg. If I setup a pool of 5 threads, each thread is attempting to open it's own connection and download a chunk of the file. The server may only allow 2 connections and will I believe return 503 errors, I want to detect this and shut down a thread, eventually limiting the size of the pool to presumably only the 2 that the server will allow
Can I make a thread stop itself?
Is self.Thread_stop() sufficient?
Do I also need to join()?
Here's my worker class that does the downloading, grabs from the queue to process, once downloaded it dumps the result into resultQ to be saved to file by the main thread
It's in here where I would like to detect a http 503 and stop/kill/remove a thread from the available pools - and of course re-add the failed chunk back to the queue so the remaining threads will process it
class Downloader(threading.Thread):
def __init__(self, queue, resultQ, file_name):
threading.Thread.__init__(self)
self.workQ = queue
self.resultQ = resultQ
self.file_name = file_name
def run(self):
while True:
block_num, url, start, length = self.workQ.get()
print 'Starting Queue #: %s' % block_num
print start
print length
#Download the file
self.download_file(url, start, length)
#Tell queue that this task is done
print 'Queue #: %s finished' % block_num
self.workQ.task_done()
def download_file(self, url, start, length):
request = urllib2.Request(url, None, headers)
if length == 0:
return None
request.add_header('Range', 'bytes=%d-%d' % (start, start + length))
while 1:
try:
data = urllib2.urlopen(request)
except urllib2.URLError, u:
print "Connection did not start with", u
else:
break
chunk = ''
block_size = 1024
remaining_blocks = length
while remaining_blocks > 0:
if remaining_blocks >= block_size:
fetch_size = block_size
else:
fetch_size = int(remaining_blocks)
try:
data_block = data.read(fetch_size)
if len(data_block) == 0:
print "Connection: [TESTING]: 0 sized block" + \
" fetched."
if len(data_block) != fetch_size:
print "Connection: len(data_block) != length" + \
", but continuing anyway."
self.run()
return
except socket.timeout, s:
print "Connection timed out with", s
self.run()
return
remaining_blocks -= fetch_size
chunk += data_block
resultQ.put([start, chunk])
Below is where I init the thread pool, further down I put items to the queue
# create a thread pool and give them a queue
for i in range(num_threads):
t = Downloader(workQ, resultQ, file_name)
t.setDaemon(True)
t.start()
Can I make a thread stop itself?
Don't use self._Thread__stop(). It is enough to exit the thread's run() method (you can check a flag or read a sentinel value from a queue to know when to exit).
It's in here where I would like to detect a http 503 and stop/kill/remove a thread from the available pools - and of course re-add the failed chunk back to the queue so the remaining threads will process it
You can simplify the code by separating responsibilities:
download_file() should not try to reconnect in the infinite loop. If there is an error; let's the code that calls download_file() resubmit it if necessary
the control about the number of concurrent connections can be encapsulated in a Semaphore object. Number of threads may differ from number of concurrent connections in this case
import concurrent.futures # on Python 2.x: pip install futures
from threading import BoundedSemaphore
def download_file(args):
nconcurrent.acquire(timeout=args['timeout']) # block if too many connections
# ...
nconcurrent.release() #NOTE: don't release it on exception,
# allow the caller to handle it
# you can put it into a dictionary: server -> semaphore instead of the global
nconcurrent = BoundedSemaphore(5) # start with at most 5 concurrent connections
with concurrent.futures.ThreadPoolExecutor(max_workers=NUM_THREADS) as executor:
future_to_args = dict((executor.submit(download_file, args), args)
for args in generate_initial_download_tasks())
while future_to_args:
for future in concurrent.futures.as_completed(dict(**future_to_args)):
args = future_to_args.pop(future)
try:
result = future.result()
except Exception as e:
print('%r generated an exception: %s' % (args, e))
if getattr(e, 'code') != 503:
# don't decrease number of concurrent connections
nconcurrent.release()
# resubmit
args['timeout'] *= 2
future_to_args[executor.submit(download_file, args)] = args
else: # successfully downloaded `args`
print('f%r returned %r' % (args, result))
See ThreadPoolExecutor() example.
you should be using a threadpool to control the life of your threads:
http://www.inductiveload.com/posts/easy-thread-pools-in-python-with-threadpool/
Then when a thread exists, you can send a message to the main thread (that is handling the threadpool) and then change the size of the threadpool, and postpone new requests or failed requests in a stack that you'll empty.
tedelanay is absolutely right about the daemon status you're giving to your threads. There is no need to set them as daemons.
Basically, you can simplify your code, you could do something as follows:
import threadpool
def process_tasks():
pool = threadpool.ThreadPool(4)
requests = threadpool.makeRequests(download_file, arguments)
for req in requests:
pool.putRequest(req)
#wait for them to finish (or you could go and do something else)
pool.wait()
if __name__ == '__main__':
process_tasks()
where arguments is up to your strategy. Either you give your threads a queue as argument and then empty the queue. Or you can get process the queue in process_tasks, block while the pool is full, and open a new thread when a thread is done, but the queue is not empty. It all depends on your needs and the context of your downloader.
resources:
http://chrisarndt.de/projects/threadpool/
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/203871
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/196618
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302746
http://lethain.com/using-threadpools-in-python/
A Thread object terminates the thread simply by returning from the run method - it doesn't call stop. If you set your thread to daemon mode, there is no need to join but otherwise the main thread needs to do it. It is common for the thread to use the resultq to report that it is exiting and for the main thread to use that info to do the join. This helps with orderly termination of your process. You can get strange errors during system exit if python is still juggling multiple threads and its best to side-step that.