I am making a bot that auto-posts to Instagram using instabot, now the thing is that if I exceed a number of request the bot terminate the script after retrying for some minutes.
The solution I came up with is to schedule the script to run every hour or so, and to ensure that the script will keep running constantly I used threading to restart the posting function when the thread is dead.
The function responsible for posting, in this code if the bot instance from instabot retried sending requests for some minutes and failed, it terminates the whole script.
def main():
create_db()
try:
os.mkdir("images")
print("[INFO] Images Directory Created")
except:
print("[INFO] Images Directory Found")
# GET A SUBMISSION FROM HOT
submissions = list(reddit.subreddit('memes').hot(limit=100))
for sub in submissions:
print("*"*68)
url = sub.url
print(f'[INFO] URL : {url}')
if "jpg" in url or "png" in url:
if not sub.stickied:
print("[INFO] Valid Post")
if check_if_exist(sub.id) is None:
id_ = sub.id
name = sub.title
link = sub.url
status = "FALSE"
print(f"""
[INFO] ID = {id_}
[INFO] NAME = {name}
[INFO] LINK = {link}
[INFO] STATUS = {status}
""")
# SAVE THE SUBMISSION TO THE DATABASE
insert_db(id_, name, link, status)
post_instagram(id_)
print(f"[INFO] Picture Uploaded, Next Upload is Scheduled in 60 min")
break
time.sleep(5 * 60)
The scheduling function:
def func_start():
schedule.every(1).hour.do(main)
while True:
schedule.run_pending()
time.sleep(10 * 60)
And last piece of code:
if __name__ == '__main__':
t = threading.Thread(target=func_start)
while True:
if not t.is_alive():
t.start()
else:
pass
So basically I want to keep running the main function every hour or so, but I am not having any successes.
Looks to me like schedule and threading are overkill for your use case as your script only performs one single task, so you do not need concurrency and can run the whole thing in the main thread. You primarily just need to catch exceptions from the main function. I would go with something like this:
if __name__ == '__main__':
while True:
try:
main()
except Exception as e:
# will handle exceptions from `main` so they do not
# terminate the script
# note that it would be better to only catch the exact
# exception you know you want to ignore (rather than
# the very broad `Exception`), and let other ones
# terminate the script
print("Exception:", e)
finally:
# will sleep 10 minutes regardless whether the last
# `main` run succeeded or not, then continue running
# the infinite loop
time.sleep(10 * 60)
...unless you actually want each main run to start precisely in 60-minute intervals, in which case you'll probably need either threading or schedule. Because, if running main takes, say, 3 to 5 minutes, then simply sleeping 60 minutes after each execution means you'll be launching the function every 63 to 65 minutes.
Related
I would like timeout the function sftp.put(), I have tried with signal Module but the script doesn't die if the upload time is over 10s.
I use that to transfer files by ssh (paramiko).
[...]
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't upload the fileeeeeeeeeeee!!!!")
[...]
raspi = paramiko.SSHClient()
raspi.set_missing_host_key_policy(paramiko.AutoAddPolicy())
raspi.connect(ip , username= "", password= "" , timeout=10)
sftp = raspi.open_sftp()
[...]
signal.signal(signal.SIGALRM, handler)
signal.alarm(10)
sftp.put(source, destination , callback=None, confirm=True)
signal.alarm(0)
raspi.close()
[...]
Update 1:
I want to abort the transfer if the server stops responding for a while. Actually, my python script check (in loop) any files in a folder, and send it to this remote server. But in the problem here I want to leave this function in the case of the server become inaccessible suddenly during a transfer (ip server changing, no internet anymore,...). But when I simulate a disconnection, the script stays stuck at this function sftp.put anyway...)
Update 2:
When the server goes offline during a transfer, put() seems to be blocked forever. This happens with this line too:
sftp.get_channel().settimeout(xx)
How to do when we lose the Channel?
Update 3 & script goal
Ubuntu 18.04
and paramiko version 2.6.0
Hello,
To follow your remarks and questions, I have to give more details about my very Ugly script, sorry about that :)
Actually, I don’t want to have to kill a thread manually and open a new one. For my application I want that the script run totally in autonomous, and if something wrong during the process, it can still go on. For that I use the Python exception handling. Everything does what I want except when the remote server going off during a transfer: The script stays blocked in the put() function, I think inside a loop.
Below, the script contains in total 3 functions to timeout this thanks to your help, but apparently nothing can leave this damned sftp.put()! Do you have some new idea ?
Import […]
[...]
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't upload the fileeeeeeeeeeee!!!!")
def check_time(size, file_size):
global start_time
if (time.time() - start_time) > 10:
raise Exception
i = 0
while i == 0:
try:
time.sleep(1) # CPU break
print ("go!")
#collect ip server
fichierIplist = open("/home/robert/Documents/iplist.txt", "r")
file_lines = fichierIplist.readlines()
fichierIplist.close()
last_line = file_lines [len (file_lines)-1]
lastKnowip = last_line
data = glob.glob("/home/robert/Documents/data/*")
items = len(data)
if items != 0:
time.sleep(60) #anyway
print("some Files!:)")
raspi = paramiko.SSHClient()
raspi.set_missing_host_key_policy(paramiko.AutoAddPolicy())
raspi.connect(lastKnowip, username= "", password= "" , timeout=10)
for source in data: #Upload file by file
filename = os.path.basename(source) #
destination = '/home/pi/Documents/pest/'+ filename #p
sftp = raspi.open_sftp()
signal.signal(signal.SIGALRM, handler)
signal.alarm(10)
sftp.get_channel().settimeout(10)
start_time = time.time()
sftp.put(source, destination, callback=check_time)
sftp.close()
signal.alarm(0)
raspi.close()
else:
print("noFile!")
except:
pass
If you want to timeout, when the server stops responding:
set the timeout argument of SSHClient.connect (your doing that already),
and set sftp.get_channel().settimeout as already suggested by #EOhm
If you want to timeout even when the server is responding, but slowly, implement the callback argument to abort the transfer after certain time:
start_time = time.time()
def check_time(size, file_size):
global start_time
if (time.time() - start_time) > ...:
raise Exception
sftp.put(source, destination, callback=check_time)
This won't cancel the transfer immediately. To optimize transfer performance, Paramiko queues the write requests to the server. Once you attempt to cancel the transfer, Paramiko has to wait for the responses to those requests in SFTPFile.close() to clear the queue. You might solve that by using SFTPClient.putfo() and avoiding calling the SFTPFile.close() when the transfer is cancelled. But you won't be able to use the connection afterwards. Of course, you can also not use the optimization, then you can cancel the transfer without delays. But that kind of defies the point of all this, doesn't it?
Alternatively, you can run the transfer in a separate thread and kill the thread if it takes too long. Ugly but sure solution.
Use sftp.get_channel().settimeout(s) for that instead.
After trying a lot of things and with your help and advice, I have found a reliable solution for what I wanted. I execute sftp.put in a separate Thread and my script do what I want.
Many thanks for your help
Now if the server shuts down during a transfer, after 60 sec, my script goes on using:
[...]
import threading
[...]
th = threading.Thread(target=sftp.put, args=(source,destination))
th.start()
h.join(60)
[...]
I am using Python3 modules:
requests for HTTP GET calls to a few Particle Photons which are set up as simple HTTP Servers
As a client I am using the Raspberry Pi (which is also an Access Point) as a HTTP Client which uses multiprocessing.dummy.Pool for making HTTP GET resquests to the above mentioned photons
The polling routine is as follows:
def pollURL(url_of_photon):
"""
pollURL: Obtain the IP Address and create a URL for HTTP GET Request
#param: url_of_photon: IP address of the Photon connected to A.P.
"""
create_request = 'http://' + url_of_photon + ':80'
while True:
try:
time.sleep(0.1) # poll every 100ms
response = requests.get(create_request)
if response.status_code == 200:
# if success then dump the data into a temp dump file
with open('temp_data_dump', 'a+') as jFile:
json.dump(response.json(), jFile)
else:
# Currently just break
break
except KeyboardInterrupt as e:
print('KeyboardInterrupt detected ', e)
break
The url_of_photon values are simple IPv4 Addresses obtained from the dnsmasq.leases file available on the Pi.
the main() function:
def main():
# obtain the IP and MAC addresses from the Lease file
IP_addresses = []
MAC_addresses = []
with open('/var/lib/misc/dnsmasq.leases', 'r') as leases_file:
# split lines and words to obtain the useful stuff.
for lines in leases_file:
fields = lines.strip().split()
# use logging in future
print('Photon with MAC: %s has IP address: %s' %(fields[1],fields[2]))
IP_addresses.append(fields[2])
MAC_addresses.append(fields[1])
# Create Thread Pool
pool = ThreadPool(len(IP_addresses))
results = pool.map(pollURL, IP_addresses)
pool.close()
pool.join()
if __name__ == '__main__':
main()
Problem
The program runs well however when I press CTRL + C the program does not terminate. Upon digging I found that the way to do so is using CTRL + \
How do I use this in my pollURL function for a safe way to exit the program, i.e. perform poll.join() so no leftover processes are left?
notes:
the KeyboardInterrupt is never recognized with the function. Hence I am facing trouble trying to detect CTRL + \.
The pollURL is executed in another thread. In Python, signals are handled only in the main thread. Therefore, SIGINT will raise the KeyboardInterrupt only in the main thread.
From the signal documentation:
Signals and threads
Python signal handlers are always executed in the main Python thread, even if the signal was received in another thread. This means that signals can’t be used as a means of inter-thread communication. You can use the synchronization primitives from the threading module instead.
Besides, only the main thread is allowed to set a new signal handler.
You can implement your solution in the following way (pseudocode).
event = threading.Event()
def looping_function( ... ):
while event.is_set():
do_your_stuff()
def main():
try:
event.set()
pool = ThreadPool()
pool.map( ... )
except KeyboardInterrupt:
event.clear()
finally:
pool.close()
pool.join()
In my Django application, I want to do some work in background when a certain view is requested. To that end, I created a multiprocessing.dummy.Pool of workers, and whenever that URL is called, I start a new process on it. The task to be executed in background can have to do some retries with a certain timeout between them.
Since this whole thing is executed, so to speak, not on a UI thread, I thought I'd use sleep for timeouts. When I unittest this arrangement, everything works fine, but when this runs in Django, the thread gets to the sleep statement and then never wakes up, but when I restart the Django app, the thread gets past the sleep statement and then is immediately killed by the restart. I know I could schedule retries using Timers, but I wanted a simpler solution.
Here's a simplified version of my code:
from multiprocessing.dummy import Pool
POOL = Pool(settings.POOL_WORKERS)
def background_task(arg):
refresh = True
try:
for i in range(settings.GET_RETRY_LIMIT):
status, result = (arg, refresh=refresh)
refresh = False
if status is Statuses.OK:
return result
if i < settings.GET_RETRY_LIMIT - 1:
sleep(settings.GET_SLEEP_TIME)
except Exception as e:
logging.error(e)
return []
def do_background_work(arg):
POOL.apply_async(
background_task,
(arg)
)
def my_view(request):
arg = get_arg_from_request(request)
do_background_work(arg)
return Response("Ok")
UPD: By the way, turns out that the workers are most probably killed by Harakiri
The request handlers are as follows:
class TestHandler(tornado.web.RequestHandler): # localhost:8888/test
#tornado.web.asynchronous
def get(self):
t = threading.Thread(target = self.newThread)
t.start()
def newThread(self):
print "new thread called, sleeping"
time.sleep(10)
self.write("Awake after 10 seconds!")
self.finish()
class IndexHandler(tornado.web.RequestHandler): # localhost:8888/
def get(self):
self.write("It is not blocked!")
self.finish()
When I GET localhost:8888/test, the page loads 10 seconds and shows Awake after 10 seconds; while it is loading, if I open localhost:8888/index in a new browser tab, the new index page is not blocked and loaded instantly. These fit my expectation.
However, while the /test is loading, if I open another /test in a new browser tab, it is blocked. The second /test only starts processing after the first has finished.
What mistakes have I made here?
What you are seeing is actually a browser limitation, not an issue with your code. I added some extra logging to your TestHandler to make this clear:
class TestHandler(tornado.web.RequestHandler): # localhost:8888/test
#tornado.web.asynchronous
def get(self):
print "Thread starting %s" % time.time()
t = threading.Thread(target = self.newThread)
t.start()
def newThread(self):
print "new thread called, sleeping %s" % time.time()
time.sleep(10)
self.write("Awake after 10 seconds!" % time.time())
self.finish()
If I open two curl sessions to localhost/test simultaneously, I get this on the server side:
Thread starting 1402236952.17
new thread called, sleeping 1402236952.17
Thread starting 1402236953.21
new thread called, sleeping 1402236953.21
And this on the client side:
Awake after 10 seconds! 1402236962.18
Awake after 10 seconds! 1402236963.22
Which is exactly what you expect. However in Chromium, I get the same behavior as you. I think that Chromium (perhaps all browsers) will only allow one connection at a time to be opened to the same URL. I confirmed this by making IndexHandler run the same code as TestHandler, except with slightly different log messages. Here's the output when opening two browser windows, one to /test, and one to /index:
index Thread starting 1402237590.03
index new thread called, sleeping 1402237590.03
Thread starting 1402237592.19
new thread called, sleeping 1402237592.19
As you can see both ran concurrently without issue.
I think you picked the "wrong" test for checking parallel GET requests, that's because you're using a blocking function for your test: time.sleep(), which its behavior doesn't really occur when you simply render an HTML page ...
What happens is, that the def get() ( which handle all GET requests ) is actually being blocked when you use time.sleep it cannot process any new GET requests, puts them in some kind of "queue".
So if you really want to test sleep() - use the Tornado non-blocking function: tornado.gen.sleep()
Example:
from tornado import gen
#gen.coroutine
def get(self):
yield self.time_wait()
#gen.coroutine
def time_wait(self):
yield gen.sleep(15)
self.write("done")
Open multiple tabs in your browser, then you'll see that all requests are being processed when they arrive w/o "queueing" the new requests that comes in ..
I've seen many topics about this particular problem but i still can't figure why i'm not catching a SIGINT in my main Thread.
Here is my code:
def connect(self, retry=100):
tries=retry
logging.info('connecting to %s' % self.path)
while True:
try:
self.sp = serial.Serial(self.path, 115200)
self.pileMessage = pilemessage.Pilemessage()
self.pileData = pilemessage.Pilemessage()
self.reception = reception.Reception(self.sp,self.pileMessage,self.pileData)
self.reception.start()
self.collisionlistener = collisionListener.CollisionListener(self)
self.message = messageThread.Message(self.pileMessage,self.collisionlistener)
self.datastreaminglistener = dataStreamingListener.DataStreamingListener(self)
self.datastreaming = dataStreaming.Data(self.pileData,self.datastreaminglistener)
return
except serial.serialutil.SerialException:
logging.info('retrying')
if not retry:
raise SpheroError('failed to connect after %d tries' % (tries-retry))
retry -= 1
def disconnect(self):
self.reception.stop()
self.message.stop()
self.datastreaming.stop()
while not self.pileData.isEmpty():
self.pileData.pop()
self.datastreaminglistener.remove()
while not self.pileMessage.isEmpty():
self.pileMessage.pop()
self.collisionlistener.remove()
self.sp.close()
if __name__ == '__main__':
import time
try:
logging.getLogger().setLevel(logging.DEBUG)
s = Sphero("/dev/rfcomm0")
s.connect()
s.set_motion_timeout(65525)
s.set_rgb(0,255,0)
s.set_back_led_output(255)
s.configure_locator(0,0)
except KeyboardInterrupt:
s.disconnect()
In the main function I call Connect() which is launching Threads over which i don't have direct controll.
When I launch this script I would like to be able to stop it when hitting Control+C by calling the "disconnect()" function which stops all the other threads.
In the code i provided it doesn't work because there is no thread in the main function. But I already tryied putting all the instuctions from Main() in a Thread with a While loop without success.
Is there a simple way to solve my problem ?
Thanx
Your indentation is messed up, but there's enough to go on.
Your main thread isn't catching SIGINT because it's not alive. There is nothing that stops your main thread from continuing past the try block, seeing no more code, and closing up shop.
I am not familiar with Sphero. I just attempted to google its docs and was linked to a bunch of 404 pages, so I'll tell you what you would normally do in a threaded environment - join your threads to the main thread so that the main thread can't finish execution before the worker threads.
for t in my_thread_list:
t.join() #main thread can't get past here until all the threads finish
If your Sphero object doesn't provide join-like functionality, you could hack something in that blocks, i.e.
raw_input('Press Enter to disconnect')
s.disconnect()