I have the following task:
- description: Task is run every 2 hours
url: /task
schedule: every 2 hours synchronized
Is there any way to know in my python code when it was run last time (date and time)?
The idea I have now is to store that in memcache when it is finished to run. But is there any other better way?
PS. Another problem I have is with the case when memcache is empty. How can I calculate when last run was?
You can't, not without storing the information of the last run somewhere. Cron, by itself, does not track that for you.
Why not add an extra bit of code to the file that is ran that creates a log, like a text file?
Related
How can I refresh my python file in every five minutes in django while running because every hour the data I'm webscraping is changing and I need to change the value of the variable?
You need is a task that runs periodically and a cron job solve that. I recomment to you take a look to django cron or Celerity, both are excelent options to create scheduled tasks.
I recommend using database like sqlite3 is better solution than restart django(web service) per hours.
You can store some datas in database and django can get them like using variables.
The real problem here is that you're fetching the data at the start of the application and keeping it in memory. I see 2 possible better methods:
move the data scraping code to your view function. This means you'll re-scrape on every call ensuring you'll always have the freshest data but at the cost of speed (the time it takes to make the request to your target url).
better yet: same as above except you cache the results locally. This could also be kept in memory (although I'd use a file or database if you're running multiple django app instances to ensure they're all using the same data). With the least amount of change to what you have already, an in memory cache could be achieved with simply adding a timestamp variable that gets the current time on each fetch. If last fetch was more than X minutes ago: refetch your data.
I have few cronjobs running with the help of django-crontab. Let us take one cronjob as an example, suppose this job A is scheduled to run every two minutes.
However, while the job is running and if it is not finished in two minutes, I do not want another instance of this job to execute.
Exploring few resources, I came across this article, but I am not sure where to fit this in.
https://bencane.com/2015/09/22/preventing-duplicate-cron-job-executions/
Did someone already came across this issue? How did you fix it?
According to the readme, you should be able to set:
CRONTAB_LOCK_JOBS = True
in your Django settings. That will prevent a new job instance from starting if a previous one is still running.
Summary: I have a python script which collects tweets using Twitter API and i have postgreSQL database in the backend which collects all the streamed tweets. I have custom code which overcomes the ratelimit issue and i made it to run 24/7 for months.
Issue: Sometimes streaming breaks and sleeps for given secs but it is not helpful. I do not want to check it manually.
def on_error(self,status)://tweepy method
self.mailMeIfError(['me <me#localhost'],'listen.py <root#localhost>','Error Occured on_error method',str(error))
time.sleep(300)
return True
Assume mailMeIfError is a method which takes care of sending me a mail.
I want a simple cron script which always checks the process and restart the python script if not running/error/breaks. I have gone through some answers from stackoverflow where they have used Process ID. In my case process ID still exists because this script sleeps if Error.
Thanks in advance.
Using Process ID is much easier and safer. Try using watchdog.
This can all be done in your one script. Cron would need to be configured to start your script periodically, say every minute. The start of your script then just needs to determine if it is the only copy of itself running on the machine. If it spots that another copy is running, it just silently terminates. Else it continues to run.
This behaviour is called a Singleton pattern. There are a number of ways to achieve this for example Python: single instance of program
I'm fairly competent with Python but I've never 'uploaded code' to a server before and have it run automatically.
I'm working on a project that would require some code to be running 24/7. At certain points of the day, if a criteria is met, a process is started. For example: a database may contain records of what time each user wants to receive a daily newsletter (for some subjective reason) - the code would at the right time of day send the newsletter to the correct person. But of course, all of this is running out on a Cloud server.
Any help would be appreciated - even correcting my entire formulation of the problem! If you know how to do this in any other language - please reply with your solutions!
Thanks!
Here are two approaches to this problem, both of which require shell access to the cloud server.
Write the program to handle the scheduling itself. For example, sleep and wake up every few miliseconds to perform the necessary checks. You would then transfer this file to the server using a tool like scp, login, and start it in the background using something like python myscript.py &.
Write the program to do a single run only, and use the scheduling tool cron to start it up every minute of the day.
Took a few days but I finally got a way to work this out. The most practical way to get this working is to use a VPS that runs the script. The confusing part of my code was that each user would activate the script at a different time for themselves. To do this, say at midnight, the VPS runs the python script (using scheduled tasking or something similar) and runs the script. the script would then pull times from a database and process the code at those times outlined.
Thanks for your time anyways!
How do I schedule a task to run once every six hours (on repeat)?
I am trying to implement a Redis queue for the first time.
I went through Heroku's tutorial : https://devcenter.heroku.com/articles/python-rq
But the tutorial did not explain how to run a task repeatedly with a timeframe (such as checking a couple of websites for info, once every six hours)
Also, since I am new to do this, if I should not be using Redis for such a task, please let me know what I should be using to check a couple of websites for info once every six hours
Thanks
You don't need Redis for this functionality at all.
Take a look at the Heroku Scheduler here: https://devcenter.heroku.com/articles/scheduler
You can set this to run your code every hour, and have your code check if the current hour is 0,5,11,17 (or whatever other interval you may need).