I have a django app running as a microservice. Which has a function which does the following:
Checks for filenames matching *.json in a directory.
for each file:
processes json data and converts to xml.
saves xml in target directory.
Exits function.
Is there any reason i shouldn't do this to keep it running on a cycle:
While True:
main_function()
The previous developer was using threading which i think is making things far too complicated given there's no need to optimise performance.
You might be looking for this:
http://www.celeryproject.org/
Celery is easy to understand and implement and should not take a lot of time.
Let me know if it helps you.
Related
Okay, so basically I am creating a website. The data I need to display on this website is delivered twice daily, where I need to read the delivered data from a file and store this new data in the database (instead of the old data).
I have created the python functions to do this. However, I would like to know, what would be the best way to run this script, while my flask application is running? This may be a very simple answer, but I have seen some answers saying to incorporate the script into the website design (however these answers didn't explain how), and others saying to run it separately. The script needs to run automatically throughout the day with no monitoring or input from me.
TIA
Generally it's a really bad idea to put a webserver to handle such tasks, that is the flask application in your case. There are many reasons for it so just to name a few:
Python's Achilles heel - GIL.
Sharing system resources of the application between users and other operations.
Crashes - it happens, it could be unlikely but it does. And if you are not careful, the web application goes down along with it.
So with that in mind I'd advise you to ditch this idea and use crontabs. Basically write a script that does whatever transformations or operations it needs to do and create a cron job at a desired time.
My web app asks users 3 questions and simple writes that to a file, a1,a2,a3. I also have real time visualization of the average of the data (reads real time from file).
Must I use a database to ensure that no/minimal information is lost? Is it possible to produce a queue of read/writes>(Since files are small I am not too worried about the execution time of each call). Does python/flask already take care of this?
I am quite experienced in python itself, but not in this area(with flask).
I see a few solutions:
read /dev/urandom a few times, calculate sha-256 of the number and use it as a file name; collision is extremely improbable
use Redis and command like LPUSH, using it from Python is very easy; then RPOP from right end of the linked list, there's your queue
I am a postdoc and I just finished a cool little scientific application in Python and want to share it with the world. It's a really useful tool for genetecists.
I'd really like to let people run this program through a CGI form interface. Since I'm not a student anymore, I no longer have webspace with a tidy little cgi-bin subdirectory that's hooked up perfectly.
I wrote a simple CGI Python program a few years ago, and was trying to use this as a template.
Here is my quesion:
My program needs to create temporary files (when run from the command line it saves images to a given path).
I've read a couple tutorials on Apache, etc. and got lots of things running, but I can't figure out how to let my program write temporary files (I also don't know where these files would live, etc.). Any time I try to write to a file (in any manner) in my Python program, the CGI "crashes" and doesn't seem OK.
I am not extremely worried about security because the temporary files will only be outputs of the program (not the user input).
And while I'm asking (I'm assuming you're kind of a CGI ninja if you got this far and weren't bored), do you know my CGI program can take a file argument without making a temporary file?
My previous approach to this was to simply take a list of text as an argument:
try:
if item.file:
data = item.file.read()
if check:
Tools_file.main(["ExeName", "-d", "-w " + data])
else:
Tools_file.main(["ExeName", "-s", "-d", "-w " + data])
...
I'd like to do this the right way! Cheers in advance.
Stack overflowingly yours,
Oliver
Well, the "right" way is probably to re-work things using an existing web framework like Django. It's probably overkill in this case. Don't underestimate the security aspects here. They're probably more relevant than you think.
All that said, you probably want to use Python's temp file module from the standard library:
http://docs.python.org/library/tempfile.html
It'll generally write stuff out to /tmp/whatever if you're on unix. If your program is crashing only when run under apache (but runs fine when you execute it directly), check your permissions. Make sure your apache user has permission to write to wherever you've decided to store your temp files. Make sure the temp files are written with appropriate permissions (don't want to write a file that you can't read later on).
As Paul McMillan said, use tempfile:
temp, temp_filename = tempfile.mkstemp(text = True)
temp_output = os.fdopen(temp, 'w')
temp_output.write(something_or_other)
temp_output.close()
My personal opinion is that frameworks are a big time sink unless you really need the prebuilt functionality. CGI is far simpler and can probably work for your application, at least until it gets really popular.
I am using Django and am making some long running processes that I am just interacting with through my web user interface. Such as, they would be running all the time, checking a database value every few minutes and stopping only if this has changed (would be boolean true false). So, I want to be able to use Django to interact with these, however am unsure of the way to do this. When I used to use PHP I had some method of doing this, figure it would be even easier to do in Python but am not able to find anything on this with my searches.
Basically, all I want to be able to do is to execute python code without waiting for it to finish, so it just begins execute then goes on to do whatever else it needs for django, quickly returning a new page to the user.
I know that there are ways to call an external program, so I suppose that may be the only way to go? Is there a way to do this with just calling other python code?
Thanks for any advice.
Can't vouch for it because I haven't used it yet, but "Celery" does pretty much what you're asking for and was originally built specifically for Django.
http://celeryproject.org/
Their example showing a simple task adding two numbers:
from celery.decorators import task
#task
def add(x, y):
return x + y
You can execute the task in the background, or wait for it to finish:
>>> result = add.delay(8, 8)
>>> result.wait() # wait for and return the result
16
You'll probably need to install RabbitMQ also to get it working, so it might be more complicated of a solution than you're looking for, but it will achieve your goals.
You want an asynchronous message manager. I've got a tutorial on integrating Gearman with Django. Any pickleable Python object can be sent to Gearman, which will do all the work and post the results wherever you want; the tutorial includes examples of posting back to the Django database (it also shows how to use the ORM outside of Django).
Question for Python 2.6
I would like to create an simple web application which in specified time interval will run a script that modifies the data (in database). My problem is code for infinity loop or some other method to achieve this goal. The script should be run only once by the user. Next iterations should run automatically, even when the user leaves the application. If someone have idea for method detecting apps breaks it would be great to show it too. I think that threads can be the best way to achive that. Unfortunately, I just started my adventure with Python and don't know yet how to use them.
The application will have also views showing database and for control of loop script.
Any ideas?
You mentioned that you're using Google App Engine. You can schedule recurring tasks by placing a cron.yaml file in your application folder. The details are here.
Update: It sounds like you're not looking for GAE-specific solutions, so the more general advice I'd give is to use the native scheduling abilities of whatever platform you're using. Cron jobs on a *nix host, scheduled tasks on Windows, cron.yaml on GAE, etc.
In your other comments you've suggested wanting something in Python that doesn't leave your script executing, and I don't think there's any way to do this. Some process has to be responsible for kicking off whatever it is you need done, so either you do it in Python and keep a process executing (even if it's just sleeping), or you use the platform's scheduling tools. The OS is almost guaranteed to do a better job of this than your code.
i think you'd want to use cron. write your script, and have cron run it every X minutes / hours.
if you really want to do this in Python, you can do something like this:
while(True):
<your app logic here>
sleep(TIME_INTERVAL)
Can you use cron to schedule the job to run at certain intervals? It's usually considered better than infinite loops, and was designed to help solve this sort of problem.
There's a very primitive cron in the Python standard library: import sched. There's also threading.Timer.
But as others say, you probably should just use the real cron.