What is the best idea to fill up data into a Django model from an external source?
E.g. I have a model Run, and runs data in an XML file, which changes weekly.
Should I create a view and call that view URL from a curl cronjob (with the advantage that that data can be read anytime, not only when the cronjob runs), or create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?
There is excellent way to do some maintenance-like jobs in project environment- write a custom manage.py command. It takes all environment configuration and other stuff allows you to concentrate on concrete task.
And of course call it directly by cron.
You don't need to create a view, you should just trigger a python script with the appropriate Django environment settings configured. Then call your models directly the way you would if you were using a view, process your data, add it to your model, then .save() the model to the database.
I've used cron to update my DB using both a script and a view. From cron's point of view it doesn't really matter which one you choose. As you've noted, though, it's hard to beat the simplicity of firing up a browser and hitting a URL if you ever want to update at a non-scheduled interval.
If you go the view route, it might be worth considering a view that accepts the XML file itself via an HTTP POST. If that makes sense for your data (you don't give much information about that XML file), it would still work from cron, but could also accept an upload from a browser -- potentially letting the person who produces the XML file update the DB by themselves. That's a big win if you're not the one making the XML file, which is usually the case in my experience.
"create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?"
First, be sure to declare your Forms in a separate module (e.g. forms.py)
Then, you can write batch loaders that look like this. (We have a LOT of these.)
from myapp.forms import MyObjectLoadForm
from myapp.models import MyObject
import xml.etree.ElementTree as ET
def xmlToDict( element ):
return dict(
field1= element.findtext('tag1'),
field2= element.findtext('tag2'),
)
def loadRow( aDict ):
f= MyObjectLoadForm( aDict )
if f.is_valid():
f.save()
def parseAndLoad( someFile ):
doc= ET.parse( someFile ).getroot()
for tag in doc.getiterator( "someTag" )
loadRow( xmlToDict(tag) )
Note that there is very little unique processing here -- it just uses the same Form and Model as your view functions.
We put these batch scripts in with our Django application, since it depends on the application's models.py and forms.py.
The only "interesting" part is transforming your XML row into a dictionary so that it works seamlessly with Django's forms. Other than that, this command-line program uses all the same Django components as your view.
You'll probably want to add options parsing and logging to make a complete command-line app out of this. You'll also notice that much of the logic is generic -- only the xmlToDict function is truly unique. We call these "Builders" and have a class hierarchy so that our Builders are all polymorphic mappings from our source documents to Python dictionaries.
Related
I have a Django project which parses XML feeds. The thing is that I often need to write a custom script to parse or download a new feed/s for a new client.
I don't think the best way is to modify Django code every time I need a custom parsing/downloading pipeline.
My idea is to create a standard. For example, I upload myscript.py through Admin, which must have the class class Downloader() with function download(self) and this function will be called every time that source has to be downloaded.
So what is the most common way to do that?
The only thing which comes to my mind is to upload myscript.py which has if __name__ == '__main__' function and call it by for example Popen('python filename.py')
or
from subprocess import call
call(["python", "filename.py"])
model:
class Source(..):
custom_downloader = FileField(... # if needed
...
Before I ask my question I need to give some context:
I wrote a simple python script that read linux's syslog file and search for certain strings. I have other similar scripts like these (scripts that do file system stuff, scripts that interact with other servers and so on). Most of these scripts write simple write stuff to stdout.
I would like to port these scripts to a web-server so I could simple browser to https://server/syslog and get the same output that I would get by running the script on the command line interface.
According with my research Django seems to be a great choice. I followed some Django tutorials and I was capable of developing some basic django web apps.
My question is: Since django does not have a "controller" where should I place the scripts code? My best bet in the view, but according with djangos documentation it does not make sence.
Extracted from django doc: In our interpretation of MVC, the “view” describes the data that gets presented to the user. It’s not necessarily how the data looks, but which data is presented. The view describes which data you see, not how you see it. It’s a subtle distinction.
The description of MVC is not so important. The typical use of django is for database backed web applications. And this describes a design pattern or paradigm for that. It's completely possible to use django in other ways as well.
If you want to build a django app that is a web interface for your existing scripts, you might not even need the django ORM at all. In any case, you can put as much or as little logic in your view as you want. Your use case might just not fit neatly into the MVC or MVT paradigm. Django views are just python functions (or classes, but Django class based views are more tightly coupled with the ORM).
I would recommend:
leaving your scripts largely as they are, but wrap the parts you want to reuse as
functions. You can keep them functional as standalone scripts with an
if __name__=='__main__':
block to call the functions.
Import the functions to views.py - it doesn't matter where they are as long as your server will always be able to find them. I put mine right in the app directory.
Call the function(s) in your view(s), and return the text to a HttpResponse object which you return from the view. (I think this is more direct than creating a template and a context and calling render, but its not what I usually do so there may be some issues?)
Thats bit old code - but you will get enough idea to start - check https://github.com/alex2/django_logtail (Django_LogTail)
I have a python script written that takes an input from one model, queries it, and appends the various results from the query to another model through a ForeignKey relationship. It works great from the python shell, but I was wondering if there is a way to run it from the admin webpage so that every time a new object for the first model is submitted, it runs the script and updates the database automatically for the other model. I'm using the Django admin interface as part of development for staff to do data entry since I've found it's a very flexible interface. The script is written specifically for this app, so it is on the app's folder.
I was surprised that this wasn't already answered.
Either wrap the existing script as a management command, or import it into a management command.
Once you've done that, you can override the Admin view in question, like this..
from django.contrib.admin import AdminSite
from django.views.decorators.cache import never_cache
class MyAdminSite(AdminSite):
#never_cache
def index(self, request, extra_context=None):
# do stuff
Then, you create an instance of this class, and use this instance, rather than admin.site to register your models.
admin_site = MyAdminSite()
Then, later:
from somewhere import admin_site
class MyModelAdmin(ModelAdmin):
...
admin_site.register(MyModel, MyModelAdmin)
Lastly, in that overriding view, you can use management.call_command from your code to call the management command. This lets you use it both from the commandline, and from inside your code - and if you need to, you can schedule it from cron, too. :)
My project's URLs are automatically generated in urls.py using a for loop (the URLs look like AppName/ViewName). According to the docs, urls.py is loaded upon every request. This appears to be slowing my site down since it requires a bunch of introspection code, so I want to generate the URLs less frequently. I could of course manually run a script to re-generate urls.py (or a file imported by urls.py) as needed, but would prefer if it happened automatically as part of project validation/startup (like the server starting up or the database being synced). I'm open-sourcing this project, and many people will be running it on their own servers, so I want to do this in a robust way. Any recommendations?
The docs do not say what you claim they do (or rather, you're reading too much into a phrase which only means "loads that Python module (if it has not already been loaded)".
Generally, the only things that happen on every request are running the middleware and the specific view code associated with that request. Even then, nothing is reloaded on every request. URLs, like all Python code, is only loaded when a new process is started, and when that happens depends on your server setup. Your problem is elsewhere: you should profile your application carefully to find out exactly where.
For example you can look for django-json-rpc where author has realized url-generating via decorators. There are main controller which receive all requests and urls dict {'pattern': method}. urls dict filled automatically by decorators like #jsonrpc_method which receive a function and put them to urls.
I think it must run faster than the for and I believe that this approach will be able to apply for django.urls
I've written some python code to accomplish a task. Currently, there are 4-5 classes that I'm storing in separate files. I'd now like to change this whole thing into a database-backed web app. I've been reading tutorials on Django, and so far I get the impression that I'll need to manually specify the fields and their types for every "model" that I use. This is a little surprising to me, since I was expecting some kind of ORM capability that would just take the existing classes I've already defined, and map them onto a database somehow, in a manner abstracted away from me.
Is this not the case? Am I missing something? It looks like I need to specify all the fields and types in the file 'models.py'.
Okay, now beyond those specifics, does anyone have any general tips on the best way to migrate an object-oriented desktop application to a web application?
Thanks!
That is Django's ORM: it maps classes to tables. What else did you expect? There needs to be some way of specifying what the fields are, though, before you can use them, and that's managed through the models.Model class and the various models.Field subclasses. You can certainly use your classes as mixins in order to use the existing business logic on top of the field definitions.
If you are thinking about a database backend based web app, you have to specify what fields of the data you want to store and what type of the value you want stored.
There is an abstraction that introspects the db to convert it into the django models.py format. But I know not of any that introspects a python class and stores arbitrary data into db. How would that even work? Are the objects, now, stored as a pickle?
You're going to have to check the output, but you can have Django automatically create models from existing databases through one-time introspection.
Taken from the link below, you would set up your database in settings.py, and then call
python manage.py inspectdb
This will dump the sample models.py file to standard out for your inspection. In order to create the file, simply redirect the output
python manage.py inspectdb > models.py
See for more:
http://docs.djangoproject.com/en/dev/howto/legacy-databases/?from=olddocs#auto-generate-the-models