I have an app that can generate some gibberish model instances on queue. As opposed to having static fixture initial_data, I would like to have something like following during migrations:
if IS_DEBUG and not IS_TEST:
(create_post() for _ in xrange(100))
(create_user() for _ in xrange(100))
I know how to load static fixtures in django, but I would like to have more control over fake data I load into my application.
Just a note - this is not for unit/func/anything tests, but to populate the database to be able to browse the development version of a website and look around.
What is the best way to accomplish this? Custom south migration?
UPDATE: Question that concerns me with south migrations, if I change a model in a migration after that custom migration which populates data - it'll be a bit messy. For example:
0001_initial
0002_generate_fixtures
0003_add_field_to_a_model
So now I'll have to remove 0002_generate_fixtures and create new migration 0004_generate_fixtures. That gets messy very quickly.
Serafeim's and nieka's answers have been very helpful, at the end of the day I decided to go with a custom management command in app_name/management/commands/populatedb.py.
Now, I can simply run python manage.py populatedb to dynamically populate my database.
I've used the factory_boy (https://github.com/rbarrois/factory_boy) with greate success to create custom fixtures (for tests or just populating the database).
Copying from the project's README:
factory_boy is a fixtures replacement based on thoughtbot's
factory_girl.
Its features include:
Straightforward syntax Support for multiple build strategies
(saved/unsaved instances, attribute dicts, stubbed objects) Powerful
helpers for common cases (sequences, sub-factories, reverse
dependencies, circular factories, ...) Multiple factories per class
support, including inheritance Support for various ORMs (currently
Django, Mogo, SQLAlchemy)
I think the best way is to write simple custom script to fill the database. Don't use South for this purpose.
Simply run your custom script when you want to fill database:
fill_db.py:
from django.conf import settings
def fill_db()
if settings.IS_DEBUG and not settings.IS_TEST:
for i in xrange(100):
Post.objects.create()
User.objects.create()
You can also add this script to the django management commands:
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def handle(self, *args, **options):
fill_db()
Related
Every Django tutorial/book I've seen approaches Django projects from what I am going to characterize as an ad-hoc database design method. A project ends-up being a bunch of little apps, each with it's own models, view, etc.
I am trying to locate a resource that covers how to structure a Django project when you do, in fact, start with a traditional DB design process.
For example, let's say that this is the starting point:
And a document describing it:
LEAD Database Guide
How does one approach this so that the various Django apps access a single "global" (bad word) db model that covers the entire schema rather than a bunch of models spread across a bunch of apps?
Are there any resources (books, pdf's, tutorials) that cover this approach rather than the piece-meal approach most commonly seen?
A corollary to this question might be: Is there a, perhaps automated, way to go from an SQL (or MySQL Workbench) schema definition to the equivalent Django ORM?
You can use the inspectdb command to generate models.py from database:
http://docs.djangoproject.com/en/dev/ref/django-admin/#inspectdb
Then, you can treat the models as a single app, and add more features/tables into new apps in the future.
I have a few databases in MongoDB that I want to create models for dynamically, since there are many databases and I cannot do it manually. Questions:
What should my models.py look like? (Does inspectdb work with mongodb databases or only SQL based dbs?)
Since the database models are created dynamically, how do I code the serializer class to return the dynamic fields?
Thanks in advance.
Django supports an object-relational mapper, that is aimed at traditional relational databases. While there are a number of mongodb packages for Django, none of them support inspectdb to construct your models. Either way, inspectdb is a kludge designed as a one of process to help a one-of migratation away from a legacy system, i.e. you'd build your models.py file once and never run inspectdb again. This is not what you want to do, as you seem to want dynamic models that can be added or altered at runtime.
On the bright side, Django MongoDB Engine has some support for arbitrary embedded models within pre-defined models. But even then they don't seem too supportive of it:
As you can see, generic embedded models add a lot of overhead that bloats up your data records. If you want to use them anyway, here’s how you’d do it...
In summary, try to build your models as best you can to actually match your requirements. If you know nothing about your models ahead of production, then perhaps Django isn't the right solution for you.
The django projet I'm working on has a ton of initial_data fixture data. It seems by default the only way to have data load automatically is to have a file in your app folder called fixtures, and the file needs to be named initial_data.ext (ext being xml or json or yaml or something).
This is really unflexable, I think. I'd rather have a fixtures folder, and then inside that a initial_data folder, and then inside there, one file for each model in that app. Or something to that effect. IS this possible to do in django now? Or maybe some other scheme of better fixture organization.
In my experience, hard-coded fixtures are a pain to write and a pain to maintain. Wherever a model change breaks a fixture, the Django initial load will return a very unfriendly error message and you will end-up adding a bunch a of print's in the Django core in order to find where the problem is coming from.
One of the developer I work with has developed a very good library to overcome this problem, it's called django-dynamic-fixture and we really to love it. Here is how it works:
Suppose you have this models:
class Author(models.Model):
name = models.CharField()
class Book(models.Model):
author = models.ForeingKey(Author, required=True)
title = models.CharField()
In order to create a book instance in your test, all you have to do is
from django_dynamic_fixture import get
from app import Book
class MyTest(TestCase):
def setUp(self):
self.book = get(Book)
The django-dynamic-fixture automatically creates for you any dependencies that are required for the Book model to exist. This is just a simple example but the library can handle very complex model structures.
You can reorganize your initial data fixtures however you want and then write a post_syncdb signal handler which loads them. So it will then be automatically loaded on syncdb, according to the logic defined by you.
See: https://docs.djangoproject.com/en/1.3/ref/signals/#post-syncdb
Yes, you can split fixtures into multiple files with subfolder structures. You can specify fixture files to load and create a script that loads some or all of them. I have done this before so can confirm that it works.
Example: django-admin.py loaddata application/module/model.json
See loaddata documentation for more information.
A hacky way to load an additional initial_data.json or two is to create additional empty apps within your Django project that has nothing but the fixtures folder and the initial_data.json file. If you need the fixture loaded before the other apps' fixtures, you could name it something like aa1. If you need another one you could name it aa2. Your directory structure would look like this:
aa1/
fixtures/
initial_data.json
aa2/
fixtures/
initial_data.json
myrealapp/
fixtures/
initial_data.json
...
You would need to add the apps to INSTALLED_APPS in settings.py.
You can then populate the fixture_data.json files with arbitrary app information as necessary:
(virtualenv) ./manage.py dumpdata --indent=4 auth > aa1/fixtures/initial_data.json
(virtualenv) ./manage.py dumpdata --indent=4 oauth2 > aa2/fixtures/initial_data.json
(virtualenv) ./manage.py dumpdata --indent=4 myrealapp > myrealapp/fixtures/initial_data.json
When you run python manage.py syncdb, each of the fixtures will be loaded in automatically in alphabetical order.
As I mentioned, this is quite hacky, but if you only need a couple extra initial_data.json files and need to be able to control the order they are loaded in, this works.
I've written some python code to accomplish a task. Currently, there are 4-5 classes that I'm storing in separate files. I'd now like to change this whole thing into a database-backed web app. I've been reading tutorials on Django, and so far I get the impression that I'll need to manually specify the fields and their types for every "model" that I use. This is a little surprising to me, since I was expecting some kind of ORM capability that would just take the existing classes I've already defined, and map them onto a database somehow, in a manner abstracted away from me.
Is this not the case? Am I missing something? It looks like I need to specify all the fields and types in the file 'models.py'.
Okay, now beyond those specifics, does anyone have any general tips on the best way to migrate an object-oriented desktop application to a web application?
Thanks!
That is Django's ORM: it maps classes to tables. What else did you expect? There needs to be some way of specifying what the fields are, though, before you can use them, and that's managed through the models.Model class and the various models.Field subclasses. You can certainly use your classes as mixins in order to use the existing business logic on top of the field definitions.
If you are thinking about a database backend based web app, you have to specify what fields of the data you want to store and what type of the value you want stored.
There is an abstraction that introspects the db to convert it into the django models.py format. But I know not of any that introspects a python class and stores arbitrary data into db. How would that even work? Are the objects, now, stored as a pickle?
You're going to have to check the output, but you can have Django automatically create models from existing databases through one-time introspection.
Taken from the link below, you would set up your database in settings.py, and then call
python manage.py inspectdb
This will dump the sample models.py file to standard out for your inspection. In order to create the file, simply redirect the output
python manage.py inspectdb > models.py
See for more:
http://docs.djangoproject.com/en/dev/howto/legacy-databases/?from=olddocs#auto-generate-the-models
What is the best idea to fill up data into a Django model from an external source?
E.g. I have a model Run, and runs data in an XML file, which changes weekly.
Should I create a view and call that view URL from a curl cronjob (with the advantage that that data can be read anytime, not only when the cronjob runs), or create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?
There is excellent way to do some maintenance-like jobs in project environment- write a custom manage.py command. It takes all environment configuration and other stuff allows you to concentrate on concrete task.
And of course call it directly by cron.
You don't need to create a view, you should just trigger a python script with the appropriate Django environment settings configured. Then call your models directly the way you would if you were using a view, process your data, add it to your model, then .save() the model to the database.
I've used cron to update my DB using both a script and a view. From cron's point of view it doesn't really matter which one you choose. As you've noted, though, it's hard to beat the simplicity of firing up a browser and hitting a URL if you ever want to update at a non-scheduled interval.
If you go the view route, it might be worth considering a view that accepts the XML file itself via an HTTP POST. If that makes sense for your data (you don't give much information about that XML file), it would still work from cron, but could also accept an upload from a browser -- potentially letting the person who produces the XML file update the DB by themselves. That's a big win if you're not the one making the XML file, which is usually the case in my experience.
"create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?"
First, be sure to declare your Forms in a separate module (e.g. forms.py)
Then, you can write batch loaders that look like this. (We have a LOT of these.)
from myapp.forms import MyObjectLoadForm
from myapp.models import MyObject
import xml.etree.ElementTree as ET
def xmlToDict( element ):
return dict(
field1= element.findtext('tag1'),
field2= element.findtext('tag2'),
)
def loadRow( aDict ):
f= MyObjectLoadForm( aDict )
if f.is_valid():
f.save()
def parseAndLoad( someFile ):
doc= ET.parse( someFile ).getroot()
for tag in doc.getiterator( "someTag" )
loadRow( xmlToDict(tag) )
Note that there is very little unique processing here -- it just uses the same Form and Model as your view functions.
We put these batch scripts in with our Django application, since it depends on the application's models.py and forms.py.
The only "interesting" part is transforming your XML row into a dictionary so that it works seamlessly with Django's forms. Other than that, this command-line program uses all the same Django components as your view.
You'll probably want to add options parsing and logging to make a complete command-line app out of this. You'll also notice that much of the logic is generic -- only the xmlToDict function is truly unique. We call these "Builders" and have a class hierarchy so that our Builders are all polymorphic mappings from our source documents to Python dictionaries.