South migration default value from other column?

South migration default value from other column? - python

I have a model that has one column right now, description, that populates an area on a few different pages of the site. The client wants this split up into a couple different columns, so they can populate different values in certain parts of the site. So I have to change that description column to frontpage_description and resourcepage_description. I am trying to come up with a way to do this in South so that the value of the description column is the (initial) default value for both of the "new" columns. This is what I have so far:
# file: project/app/xxxx_mymigration.py
import datetime
from south.db import db
from south.v2 import SchemaMigration
from django.db import models
class Migration(SchemaMigration):
def forwards(self, orm):
db.rename_column('myapp', 'description', 'frontpage_description')
db.add_column('myapp', 'resourcepage_description', self.gf('myfields.TextField')(default='CHANGEME'), keep_default=False)
def backwards(self, orm):
db.rename_column('myapp', 'frontpage_description', 'description')
db.delete_column('myapp', 'resourcepage_description')
models = {
# ...
The part I am wondering about is the self.gf(...)(default='CHANGEME'), I am wondering if there is a way to set the value of description or frontpage_description to be the default value for resourcepage_description?
I have looked into the orm parameter, which does allow you to access models during the migration, but all the examples I have come across involve using it to define relations during a migration, and not actually accessing individual records.
Am I just going to have to split this up into a schemamigration and a datamigration?

This is exactly a three-step: schema migration to add the columns, a data migration to set them, and another schema migration to delete the original description column. 90% of that is done for you from the ./manage command, so it's not as if this is tragically more work.

Related

Unable to apply migration on altered model in django

I am new to django.
I have changed some fields in my already created Django model. But It says this message when I try to apply migrations on it:
It is impossible to add a non-nullable field 'name' to table_name without specifying a default. This is because the database needs something to populate existing rows.
Please select a fix:
1) Provide a one-off default now (will be set on all existing rows with a null value for this column)
2) Quit and manually define a default value in models.py.
Although I have deleted the data of this table from database. I cannot set it's default value because the field has to store unique values. Do I need to delete my previous migration file related to that table?
I have applied data migrations, but still getting the same error when applying migrations again:
def add_name_and_teacher(apps, schema_editor):
Student = apps.get_model('app_name', 'Student')
Teacher = apps.get_model('app_name', 'Teacher')
for student in Student.objects.all():
student.name = 'name'
student.teacher = Teacher.objects.get(id=1)
student.save()
class Migration(migrations.Migration):
dependencies = [
('app', '0045_standup_standupupdate'),
]
operations = [
migrations.RunPython(add_name_and_teacher),
]

So, before you had a nullable field "name". This means that it's possible to have null set as that field's value.
If you add a not null constraint to that field (null=False), and you run the migrations, you will get an integrity error from the database because there are rows in that table that have null set as that field's value.
In case you just made two migrations where first you added a nullable field, but then remembered it mustn't be nullable and you added the not null constraint, you should simply revert your migrations and delete the previous migration. It's the cleanest solution.
You can revert by running python manage.py migrate <app name> <the migration that you want to keep>
Then you simply delete the new migrations and run python manage.py makemigrations again.
In case the migration with the nullable field was defined very early on and there is already data there and it's impossible to delete that migration, you will need to figure out how to populate that data. Since you say that there is also the unique constraint, you can't just provide a default because it will cause issues with that constraint.
My suggestion is to edit the migration file and add migrations.RunSQL where you write custom SQL code which will insert values to the field. Make sure you place the RunSQL operation before the operation that adds the not null constraint (it should be AlterField or AddConstraint) as they are run in order.
You could also use migrations.RunPython, but I prefer the RunSQL because future changes in the code might break your migrations which is a hassle to deal with.
Docs for RunSQL

How do I safely delete a model field in Django?

I need to delete fields from an existing django model that already have a few objects associated with it. Deleting the fields from models.py gives me an error (obviously as there are table columns still associated with them). The data in body2 and body3 are not necessary for my app.
I have copied that data from those fields to the body field. How would I go about deleting these fields without dropping the table entirely?
class Post(models.Model):
#some fields
body =EditorJsField(editorjs_config=editorjs_config)
body2 =EditorJsField(editorjs_config=editorjs_config)
body3 =EditorJsField(editorjs_config=editorjs_config)
I deleted body2 and body3 and ran migrations and when creating a new object, I get errors such as this.
django.db.utils.IntegrityError: null value in column "body2" of relation "second_posts" violates not-null constraint
DETAIL: Failing row contains (20, Wave | Deceptiveness, and unpredictability of nature, 2021-07-19 13:40:32.274815+00, 2021-07-19 13:40:32.274815+00, {"time":1626702023175,"blocks":[{"type":"paragraph","data":{"tex..., null, null, Just how unpredictable is nature? Nature is all around us, yet, ..., image/upload/v1626702035/dfaormaooiaa8felspqd.jpg, wave--deceptiveness-and-unpredictability-of-nature, #66c77c, l, 1, 1, 0).
This is the code that I'm using to save the sanitized data(after I've deleted those fields of course.)
post = Posts.objects.create(
body=form.cleaned_data.get('body'),
#
)

Since nobody seemed to have an answer, and since it looked like this error was an anomaly, I went the non-python way and ran SQL queries and dropped the columns. For those of you who ran into the same problem,
Warning, you will lose all the data in the fields you would like to delete using this method
First, make Django aware of the changes
Delete the fields you want to be deleted and run migrations.
Before
class Post(models.Model):
#some fields
body =EditorJsField(editorjs_config=editorjs_config)
body2 =EditorJsField(editorjs_config=editorjs_config)
body3 =EditorJsField(editorjs_config=editorjs_config)
After
class Post(models.Model):
#some fields
body =EditorJsField(editorjs_config=editorjs_config)
Command Prompt
python manage.py makemigrations
python manage.py migrate
Drop the columns using SQL queries
First connect to your database(I used postgres). The name of the table should be something like appname_model. My app was called "Second" and the model was Post. So, the table was called second_post.
See if the columns still persist after the migrations using,
In the SQL command prompt
/d second_post
This should give you a nice diagram of the database with all the columns listed on the left side. To drop those columns, type,
ALTER TABLE second_post DROP COLUMN body2;
ALTER TABLE second_post DROP COLUMN body3;
After entering each query, the prompt should return a string ALTER TABLE if successful.

If you want to drop the data completely, you need to create a Django migration (using ./manage.py makemigrations preferably) that will remove those columns from the database.
Alternatively, if you want to play safe and persist the old data, you can first make those fields as nullable, then create migrations for them and at the end either just don't use them anymore or remove those columns from the model, but don't reflect it in migrations (you'll need to fake the removal of those columns in migrations though if you'll ever need to run another migration in this app).

migration trouble with unique random default value in model django

I want to add a new field to my already existing model called color. I want it to be unique and also want it to be randomly selected by default. So what I do is:
color = models.CharField(max_length=15, default=random_color, unique=True)
My random_color looks like this
def random_color():
"""
Returns:
[string] : returns a rgb value in hex string
"""
while True:
generated_color = f'#{random_hex()}{random_hex()}{random_hex()}'
if not MyModel.objects.filter(color=generated_color):
return generated_color
I followed a similar logic to what has been provided here.
Now the problem with this approach is that there is no color to begin with to look for.
And I also want my migration to add in a bunch of default random color values to my already existing tables.
How do I fix this?

There might be a simpler way to accomplish this, but these steps should work:
Add the new color field with null=True and without the default and the unique=True.
Run makemigrations for your app.
Run makemigrations --empty to create a custom migration for your app. Add a RunPython operation using your random_color() to populate the new column.
Alter the color field to add the default, unique constraint, and remove the null=True.
Run makemigrations again.

Automatically merging arbitrary Django models

I have two Django-ORM managed databases that I'd like to merge. Both have a very similar schema, and both have the standard auth_users table, along with a few other shared tables that reference each other as well as auth_users, which I'd like to merge into a single database automatically.
Understandably, this could be very non-trivial depending upon the foreign-key relationships, and what constitutes a "unique" record in each table.
Does anyone know if there exists a tool to do this merge operation?
If nothing like this currently exists, I was considering writing my own management command, based on the standard loaddata command. Essentially, you'd use the standard dumpdata command to export tables from a source database, and then use a modified version of loaddata to "merge" them into the destination database.
For example, if I have databases A and B, and I want to merge database B into database A, then I'd want to follow a procedure according to the pseudo-code:
merge_database_dst = A
merge_database_src = B
for table in sorted(merge_database_dst.get_redundant_tables(merge_database_src), key=acyclic_dependency):
key = table.get_unique_column_key()
src_id_to_dst_id = {}
for record_src in merge_database_src.table.objects.all():
src_key_value = record_src.get_key_value(key)
try:
record_dst = merge_database_dst.table.objects.get(key)
dst_key_value = record_dst.get_key_value(key)
except merge_database_dst.table.DoesNotExist:
record_dst = merge_database_dst.table(**[(k,convert_fk(v)) for k,v in record_src._meta.fields])
record_dst.save()
dst_key_value = record_dst.get_key_value(key)
src_id_to_dst_id[(table,record_src.id)] = record_dst.id
The convert_fk() function would use the src_id_to_dst_id index to convert foreign key references in the source table to the equivalent IDs in the destination table.
To summarize, the algorithm would iterate over the table to be merged in the order of dependency, with parents iterated over first. So if we wanted to merge tables auth_users and mycustomprofile, which is dependent on auth_users, we'd iterate ['auth_users','mycustomprofile'].
Each merged table would need some sort of indicator documenting the combination of columns that denotes a universally unique record (i.e. the "key"). For auth_users, that might be the "username" and/or "email" column.
If the value of the key in database B already exists in A, then the record is not imported from B, but the ID of the existing record in A is recorded.
If the value of the key in database B does not exist in A, then the record is imported from B, and the ID of the new record is recorded.
Using the previously recorded ID, a mapping is created, explaining how to map foreign-key references to that specific record in B to the new merged/pre-existing record in A. When future records are merged into A, this mapping would be used to convert the foreign keys.
I could still envision some cases where an imported record references a table not included in the dumpdata, which might cause the entire import to fail, therefore some sort of "dryrun" option would be needed to simulate the import to ensure all FK references can be translated.
Does this seem like a practical approach? Is there a better way?
EDIT: This isn't exactly what I'm looking for, but I thought others might find it interesting. The Turbion project has a mechanism for copying changes between equivalent records in different Django models within the same database. It works by defining a translation layer (i.e. merging.ModelLayer) between two Django models, so, say if you update the "www" field in user bob#bob.com's profile, it'll automatically update the "url" field in user bob#bob.com's otherprofile.
The functionality I'm looking for is a bit different, in that I want to merge an entire (or partial) database snapshot at infrequent intervals, sort of the way the loaddata management command does.

Wow. This is going to be a complex job regardless. That said:
If I understand the needs of your project correctly, this can be something that can be done using a data migration in South. Even so, I'd be lying if I said it was going to be a joke.
My recommendation is -- and this is mostly a parrot of an assumption in your question, but I want to make it clear -- that you have one "master" table that is the base, and which has records from the other table added to it. So, table A keeps all of its existing records, and only gets additions from B. B feeds additions into A, and once done, B is deleted.
I'm hesitant to write you sample code because your actual job will be so much more complex than this, but I will anyway to try and point you in the right direction. Consider something like...
import datetime
from south.db import db
from south.v2 import DataMigration
from django.db import models
class Migration(DataMigration):
def forwards(self, orm):
for b in orm.B.objects.all():
# sanity check: does this item get copied into A at all?
if orm.A.objects.filter(username=b.username):
continue
# make an A record with the properties of my B record
a = orm.A(
first_name=b.first_name,
last_name=b.last_name,
email_address=b.email_address,
[...]
)
# save the new A record, and delete the B record
a.save()
b.delete()
def backwards(self, orm):
# backwards method, if you write one
This would end up migrating all of the Bs not in A to A, and leave you a table of Bs that are expected duplicates, which you could then check by some other means before deleting.
Like I said, this sample isn't meant to be complete. If you decide to go this route, spend time in the South documentation, and particularly make sure you look at data migrations.
That's my 2¢. Hope it helps.

Django: How do I get every table and all of that table's columns in a project?

I'm creating a set of SQL full database copy scripts using MySQL's INTO OUTFILE and LOAD DATA LOCAL INFILE.
Specifically:
SELECT {columns} FROM {table} INTO OUTFILE '{table}.csv'
LOAD DATA LOCAL INFILE '{table}.csv' REPLACE INTO {table} {columns}
Because of this, I don't need just the tables, I also need the columns for the tables.
I can get all of the tables and columns, but this doesn't include m2m tables:
from django.db.models import get_models()
for model in get_models():
table = model._meta.db_table
columns = [field.column for field in model._meta.fields]
I can also get all of the tables, but this doesn't give me access to the columns:
from django.db import connection
tables = connection.introspection.table_names()
How do you get every table and every corresponding column on that table for a Django project?
More details:
I'm doing this on a reasonably large dataset (>1GB) so using the flat file method seems to be the only reasonable way to make this large of a copy in MySQL. I already have the schema copied over (using ./manage.py syncdb --migrate) and the issue I'm having is specifically with copying the data, which requires me to have the tables and columns to create proper SQL statements. Also, the reason I can't use default column ordering is because the production database I'm copying from has different column ordering than what is created with a fresh syncdb (due to many months worth of migrations and schema changes).

Have you taken a look at manage.py ?
You can get boatloads of SQL information, for example to get all the create table syntax for an app within your project you can do:
python manage.py sqlall <appname>
If you type:
python manage.py help
You can see a ton of other features.

I dug in to the source to find this solution. I feel like there's probably a better way, but this does the trick.
This first block gets all of the normal (non-m2m) tables and their columns
from django.db import connection
from django.apps import apps
table_info = []
tables = connection.introspection.table_names()
seen_models = connection.introspection.installed_models(tables)
for model in apps.get_models():
if model._meta.proxy:
continue
table = model._meta.db_table
if table not in tables:
continue
columns = [field.column for field in model._meta.fields]
table_info.append((table, columns))
This next block was the tricky part. It gets all the m2m field tables and their columns.
for model in apps.get_models():
for field in model._meta.local_many_to_many:
if not field.creates_table:
continue
table = field.m2m_db_table()
if table not in tables:
continue
columns = ['id'] # They always have an id column
columns.append(field.m2m_column_name())
columns.append(field.m2m_reverse_name())
table_info.append((table, columns))

Have you looked into "manage.py dumpdata" and "manage.py loaddata"? They dump and load in json format. I use it to dump stuff from one site and overwrite another site's database. It doesn't have an "every database" option on dumpdata, but you can call it in a loop on the results of a "manage.py dbshell" command.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.