Problem trying to re-use a primary key id with SQLAlchemy

Problem trying to re-use a primary key id with SQLAlchemy - python

I'm trying to reuse a primary key in one of my tables with SQLAlchemy and am getting foreign key constraint error.
In a nutshell:
PostgreSQL 8.4
Python 2.7
SQLAlchemy 0.7
I have 3 tables: User, Inventories and Devices. Inventories and Devices have a one-to-one relationship with User. User.id is Inventories.user_id and Devices.user_id foreign keyed.
I've got User, Devices and Inventories set up in models/ according to standard python practices.
Within interactive python I can issue the following commands no problem:
>>>newUser = User.create()
>>>newUser.device = User.create_device(<*args>)
>>>Session.add(newUser)
>>>Session.commit()
(an inventory record is automatically created in code)
Now, let's say I want to re-use User record 1 (it's the only record that will allow a method called reset in code for security and internal testing reasons)
>>>oldUser = User.retrieve(1)
>>>Session.delete(oldUser)
>>>Session.commit()
(confirm that user 1 no longer exists)
>>>newUser = User.create()
>>>newUser.device = User.create_device(<*args>)
>>>newUser.id = 1
>>>Session.add(newUser)
>>>Session.commit()
At this point I'll either get an eror that Key(id)=(<id>) is still referenced from table "devices" (or "inventories") where <id> is the newUser.id before re-assigning it to be id 1
I've looked into cascading and have tried the various options (all, save-update, etc) with no effect.
Any information pointing to where I'm going wrong would greatly be appreciated,
Thanks,
Krys

To address the error you're seeing, you could update the foreign keys on all of the Device and Inventory models associated with that User model before committing. You'll have to make sure that your User model doesn't auto-increment the id (i.e., that it isn't a PostgreSQL sequence).
For example, the SQLAlchemy model declaration should be
class User(base):
__tablename__ = 'user'
id = Column('id', Integer, primary_key=True, unique=True, nullable=False)
instead of
class User(base):
__tablename__ = 'user'
id = Column('id', Integer, Sequence('user_id_seq'), primary_key=True)
BUT, this is probably not the right way to do it! It would be a better design to use a sequence on User.id (like in the second model declaration), and add another field on the user table that indicates if the user is an admin (for the security/testing purposes you mentioned). This way you don't have to rely on magic numbers in your application (e.g., the user id) for application logic, especially security.

I ma not using SQLAlchemy, so i do not have a proper answer, but i can say that you must ask yourself what you want is really necessary?
Because,
You probably will break the data integrity, and that may couse serious problems.
You will need to break the auto-increment structure of the ID, so until then, you have to assign id's by hand or use a hand-written pre-save trigger to get a proper id.
If you have tables that have a User foreginkey that sets NOT null, thn you probably will have problem with freeing records related to a deleted user. If you do not null them, a re-used id will create a serious data-integrity problem (wrongly referanced relations)...
So first of all, you must decide if it worth it?

Since this is a problem that shouldn't be seen in production, just use SET CONSTRAINTS. You could use INITIALLY DEFERRED on your FOREIGN KEYs but I wouldn't recommend that since you're not dealing with a cyclic dependency that exists in production.

Related

What does adding on_delete to models.py do, and what should I put in it? [duplicate]

I'm quite familiar with Django, but I recently noticed there exists an on_delete=models.CASCADE option with the models. I have searched for the documentation for the same, but I couldn't find anything more than:
Changed in Django 1.9:
on_delete can now be used as the second positional argument (previously it was typically only passed as a keyword argument). It will be a required argument in Django 2.0.
An example case of usage is:
from django.db import models
class Car(models.Model):
manufacturer = models.ForeignKey(
'Manufacturer',
on_delete=models.CASCADE,
)
# ...
class Manufacturer(models.Model):
# ...
pass
What does on_delete do? (I guess the actions to be done if the model is deleted.)
What does models.CASCADE do? (any hints in documentation)
What other options are available (if my guess is correct)?
Where does the documentation for this reside?

This is the behaviour to adopt when the referenced object is deleted. It is not specific to Django; this is an SQL standard. Although Django has its own implementation on top of SQL. (1)
There are seven possible actions to take when such event occurs:
CASCADE: When the referenced object is deleted, also delete the objects that have references to it (when you remove a blog post for instance, you might want to delete comments as well). SQL equivalent: CASCADE.
PROTECT: Forbid the deletion of the referenced object. To delete it you will have to delete all objects that reference it manually. SQL equivalent: RESTRICT.
RESTRICT: (introduced in Django 3.1) Similar behavior as PROTECT that matches SQL's RESTRICT more accurately. (See django documentation example)
SET_NULL: Set the reference to NULL (requires the field to be nullable). For instance, when you delete a User, you might want to keep the comments he posted on blog posts, but say it was posted by an anonymous (or deleted) user. SQL equivalent: SET NULL.
SET_DEFAULT: Set the default value. SQL equivalent: SET DEFAULT.
SET(...): Set a given value. This one is not part of the SQL standard and is entirely handled by Django.
DO_NOTHING: Probably a very bad idea since this would create integrity issues in your database (referencing an object that actually doesn't exist). SQL equivalent: NO ACTION. (2)
Source: Django documentation
See also the documentation of PostgreSQL for instance.
In most cases, CASCADE is the expected behaviour, but for every ForeignKey, you should always ask yourself what is the expected behaviour in this situation. PROTECT and SET_NULL are often useful. Setting CASCADE where it should not, can potentially delete all of your database in cascade, by simply deleting a single user.
Additional note to clarify cascade direction
It's funny to notice that the direction of the CASCADE action is not clear to many people. Actually, it's funny to notice that only the CASCADE action is not clear. I understand the cascade behavior might be confusing, however you must think that it is the same direction as any other action. Thus, if you feel that CASCADE direction is not clear to you, it actually means that on_delete behavior is not clear to you.
In your database, a foreign key is basically represented by an integer field which value is the primary key of the foreign object. Let's say you have an entry comment_A, which has a foreign key to an entry article_B. If you delete the entry comment_A, everything is fine. article_B used to live without comment_A and don't bother if it's deleted. However, if you delete article_B, then comment_A panics! It never lived without article_B and needs it, it's part of its attributes (article=article_B, but what is article_B???). This is where on_delete steps in, to determine how to resolve this integrity error, either by saying:
"No! Please! Don't! I can't live without you!" (which is said PROTECT or RESTRICT in Django/SQL)
"All right, if I'm not yours, then I'm nobody's" (which is said SET_NULL)
"Good bye world, I can't live without article_B" and commit suicide (this is the CASCADE behavior).
"It's OK, I've got spare lover, I'll reference article_C from now" (SET_DEFAULT, or even SET(...)).
"I can't face reality, I'll keep calling your name even if that's the only thing left to me!" (DO_NOTHING)
I hope it makes cascade direction clearer. :)
Footnotes
(1) Django has its own implementation on top of SQL. And, as mentioned by #JoeMjr2 in the comments below, Django will not create the SQL constraints. If you want the constraints to be ensured by your database (for instance, if your database is used by another application, or if you hang in the database console from time to time), you might want to set the related constraints manually yourself. There is an open ticket to add support for database-level on delete constraints in Django.
(2) Actually, there is one case where DO_NOTHING can be useful: If you want to skip Django's implementation and implement the constraint yourself at the database-level.

The on_delete method is used to tell Django what to do with model instances that depend on the model instance you delete. (e.g. a ForeignKey relationship). The on_delete=models.CASCADE tells Django to cascade the deleting effect i.e. continue deleting the dependent models as well.
Here's a more concrete example. Assume you have an Author model that is a ForeignKey in a Book model. Now, if you delete an instance of the Author model, Django would not know what to do with instances of the Book model that depend on that instance of Author model. The on_delete method tells Django what to do in that case. Setting on_delete=models.CASCADE will instruct Django to cascade the deleting effect i.e. delete all the Book model instances that depend on the Author model instance you deleted.
Note: on_delete will become a required argument in Django 2.0. In older versions it defaults to CASCADE.
Here's the entire official documentation.

FYI, the on_delete parameter in models is backwards from what it sounds like. You put on_delete on a foreign key (FK) on a model to tell Django what to do if the FK entry that you are pointing to on your record is deleted. The options our shop have used the most are PROTECT, CASCADE, and SET_NULL. Here are the basic rules I have figured out:
Use PROTECT when your FK is pointing to a look-up table that really shouldn't be changing and that certainly should not cause your table to change. If anyone tries to delete an entry on that look-up table, PROTECT prevents them from deleting it if it is tied to any records. It also prevents Django from deleting your record just because it deleted an entry on a look-up table. This last part is critical. If someone were to delete the gender "Female" from my Gender table, I CERTAINLY would NOT want that to instantly delete any and all people I had in my Person table who had that gender.
Use CASCADE when your FK is pointing to a "parent" record. So, if a Person can have many PersonEthnicity entries (he/she can be American Indian, Black, and White), and that Person is deleted, I really would want any "child" PersonEthnicity entries to be deleted. They are irrelevant without the Person.
Use SET_NULL when you do want people to be allowed to delete an entry on a look-up table, but you still want to preserve your record. For example, if a Person can have a HighSchool, but it doesn't really matter to me if that high-school goes away on my look-up table, I would say on_delete=SET_NULL. This would leave my Person record out there; it just would just set the high-school FK on my Person to null. Obviously, you will have to allow null=True on that FK.
Here is an example of a model that does all three things:
class PurchPurchaseAccount(models.Model):
id = models.AutoField(primary_key=True)
purchase = models.ForeignKey(PurchPurchase, null=True, db_column='purchase', blank=True, on_delete=models.CASCADE) # If "parent" rec gone, delete "child" rec!!!
paid_from_acct = models.ForeignKey(PurchPaidFromAcct, null=True, db_column='paid_from_acct', blank=True, on_delete=models.PROTECT) # Disallow lookup deletion & do not delete this rec.
_updated = models.DateTimeField()
_updatedby = models.ForeignKey(Person, null=True, db_column='_updatedby', blank=True, related_name='acctupdated_by', on_delete=models.SET_NULL) # Person records shouldn't be deleted, but if they are, preserve this PurchPurchaseAccount entry, and just set this person to null.
def __unicode__(self):
return str(self.paid_from_acct.display)
class Meta:
db_table = u'purch_purchase_account'
As a last tidbit, did you know that if you don't specify on_delete (or didn't), the default behavior is CASCADE? This means that if someone deleted a gender entry on your Gender table, any Person records with that gender were also deleted!
I would say, "If in doubt, set on_delete=models.PROTECT." Then go test your application. You will quickly figure out which FKs should be labeled the other values without endangering any of your data.
Also, it is worth noting that on_delete=CASCADE is actually not added to any of your migrations, if that is the behavior you are selecting. I guess this is because it is the default, so putting on_delete=CASCADE is the same thing as putting nothing.

As mentioned earlier, CASCADE will delete the record that has a foreign key and references another object that was deleted. So for example if you have a real estate website and have a Property that references a City
class City(models.Model):
# define model fields for a city
class Property(models.Model):
city = models.ForeignKey(City, on_delete = models.CASCADE)
# define model fields for a property
and now when the City is deleted from the database, all associated Properties (eg. real estate located in that city) will also be deleted from the database
Now I also want to mention the merit of other options, such as SET_NULL or SET_DEFAULT or even DO_NOTHING. Basically, from the administration perspective, you want to "delete" those records. But you don't really want them to disappear. For many reasons. Someone might have deleted it accidentally, or for auditing and monitoring. And plain reporting. So it can be a way to "disconnect" the property from a City. Again, it will depend on how your application is written.
For example, some applications have a field "deleted" which is 0 or 1. And all their searches and list views etc, anything that can appear in reports or anywhere the user can access it from the front end, exclude anything that is deleted == 1. However, if you create a custom report or a custom query to pull down a list of records that were deleted and even more so to see when it was last modified (another field) and by whom (i.e. who deleted it and when)..that is very advantageous from the executive standpoint.
And don't forget that you can revert accidental deletions as simple as deleted = 0 for those records.
My point is, if there is a functionality, there is always a reason behind it. Not always a good reason. But a reason. And often a good one too.

Using CASCADE means actually telling Django to delete the referenced record.
In the poll app example below: When a 'Question' gets deleted it will also delete the Choices this Question has.
e.g Question: How did you hear about us?
(Choices: 1. Friends 2. TV Ad 3. Search Engine 4. Email Promotion)
When you delete this question, it will also delete all these four choices from the table.
Note that which direction it flows.
You don't have to put on_delete=models.CASCADE in Question Model put it in the Choice.
from django.db import models
class Question(models.Model):
question_text = models.CharField(max_length=200)
pub_date = models.dateTimeField('date_published')
class Choice(models.Model):
question = models.ForeignKey(Question, on_delete=models.CASCADE)
choice_text = models.CharField(max_legth=200)
votes = models.IntegerField(default=0)

simply put, on_delete is an instruction to specify what modifications will be made to the object in case the foreign object is deleted:
CASCADE: will remove the child object when the foreign object is deleted
SET_NULL: will set the child object foreign key to null
SET_DEFAULT: will set the child object to the default data given while creating the model
RESTRICT: raises a RestrictedError under certain conditions.
PROTECT: prevents the foreign object from being deleted so long there are child objects inheriting from it
additional links:
https://docs.djangoproject.com/en/4.0/ref/models/fields/#foreignkey

Here is answer for your question that says: why we use on_delete?
When an object referenced by a ForeignKey is deleted, Django by default emulates the behavior of the SQL constraint ON DELETE CASCADE and also deletes the object containing the ForeignKey. This behavior can be overridden by specifying the on_delete argument. For example, if you have a nullable ForeignKey and you want it to be set null when the referenced object is deleted:
user = models.ForeignKey(User, blank=True, null=True, on_delete=models.SET_NULL)
The possible values for on_delete are found in django.db.models:
CASCADE: Cascade deletes; the default.
PROTECT: Prevent deletion of the referenced object by raising ProtectedError, a subclass of django.db.IntegrityError.
SET_NULL: Set the ForeignKey null; this is only possible if null is True.
SET_DEFAULT: Set the ForeignKey to its default value; a default for the ForeignKey must be set.

Let's say you have two models, one named Person and another one named Companies, and that, by definition, one person can create more than one company.
Considering a company can have one and only one person, we want that when a person is deleted that all the companies associated with that person also be deleted.
So, we start by creating a Person model, like this
class Person(models.Model):
id = models.IntegerField(primary_key=True)
name = models.CharField(max_length=20)
def __str__(self):
return self.id+self.name
Then, the Companies model can look like this
class Companies(models.Model):
title = models.CharField(max_length=20)
description=models.CharField(max_length=10)
person= models.ForeignKey(Person,related_name='persons',on_delete=models.CASCADE)
Notice the usage of on_delete=models.CASCADE in the model Companies. That is to delete all companies when the person that owns it (instance of class Person) is deleted.

Reorient your mental model of the functionality of "CASCADE" by thinking of adding a FK to an already existing cascade (i.e. a waterfall). The source of this waterfall is a primary key (PK). Deletes flow down.
So if you define a FK's on_delete as "CASCADE," you're adding this FK's record to a cascade of deletes originating from the PK. The FK's record may participate in this cascade or not ("SET_NULL"). In fact, a record with a FK may even prevent the flow of the deletes! Build a dam with "PROTECT."

Deletes all child fields in the database when parent object is deleted then we use on_delete as so:
class user(models.Model):
commodities = models.ForeignKey(commodity, on_delete=models.CASCADE)

CASCADE will also delete the corresponding field connected with it.

SQLAlchemy: how to create a relationship programmatically

I'd like to create a 1:n relationship between two tables dynamically. My DB model is mapped via SQLAlchemy but due to some special features of my application I can not use the default declarative way.
E.g.
class Foo(Base):
id = Column(Integer, autoincrement=True, primary_key=True)
flag = Column(Boolean)
class Bar(Base):
id = Column(Integer, autoincrement=True, primary_key=True)
foo_id = Column(Integer, ForeignKey('foo.id'))
# declarative version:
# foo = relationship(Foo)
So I want to add relationship named "foo" to the mapped class "Bar" after Bar was defined and SQLAlchemy did its job of defining a mapper etc.
Update 2017-09-05: Why is this necessary for me? (I thought I could omit this because I think it mostly distracts from the actual problem to solve but since there were comments abouts it...)
First of all I don't have a single database but hundreds/thousands. Data in old databases must not be altered in any way but I want a single source code to access old data (even though data structure and calculation rules change significantly).
Currently we use multiple model definitions. Later definitions extend/modify previous ones. Often we manipulate SQLAlchemy models dynamically. We try not to have code in mapped classes because we think it will be much harder ensuring correctness of that code after changing a table many times (code must work in every intermediate step).
In many cases we extend tables (mapped classes) programatically in model X after it was initially defined in model X-1. Adding columns to an existing SQLAlchemy ORM class is manageable. Now we are adding a new reference column an existing table and a relationship() provides a nicer Python API.

Well, my question above is again a nice example of SQLAlchemy's super powers (and my limited understanding):
Bar.__mapper__.add_property('foo', relationship('Foo'))
Likely I was unable to get this working initially because some of my surrounding code mixed adding relationships and columns. Also there is one important difference to declaring columns:
Column('foo', Integer)
For columns the first parameter can be the column name but you can not use this for relationships. relationship('foo', 'Foo') triggers exceptions when passing it to .add_property().

Why does Django set a shorter maximum length of foreign keys to auth.User.username?

I have a model with a foreign key that references the username field of auth.User. The original field has a maximum length of 150. But Django generates a foreign key with a maximum length of 30.
In my app's models.py:
class Profile(models.Model):
user = models.ForeignKey('auth.User', to_field='username')
In django.contrib.auth.models:
username = models.CharField(
_('username'),
max_length=150,
Generated SQL:
CREATE TABLE "myapp_profile" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"user_id" varchar(30) NOT NULL REFERENCES "auth_user" ("username")
);
This only happens when referencing auth.User.username. If I reference a long field in my own model, the foreign key is generated fine.
Why is that? How can I overcome it?
Using Django 1.11.4 and Python 3.6.2. I tried PostgreSQL and SQLite and the problem occurs on both.
CLARIFICATION:
From the answers so far I think my question was misunderstood. I am not looking for a way to have long usernames. My problem is that the stock User model that comes with Django has one max_length (150), but when your model refers to it, the foreign hey has a shorter max_length of 30. Therefore if a user is registered with a username of 31 characters, I will not be able to create child objects of that user, because the foreign key constraint will be violated. And I need this because I have a REST API whose URLs nest resources under uses, that are referred by username, not ID. For example: /users/<username>/profiles/...
UPDATE:
I think the reason for this behavior is the undocumented swappable property of the User model. It is designed to be replaceable by custom models. However, the configured model must have its data in the initial migration of the app that defines the model. The migrations code seems to generate references to the initial migration of swappable models. I am using the default User model, and its initial migration sets the username to 30 chars. Hence my username FKs are 30 chars long. I am able to work around this with a RunSQL migration to alter the FK data type to varchar(15), but I am in doubt if it's the right thing to do.

Is recommended use short identifier, varchar(30) is a long number, something like 999999999999999999999999999999, when Django make identifiers always use the same number. I don't think that you are going to use so much users if you reach that number you should create another type of identifier. Remember the long of the user_id field is the id of the username and not the string

You can use this hack described in this SO answer,
but be very careful!.
Or you can use this package.
However, I think that, as described in this discussion, the best way would be to create a custom User model and do whatever you want there.
Hope it helps!

You must use custom user model.Taken from django docs.
150 characters or fewer. Usernames may contain alphanumeric, _, #, +, . and - characters.
The max_length should be sufficient for many use cases. If you need a longer length, please use a custom user model. If you use MySQL with the utf8mb4 encoding (recommended for proper Unicode support), specify at most max_length=191 because MySQL can only create unique indexes with 191 characters in that case by default.

Set alternative pk in Django

I would like to make Django generate 6+ digits number id's for one Model. I don't want to start from zero but I want this id's to be clear and readable by users. So the good one is for example: 658975
How to do that?
I've tried this:
class MyUUIDModel(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
But uuid generates huge sequences which is not user-friendly.
Do you have any advices? Maybe setting minimum number of autoincrement pk would be enough.

Can't you simply use pk+random number to show to the user and keeping the rest of the logic same ?
Else here you go How to make a primary key start from 1000?
Write a one of migration for this.

The best options to set the primary key to start from 100000. For example(MySQL),
ALTER TABLE django_app_table_name AUTO_INCREMENT = 100000
Since these commands are depends on the database vendor, it is better to write a our own custom migration using the Django migration documentation.

saving a django orm model with a foreign key

I am trying to figure out the best way to save a model that I've got using the django orm. I have a model/table, User. Additionally, I have a model/table called ContactInfo, where we store a foreign key to the User table.
I understand that common django orm practice would be to put the foreign key for the ContactInfo model into the User model, but at this point, we do not want to add anything to the already monolithic user table, so we put the foreign key into the ContactInfo model.
I understand that I can store the User model in the ContactInfo model, call save on ContactInfo, and it should save the User model, but what if I have a one-to-many relationship with users and their contact info? I would rather not have multiple instances of the user table within (1-many) instances of the contact info model/object.
If I can clear anything up, please let me know. At the current moment, the best idea I have is to store an instance of the ContactInfo list as user.contact_info, and override the save method for user user.save() to check for contact_info, and if it exists insert the user.id into each model and save. Unfortunately I just feel like this is a bit messy, but being new-er to django and python, I'm not sure what my options are.
Any help would be greatly appreciated, thanks!

I am not sure if I understand your question correctly. Django provides well support for 1-N relationship. If ContactInfo has a foreign key of User, by default, it's a 1-N mapping.
ContactInfo ---------> User
N 1
So, there is only one User record in your database, looks like this
Table User Table ContactInfo
---------------------------------------------
id user_name id user_id
1 someone 1 1
2 1
3 1
And you don't need to override save method. When you need to add a Contact,
contact = ContactInfo(user=target_user)
# other stuff
contact.save()
#or
target_user.contactinfo_set.create(...)#contactinfo_set is the related name of target_user
#Django maintains the foreign key things.
If you use methods above to insert a new ContactInfo record, then you do not need to iterate your contact_info list to insert user.id into the database.

I am not sure if you're meaning a custom User model or the standard model that ships with Django. If the latter, then Django provides a standard way of storing additional information, called user profiles, associated with each user. See this section in the documentation for details.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.