The documentation just says
To save an object back to the database, call save()
That does not make it clear. Exprimenting, I found that if I include an id, it updates existing entry, while, if I don't, it creates a new row. Does the documentation specify what happens?
It's fuly documented here:
https://docs.djangoproject.com/en/2.2/ref/models/instances/#how-django-knows-to-update-vs-insert
You may have noticed Django database objects use the same save()
method for creating and changing objects. Django abstracts the need to
use INSERT or UPDATE SQL statements. Specifically, when you call
save(), Django follows this algorithm:
If the object’s primary key attribute is set to a value that evaluates
to True (i.e., a value other than None or the empty string), Django
executes an UPDATE. If the object’s primary key attribute is not set
or if the UPDATE didn’t update anything (e.g. if primary key is set to
a value that doesn’t exist in the database), Django executes an
INSERT. The one gotcha here is that you should be careful not to
specify a primary-key value explicitly when saving new objects, if you
cannot guarantee the primary-key value is unused. For more on this
nuance, see Explicitly specifying auto-primary-key values above and
Forcing an INSERT or UPDATE below.
As a side note: django is OSS so when in doubt you can always read the source code ;-)
Depends on how the Model object was created. If it was queried from the database, UPDATE. If it's a new object and has not been saved before, INSERT.
Related
Is it possible to use a couple of fields not from the primary key to retrieve items (already fetched earlier) from the identity map? For example, I often query a table by (external_id, platform_id) pair, which is a unique key, but not a primary key. And I want to omit unnecessary SQL queries in such cases.
A brief overview of identity_map and get():
An identity map is kept for a lifecycle of a SQLAlchemy's session object i.e. in case of a web-service or a RESTful api the session object's lifecycle is not more than a single request (recommended).
From : http://martinfowler.com/eaaCatalog/identityMap.html
An Identity Map keeps a record of all objects that have been read from
the database in a single business transaction. Whenever you want an object, you check the Identity Map first to see if you already have it.
In SQLAlchemy's ORM there's this special query method get(), it first looks into identity_map using the pk (only allowed argument) and returns object from the identity map, actually executing the SQL query and hitting the database.
From the docs:
get(ident)
Return an instance based on the given primary key identifier, or None
if not found.
get() is special in that it provides direct access to the identity
map of the owning Session. If the given primary key identifier is
present in the local identity map, the object is returned directly
from this collection and no SQL is emitted, unless the object has been
marked fully expired. If not present, a SELECT is performed in order
to locate the object.
Only get() is using the identity_map - official docs:
It’s somewhat used as a cache, in that it implements the identity map
pattern, and stores objects keyed to their primary key. However, it
doesn’t do any kind of query caching. This means, if you say
session.query(Foo).filter_by(name='bar'), even if Foo(name='bar')
is right there, in the identity map, the session has no idea about
that. It has to issue SQL to the database, get the rows back, and
then when it sees the primary key in the row, then it can look in the
local identity map and see that the object is already there. It’s
only when you say query.get({some primary key}) that the Session
doesn’t have to issue a query.
P.S. If you're querying not using pk, you aren't hitting the identity_map in the first place.
Few relevant SO questions, helpful to clear the concept:
Forcing a sqlalchemy ORM get() outside identity map
It's possible to access the whole identity map sequentially:
for obj in session.identity_map.values():
print(obj)
To get an object by arbitrary attributes, you then have to filter for the object type first and then check your attributes.
It's not a lookup in constant time, but can prevent unnecessary queries.
There is the argument, that objects may have been modified by another process and the identity map doesn't hold the current state, but this argument is invalid: If your transaction isolation level is read committed (or less) - and this is often the case, data ALWAYS may have been changed immediately after the query is finished.
I have created 31 objects in my database. Now, for some reason, if I create a new object through the Django admin page, the new object will have an id of 33. Now, suppose I change I change its id and then delete it. If I try to create a new object, it's id will be 34. So, the id is always shifted by 2. I'm very new to databases and Django, is there any reason for behaving like this? Thanks
Note: I didn't upload any code, because I don't think that's the problem...
By default, the id is an integer that is always incremented at the creation of an object. It is also incremented such that ids of deleted objects are never used again.
The incrementation is handled by the database itself, not Django. For example, with PostgreSQL, the corresponding database field corresponding the "id" has the "PRIMARY KEY" constraint. It basically means that the field should be not null, and with no duplicates. Moreover the field will be associated with a sequence, that stores the id to use for the next row creation. To change this number, run this in the database shell:
ALTER SEQUENCE yourobjectstable_id_seq RESTART WITH 1234;
However, as emphasized in the comments to your question, this is something you should not do: it is better to keep the "uniqueness" feature of the primary key, even for deleted objects, since other tables may use the id to refer to a row in your main table.
Django is proving the model field argument default (https://docs.djangoproject.com/en/dev/ref/models/fields/#default) but as I know it will be called every time new object is created via django.
If we insert/create a records with raw queries (using django.db.connection.cursor) we will get the exception cause Field 'xyz' doesn't have a default value.
How to represent the db level default value for the column in model. Like db_index.
I hope you guy understand my question.
There is an open ticket 470 to include default values in the SQL schema.
Until this feature has been added to Django, you'll have to manually run alter table statements yourself or write a migration to run them if you want a default value in the SQL schema.
Note that Django allows callables as defaults, so even if this feature is added to Django, it won't be possible to have all defaults in the database.
I'm working with SQLAlchemy for the first time and was wondering...generally speaking is it enough to rely on python's default equality semantics when working with SQLAlchemy vs id (primary key) equality?
In other projects I've worked on in the past using ORM technologies like Java's Hibernate, we'd always override .equals() to check for equality of an object's primary key/id, but when I look back I'm not sure this was always necessary.
In most if not all cases I can think of, you only ever had one reference to a given object with a given id. And that object was always the attached object so technically you'd be able to get away with reference equality.
Short question: Should I be overriding eq() and hash() for my business entities when using SQLAlchemy?
Short answer: No, unless you're working with multiple Session objects.
Longer answer, quoting the awesome documentation:
The ORM concept at work here is known as an identity map and ensures that all operations upon a particular row within a Session operate upon the same set of data. Once an object with a particular primary key is present in the Session, all SQL queries on that Session will always return the same Python object for that particular primary key; it also will raise an error if an attempt is made to place a second, already-persisted object with the same primary key within the session.
I had a few situations where my sqlalchemy application would load multiple instances of the same object (multithreading/ different sqlalchemy sessions ...). It was absolutely necessary to override eq() for those objects or I would get various problems. This could be a problem in my application design, but it probably doesn't hurt to override eq() just to be sure.
I'm using SQLAlchemy 0.7. I would like some 'post-processing' to occur after a session.flush(), namely, I need to access the instances involved in the flush() and iterate through them. The flush() call will update the database, but the instances involved also store some data in an LDAP database, I would like SQLAlchemy to trigger an update to that LDAP database by calling an instance method.
I figured I'd be using the after_flush(session, flush_context) event, detailed here, but how do I get a list of update()'d instances?
On a side note, how can I determine which columns have changed (or are 'dirty') on an instance. I've been able to find out if an instance as a whole is dirty, but not individual properties.
According to the link you provided:
Note that the session’s state is still in pre-flush, i.e. ‘new’, ‘dirty’, and ‘deleted’ lists still show pre-flush state as well as the history settings on instance attributes.
This means that you should be able to get an access of all the dirty objects in the session.dirty list. You'll note that the first parameter of the event callback is the current session object.
As for the second part, you can use the sqlalchemy.orm.attributes.get_history function to figure out which columns have been changed. It returns a History object for a given attribute which contains a has_changes() method.
If you're trying to listen for changes on specific class attributes, consider using Attribute Events instead.