I am trying to create a method to duplicate a model instance
new_object = old_object.duplicate_in_db()
According to Django 3.2 documentation model instance can be duplicated in database by setting entity pk and/or id to None :
# Model method
def duplicate_in_db(self):
self.pk = None
self._state.adding = True
# Some other duplication steps happens here e.g. duplicating OneToOneFields attributes
# ...
self.save()
However a side effect is that original object has its id changed (along other attributes...)
old_object_pk = old_object.pk
new_object = old_object.duplicate_in_db()
print(old_object_pk == old_object.pk) # False
What would be an effective duplicate method/function without any side effect?
Refreshing from db with correct id do the job, Downside is having to do multiple database call.
def duplicate_from_db(self):
old_pk = self.pk # storing for refreshing latter
self.pk = None
self._state.adding = True
self.save()
new_pk = self.pk
self.pk = old_pk
self.refresh_from_db()
return MyModel.objects.get(pk=new_pk)
I'm trying to override update method of ModelSerializer so I can update the related set of objects while updating the instance.
The problem is that validated_data doesn't include ids of these objects even when I'm passing there to the serializer.
def update(self, instance, validated_data):
subcontracts = validated_data.pop('subcontracts') # THERE ARE NO IDS HERE
contract = super().update(instance,validated_data)
instance.subcontracts.exclude(id__in=[x.get('id',None) for x in subcontracts if x is not None]).delete()
for subcontract in subcontracts:
_id = subcontract.get('id',None)
if _id:
sc = SubContract.objects.get(id=_id)
else:
sc = SubContract()
for attr, value in subcontract.items():
setattr(sc, attr, value)
sc.contract = instance
sc.save()
return contract
Basically, what I want to do:
{id:4, some_data..., subcontracts:[{id:1,description:'xxx'},{id:2,description:'yyy'}]}
This data would (except updating the instance) delete related subcontracts not having id 1 or 2 and update the rest.
Do you know what to do?
I'm working on a project using Flask and a PostgreSQL database, with SQLAlchemy.
I have Group objects which have a list of User IDs who are members of the group. For some reason, when I try to add an ID to a group, it will not save properly.
If I try members.append(user_id), it doesn't seem to work at all. However, if I try members += [user_id], the id will show up in the view listing all the groups, but if I restart the server, the added value(s) is (are) not there. The initial values, however, are.
Related code:
Adding group to the database initially:
db = SQLAlchemy(app)
# ...
g = Group(request.form['name'], user_id)
db.session.add(g)
db.session.commit()
The Group class:
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.dialects.postgresql import ARRAY
class Group(db.Model):
__tablename__ = "groups"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(128))
leader = db.Column(db.Integer)
# list of the members in the group based on user id
members = db.Column(ARRAY(db.Integer))
def __init__(self, name, leader):
self.name = name
self.leader = leader
self.members = [leader]
def __repr__(self):
return "Name: {}, Leader: {}, Members: {}".format(self.name, self.leader, self.members)
def add_user(self, user_id):
self.members += [user_id]
My test function for updating the Group:
def add_2_to_group():
g = Group.query.all()[0]
g.add_user(2)
db.session.commit()
return redirect(url_for('show_groups'))
Thanks for any help!
As you have mentioned, the ARRAY datatype in sqlalchemy is immutable. This means it isn’t possible to add new data into array once it has been initialised.
To solve this, create class MutableList.
from sqlalchemy.ext.mutable import Mutable
class MutableList(Mutable, list):
def append(self, value):
list.append(self, value)
self.changed()
#classmethod
def coerce(cls, key, value):
if not isinstance(value, MutableList):
if isinstance(value, list):
return MutableList(value)
return Mutable.coerce(key, value)
else:
return value
This snippet allows you to extend a list to add mutability to it. So, now you can use the class above to create a mutable array type like:
class Group(db.Model):
...
members = db.Column(MutableList.as_mutable(ARRAY(db.Integer)))
...
You can use the flag_modified function to mark the property as having changed. In this example, you could change your add_user method to:
from sqlalchemy.orm.attributes import flag_modified
# ~~~
def add_user(self, user_id):
self.members += [user_id]
flag_modified(self, 'members')
To anyone in the future: so it turns out that arrays through SQLAlchemy are immutable. So, once they're initialized in the database, they can't change size. There's probably a way to do this, but there are better ways to do what we're trying to do.
This is a hacky solution, but what you can do is:
Store the existing array temporarily
Set the column value to None
Set the column value to the existing temporary array
For example:
g = Group.query.all()[0]
temp_array = g.members
g.members = None
db.session.commit()
db.session.refresh(g)
g.members = temp_array
db.session.commit()
In my case it was solved by using the new reference for storing a object variable and assiging that new created variable in object variable.so, Instead of updating the existing objects variable it will create a new reference address which reflect the changes.
Here in Model,
Table: question
optional_id = sa.Column(sa.ARRAY(sa.Integer), nullable=True)
In views,
option_list=list(question.optional_id if question.optional_id else [])
if option_list:
question.optional_id.clear()
option_list.append(obj.id)
question.optional_id=option_list
else:
question.optional_id=[obj.id]
I've been working with Scrapy but run into a bit of a problem.
DjangoItem has a save method to persist items using the Django ORM. This is great, except that if I run a scraper multiple times, new items will be created in the database even though I may just want to update a previous value.
After looking at the documentation and source code, I don't see any means to update existing items.
I know that I could call out to the ORM to see if an item exists and update it, but it would mean calling out to the database for every single object and then again to save the item.
How can I update items if they already exist?
Unfortunately, the best way that I found to accomplish this is to do exactly what was stated: Check if the item exists in the database using django_model.objects.get, then update it if it does.
In my settings file, I added the new pipeline:
ITEM_PIPELINES = {
# ...
# Last pipeline, because further changes won't be saved.
'apps.scrapy.pipelines.ItemPersistencePipeline': 999
}
I created some helper methods to handle the work of creating the item model, and creating a new one if necessary:
def item_to_model(item):
model_class = getattr(item, 'django_model')
if not model_class:
raise TypeError("Item is not a `DjangoItem` or is misconfigured")
return item.instance
def get_or_create(model):
model_class = type(model)
created = False
# Normally, we would use `get_or_create`. However, `get_or_create` would
# match all properties of an object (i.e. create a new object
# anytime it changed) rather than update an existing object.
#
# Instead, we do the two steps separately
try:
# We have no unique identifier at the moment; use the name for now.
obj = model_class.objects.get(name=model.name)
except model_class.DoesNotExist:
created = True
obj = model # DjangoItem created a model for us.
return (obj, created)
def update_model(destination, source, commit=True):
pk = destination.pk
source_dict = model_to_dict(source)
for (key, value) in source_dict.items():
setattr(destination, key, value)
setattr(destination, 'pk', pk)
if commit:
destination.save()
return destination
Then, the final pipeline is fairly straightforward:
class ItemPersistencePipeline(object):
def process_item(self, item, spider):
try:
item_model = item_to_model(item)
except TypeError:
return item
model, created = get_or_create(item_model)
update_model(model, item_model)
return item
I think it could be done more simply with
class DjangoSavePipeline(object):
def process_item(self, item, spider):
try:
product = Product.objects.get(myunique_id=item['myunique_id'])
# Already exists, just update it
instance = item.save(commit=False)
instance.pk = product.pk
except Product.DoesNotExist:
pass
item.save()
return item
Assuming your django model has some unique id from the scraped data, such as a product id, and here assuming your Django model is called Product.
for related models with foreignkeys
def update_model(destination, source, commit=True):
pk = destination.pk
source_fields = fields_for_model(source)
for key in source_fields.keys():
setattr(destination, key, getattr(source, key))
setattr(destination, 'pk', pk)
if commit:
destination.save()
return destination
I'd like to save my files using the primary key of the entry.
Here is my code:
def get_nzb_filename(instance, filename):
if not instance.pk:
instance.save() # Does not work.
name_slug = re.sub('[^a-zA-Z0-9]', '-', instance.name).strip('-').lower()
name_slug = re.sub('[-]+', '-', name_slug)
return u'files/%s_%s.nzb' % (instance.pk, name_slug)
class File(models.Model):
nzb = models.FileField(upload_to=get_nzb_filename)
name = models.CharField(max_length=256)
I know the first time an object is saved the primary key isn't available, so I'm willing to take the extra hit to save the object just to get the primary key, and then continue on.
The above code doesn't work. It throws the following error:
maximum recursion depth exceeded while calling a Python object
I'm assuming this is an infinite loop. Calling the save method would call the get_nzb_filename method, which would again call the save method, and so on.
I'm using the latest version of the Django trunk.
How can I get the primary key so I can use it to save my uploaded files?
Update #muhuk:
I like your solution. Can you help me implement it? I've updated my code to the following and the error is 'File' object has no attribute 'create'. Perhaps I'm using what you've written out of context?
def create_with_pk(self):
instance = self.create()
instance.save()
return instance
def get_nzb_filename(instance, filename):
if not instance.pk:
create_with_pk(instance)
name_slug = re.sub('[^a-zA-Z0-9]', '-', instance.name).strip('-').lower()
name_slug = re.sub('[-]+', '-', name_slug)
return u'files/%s_%s.nzb' % (instance.pk, name_slug)
class File(models.Model):
nzb = models.FileField(upload_to=get_nzb_filename, blank=True, null=True)
name = models.CharField(max_length=256)
Instead of enforcing the required field in my model I'll do it in my Form class. No problem.
It seems you'll need to pre-generate your File models with empty file fields first. Then pick up one and save it with the given file object.
You can have a custom manager method like this;
def create_with_pk(self):
instance = self.create()
instance.save() # probably this line is unneeded
return instance
But this will be troublesome if either of your fields is required. Because you are initially creating a null object, you can't enforce required fields on the model level.
EDIT
create_with_pk is supposed to be a custom manager method, in your code it is just a regular method. Hence self is meaningless. It is all properly documented with examples.
You can do this by setting upload_to to a temporary location and by creating a custom save method.
The save method should call super first, to generate the primary key (this will save the file to the temporary location). Then you can rename the file using the primary key and move it to it's proper location. Call super one more time to save the changes and you are good to go! This worked well for me when I came across this exact issue.
For example:
class File( models.Model ):
nzb = models.FileField( upload_to='temp' )
def save( self, *args, **kwargs ):
# Call save first, to create a primary key
super( File, self ).save( *args, **kwargs )
nzb = self.nzb
if nzb:
# Create new filename, using primary key and file extension
oldfile = self.nzb.name
dot = oldfile.rfind( '.' )
newfile = str( self.pk ) + oldfile[dot:]
# Create new file and remove old one
if newfile != oldfile:
self.nzb.storage.delete( newfile )
self.nzb.storage.save( newfile, nzb )
self.nzb.name = newfile
self.nzb.close()
self.nzb.storage.delete( oldfile )
# Save again to keep changes
super( File, self ).save( *args, **kwargs )
Context
Had the same issue.
Solved it attributing an id to the current object by saving the object first.
Method
create a custom upload_to function
detect if object has pk
if not, save instance first, retrieve the pk and assign it to the object
generate your path with that
Sample working code :
class Image(models.Model):
def upload_path(self, filename):
if not self.pk:
i = Image.objects.create()
self.id = self.pk = i.id
return "my/path/%s" % str(self.id)
file = models.ImageField(upload_to=upload_path)
You can create pre_save and post_save signals. Actual file saving will be in post_save, when pk is already created.
Do not forget to include signals in app.py so they work.
Here is an example:
_UNSAVED_FILE_FIELD = 'unsaved_file'
#receiver(pre_save, sender=File)
def skip_saving_file_field(sender, instance: File, **kwargs):
if not instance.pk and not hasattr(instance, _UNSAVED_FILE_FIELD):
setattr(instance, _UNSAVED_FILE_FIELD, instance.image)
instance.nzb = None
#receiver(post_save, sender=File)
def save_file_field(sender, instance: Icon, created, **kwargs):
if created and hasattr(instance, _UNSAVED_FILE_FIELD):
instance.nzb = getattr(instance, _UNSAVED_FILE_FIELD)
instance.save()
Here are 2 possible solutions:
Retrieve id before inserting a row
For simplicity I use postgresql db, although it is possible to adjust implementation for your db backend.
By default django creates id as bigserial (or serial depending on DEFAULT_AUTO_FIELD). For example, this model:
class File(models.Model):
nzb = models.FileField(upload_to=get_nzb_filename)
name = models.CharField(max_length=256)
Produces the following DDL:
CREATE TABLE "example_file" ("id" bigserial NOT NULL PRIMARY KEY, "nzb" varchar(100) NOT NULL, "name" varchar(256) NOT NULL);
There is no explicit sequence specification. By default bigserial creates sequence name in the form of tablename_colname_seq (example_file_id_seq in our case)
The solution is to retrieve this id using nextval :
def get_nextval(model, using=None):
seq_name = f"{model._meta.db_table}_id_seq"
if using is None:
using = "default"
with connections[using].cursor() as cursor:
cursor.execute("select nextval(%s)", [seq_name])
return cursor.fetchone()[0]
And set it before saving the model:
class File(models.Model):
# fields definition
def save(
self, force_insert=False, force_update=False, using=None, update_fields=None
):
if not self.pk:
self.pk = get_nextval(self, using=using)
force_insert = True
super().save(
force_insert=force_insert,
force_update=force_update,
using=using,
update_fields=update_fields,
)
Note that we rely on force_insert behavior, so make sure to read documentation and cover your code with tests:
from django.core.files.uploadedfile import SimpleUploadedFile
from django.forms import ModelForm
from django.test import TestCase
from example import models
class FileForm(ModelForm):
class Meta:
model = models.File
fields = (
"nzb",
"name",
)
class FileTest(TestCase):
def test(self):
form = FileForm(
{
"name": "picture",
},
{
"nzb": SimpleUploadedFile("filename", b"content"),
},
)
self.assertTrue(form.is_valid())
form.save()
self.assertEqual(models.File.objects.count(), 1)
f = models.File.objects.first()
self.assertRegexpMatches(f.nzb.name, rf"files/{f.pk}_picture(.*)\.nzb")
Insert without nzt then update with actual nzt value
The idea is self-explanatory - we basically pop nzt on the object creation and save object again after we know id:
def save(
self, force_insert=False, force_update=False, using=None, update_fields=None
):
nzb = None
if not self.pk:
nzb = self.nzb
self.nzb = None
super().save(
force_insert=force_insert,
force_update=force_update,
using=using,
update_fields=update_fields,
)
if nzb:
self.nzb = nzb
super().save(
force_insert=False,
force_update=True,
using=using,
update_fields=["nzb"],
)
Test is updated to check actual queries:
def test(self):
form = FileForm(
{
"name": "picture",
},
{
"nzb": SimpleUploadedFile("filename", b"content"),
},
)
self.assertTrue(form.is_valid())
with CaptureQueriesContext(connection) as ctx:
form.save()
self.assertEqual(models.File.objects.count(), 1)
f = models.File.objects.first()
self.assertRegexpMatches(f.nzb.name, rf"files/{f.pk}_picture(.*)\.nzb")
self.assertEqual(len(ctx.captured_queries), 2)
insert, update = ctx.captured_queries
self.assertEqual(
insert["sql"],
'''INSERT INTO "example_file" ("nzb", "name") VALUES ('', 'picture') RETURNING "example_file"."id"''',
)
self.assertRegexpMatches(
update["sql"],
rf"""UPDATE "example_file" SET "nzb" = 'files/{f.pk}_picture(.*)\.nzb' WHERE "example_file"."id" = {f.pk}""",
)
Ty, is there a reason you rolled your own slugify filter?
Django ships with a built-in slugify filter, you can use it like so:
from django.template.defaultfilters import slugify
slug = slugify(some_string)
Not sure if you were aware it was available to use...
You can use the next available primary key ID:
class Document(models.Model):
def upload_path(self, filename):
if not self.pk:
document_next_id = Document.objects.order_by('-id').first().id + 1
self.id = self.pk = document_next_id
return "my/path/document-%s" % str(self.pk)
document = models.FileField(upload_to=upload_path)
Details
My example is a modification of #vinyll's answer, however, the problem Giles mentioned in his comment (two objects being created) is resolved here.
I am aware that my answer is not perfect, and there can be issues with the "next available ID", e.g., when more users will attempt to submit many forms at once. Giles's answer is more robust, mine is simpler (no need to generate temp files, then moving files, and deleting them). For simpler applications, this will be enough.
Also credits to Tjorriemorrie for the clear example on how to get the next available ID of an object.
Well I'm not sure of my answer but -
use nested models, if you can -
class File(models.Model):
name = models.CharField(max_length=256)
class FileName(models.Model):
def get_nzb_filename(instance, filename):
return instance.name
name = models.ForeignKey(File)
nzb = models.FileField(upload_to=get_nzb_filename)
And in create method -
File_name = validated_data.pop(file_name_data)
file = File.objects.create(validated_data)
F = FileName.objects.create(name=file, **File_name)