Deleting EmbeddedDocument with FileField from ListField

Deleting EmbeddedDocument with FileField from ListField - python

In MongoEngine, when deleting an EmbeddedDocument from a ListField which includes a FileField, the referenced file does not get deleted. Currently, I have solved the issue by looping over the whole list field.
for embdoc in doc.embdocs:
if embdoc.filtered == value:
embdoc.dfile.delete()
doc.update(pull__embdocs={'filtered': value})
I was wondering if there was a better way to do this.

By default, MongoDB doesn’t check the integrity of your data, so deleting documents that other documents still hold references to will lead to consistency issues.
You should use ListField with ReferenceFields. ReferenceFields can used with option reverse_delete_rule=mongoengine.PULL or another:
mongoengine.DO_NOTHING
This is the default and won’t do anything. Deletes are fast, but may cause database inconsistency or dangling references.
mongoengine.DENY
Deletion is denied if there still exist references to the object being deleted.
mongoengine.NULLIFY
Any object’s fields still referring to the object being deleted are removed (using MongoDB’s “unset” operation), effectively nullifying the relationship.
mongoengine.CASCADE
Any object containing fields that are refererring to the object being deleted are deleted first.
mongoengine.PULL
Removes the reference to the object (using MongoDB’s “pull” operation) from any object’s fields of ListField (ReferenceField).

I also was in need to delete a file in a list field inside an embedded document, after a lot of searches I came across this soultion
the Document:
class AllTheFiles(me.EmbeddedDocument):
type1 = me.ListField(me.FileField())
type2 = me.ListField(me.FileField())
class MainDocument(me.Document):
files = me.EmbeddedDocumentField(AllTheFiles)
I am assuming here that you have some documents and they have files, in the real world you will need to check if there are files and for documents existns.
So in order to delete the first file(index 0) in the type1 field:
del_here = MainDocument.objects()[0]
del_here.files.type1[0].delete()
del_here.files.type1.pop(0)
del_here.save()
The file will be deleted in the embedded document type1 list and also in "fs.files" and "fs.chuncks" colloctions.

Related

SQLAlchemy: Flush-Order for Inserts wrong?

I have 2 tables (contracts and contract_items) where the latter has a foreign key set to the first one.
When using SQLAlchemy for inserting new data from a list into my postgre database I'm basically doing the following:
for row in list:
# Get contract and item from row
...
session.add(contract)
session.add(contract_item)
# Do some select statements (which will raise an auto-flush)
...
session.commit()
Now... this works for maybe 2-3 runs, sometimes more, sometimes less. Then the part where an auto-flush is executed will end in an exception telling me that contract_item could not be inserted because it has an foreign key to contract and the contract row does not exist yet.
Is the order in which I pass the data to the add-function of the session not the order in which the data will be flushed? I actually hoped SQLAlchemy would find the right order in which to flush statements on it's own based on the dependencies. It should be clear that the contract_item row should not be inserted before the contract row, when contract_item has a foreign key to contract set. Yet the order seems to be random.
I then tried to flush the contract manually before adding contract_item:
for row in list:
# Getting contract and item from row
...
session.add(contract)
session.flush() # Flushing manually
session.add(contract_item)
# Do some select statements (which will raise an auto-flush)
...
session.commit()
This worked without any problems and the rows got inserted into the database.
Is there any way to set the order in which statements will be flushed for the session? Does SQLAlchemy really not care about dependencies such as foreign keys or am I making a mistake when adding the data? I'd rather not manage the flushs manually if somehow possible.
Is there a way to make SQLAlchemy get the order right?

Had the same problem. What solved it in my case is creating a biderictional relationship - you need to make relationship from contracts to contract_items, as described HERE
UPD: actually you can do it simplier: just add relationship from contract_items table to contract table and that should do the thing.

The way session handles related objects is defined by cascades. Use "save-update" cascade on a relationship (that is enabled by default) to automatically add related objects, so that you only have to use one add call. The documentation I linked contains code example.

GAE python ndb - How to get_by_id with projection?

I'd like to do this.
Content.get_by_id(content_id, projection=['title'])
However, I got an error.
TypeError: Unknown configuration option ('projection')
I should do like this. How?
Content.query(key=Key('Content', content_id)).get(projection=['title'])
Why bother projection for getting an entity? Because Content.body could be large so that I want to reduce db read time and instance hours.

If you are using ndb, the below query should work
Content.query(key=Key('Content', content_id)).get(projection=[Content.title])
Note: It gets this data from the query index. So, make sure that index is enabled for the column. Reference https://developers.google.com/appengine/docs/python/ndb/queries#projection

I figured out that following code.
Content.query(Content.key == ndb.Key('Content', content_id)).get(projection=['etag'])
I found a hint from https://developers.google.com/appengine/docs/python/ndb/properties
Don't name a property "key." This name is reserved for a special
property used to store the Model key. Though it may work locally, a
property named "key" will prevent deployment to App Engine.

There is a simpler method than the currently posted answers.
As previous answers have mentioned, projections are only for ndb.Queries.
Previous answers suggest to use the entity returned by get_by_id to perform a projection query in the form of:
<Model>.query(<Model>.key == ndb.Key('<Model>', model_id).get(projection=['property_1', 'property_2', ...])
However, you can just manipulate the model's _properties directly. (See: https://cloud.google.com/appengine/docs/standard/python/ndb/modelclass#intro_properties)
For example:
desired_properties = ['title', 'tags']
content = Content.get_by_id(content_id)
content._properties = {k: v for k, v in content._properties.iteritems()
if k in desired_properties}
print content
This would update the entity properties and only return those properties whose keys are in the desired_properties list.
Not sure if this is the intended functionality behind _properties but it works, and it also prevents the need of generating/maintaining additional indexes for the projection queries.
The only down-side is that this retrieves the entire entity in-memory first. If the entity has arbitrarily large metadata properties that will affect performance, it would be a better idea to use the projection query instead.

Projection is only for query, not get by id. You can put the content.body in a different db model and store only the ndb.Key of it in the Content.

What is meant by OpenERP fields.reference?

I saw this code segment in subscription.py class. It gives selection and many2one fields together for users. I found in openerp documentation and another modules also but i never found any details or other samples for this
here is the its view
here is the code related to that field
'doc_source': fields.reference('Source Document', required=True, selection=_get_document_types, size=128),
here is the selection part function code
def _get_document_types(self, cr, uid, context=None):
cr.execute('select m.model, s.name from subscription_document s, ir_model m WHERE s.model = m.id order by s.name')
return cr.fetchall()
I Need to know that; can we make our own fields.reference type fields.?
another combination instead of MODEL,NAME..?

In the OpenERP framework a fields.reference field is a pseudo-many2one relationship that can target multiple models. That is, it contains the name of the target model in addition to the foreign key, so that each value can belong to a different table. The user interface first presents a drop-down where the user selects the target document model, and then a many2one widget in which they can pick the specific document from that model.
You can of course use it in your own modules, but it will always behave in this manner.
This is typically used for attaching various documents (similarly to attachments except the target is another record rather than a file). It's also used in some internal OpenERP models that need to be attached to different types of record, such as properties (fields.property values) that may belong to any record.
The fields.reference constructor takes 3 main parameters:
'doc': fields.reference('Field Label', selection, size)
where selection contains the list of document models from which values can be selected (e.g Partners, Products, etc.), in the same form as in a fields.selection declaration. The key of the selection values must be the model names (e.g. 'res.partner').
As of OpenERP 7.0 the size parameter should be None, unless you want to specifically restrict the size of the database field where the values will be stored, which is probably a bad idea. Technically, fields.reference values are stored as text in the form model.name,id. You won't be able to use these fields in a regular SQL JOIN, so they won't behave like many2one fields in many cases.
Main API calls
When you programmatically read() a non-null reference value you have to split it on ',' to identify the target model and target ID
When you programmatically write() a non-null reference value you need to pass the 'model.name,id' string.
When you search() for a non-null reference value you need to search for the 'model.name,id' string (e.g. in a search domain)
Finally, when you browse() through a reference value programmatically the framework will automatically dereference it and follow the relationship as with a regular many2one field - this is the main exception to the rule ;-)

MongoEngine 0.8.3 NotUniqueError on _id field

After upgrading MongoEngine from 0.7.9 to 0.8.3, any attempts to save any existing documents in any collection results in a NotUniqueError (user collection shown in example):
Tried to save duplicate unique keys (E11000 duplicate key error index: foo.user.$_id_ dup key: { : ObjectId('xxxxxx') })
I get the same error if I create a new document and save it more than once:
a = Foo()
a.save()
a.save() # results in duplicate error
Mongo by default creates an index on _id which cannot be removed, and I have no other indexes which use _id. Most issues similar to this that I've seen have been on duplicate indexes that aren't _id and can be removed, but this is really odd. I am doing nothing weird with the _id field, just letting Mongo generate it on its own.
Any ideas on what might be causing this to happen?
Thanks!

There was a custom save function which hadn't been migrated to using the new save() arguments, so one of them was caused force_insert to evaluate to true.
So dumb...

Django get_or_create raises Duplicate entry for key Primary with defaults

Help! Can't figure this out! I'm getting a Integrity error on get_or_create even with a defaults parameter set.
Here's how the model looks stripped down.
class Example(models.Model):model
user = models.ForeignKey(User)
text = models.TextField()
def __unicode__(self):
return "Example"
I run this in Django:
def create_example_model(user, textJson):
defaults = {text: textJson.get("text", "undefined")}
model, created = models.Example.objects.get_or_create(
user=user,
id=textJson.get("id", None),
defaults=defaults)
if not created:
model.text = textJson.get("text", "undefined")
model.save()
return model
I'm getting an error on the get_or_create line:
IntegrityError: (1062, "Duplicate entry '3020' for key 'PRIMARY'")
It's live so I can't really tell what the input is.
Help? There's actually a defaults set, so it's not like, this problem where they do not have a defaults. Plus it doesn't have together-unique. Django : get_or_create Raises duplicate entry with together_unique
I'm using python 2.6, and mysql.

You shouldn't be setting the id for objects in general, you have to be careful when doing that.
Have you checked to see the value for 'id' that you are putting into the database?
If that doesn't fix your issue then it may be a database issue, for PostgreSQL there is a special sequence used to increment the ID's and sometimes this does not get incremented. Something like the following:
SELECT setval('tablename_id_seq', (SELECT MAX(id) + 1 FROM
tablename_id_seq));

get_or_create() will try to create a new object if it can't find one that is an exact match to the arguments you pass in.
So is what I'm assuming is happening is that a different user has made an object with the id of 3020. Since there is no object with the user/id combo you're requesting, it tries to make a new object with that combo, but fails because a different user has already created an item with the id of 3020.
Hopefully that makes sense. See what the following returns. Might give a little insight as to what has gone on.
models.Example.objects.get(id=3020)
You might need to make 3020 a string in the lookup. I'm assuming a string is coming back from your textJson.get() method.

One common but little documented cause for get_or_create() fails is corrupted database indexes.
Django depends on the assumption that there is only one record for given identifier, and this is in turn enforced using UNIQUE index on this particular field in the database. But indexes are constantly being rewritten and they may get corrupted e.g. when the database crashes unexpectedly. In such case the index may no longer return information about an existing record, another record with the same field is added, and as result you'll be hitting the IntegrityError each time you try to get or create this particular record.
The solution is, at least in PostgreSQL, to REINDEX this particular index, but you first need to get rid of the duplicate rows programmatically.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.