How to delete a record using GQL? - python

I need to iterate and delete all records of my datastore. I am using Google App engine Launcher to test it on local host. How to do it?
When I am trying to delete all recors in Person model that way:
qObj = Person.all()
db.delete(qObj)
I am getting error BadValueError: Property y must be a str or unicode instance, not a long
I guess there is conflict in Model data types.
class Person(db.Model):
name = db.StringProperty()
x = db.StringProperty()
y = db.StringProperty()
group = db.StringProperty()
The field y = db.StringProperty() previously was y = db.IntegerProperty().
At this moment I need to flush all db records. How can I do that?
Is there is an opportunity to delete local file which stores all db records?

The GQL language can only be used to retrieve entities or key (cf. http://code.google.com/appengine/docs/python/datastore/gqlreference.html)
You'll have to do this:
persons = Person.all()
for p in persons:
p.delete()
Regarding the error BadValueError: Property y must be a str or unicode instance, not a long, you'll have to modify all the data (from integer to string) in the database to resolve the conflict.
It seems that you want to delete everything, so another solution would be to just go to the datastore administration page - http://localhost:8080/_ah/admin on localhost or via https://appengine.google.com/ - and remove everything.
You might find this useful: http://code.google.com/appengine/articles/update_schema.html

If you have a variable that stores a record on the database then you can simply use delete().
That is, say you have an Entity called Persons, you can do:
personToDelete = db.GqlQuery("SELECT * FROM Persons WHERE name='Joe'");
person = personToDelete[0];
person.delete();
You do also have to import the database library, but I'm assuming you do that anyway given that you're clearly using the database.

Just to share a helpful tips on the already accepted answer.
You can do the following with db.delete:
persons = Person.all()
d = []
for p in persons:
d.append(p)
db.delete(d)
This saves a lot of db operations.

Related

Unable to update document without specify Primary Key

schema.py:
class Test(Document):
_id = StringField()
classID = StringField(required=True, unique=True)
status = StringField()
====================
database.py:
query = schema.Test(_id = id)
query.update(status = "confirm")
Critical error occured. attempt to update a document not yet saved
I can update the DB only if I specify _id = StringField(primary_key=True), but if I insert a new data, the _id has to be inserted by me instead of automatically created by MongoDB.
Anyone can help me with a solution?
Thanks!
Inserts and updates are distinct operations in MongoDB:
Insert adds a document to the collection
Update finds a document in the collection given a search criteria, then changes this document
If you haven't inserted a document, trying to update it won't do anything since it will never be found by any search criteria. Your ODM is pointing this out to you and prevents you from updating a document you haven't saved. Using the driver you can issue the update anyway but it won't have any effect.
If you want to add a new document to the database, use inserts. To change documents that are already saved, use updates. To change fields on document instances without saving them, consult your ODM documentation to figure out how to do that instead of attempting to save the documents.

How to change Column label using SqlAlchemy ORM

I have MS Access DB file (.accdb), and inside of file stored photos of persons as attachments. In table constructor I see only one field "photos" with type "Attachment". Actually there three hidden fields with names: photos.FileData, photos.FileName, photos.FileType. For parsing these fields I created following class:
class Person:
__tablename__ = 'persons'
name = Column(String(255), name='name')
photos_data = Column(String, name='photos.FileData', quote=False)
....
If I try to get all attributes of Person in same time, as following:
persons = session.query(Person)
I get error in following generated piece of SQL statement:
SELECT ... [persons].photos.FileData AS persons_photos.FileData ...;
As you can see there dot sign present in alias, which raises ODBC error. I can avoid such behavior to request FileData as separate value:
persons = session.query(Person.photos_data.label('photos_data'))
Or I can use raw SQL without aliases. But This is not normal ORM way that I need, because I have to manually construct Persons object each time after request to DB.
Is it possible to set own label to Column during its declaration or even disable label for selected column?
I saw this great answer, but seems this is not applicable to me. Below statement doesn't work properly:
photos_data = Column(String, name='photos.FileData', quote=False).label('photos_data')

Where clause in Google App Engine Datastore

The model for my Resource class is as follows:
class Resource(ndb.Model):
name = ndb.StringProperty()
availability = ndb.StructuredProperty(Availability, repeated=True)
tags = ndb.StringProperty(repeated=True)
owner = ndb.StringProperty()
id = ndb.StringProperty(indexed=True, required=True)
lastReservedTime = ndb.DateTimeProperty(auto_now_add=False)
startString = ndb.StringProperty()
endString = ndb.StringProperty()
I want to extract records where the owner is equal to a certain string.
I have tried the below query. It does not give an error but does not return any result either.
Resource.query(Resource.owner== 'abc#xyz.com').fetch()
As per my understanding if a column has duplicate values it shouldn't be indexed and that is why owner is not indexed. Please correct me if I am wrong.
Can someone help me figure out how to achieve a where clause kind of functionality?
Any help is appreciated! Thanks!
Just tried this. It worked first time. Either you have no Resource entities with an owner of "abc#xyz.com", or the owner property was not indexed when the entities were put (which can happen if you had indexed=False at the time the entities were put).
My test:
Resource(id='1', owner='abc#xyz.com').put()
Resource(id='2', owner='abc#xyz.com').put()
resources = Resource.query(Resource.owner == 'abc#xyz.com').fetch()
assert len(resources) == 2
Also, your comment:
As per my understanding if a column has duplicate values it shouldn't
be indexed and that is why owner is not indexed. Please correct me if
I am wrong.
Your wrong!
Firstly, there is no concept of a 'column' in a datastore model, so I will I assume you mean 'Property'.
Next, to clarify what you mean by "if a column property has duplicate values":
I assume you mean 'multiple entities created from the same model with the same value for a specific property', in your case 'owner'. This has no effect on indexing, each entity will be indexed as expected.
Or maybe you mean 'a single entity with a property that allows multiple values (ie a list)', which also does not prevent indexing. In this case, the entity will be indexed multiple times, once for each item in the list.
To further elaborate, most properties (ie ones that accept primitive types such as string, int, float etc) are indexed automatically, unless you add the attribute indexed=False to the Property constructor. In fact, the only time you really need to worry about indexing is when you need to perform more complex queries, which involve querying against more that 1 property (and even then, by default, the app engine dev server will auto create the indexes for you in your local index.yaml file), or using inequality filters.
Please read the docs for more detail.
Hope this helps!

Efficient way to do large IN query in Google App Engine?

A user accesses his contacts on his mobile device. I want to send back to the server all the phone numbers (say 250), and then query for any User entities that have matching phone numbers.
A user has a phone field which is indexed. So I do User.query(User.phone.IN(phone_list)), but I just looked at AppStats, and is this damn expensive. It cost me 250 reads for this one operation, and this is something I expect a user to do often.
What are some alternatives? I suppose I can set the User entity's id value to be his phone number (i.e when creating a user I'd do user = User(id = phone_number)), and then get directly by keys via ndb.get_multi(phones), but I also want to perform this same query with emails too.
Any ideas?
You could create a PhoneUser model like so:
from google.appengine.ext import ndb
class PhoneUser(ndb.Model):
number = ndb.StringProperty()
user = ndb.KeyProperty()
class User(ndb.Model):
pass
u = User()
u.put()
p = PhoneUser(id='123-456-7890', number='123-456-7890', user=u.key)
p.put()
u2 = User()
u2.put()
p2 = PhoneUser(id='555-555-5555', number='555-555-5555', user=u2.key)
result = ndb.get_multi([ndb.Key(PhoneUser, '123-456-7890'), ndb.Key(PhoneUser, '555-555-5555')])
I think that would work in this situation. You would just have to add/delete your PhoneUser model whenever you update your User. You can do this using post hooks: https://developers.google.com/appengine/docs/python/ndb/modelclass#Model__post_delete_hook
I misunderstood part of your problem, I thought you were issuing a query that was giving you 250 entities.
I see what the problem is now, you're issuing an IN query with a list of 250 phone numbers, behind the scenes, the datastore is actually doing 250 individual queries, which is why you're getting 250 read ops.
I can't think of a way to avoid this. I'd recommend avoiding searching on long lists of phone numbers. This seems like something you'd need to do only once, the first time the user logs in using that phone. Try to find some way to store the results and avoid the query again.
there is no efficient way to do an IN query.
so instead avoid it all together.
how?
invert the query, instead of finding all people that belong to this guys phone list.
try
finding all people that have this users phoneid in their list.
this however is not without some extra cost.
the phonelist for each user much be stored and indexed.
class User(ndb.Model):
phoneList = ndb.PropertyList()
phone_id= ndb.StringProperty()
select from where User.phoneList = :this_phone_number

Designing a scalable product database on Google App Engine

I've built a product database that is divided in 3 parts. And each part has a "sub" part containing labels. But the more I work with it the more unstable it feels. And each addition I make it takes more and more code to get it to work.
A product is built of parts, and each part is of a type. Each product, part and type has a label. And there's a label for each language.
A product contains parts in 2 list. One list for default parts (one of each type) and one of optional parts.
Now I want to add currency in the mix and have come to the decision to re-model the entire way I handle this.
The result I want to get is a list of all product objects that contains the name, description, price, all parts and all types that match the parts. And for these the correct language labels.
Like so:
product
- name
- description (by language)
- price (by currency)
- parts
- part (type name and part name by language)
- partPrice (by currency)
The problem with my current setup that is a wild mix of db.ReferenceProperty and db.ListProperty(db.key)
And getting all data by is a bit of a hassle that require multiple for-loops, matching dict and datastore calls. Well it's bit of a mess.
The re-model(un-tested) look like this
class Products(db.model)
name = db.StringProperty()
imageUrl = db.StringProperty()
optionalParts = db.ListProperty(db.Key)
defaultParts = db.ListProperty(db.Key)
active = db.BooleanProperty(default=True)
#property
def itemId(self):
return self.key().id()
class ProductPartTypes(db.Model):
name= db.StringProperty()
#property
def itemId(self):
return self.key().id()
class ProductParts(db.Model):
name = db.StringProperty()
type = db.ReferenceProperty(ProductPartTypes)
imageUrl = db.StringProperty()
parts = db.ListProperty(db.Key)
#property
def itemId(self):
return self.key().id()
class Labels(db.Model)
key = db.StringProperty() #want to store a key here
language = db.StringProperty()
label = db.StringProperty()
class Price(db.Model)
key = db.StringProperty() #want to store a key here
language = db.StringProperty()
price = db.IntegerProperty()
The major thing here is that I've split the Labels and Price out. So these can contain labels and prices for any products, parts or types.
So what I am curious about, is this a solid solution from a architectural point of view? Will this hold even if there's thousands of entries in each model?
Also, any tips for retrieving data in a good manner are welcome. My current solution of get all data first and for-looping over them and stick them in dicts works but feels like it could fail any minute.
..fredrik
You need to keep in mind that App Engine's datastore requires you to rethink your usual way of designing databases. It goes against intuition at first but you must denormalize your data as much as possible if you want your application to be scalable. The datastore has been designed this way.
The approach I usually take is to consider first what kind of queries will need to be done in different use cases, eg. what data do I need to retrieve at the same time ? In what order ? What properties should be indexed ?
If I understand correctly, your main goal is to fetch a list of products with complete details. BTW, if you have other query scenarios - ie. filtering on price, type, etc - you should take them into account too.
In order to fetch all the data you need from only one query, I suggest you create one model which could look like this :
class ProductPart(db.Model):
product_name = db.StringProperty()
product_image_url = db.StringProperty()
product_active = db.BooleanProperty(default=True)
product_description = db.StringListProperty(indexed=False) # Contains product description in all languages
part_name = db.StringProperty()
part_image_url = db.StringProperty()
part_type = db.StringListProperty(indexed=False) # Contains part type in all languages
part_label = db.StringListProperty(indexed=False) # Contains part label in all languages
part_price = db.ListProperty(float, indexed=False) # Contains part price in all currencies
part_default = db.BooleanProperty()
part_optional = db.BooleanProperty()
About this solution :
ListProperties are set to
indexed=False in order to avoid
exploding indexes if you don't need
to filter on them.
In order to get the right
description, label or type, you will have to set
list values always in the same order.
For example : part_label[0] is
English, part_label[1] is Spanish,
etc. Same idea for prices and
currencies.
After fetching entities from this
model you will have to do some
in-memory manipulations in order to
get the data nicely structured the way
you want, maybe in a new dictionary.
Obviously, there will be a lot of redundancy in the datastore with such a design - but that's okay, since it allows you to query the datastore in a scalable fashion.
Besides, this is not meant as a replacement for the architecture that you had in mind, but rather an additional Model designed specifically for the user-facing kind of queries that you need to do, ie. retrieving lists of complete product/parts information.
These ProductPart entities could be populated by background tasks, replicating data located in your other normalized entities which would be the authoritative data source. Since you have plenty of data storage on App Engine, this should not be a problem.
IMO your design mostly makes sense. I did come up with almost same design after reading your problem statement. With a few differnces
I had prices with Product and ProductPart not as a separate table.
Other difference was part_types. If there are not many part_type you can simply have them as python list/tuple.
part_types = ('wheel', 'break', 'mirror')
It also depends on kind of queries you are anticipating. If there are many queries of nature price calculation (independent of rest of product and part info) then it might make sense to design it way you have done.
You have mentioned that you will get all the data first. Isn't querying possible? If you get the whole data in your app and then sort/filter in python then it would be slow. Which database are you considering? For me mongodb looks like a good option here.
Finally why are you suspicious about even 1000 records? You can run a few tests on your db beforehand.
Bests

Categories

Resources