How to fetch the latest data in GAE Python NDB - python

I am using GAE Python. I have two root entities:
class X(ndb.Model):
subject = ndb.StringProperty()
grade = ndb.StringProperty()
class Y(ndb.Model):
identifier = ndb.StringProperty()
name = ndb.StringProperty()
school = ndb.StringProperty()
year = ndb.StringProperty()
result = ndb.StructuredProperty(X, repeated=True)
Since google stores our data across several data centers, we might not get the most recent data when we do a query as shown below(in case some changes have been "put"):
def post(self):
identifier = self.request.get('identifier')
name = self.request.get('name')
school = self.request.get('school')
year = self.request.get('year')
qry = Y.query(ndb.AND(Y.name==name, Y.school==school, Y.year==year))
record_list = qry.fetch()
My question: How should I modify the above fetch operation to always get the latest data
I have gone through the related google help doc but could not understand how to apply that here
Based on hints from Isaac answer, Would the following be the solution(would "latest_record_data" contain the latest data of the entity):
def post(self):
identifier = self.request.get('identifier')
name = self.request.get('name')
school = self.request.get('school')
year = self.request.get('year')
qry = Y.query(ndb.AND(Y.name==name, Y.school==school, Y.year==year))
record_list = qry.fetch()
record = record_list[0]
latest_record_data = record.key.get()

There's a couple ways on app engine to get strong consistency, most commonly using gets instead of queries and using ancestor queries.
To use a get in your example, you could encode the name into the entity key:
class Y(ndb.Model):
result = ndb.StructuredProperty(X, repeated=True)
def put(name, result):
Y(key=ndb.Key(Y, name), result).put()
def get_records(name):
record_list = ndb.Key(Y, name).get()
return record_list
An ancestor query uses similar concepts to do something more powerful. For example, fetching the latest record with a specific name:
import time
class Y(ndb.Model):
result = ndb.StructuredProperty(X, repeated=True)
#classmethod
def put_result(cls, name, result):
# Don't use integers for last field in key. (one weird trick)
key = ndb.Key('name', name, cls, str(int(time.time())))
cls(key=key, result=result).put()
#classmethod
def get_latest_result(cls, name):
qry = cls.query(ancestor=ndb.Key('name', name)).order(-cls.key)
latest = qry.fetch(1)
if latest:
return latest[0]
The "ancestor" is the first pair of the entity's key. As long as you can put a key with at least the first pair into the query, you'll get strong consistency.

Related

Hybrid property expression with JOIN

I'm fairly new to peewee, but have some strong background on SQLAlchemy (and all the vices that come with it). I'm trying to create a custom hybrid expression that correlates to a third (or even N) table. I'll try to demonstrate in an example (non-tested) code:
class BaseModel(Model):
class Meta:
database = database
class Person(BaseModel):
id = PrimaryKeyField(column_name="person_id")
name = CharField(max_length=255, column_name="person_name")
username = CharField(max_length=255, column_name="person_username")
class PersonTree(BaseModel):
id = PrimaryKeyField(column_name="person_tree_id")
name = CharField(max_length=255, column_name="person_tree_name")
code = CharField(max_length=255, column_name="person_tree_code")
person = ForeignKeyField(
column_name="person_id",
model=Person,
field="id",
backref="tree",
)
class Article(BaseModel):
id = PrimaryKeyField(column_name="article_id")
name = CharField(max_length=255, column_name="article_name")
branch = ForeignKeyField(
column_name="person_tree_id",
model=PersonTree,
field="id",
backref="articles",
)
#hybrid_property
def username(self):
"""
This gives me the possibility to grab the direct username of an article
"""
return self.branch.person.username
#username.expression
def username(cls):
"""
What if I wanted to do: Article.query().where(Article.username == "john_doe") ?
"""
pass
With the username hybrid_property on Article, I can get the username of the Person related to an Article using the PersonTree as a correlation, so far so good, but ... What if I wanted to "create a shortcut" to query all Articles created by the "john_doe" Person username, without declaring the JOINs every time I make the query and without relying on .filter(branch__person__username="john_doe")? I know it's possible with SA (to a great extent), but I'm finding this hard to accomplish with peewee.
Just for clarification, here's the SQL I hope to be able to construct:
SELECT
*
FROM
article a
JOIN person_tree pt ON a.person_tree_id = pt.person_tree_id
JOIN person p ON pt.person_id = p.person_id
WHERE
p.username = 'john_doe';
Thanks a lot in advance!
Hybrid properties can be used to allow an attribute to be expressed as a property of a model instance or as a scalar computation in a SQL query.
What you're trying to do, which is add multiple joins and stuff via the property, is not possible using hybrid properties.
What if I wanted to "create a shortcut" to query all Articles created by the "john_doe" Person username
Just add a normal method:
#classmethod
def by_username(cls, username):
return (Article
.select(Article, PersonTree, Person)
.join(PersonTree)
.join(Person)
.where(Person.name == username))

Django - cannot retrieve just one record in multi-part filter on model with multiple relations

I can't seem to isolate a single record from this query:
subcust = OwnerCustom.objects.get(carcustom=ncset, owner=sset)
This is the error:
OwnerCustom matching query does not exist
In the actual data, there is only actually one matching record in OwnerCustom for each record in CarCustom. It's supposed to be a kind of many-to-many where there are standard differences listed in CarCustom for each Car, and each owner may maintain their own customizations (overrides) or those default OwnerCustom entries.
Note, there are many different Owner of the same Car. And of course, I'm not actually doing cars, this is a renaming from the original purpose.
Here's the relevant models:
class Car(models.Model):
car_name = models.CharField(max_length=50)
class CarCustom(models.Model):
car = models.ForeignKey(Car, models.PROTECT)
class Owner(models.Model):
car = models.ForeignKey(Car, models.PROTECT)
class OwnerCustom(models.Model):
owner = models.ForeignKey(Owner, models.PROTECT)
carcustom = models.ForeignKey(CarCustom, models.PROTECT)
name = models.CharField(max_length=50)
And the code:
car_queryset = Car.objects.filter(car_name="fancy car")
for nset in car_queryset:
owner_queryset = Owner.objects.filter(car=nset)
for sset in owner_queryset :
carcustom_queryset = CarCustom.objects.filter(car=nset)
for ncset in carcustom_queryset:
subcust = OwnerCustom.objects.get(carcustom=ncset, owner=sset)
I've tried stuff like:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset).first()
Which gives me a NoneType, and then tried:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)[:1].get()
Which gives "matching query does not exist" and this:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)[0]
Gives "list index out of range"
UPDATE: I CAN get a working function by using code like this, but I would think since there is only one (guaranteed by application) matching record possible for OwnerCustom.objects.filter(carcustom=ncset, owner=sset) that I could find a better way to fetch it:
car_queryset = Car.objects.filter(car_name="fancy car")
for nset in car_queryset:
owner_queryset = Owner.objects.filter(car=nset)
for sset in owner_queryset :
carcustom_queryset = CarCustom.objects.filter(car=nset)
for ncset in carcustom_queryset:
subcust_queryset = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)
for subcust in subcust_queryset :
logger.info(subcust.name)

NDB query using filters on Structured property which is also repeated ?

I am creating a sample application storing user detail along with its class information.
Modal classes being used are :
Model class for saving user's class data
class MyData(ndb.Model):
subject = ndb.StringProperty()
teacher = ndb.StringProperty()
strength = ndb.IntegerProperty()
date = ndb.DateTimeProperty()
Model class for user
class MyUser(ndb.Model):
user_name = ndb.StringProperty()
email_id = ndb.StringProperty()
my_data = ndb.StructuredProperty(MyData, repeated = True)
I am able to successfully store data into the datastore and can also make simple query on the MyUser entity using some filters based on email_id and user_name.
But when I try to query MyUser result using filter on a property from the MyUser modal's Structured property that is my_data, its not giving correct result.
I think I am querying incorrectly.
Here is my query function
function to query based upon the repeated structure property
def queryMyUserWithStructuredPropertyFilter():
shail_users_query = MyUser.query(ndb.AND(MyUser.email_id == "napolean#gmail.com", MyUser.my_data.strength > 30))
shail_users_list = shail_users_query.fetch(10)
maindatalist=[]
for each_user in shail_users_list:
logging.info('NEW QUERY :: The user details are : %s %s'% (each_user.user_name, each_user.email_id))
# Class data
myData = each_user.my_data
for each_my_data in myData:
templist = [each_my_data.strength, str(each_my_data.date)]
maindatalist.append(templist)
logging.info('NEW QUERY :: The class data is : %s %s %s %s'% (each_my_data.subject, each_my_data.teacher, str(each_my_data.strength),str(each_my_data.date)))
return maindatalist
I want to fetch that entity with repeated Structured property (my_data) should be a list which has strength > 30.
Please help me in knowing where I am doing wrong.
Thanks.
Queries over StructuredProperties return objects for which at least one of the structured ones satisfies the conditions. If you want to filter those properties, you'll have to do it afterwards.
Something like this should do the trick:
def queryMyUserWithStructuredPropertyFilter():
shail_users_query = MyUser.query(MyUser.email_id == "napolean#gmail.com", MyUser.my_data.strength > 30)
shail_users_list = shail_users_query.fetch(10)
# Here, shail_users_list has at most 10 users with email being
# 'napolean#gmail.com' and at least one element in my_data
# with strength > 30
maindatalist = [
[[data.strength, str(data.date)] for data in user.my_data if data.strength > 30] for user in shail_users_list
]
# Now in maindatalist you have ONLY those my_data with strength > 30
return maindatalist

Google App Engine: defining custom id and querying

I want to define a custom string as an ID so I created the following Model:
class WikiPage(ndb.Model):
id = ndb.StringProperty(required=True, indexed=True)
content = ndb.TextProperty(required=True)
history = ndb.DateTimeProperty(repeated=True)
Based on this SO thread, I believe this is right.
Now I try to query by this id by:
entity = WikiPage.get_by_id(page) # page is an existing string id, passed in as an arg
This is based on the NDB API.
This however isn't returning anything -- entity is None.
It only works when I run the following query instead:
entity = WikiPage.query(WikiPage.id == page).get()
Am I defining my custom key incorrectly or misusing get_by_id() somehow?
Example:
class WikiPage(ndb.Model):
your_id = ndb.StringProperty(required=True)
content = ndb.TextProperty(required=True)
history = ndb.DateTimeProperty(repeated=True)
entity = WikiPage(id='hello', your_id='hello', content=...., history=.....)
entity.put()
entity = WikiPage.get_by_id('hello')
or
key = ndb.Key('WikiPage','hello')
entity = key.get()
entity = WikiPage.get_by_id(key.id())
and this still works:
entity = WikiPage.query(WikiPage.your_id == 'hello').get()

Google app engine python problem

I'm having a problem with the datastore trying to replicate a left join to find items from model a that don't have a matching relation in model b:
class Page(db.Model):
url = db.StringProperty(required=True)
class Item(db.Model):
page = db.ReferenceProperty(Page, required=True)
name = db.StringProperty(required=True)
I want to find any pages that don't have any associated items.
You cannot query for items using a "property is null" filter. However, you can add a boolean property to Page that signals if it has items or not:
class Page(db.Model):
url = db.StringProperty(required=True)
has_items = db.BooleanProperty(default=False)
Then override the "put" method of Item to flip the flag. But you might want to encapsulate this logic in the Page model (maybe Page.add_item(self, *args, **kwargs)):
class Item(db.Model):
page = db.ReferenceProperty(Page, required=True)
name = db.StringProperty(required=True)
def put(self):
if not self.page.has_items:
self.page.has_items = True
self.page.put()
return db.put(self)
Hence, the query for pages with no items would be:
pages_with_no_items = Page.all().filter("has_items =", False)
The datastore doesn't support joins, so you can't do this with a single query. You need to do a query for items in A, then for each, do another query to determine if it has any matching items in B.
Did you try it like :
Page.all().filter("item_set = ", None)
Should work.

Categories

Resources