Entity keys are different after migration to High Replication Datastore

Entity keys are different after migration to High Replication Datastore - python

My migration to hrd is not working on appspot.com . The app datastore has 3 "kind"s of data in both the original master/slave (MS) and in the High Replication Datastore (hrd): Group, Pin, and Log. Each Group entity has Pin entities and/or Log entities associated with it, but the associations no longer work in the hrd (which is all that survives the migration), so my app no longer works and I am looking for help to revive it.
Below I report the entity keys for the first two Pin entities in the datastore. I have inserted some spaces in the shorter key of each pair to facilitate lining up the keys to see their similarities. Notice that all the keys start and end similarly, but differ in MS vs hrd.
Decoded entity key: Group: name=250cc > Pin: id=1
Entity #1 MS key: ah NzaW1wbGlmeWNvbm5lY3Rpb25zchkLEgVHcm91cCIFMjUwY2MMCxIDUGluGAEM
Entity #1 hrd key: ahlzfnNpbXBsaWZ5Y29ubmVjdGlvbnMtaHJkchkLEgVHcm91cCIFMjUwY2MMCxIDUGluGAEM
Decoded entity key: Group: name=250cc > Pin: id=5001
Entity #2 MS key: ah NzaW1wbGlmeWNvbm5lY3Rpb25zchoLEgVHcm91cCIFMjUwY2MMCxIDUGluGIknDA
Entity #2 hrd key: ahlzfnNpbXBsaWZ5Y29ubmVjdGlvbnMtaHJkchoLEgVHcm91cCIFMjUwY2MMCxIDUGluGIknDA
To view the app yourself use this link. You will see the Group named "Playground" and see how it is called in the URL. However, the only markers (map pins) that appear are ones that were added since the migration to hrd.
edit #0
Below is my Python code for adding saving a Pin where the parent is a Group.
elif action == "add":
pin = Pin(parent=place)
pin.name = self.request.get('details')
pin.lat = float(self.request.get('lat'))
pin.lng = float(self.request.get('lng'))
pin.category = int(self.request.get('category'))
pin.label = self.request.get('label')
new_id = pin.put()
self.response.out.write(new_id)
And below is the class definition for Pin.
class Pin(db.Model):
date = db.DateTimeProperty(auto_now_add=True)
lat = db.FloatProperty()
lng = db.FloatProperty()
name = db.StringProperty()
cornerColor = db.StringProperty(default='ffffff')
height = db.IntegerProperty(default=32)
label = db.StringProperty(default='')
labelColor = db.StringProperty(default='000000')
labelSize = db.IntegerProperty(default=2)
primaryColor = db.StringProperty(default='ff0000')
shadowColor = db.StringProperty(default='000000')
shape = db.StringProperty(default='circle')
strokeColor = db.StringProperty(default='000000')
width = db.IntegerProperty(default=32)
category = db.IntegerProperty(default=0)
scategory = db.StringProperty()
logindex = db.IntegerProperty(default=0)
imageindex = db.IntegerProperty(default=0)
deleteRequested = db.BooleanProperty(default=False)
edit #0
edit #1
The problem with my app is not with the entity keys, after all. Instead, the problem is with the way I tried to handle another deprecated Google (Maps) feature regarding stylized markers in my javascript/html.
I am sorry for the noise here. The problem resulted from my inability/ineptness with a try..catch pattern I attempted to employ as a workaround in the javascript/html template.
edit #1

The encoded key strings are expected to change. The encoded version contain the application's Id. During the migration process the keys are re-written with the new application Id. References to keys are also similarly updated.
If you store a key as a db.ReferenceProperty, the key is automatically updated for you during the migration.
However if you are storing strings like
ahNzaW1wbGlmeWNvbm5lY3Rpb25zchkLEgVHcm91cCIFMjUwY2MMCxIDUGluGAEM
in db.StringProperty() (or other similar ways, such as a part of a URL), then they will not be updated an you need to update yourself as described in the docs.
The model you reference for Pin, does not appear to link to other entities so there shouldn't be any problems.

Related

find all documents with same id in different collections in firestore

I have this structure in firestore. Many collections with id the user_id and inside each of them many documents with IDs the date of departure. The documents contain the fields "from" and "to" with the airport name.
I want to retrieve all the IDs of collections (the users IDs) that have the same documents of a choosed user in input for see who shared the flight with this user in all the flights he made.
I'm using python.
UPDATE: I solved my issue in this way.
#app.route('/infos/<string:user_id>/', methods=['GET'])
def user_info(user_id):
docs = db.collection(f'{user_id}').stream()
travels = []
for doc in docs:
sharing_travellers = []
tmp = doc.to_dict()
tmp['date'] = doc.id
colls = db.collections()
for coll in colls:
if coll.id != user_id:
date = datetime.strptime(doc.id, '%Y-%m-%d')
query = db.collection(f'{coll.id}').stream()
for q in query:
other_date = datetime.strptime(q.id, '%Y-%m-%d')
if abs((date - other_date).days) < 1:
json_obj = q.to_dict()
if json_obj['from'] == tmp['from'] and json_obj['to'] == tmp['to']:
sharing_travellers.append(coll.id)
tmp['shared'] = sharing_travellers
travels.append(tmp)
return render_template('user_info.html', title=user_id, travels=travels)

The only way to read across collections is if those collections have the same name. If that was the case, you could use a collection group query.
Since your collections don't have the same name though, you'll have to get the list collections, and then look in each collection separately.

I support Frank's answer but I want to elaborate that it might be wise to reformat the structure of your database to better accommodate this type of situation. cross collection searching is limited to collection group queries which are already limited, and additional methods will require costly solutions.
It's often better to have a dedicated collection with those ID as field values of which you can query per user and in a collective group.

Querying objects using attribute of member of many-to-many

I have the following models:
class Member(models.Model):
ref = models.CharField(max_length=200)
# some other stuff
def __str__(self):
return self.ref
class Feature(models.Model):
feature_id = models.BigIntegerField(default=0)
members = models.ManyToManyField(Member)
# some other stuff
A Member is basically just a pointer to a Feature. So let's say I have Features:
feature_id = 2, members = 1, 2
feature_id = 4
feature_id = 3
Then the members would be:
id = 1, ref = 4
id = 2, ref = 3
I want to find all of the Features which contain one or more Members from a list of "ok members." Currently my query looks like this:
# ndtmp is a query set of member-less Features which Members can point to
sids = [str(i) for i in list(ndtmp.values('feature_id'))]
# now make a query set that contains all rels and ways with at least one member with an id in sids
okmems = Member.objects.filter(ref__in=sids)
relsways = Feature.geoobjects.filter(members__in=okmems)
# now combine with nodes
op = relsways | ndtmp
This is enormously slow, and I'm not even sure if it's working. I've tried using print statements to debug, just to make sure anything is actually being parsed, and I get the following:
print(ndtmp.count())
>>> 12747
print(len(sids))
>>> 12747
print(okmems.count())
... and then the code just hangs for minutes, and eventually I quit it. I think that I just overcomplicated the query, but I'm not sure how best to simplify it. Should I:
Migrate Feature to use a CharField instead of a BigIntegerField? There is no real reason for me to use a BigIntegerField, I just did so because I was following a tutorial when I began this project. I tried a simple migration by just changing it in models.py and I got a "numeric" value in the column in PostgreSQL with format 'Decimal:( the id )', but there's probably some way around that that would force it to just shove the id into a string.
Use some feature of Many-To-Many Fields which I don't know abut to more efficiently check for matches
Calculate the bounding box of each Feature and store it in another column so that I don't have to do this calculation every time I query the database (so just the single fixed cost of calculation upon Migration + the cost of calculating whenever I add a new Feature or modify an existing one)?
Or something else? In case it helps, this is for a server-side script for an ongoing OpenStreetMap related project of mine, and you can see the work in progress here.
EDIT - I think a much faster way to get ndids is like this:
ndids = ndtmp.values_list('feature_id', flat=True)
This works, producing a non-empty set of ids.
Unfortunately, I am still at a loss as to how to get okmems. I tried:
okmems = Member.objects.filter(ref__in=str(ndids))
But it returns an empty query set. And I can confirm that the ref points are correct, via the following test:
Member.objects.values('ref')[:1]
>>> [{'ref': '2286047272'}]
Feature.objects.filter(feature_id='2286047272').values('feature_id')[:1]
>>> [{'feature_id': '2286047272'}]

You should take a look at annotate:
okmems = Member.objects.annotate(
feat_count=models.Count('feature')).filter(feat_count__gte=1)
relsways = Feature.geoobjects.filter(members__in=okmems)

Ultimately, I was wrong to set up the database using a numeric id in one table and a text-type id in the other. I am not very familiar with migrations yet, but as some point I'll have to take a deep dive into that world and figure out how to migrate my database to use numerics on both. For now, this works:
# ndtmp is a query set of member-less Features which Members can point to
# get the unique ids from ndtmp as strings
strids = ndtmp.extra({'feature_id_str':"CAST( \
feature_id AS VARCHAR)"}).order_by( \
'-feature_id_str').values_list('feature_id_str',flat=True).distinct()
# find all members whose ref values can be found in stride
okmems = Member.objects.filter(ref__in=strids)
# find all features containing one or more members in the accepted members list
relsways = Feature.geoobjects.filter(members__in=okmems)
# combine that with my existing list of allowed member-less features
op = relsways | ndtmp
# prove that this set is not empty
op.count()
# takes about 10 seconds
>>> 8997148 # looks like it worked!
Basically, I am making a query set of feature_ids (numerics) and casting it to be a query set of text-type (varchar) field values. I am then using values_list to make it only contain these string id values, and then I am finding all of the members whose ref ids are in that list of allowed Features. Now I know which members are allowed, so I can filter out all the Features which contain one or more members in that allowed list. Finally, I combine this query set of allowed Features which contain members with ndtmp, my original query set of allowed Features which do not contain members.

geoSpatial & Location based search in google appengine python

I want to achieve something like the map drag search on airbnb (https://www.airbnb.com/s/Paris--France?source=ds&page=1&s_tag=PNoY_mlz&allow_override%5B%5D=)
I am saving the data like this in datastore
user.lat = float(lat)
user.lon = float(lon)
user.geoLocation = ndb.GeoPt(float(lat),float(lon))
and whenever I drag & drop map or zoom in or zoom out I get following parameters in my controller
def get(self):
"""
This is an ajax function. It gets the place name, north_east, and south_west
coordinates. Then it fetch the results matching the search criteria and
create a result list. After that it returns the result in json format.
:return: result
"""
self.response.headers['Content-type'] = 'application/json'
results = []
north_east_latitude = float(self.request.get('nelat'))
north_east_longitude = float(self.request.get('nelon'))
south_west_latitude = float(self.request.get('swlat'))
south_west_longitude = float(self.request.get('swlon'))
points = Points.query(Points.lat<north_east_latitude,Points.lat>south_west_latitude)
for row in points:
if row.lon > north_east_longitude and row.lon < south_west_longitude:
listingdic = {'name': row.name, 'desc': row.description, 'contact': row.contact, 'lat': row.lat, 'lon': row.lon}
results.append(listingdic)
self.write(json.dumps({'listings':results}))
My model class is given below
class Points(ndb.Model):
name = ndb.StringProperty(required=True)
description = ndb.StringProperty(required=True)
contact = ndb.StringProperty(required=True)
lat = ndb.FloatProperty(required=True)
lon = ndb.FloatProperty(required=True)
geoLocation = ndb.GeoPtProperty()
I want to improve the query.
Thanks in advance.

No, you cannot improve the solution by checking all 4 conditions in the query because ndb queries do not support inequality filters on multiple properties. From NDB Queries (emphasis mine):
Limitations: The Datastore enforces some restrictions on queries.
Violating these will cause it to raise exceptions. For example,
combining too many filters, using inequalities for multiple
properties, or combining an inequality with a sort order on a
different property are all currently disallowed. Also filters
referencing multiple properties sometimes require secondary indexes to
be configured.
and
Note: As mentioned earlier, the Datastore rejects queries using inequality filtering on more than one property.

How do I update a query from Google App Engine NDB?

We are using Google App Engine in Python. I have code that saves a new object to the database, and then queries the database to receive all the objects. The problem is that the query returns all the objects except the new object I created. Only after refreshing the page I see the new object. Is there a way to update the query to include all the objects, including the new object I created? Here is my code:
if (self.request.get("add_a_new_feature") == "true"):
features = Feature.gql("WHERE feature_name=:1 ORDER BY last_modified DESC LIMIT 1", NEW_FEATURE_NAME) # class Feature inherits from ndb.Model
if (features.count() == 0):
new_feature = Feature(feature_name=NEW_FEATURE_NAME)
new_feature.put()
...
features = Feature.gql("ORDER BY date_created")
if (features.count() > 0):
features_list = features.fetch()
for feature in features_list:
... # the list doesn't contain new_feature

As mentioned in the comments - this is an expected behaviour. Take a look at this article for additional information. As a quick fix/hack you could simply get the data from datastore before adding the new entity and then append it to the list.
features = Feature.gql("ORDER BY date_created")
if (self.request.get("add_a_new_feature") == "true"):
if (Feature.gql("WHERE feature_name=:1 ORDER BY last_modified DESC LIMIT 1", NEW_FEATURE_NAME).count() == 0):
new_feature = Feature(feature_name=NEW_FEATURE_NAME)
new_feature.put()
features.append(new_feature)
...
if (features.count() > 0):
features_list = features.fetch()
for feature in features_list:
... # the list now contain the new_feature at the end
Depending on what Entity.gql() returns when there are no results (None or [ ]?) you may need to check whether features is a list before appending. You could also probably avoid the second query since you already have a list of features and could loop through it in Python rather than sending another request to datastore.

ListProperty with GoogleAppEngine

How do I assign a ListProperty with Google App Engine?
name = self.request.get("name")
description = self.request.get("description")
list = '''insert code here'''
I want list to work like a dictionary, is this possible with Google App Engine, if so, how:
[wordone : score; wordtwo : score; wordthree : score]
^I want the list property to store some data like this, how is this possible?

You actually won't be able to store a true dictionary as type in a ListProperty (it only supports datastore property types, of which dict is not one), so you won't be able to get the behavior you're looking for. Will all of the data be the same (i.e. each element represents a word score)? Assuming storing each word as its own property on the model doesn't make sense, one 'dirty' solution would be to make a ListProperty of type str, and then append the word and score as separate elements. Then, when you searched for a word in the list, you would return the value at the index position of the word + 1. That would looks something like:
class MyEntity(db.Model):
name = db.StringProperty()
description = db.TextProperty()
word_list = db.ListProperty()
You could then add words like:
new_entity = MyEntity()
new_entity.word_list = ['word1', 1, 'word2', 2, 'word3', 10]
You could then query for a particular entity and then examine its word_list property (a list), looking for your target word and returning the element one position after it.
More convoluted suggestion
However if that isn't an option, you could look into creating another model (let's say WordScore) that looked something like:
class WordScore(db.Model):
word = db.StringProperty()
score = db.IntegerProperty()
Then, whenever you needed to add a new score, you would create a WordScore instance, fill out the properties and then assign it to the proper entity. I haven't tested any of this, but the idea would be something like:
# Pull the 'other' entity (this would be your main class as defined above)
q = OtherEntity.all()
q.filter('name =', 'Someone')
my_entity = q.get()
# Create new score
ws = WordScore(parent=my_entity)
ws.word = 'dog'
ws.score = 2
ws.put()
You could then pull out the score for dog for 'Someone' by doing something like this (again, completely untested for now - be warned :) ):
# Get key of 'Someone'
q = OtherEntity.all()
q.filter('name =', 'Someone')
my_entity = q.get().key()
# Now get the score
ws = WordScore.all()
ws.filter('word = ', 'dog').ancestor(my_entity)
word_score = ws.get().score

Change to NDB and use the Pickle property:
Value is a Python object (such as a list or a dict or a string) that is serializable using Python's pickle protocol; the Datastore stores the pickle serialization as a blob. Unindexed by default.
NDB Properties
Then you can use it directly:
class table(ndb.Model):
data_dict = ndb.PickleProperty(default = {})
then
dd = table()
dd.data_dict['word_one'] = "Some_Score"

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Entity keys are different after migration to High Replication Datastore - python

Related

find all documents with same id in different collections in firestore

Querying objects using attribute of member of many-to-many

geoSpatial & Location based search in google appengine python

How do I update a query from Google App Engine NDB?

ListProperty with GoogleAppEngine

Categories

Resources