I'm planning to do this for my application:
Store a unique id into key_name of a User model.
At any time given, user will be allowed to choose a username once, which I intend to replace the original key_name of a model with this username chosen by user.
With my implementation, any new user, the User model will be only created when user is activated.
Based on the situation, my question, which of the following a better approach ?
Upon user log in, user must choose a username, so that I could create the User model with the keyname = username chosen. However, this approach might appears unpleasant to user as they should be allowed to choose username anytime that they wanted to.
The approach explained in situation above, however I would need to do clone_entity. However, with clone_entity, will reference properties be assigned back to the new cloned entity ? And also, performance is priority, will this be costly in terms of database operations if it involves a lot of users at the same time ?
If you're heart-set on having the user_name as the key either approach should work fine (assuming you have the logic to prevent duplicate usernames)
However, with clone_entity, will reference properties be assigned back to the new cloned entity ?
If the clone entity is done correctly reference properties will be copied over without a problem. However if you have any entities referencing the entity you are cloning those will not be updated to reference the new clone of the entity.
And also, performance is priority, will this be costly in terms of database operations if it involves a lot of users at the same time ?
As long as the clone is implemented efficiently and assuming you pass in the entity you want to clone there should only be one database operation per clone call (the put of the newly created entity).
It looks like the the clone_entity you linked has an update that will avoid excess db calls for reference properties so you should be good.
Related
I want to add the 'check username available' functionality on my signup page using AJAX. I have few doubts about the way I should implement it.
With which event should I register my AJAX requests? We can send the
requests when user focus out of the 'username' input field (blur
event) or as he types (keyup event). Which provides better user
experience?
On the server side, a simple way of dealing with requests would be
to query my main 'Accounts' database. But this could lead to a lot
of request hitting my database(even more if we POST using the keyup
event). Should I maintain a separate model for registered usernames
only and use that to get better results?
Is it possible to use Memcache in this case? Initializing cache with
every username as key and updating it as we register users and use a
random key to check if cache is actually initialized or pass the
queries directly to db.
Answers -
Do the check on blur. If you do it on key up, you will be hammering your server with unnecessary queries, annoying the user who is not yet done typing, and likely lag the typing anyway.
If your Account entity is very large, you may want to create a separate AccountName entity, and create a matching such entity whenever you create a real Account (but this is probably an unnecessary optimization). When you create the Account (or AccountName), be sure to assign id=name when you create it. Then you can do an AccountName.get_by_id(name) to quickly see if the AccountName has already been assigned, and it will automatically pull it from memcache if it has been recently dealt with.
By default, GAE NDB will automatically populate memcache for you when you put or get entities. If you follow my advice in step 2, things will be very fast and you won't have to mess around with pre-populating memcache.
If you are concerned about 2 people simultaneously requesting the same user name, put your create method in a transaction:
#classmethod
#ndb.transactional()
def create_account(cls, name, other_params):
acct = Account.get_by_id(name)
if not acct:
acct = Account(id=name, other param assigns)
acct.put()
I would recommend the blur event of the username field, combined with some sort of inline error/warning display.
I would also suggest maintaining a memcache of registered usernames, to reduce DB hits and improve user experience - although probably not populate this with a warm-up, but instead only when requests are made. This is sometimes called a "Repository" pattern.
BUT, you can only populate the cache with USED usernames - you should not store the "available" usernames here (or if you do, use a much lower timeout).
You should always check directly against the DB/Datastore when actually performing the registration. And ideally in some sort of transactional method so that you don't have race conditions with multiple people registering.
BUT, all of this work is dependant on several things, including how busy your app is and what data storage tech you are using!
I have a users who have "followers". I need to be able to navigate up and down the tree of users/followers. I'm eventually going hit AppEngine's 1mb limit on entity entries if I use ancestor relations if a user has many followers.
What's the best way to structure this data on AppEngine?
You cannot use ancestor relations for a simple reason that your use case allows circular references (I follow you, you follow me).
The solution depends on your expected usage patterns. You can choose between two options:
(A) In each suer entity store a list of IDs of other users that this user is following.
(B) Create a separate entity that has two properties: "User" and"Follower". Every entity will represent a single "connection" between users.
While the first option seems simpler, you may run into exploding indexes problem. Besides, it may turn out to be a more expensive solution as each change in user relationships will require an overwrite of a user entity with updates to all of its other indexes. The second solution does not have these drawbacks, but may require a little extra code.
I have an NDB model. Once the data in the model becomes stale, I want to remove stale data items from searches and updates. I could have deleted them, which is explained in this SO post, if not for the need to analyze old data later.
I see two choices
adding a boolean status field, and simply mark entities deleted
move entities to a different model
My understanding of the trade off between these two options
mark-deleted is faster
mark-deleted is more error prone: having extra column would require modifying all the queries to exclude entities that are marked deleted. That will increase complexity and probability of bugs.
Question:
Can move-entities option be made fast enough to be comparable to mark-deleted?
Any sample code as to how to move entities between models efficiently?
Update: 2014-05-14, I decided for the time being to use mark-deleted. I figure there is an additional benefit of fewer RPCs.
Related:
How to delete all entities for NDB Model in Google App Engine for python?
You can use a combination, of the solutions you propose although in my head I think its an over engineering.
1) In first place, write a task queue that will update all of your entities with your new field is_deleted with a default value False, this will prevent all the previous entities to return an error when you ask them if they are deleted.
2) Write your queries in a model level, so you don't have to alter them any time you make a change in your model, but only pass the extra parameter you want to filter on when you make the relevant query. You can get an idea from the model of the bootstrap project gae-init. You can query them with is_deleted = False.
3) BigTable's performance will not be affected if you are querying 10 entities or 10 M entities, but if you want to move the deleted ones in an new Entity model you can try to create a crop job so in the end of the day or something copy them somewhere else and remove the original ones. Don't forget that will use your quota and you mind end up paying literally for the clean up.
Keep in mind also that if there are any dependencies on the entities you will move, you will have to update them also. So in my opinion its better to leave them flagged, and index your flag.
I think I read something about a function appengine has that can tell whether an ID / key you want to use for an entity is available, or if there was a function to get an available ID to choose. App engine team said also that we should set the ID when the entity is created and not change it. But in practice we can just copy everything to a new entity with the new ID?
Thanks!
Update
I think the function I'm looking for is allocateIDs from the docs:
http://code.google.com/appengine/docs/python/datastore/functions.html
To reserve one or more IDs, use allocate_ids(). To check whether an ID is already taken, just construct a Key for it using Key.from_path(kind, id) and try to db.get() it. Also note that IDs for keys with a parent are taken from separate pools and are only unique among keys with the same parent.
On the page describing transactions, a use case is presented where the entity in question, a SalesAccount is updated, or if the account doesn't exist, it is created instead. The technique is to just try to load the entity with the given key; and if it returns nothing, create it. It's important to do this inside a transaction to avoid the situation where two users are both racing for the same key, and both see that it doesn't exist (and both try to create it).
If you are familiar with Django, you know that they have a Authentication system with User model. Of course, I have many other tables that have a Foreign Key to this User model.
If I want to delete this user, how do I architect a script (or through mysql itself) to delete every table that is related to this user?
My only worry is that I can do this manually...but if I add a table , but I forget to add that table to my DELETE operation...then I have a row that links to a deleted, non-existing User.
As far as I understand it, django does an "on delete cascade" by default:
http://docs.djangoproject.com/en/dev/topics/db/queries/#deleting-objects
You don't need a script for this. When you delete a record, Django will automatically delete all dependent records (thus taking care of its own database integrity).
This is simple to test. In the admin, go to delete a User. On the confirmation page, you'll see a list of all dependent records in the system. You can use this any time as a quick test to see what's dependent on what (as long as you don't actually click Confirm).
If you perform deletions from your view code with .delete(), all dependent objects will be deleted automatically with no option for confirmation.
From the docs:
When Django deletes an object, it
emulates the behavior of the SQL
constraint ON DELETE CASCADE -- in
other words, any objects which had
foreign keys pointing at the object to
be deleted will be deleted along with
it.