GAE post does not show directly after put

GAE post does not show directly after put - python

I have a very simple "guestbook" script on GAE/Python. It often happens however, that entries which I put() into the datastore are not showing right away - I almost always need to refresh.
def post(self):
t = NewsBase(
date = datetime.now(),
text = self.request.get('text'),
title = self.request.get('title'),
link = self.request.get('link'),
upvotes = [],
downvotes = [],
)
t.put()
q = db.GqlQuery('SELECT * FROM NewsBase ORDER BY date DESC')
template_values = {
'q' : q,
'user' : user,
'search' : search
}
template = jinja_environment.get_template('finaggnews.html')
self.response.out.write(template.render(template_values))
I'm sure there is a solution to this?
Best,
Oliver

This is due to the eventual consistency model of HRD.
You should really read some of the intro docs, Structuring Data for Strong Consistency - https://developers.google.com/appengine/docs/python/datastore/structuring_for_strong_consistency and do some searching of SO. This question has been asked many times before.

Related

Get Related Data from ManyToManyField in Django

Working with ManyToManyField I want to get data of all the users related to all the queried model object along with other field data in the model.
For example for the below model, I have 2 users related to this "ChatRoom"
class ChatRoomParticipants(models.Model):
user = models.ManyToManyField(User, related_name='chatroom_users')
room = models.ForeignKey(ChatRoom, on_delete=models.PROTECT)
With the below query
chatrooms = list(ChatRoomParticipants.objects.filter(user=user).values('user__user_uid', 'room__id', 'room__name'))
I'm able to fetch
[{'user__user_uid': UUID('f4253fbd-90d1-471f-b541-80813b51d610'), 'room__id': 4, 'room__name': 'f4253fbd-90d1-471f-b541-80813b51d610-872952bb-6c34-4e50-b6fd-7053dfa583de'}]
But I'm expecting something like
[{
'user__user_uid1': UUID('f4253fbd-90d1-471f-b541-80813b51d610'),
'user__user_uid2': UUID('872952bb-6c34-4e50-b6fd-7053dfa583de'),
'room__id': 4,
'room__name': 'f4253fbd-90d1-471f-b541-80813b51d610-872952bb-6c34-4e50-b6fd-7053dfa583de'
},
{
'user__user_uid1': UUID('f4253fbd-90d1-471f-b541-80813b51d610'),
'user__user_uid2': UUID('eecd66e7-4874-4b96-bde0-7dd37d0b83b3'),
'room__id': 5,
'room__name': 'f4253fbd-90d1-471f-b541-80813b51d610-eecd66e7-4874-4b96-bde0-7dd37d0b83b3'
},
{
'user__user_uid1': UUID('f4253fbd-90d1-471f-b541-80813b51d610'),
'user__user_uid2': UUID('4f4c0f3d-2292-4d06-afdc-1e95962ac5e6'),
'room__id': 6,
'room__name': 'f4253fbd-90d1-471f-b541-80813b51d610-4f4c0f3d-2292-4d06-afdc-1e95962ac5e6'
}]
I've searched and found I can do something like
user_data = chatrooms.users.all().values('user_uid')
But the above doesn't work well with filter and I would miss out data on room.
Note: I know that's not a correct method to do what I'm trying to achieve, if anyone can enlighten with what's the correct way to achieve the same data.

your example is somewhat confusing, but I think what you are looking for is to find the information of the users related to the same room.
chat_room = ChatRoomParticipants.objects.get(id=id_room)
users_chat = chat_room.user.all().values_list('user_uid', flat=True)
data = {
"room__id": chat_room.room.id
"room__name": chat_room.room.name
"users" : users_chat
}
for something more consistent you can use serializers

How to generate json data from a many to many database table?

I am using sqlite db and peewee as the ORM.
My data model is:
class User(UserMixin, db.Model):
nickname = CharField(index=True, unique=True)
class Circle(db.Model):
name = CharField(unique=True)
class UserInCircle(db.Model):
user = ForeignKeyField(User, related_name="in_circles")
circle = ForeignKeyField(Circle, related_name="include_users")
privilege = IntegerField()
What I need is to get a data format like the following:
[{"nickname": "urbainy", "privilege": 7, "in_circles": [{"circle_name": "world"}, {"circle_name": "test"}]}, {"nickname": "ywe", "privilege": 1, "in_circles": [{"circle_name": "family"}], {"nickname": "ymo", "privilege": null, "in_circles": []}]
So this is a nested json object. I tried marshmallow but I failed because of the many to many data structure. I can't get in_circles field all along time. I am beginner of programmer, so maybe this question is low level. But I really don't have idea to solve it. Thank you very much!

Now, I adopt this way to solve the problem:
#login_required
def setting():
users_in_circles = (User.select(User.nickname,
UserInCircle.privilege,
Circle.name.alias("circle_name"))
.join(UserInCircle, JOIN.LEFT_OUTER)
.join(Circle, JOIN.LEFT_OUTER)
.order_by(User.id))
users_in_circles_data = []
user_nickname = ""
user_in_circles = []
for user_in_circle in users_in_circles.naive():
if user_in_circle.nickname != user_nickname:
user_nickname = user_in_circle.nickname
user_in_circles = [dict(circle_name=str(user_in_circle.circle_name), privilege=str(user_in_circle.privilege))]
users_in_circles_data.append(dict(nickname=user_in_circle.nickname, in_circles=user_in_circles))
else:
user_in_circles.append(dict(circle_name=str(user_in_circle.circle_name), privilege=str(user_in_circle.privilege)))
users_in_circles_data[-1].update(nickname=user_in_circle.nickname, in_circles=user_in_circles)
print(users_in_circles_data)
return render_template("admin_setting.html", circles=Circle.select(), users=User.select(), users_in_circles_data=users_in_circles_data)
Somehow I think maybe there is some other neat way to implement this, such as marshmallow or some other tools. If you know a better solution, welcome to reply to my post then.

ToscaWidgets2 Capture Data from GrowingGridLayout

Currently working on a project with TurboGears2 and ToscaWidgets2. I have a form setup with a few static fields, name, date, and point of contact information. Inside this form I have added a sub form where the user can dynamically add numerous entries in a GrowingGridLayout. The form, its layout, and submitting information is all well and good but I'm having a hard time figuring out how to capture the information from the GrowingGridLayout once it's passed on for saving. Guess the main points are, how do I know how many entries were included in the form?
Included the code for the form:
class OnrampForm(twf.Form):
title = "Onramp Submission Form"
class child(twd.CustomisedTableForm):
onramp_name = twf.TextField(validator=twc.Required)
class Destinations (twd.GrowingGridLayout):
environment = twf.SingleSelectField(label='Environment', validator=twc.Validator(required=True), options=[<OPTIONS>])
location = twf.SingleSelectField(validator=twc.Required, label='Location', options=[<OPTIONS>])
jms_type = twf.SingleSelectField(label='JMS Type', validator=twc.Validator(required=True), options=[<OPTIONS>])
subscription_type = twf.SingleSelectField(label='Subscription Type', validator=twc.Validator(required=True), options=[<OPTIONS>])
onramp_status = twf.SingleSelectField(prompt_text='Status', options=['Initial Release', 'Update'], validator=twc.Required)
current_date = datetime.date.today()
need_by_date = twd.CalendarDatePicker(validators=[twc.Required, twc.DateTimeValidator])
need_by_date.default = current_date + datetime.timedelta(days=30)
organization = twf.TextField(validator=twc.Required)
poc_name = twf.TextField(validator=twc.Required)
poc_email = twf.EmailField(validator=twc.EmailValidator)
poc_phone = twf.TextField(validator=twc.Required)
poc_address = twf.TextField()
poc_city = twf.TextField()
poc_state = twf.TextField()
onramp_form = twf.FileField()
submit = twf.SubmitButton(value="Submit")
action = "/print_args"
submit = ""

If you controller #validates against the form you should get the data into the Destination parameter which should be a list of dictionaries.
Also I just noticed you have two nested forms, that's something that might confuse TW2 pretty much. What you wanted to do is probably have OnrampForm inherit CustomisedForm and then have child inherit TableLayout. See http://turbogears.readthedocs.org/en/latest/cookbook/TwForms.html#displaying-forms
PS: note that need_by_date.default = current_date + datetime.timedelta(days=30) will always return 30 days from when the server started as you are actually storing a current_date = datetime.date.today() class variable that gets computed when the module is imported and no more.
You should use default = Deferred(lambda: datetime.date.today() + datetime.timedelta(days=30)) to achieve that

Floats in JSON on GAE / gviz_api

I have a python application running on Google App Engines which outputs data in JSON format structured by the gviz_api for Google Charts visualisation. The code is as follows:
class StatsItem(ndb.Model):
added = ndb.DateTimeProperty(auto_now_add = True, verbose_name = "Upload date")
originated = ndb.DateTimeProperty(verbose_name = "Origination date")
host = ndb.StringProperty(verbose_name = "Originating host")
uptime = ndb.IntegerProperty(indexed = False, verbose_name = "Uptime")
load1 = ndb.FloatProperty(indexed = False, verbose_name = "1-min load")
load5 = ndb.FloatProperty(indexed = False, verbose_name = "5-min load")
load15 = ndb.FloatProperty(indexed = False, verbose_name = "15-min load")
class ChartDataPage(webapp2.RequestHandler):
def get(self):
span = int(self.request.get('span', 720))
stats = StatsItem.query().order(-StatsItem.originated).fetch(span)
header = { 'originated' : ("datetime", "date") }
vars = []
for v in self.request.get_all('v'):
if v in StatsItem._properties.keys():
vars.append(v)
header[v] = ("number", StatsItem._properties[v]._verbose_name)
data = []
for s in stats:
entry = { 'originated' : s.originated }
for v in vars:
entry[v] = getattr(s, v)
data.append(entry)
data_table = gviz_api.DataTable(header)
data_table.LoadData(data)
self.response.headers['Content-Type'] = 'application/json'
self.response.out.write(data_table.ToJSonResponse(columns_order=(("originated",) + tuple(vars)),
order_by="originated"))
It is working all right, but I get the famous issue with float-type properties, namely this is the output I am seeing (example):
google.visualization.Query.setResponse({"status":"ok","table":{"rows":[{"c":[{"v":"Date(2013,11,19,12,55,22,460)"},{"v":0.33000000000000002}]},{"c":[{"v":"Date(2013,11,19,12,56,22,641)"},{"v":0.33000000000000002}]},{"c":[{"v":"Date(2013,11,19,12,57,22,747)"},{"v":0.28999999999999998}]},{"c":[{"v":"Date(2013,11,19,12,58,22,914)"},{"v":0.25}]},{"c":[{"v":"Date(2013,11,19,12,59,23,19)"},{"v":0.28000000000000003}]},{"c":[{"v":"Date(2013,11,19,13,0,23,169)"},{"v":0.28000000000000003}]},{"c":[{"v":"Date(2013,11,19,13,1,23,268)"},{"v":0.41999999999999998}]},{"c":[{"v":"Date(2013,11,19,13,2,23,385)"},{"v":0.40999999999999998}]},{"c":[{"v":"Date(2013,11,19,13,3,23,518)"},{"v":0.40999999999999998}]},{"c":[{"v":"Date(2013,11,19,13,4,23,643)"},{"v":0.40999999999999998}]}],"cols":[{"type":"datetime","id":"originated","label":"date"},{"type":"number","id":"load5","label":"5-min load"}]},"reqId":"0","version":"0.6"});
So a float with a value of 0.33 (as seen in the DataStore viewer) is represented as 0.33000000000000002 in JSON. While it works, this is not only ugly, but also takes up bandwidth, so I would like to round it to 2 digits, i.e. 0.33. Strangely enough in some cases, this is happening (see 0.25 above).
I am loading the gviz_api module from my applications directory.
I have tried the following solutions, none of these worked:
round()-ing the figure before inputting into the datatable (round(getattr(s, v)) in the above code). It gets invoked, as I see integers turning into floats, but has no impact on the above issue with floats.
Monkey-patching JSON both in the GAE application module and also in the gviz_api module. No effect, the code is just simply not invoked, as if it was not there at all.
Overriding the default() method in gviz_api.DataTableJSONEncoder. This is not working I guess because it gets invoked only for unknown data types.
I have not tried yet to process the JSON string produced with regexps and I would like to avoid that if possible. Any ideas how to fix it?

Implementing a popularity algorithm in Django

I am creating a site similar to reddit and hacker news that has a database of links and votes. I am implementing hacker news' popularity algorithm and things are going pretty swimmingly until it comes to actually gathering up these links and displaying them. The algorithm is simple:
Y Combinator's Hacker News:
Popularity = (p - 1) / (t + 2)^1.5`
Votes divided by age factor.
Where`
p : votes (points) from users.
t : time since submission in hours.
p is subtracted by 1 to negate submitter's vote.
Age factor is (time since submission in hours plus two) to the power of 1.5.factor is (time since submission in hours plus two) to the power of 1.5.
I asked a very similar question over yonder Complex ordering in Django but instead of contemplating my options I choose one and tried to make it work because that's how I did it with PHP/MySQL but I now know Django does things a lot differently.
My models look something (exactly) like this
class Link(models.Model):
category = models.ForeignKey(Category)
user = models.ForeignKey(User)
created = models.DateTimeField(auto_now_add = True)
modified = models.DateTimeField(auto_now = True)
fame = models.PositiveIntegerField(default = 1)
title = models.CharField(max_length = 256)
url = models.URLField(max_length = 2048)
def __unicode__(self):
return self.title
class Vote(models.Model):
link = models.ForeignKey(Link)
user = models.ForeignKey(User)
created = models.DateTimeField(auto_now_add = True)
modified = models.DateTimeField(auto_now = True)
karma_delta = models.SmallIntegerField()
def __unicode__(self):
return str(self.karma_delta)
and my view:
def index(request):
popular_links = Link.objects.select_related().annotate(karma_total = Sum('vote__karma_delta'))
return render_to_response('links/index.html', {'links': popular_links})
Now from my previous question, I am trying to implement the algorithm using the sorting function. An answer from that question seems to think I should put the algorithm in the select and sort then. I am going to paginate these results so I don't think I can do the sorting in python without grabbing everything. Any suggestions on how I could efficiently do this?
EDIT
This isn't working yet but I think it's a step in the right direction:
from django.shortcuts import render_to_response
from linkett.apps.links.models import *
def index(request):
popular_links = Link.objects.select_related()
popular_links = popular_links.extra(
select = {
'karma_total': 'SUM(vote.karma_delta)',
'popularity': '(karma_total - 1) / POW(2, 1.5)',
},
order_by = ['-popularity']
)
return render_to_response('links/index.html', {'links': popular_links})
This errors out into:
Caught an exception while rendering: column "karma_total" does not exist
LINE 1: SELECT ((karma_total - 1) / POW(2, 1.5)) AS "popularity", (S...
EDIT 2
Better error?
TemplateSyntaxError: Caught an exception while rendering: missing FROM-clause entry for table "vote"
LINE 1: SELECT ((vote.karma_total - 1) / POW(2, 1.5)) AS "popularity...
My index.html is simply:
{% block content %}
{% for link in links %}
karma-up
{{ link.karma_total }}
karma-down
{{ link.title }}
Posted by {{ link.user }} to {{ link.category }} at {{ link.created }}
{% empty %}
No Links
{% endfor %}
{% endblock content %}
EDIT 3
So very close! Again, all these answers are great but I am concentrating on a particular one because I feel it works best for my situation.
from django.db.models import Sum
from django.shortcuts import render_to_response
from linkett.apps.links.models import *
def index(request):
popular_links = Link.objects.select_related().extra(
select = {
'popularity': '(SUM(links_vote.karma_delta) - 1) / POW(2, 1.5)',
},
tables = ['links_link', 'links_vote'],
order_by = ['-popularity'],
)
return render_to_response('links/test.html', {'links': popular_links})
Running this I am presented with an error hating on my lack of group by values. Specifically:
TemplateSyntaxError at /
Caught an exception while rendering: column "links_link.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...karma_delta) - 1) / POW(2, 1.5)) AS "popularity", "links_lin...
Not sure why my links_link.id wouldn't be in my group by but I am not sure how to alter my group by, django usually does that.

On Hacker News, only the 210 newest stories and 210 most popular stories are paginated (7 pages worth * 30 stories each). My guess is that the reason for the limit (at least in part) is this problem.
Why not drop all the fancy SQL for the most popular stories and just keep a running list instead? Once you've established a list of the top 210 stories you only need to worry about reordering when a new vote comes in since relative order is maintained over time. And when a new vote does come in, you only need to worry about reordering the story that received the vote.
If the story that received the vote is not on the list, calculate the score of that story, plus the least popular story that is on the list. If the story that received the vote is lower, you're done. If it's higher, calculate the current score for the second-to-least most popular (story 209) and compare again. Continue working up until you find a story with a higher score and then place the newly-voted-upon story right below that one in the rankings. Unless, of course, it reaches #1.
The benefit of this approach is that it limits the set of stories you have to look at to figure out the top stories list. In the absolute worst case scenario, you have to calculate the ranking for 211 stories. So it's very efficient unless you have to establish the list from an existing data set - but that's just a one-time penalty assuming you cache the list someplace.
Downvotes are another issue, but I can only upvote (at my karma level, anyway).

popular_links = Link.objects.select_related()
popular_links = popular_links.extra(
select = {
'karma_total': 'SUM(vote.karma_delta)',
'popularity': '(karma_total - 1) / POW(2, 1.5)'
},
order_by = ['-popularity']
)
Or select some sane number, sort the selection using python in any way you like, and cache if its going to be static for all users which it looks like it will - set cache expiration to a minute or so.
But the extra will work better for paginated results in a highly dynamic setup.

Seems like you could overload the save of the Vote class and have it update the corresponding Link object. Something like this should work well:
from datetime import datetime, timedelta
class Link(models.Model):
category = models.ForeignKey(Category)
user = models.ForeignKey(User)
created = models.DateTimeField(auto_now_add = True)
modified = models.DateTimeField(auto_now = True)
fame = models.PositiveIntegerField(default = 1)
title = models.CharField(max_length = 256)
url = models.URLField(max_length = 2048)
#a field to keep the most recently calculated popularity
popularity = models.FloatField(default = None)
def CalculatePopularity(self):
"""
Add a shorcut to make life easier ... this is used by the overloaded save() method and
can be used in a management function to do a mass-update periodically
"""
ts = datetime.now()-self.created
th = ts.seconds/60/60
self.popularity = (self.user_set.count()-1)/((th+2)**1.5)
def save(self, *args, **kwargs):
"""
Modify the save function to calculate the popularity
"""
self.CalculatePopularity()
super(Link, self).save(*args, **kwargs)
def __unicode__(self):
return self.title
class Vote(models.Model):
link = models.ForeignKey(Link)
user = models.ForeignKey(User)
created = models.DateTimeField(auto_now_add = True)
modified = models.DateTimeField(auto_now = True)
karma_delta = models.SmallIntegerField()
def save(self, *args, **kwargs):
"""
Modify the save function to calculate the popularity of the Link object
"""
self.link.CalculatePopularity()
super(Vote, self).save(*args, **kwargs)
def __unicode__(self):
return str(self.karma_delta)
This way every time you call a link_o.save() or vote_o.save() it will re-calculate the popularity. You have to be a little careful because when you call Link.objects.all().update('updating something') then it won't call our overloaded save() function. So when I use this sort of thing I create a management command which updates all of the objects so they're not too out of date. Something like this will work wonderfully:
from itertools import imap
imap(lambda x:x.CalculatePopularity(), Link.objects.all().select_related().iterator())
This way it will only load a single Link object into memory at once ... so if you have a giant database it won't cause a memory error.
Now to do your ranking all you have to do is:
Link.objects.all().order_by('-popularity')
It will be super-fast since all of you Link items have already calculated the popularity.

Here was the final answer to my question although many months late and not exactly what I had in mind. Hopefully it will be useful to some.
def hot(request):
links = Link.objects.select_related().annotate(votes=Count('vote')).order_by('-created')[:150]
for link in links:
delta_in_hours = (int(datetime.now().strftime("%s")) - int(link.created.strftime("%s"))) / 3600
link.popularity = ((link.votes - 1) / (delta_in_hours + 2)**1.5)
links = sorted(links, key=lambda x: x.popularity, reverse=True)
links = paginate(request, links, 5)
return direct_to_template(
request,
template = 'links/link_list.html',
extra_context = {
'links': links
})
What's going on here is I pull the latest 150 submissions (5 pages of 30 links each) if you need more obviously you can go grab'em by altering my slice [:150]. This way I don't have to iterate over my queryset which might eventually become very large and really 150 links should be enough procrastination for anybody.
I then calculate the difference in time between now and when the link was created and turn it into hours (not nearly as easy as I thought)
Apply the algorithm to a non-existant field (I like this method because I don't have to store the value in my database and isn't reliant on surrounding links.
The line immediately after the for loop was where I also had another bit of trouble. I can't order_by('popularity') because it's not a real field in my database and is calculated on the fly so I have to convert my queryset into an object list and sort popularity from there.
The next line is just my paginator shortcut, thankfully pagination does not require a queryset unlike some generic views (talking to you object_list).
Spit everything out into a nice direct_to_template generic view and be on my merry way.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.