Django How to update model object fields from external api data - python

I have a script which runs on a scheduler to get data from an api which I then intend to use this data to update the current database model information.
My model ShowInfo within main/models.py:
from django.contrib.auth.models import User
class ShowInfo(models.Model):
title = models.CharField(max_length=50)
latest_ep_num = models.FloatField()
ld = models.BooleanField()
sd = models.BooleanField()
hd = models.BooleanField()
fhd = models.BooleanField()
following = models.ManyToManyField(User, related_name = 'following', blank=True)
I managed to isolate the issue to this section of the script which runs but inserts duplicate shows with the same titles into the database:
else: #test if api fails
for t in real_title:
if t in data_title: #testing if the titles in the database and from the api match
a = ShowInfo.objects.get(title=t)
id = a.id
b = next(item for item in show_list if item["title"] == t)
a1 = ShowInfo(id = id, title = b["title"], latest_ep_num=b["latest_ep_num"], ld=b["ld"], sd=b["sd"],hd=b["hd"],fhd=b["fhd"])
a1.save()
Some additional info about the lists (where show_list is a list of dictionaries gotten from an api):
database = ShowInfo.objects.values()
real_title = []
data_title = []
for show in show_list:
real_title.append(show["title"])
for data in database:
data_title.append(data["title"])
When the script runs I notice from browsing my database with DB Browser for SQLite that the objects were being inserted and not updating as i intended.
The script is supposed to catch shows with the same title from the api and the database and to update any changed information. Does anyone have any idea what is wrong with my save() method?

After a day of trial and error and scrounging around the internet I finally found a solution that worked for me.
For anyone interested, this is the solution that I found from another user on Stack which utilized overriding the internal save method to force an update and not insert if the current object with the same field is already in the DB.
Other methods such as doing force_update=True or update_or_create() did not work in my case.

Related

peewee 'no such table' error

I am trying to put data into a database using flask and peewee, and I have come across the following error: peewee.OperationalError: no such table: post
My models.py file is below:
from peewee import *
import datetime
db = SqliteDatabase('posts.db') #create database to interact with
#create a class for blogposts
class Post(Model):
id = PrimaryKeyField()
date = DateTimeField(default = datetime.datetime.now)
title = CharField()
text = TextField()
class Meta:
database = db
def initialize_db():
db.connect()
db.create_tables([Post], safe = True)
db.close()
I have Googled this, and for most people the lack of 'db.create_tables()' seems to be the problem. Obviously, it's in my code, so I am really not sure where the error is coming from. Some advice would be much appreciated. The problem seems to arise specifically when I try to populate the 'text' field using another .py file.
I adapted your code into the following snippet and it works for me:
from peewee import *
import datetime
db = SqliteDatabase('posts.db') #create database to interact with
#create a class for blogposts
class Post(Model):
id = PrimaryKeyField()
date = DateTimeField(default = datetime.datetime.now)
title = CharField()
text = TextField()
class Meta:
database = db
def initialize_db():
db.connect()
db.create_tables([Post], safe = True)
db.close()
initialize_db() #if db tables are not created, create them
post = Post.create(id=4, title="Some title", text="some text1") #add a new row
post.save() #persist it to db, not necessarily needed
You'll need to call the create method when creating a new Post (i.e. a new row in your database). Other than that, initialize_db() seems to work just fine.
If you are unable to perform any writes on the database, make sure you have write access in the directory where you are trying to do that (in this case, it would be your working directory)

Performance optimization of a database access query in Python/Django web app

TL;DR
In a Django app I maintain, I'm making a DB call from inside a FOR loop. Bad idea; I want to take this outside the loop. Here's the code:
for link in context["object_list"]:
try:
latest_reply = link.publicreply_set.latest('submitted_on')
#if latest_reply is something:
#do something
except:
pass
What would be the DB call outside the FOR loop? As you can see, I'm trying to get the latest publicreply for each link object (foreign key relationship). Note that a publicreply may not exist for every link object. I can't seem to wrap my head around how to do this outside the loop. Profiling tells me this repeated call adds significant overhead.
More details:
Models are:
class Link(models.Model):
description = models.TextField(validators=[MaxLengthValidator(500)])
submitter = models.ForeignKey(User)
submitted_on = models.DateTimeField(auto_now_add=True)
class Publicreply(models.Model):
submitted_by = models.ForeignKey(User)
answer_to = models.ForeignKey(Link)
submitted_on = models.DateTimeField(auto_now_add=True)
description = models.TextField(validators=[MaxLengthValidator(250)])
class Seen(models.Model):
seen_status = models.BooleanField(default=False)
seen_user = models.ForeignKey(User)
seen_at = models.DateTimeField(auto_now_add=True)
which_reply = models.ForeignKey(Publicreply, related_name="publicreply_seen_related")
And adding some more accompanying code to the snippet at the top:
link_ids = [link.id for link in context["object_list"]]
seen_replies = Publicreply.objects.filter(answer_to_id__in=link_ids,publicreply_seen_related__seen_user = user)
for link in context["object_list"]:
try:
latest_reply = link.publicreply_set.latest('submitted_on')
if latest_reply in seen_replies:
#do something
except:
pass
Lastly, context["object_list"] is a list of link objects. For each link object shown in the Django template, if it has a latest_reply, I'll compare it to some timestamp and put in visual markers if certain conditions are true.
try:
latest_replys = Publicreply.objects.all().order_by('answer_to','-submitted_on').distinct('answer_to')
for reply in latest_replys:
if reply in in seen_replies:
#do something

Filter latest record in Django

Writing my first Django app that gets messages from other applications and stores reports about them.
It is performing very slow due to the following logic that I hope can be improved but I'm struggling to find a way to do it with out a loop.
Basically I'm just trying to go through all of the apps (there are about 500 unique ones) and get the latest report for each one. Here are my models and function:
class App(models.Model):
app_name = models.CharField(max_length=200)
host = models.CharField(max_length=50)
class Report(models.Model):
app = models.ForeignKey(App)
date = models.DateTimeField(auto_now_add=True)
status = models.CharField(max_length=20)
runtime = models.DecimalField(max_digits=13, decimal_places=2,blank=True,null=True)
end_time = models.DateTimeField(blank=True,null=True)
def get_latest_report():
""" Returns the latest report from each app """
lset = set()
## get distinct app values
for r in Report.objects.order_by().values_list('app_id').distinct():
## get latest report (by date) and push in to stack.
lreport = Report.objects.filter(app_id=r).latest('date')
lset.add(lreport.pk)
## Filter objects and return the latest runs
return Report.objects.filter(pk__in = lset)
If you're not afraid of executing a query for every app in your database you can try it this way:
def get_latest_report():
""" Returns the latest report from each app """
return [app.report_set.latest('date') for app in App.objects.all()]
This adds a query for every app in your database, but is really expressive and sometimes maintainability and readability are more important than performance.
If you are using PostgreSQL you can use distinct and order_by in combination, giving you the latest report for each app like so
Report.objects.order_by('-date').distinct('app')
If you are using a database that does not support the DISTINCT ON clause, MySQL for example, and you do not mind changing the default ordering of the Report model, you can use prefetch_related to reduce 500+ queries to 2 (however this method will use a lot more memory as it will load every report)
class Report(models.Model):
# Fields
class Meta:
ordering = ['-date']
def get_latest_report():
latest_reports = []
for app in App.objects.all().prefetch_related('report_set'):
try:
latest_reports.append(app.report_set.all()[0])
except IndexError:
pass
return latest_reports

Querying multiple tables in Django

I'm beginner in Django and Python, I started developing survey app. It's based on https://github.com/jessykate/django-survey
I added few features, but I'm having problem with results page, to be more precisely how to
get data to present them. Here's what models with most important fields look like:
class Survey(models.Model):
name = models.CharField(max_length=250)
class Question(models.Model):
text = models.TextField()
survey = models.ForeignKey(Survey)
choices = models.TextField()
class Response(models.Model):
survey = models.ForeignKey(Survey)
class AnswerBase(models.Model):
question = models.ForeignKey(Question)
response = models.ForeignKey(Response)
class AnswerText(AnswerBase):
body = models.TextField(blank=True, null=True)
class AnswerRadio(AnswerBase):
body = models.TextField(blank=True, null=True)
and few more Answer..
I think data in this format would be good to process later in js and display as bar char:
results = [{'some_question_text':
[{'answer':'answer1','count': 11},{'answer':'answer2','count': 6}, ..]}
,..]
I could't came up how to do it in django way, so i tried in sql. Problem is, it works only with one answer type, when I add another condition like 'or ab.id==polls_answerselect.answerbase_ptr_id' query returns strange results.
Here's what I've done:
cursor = connection.cursor()
cursor.execute("select q.text as qtext, ar.body as ans, ab.id as Aid, q.id as Qid, count(ar.body) as count \
from polls_answerbase ab, polls_answerradio ar, polls_question q, polls_survey s \
where ab.id==ar.answerbase_ptr_id \
and ab.question_id==q.id \
and s.id==q.survey_id \
group by ar.body")
rows = dictfetchall(cursor)
result = {}
for r in rows:
res[r['qtext']] = []
res[r['qtext']].append({'ans': r['ans'], 'count': r['count']})
What is better and correct way to solve my problem?
It looks like what you want here is a question list filtered by survey, and you want it in json format.
Take a look at http://django-rest-framework.org/ It comes with a set of predefined class based views that support multiple response formats, json being one of them. The tutorial on that site walks you through setting it up, and uses simple tests along the way to verify you're doing it right. You can do something similar for your models.
I'm a Python/Django beginner as well and found it very easy to pick up.

How to query datastore when using ReferenceProperty?

I am trying to understand the 1-to-many relationships in datastore; but I fail to understand how query and update the record of a user when the model includes ReferenceProperty. Say I have this model:
class User(db.Model):
userEmail = db.StringProperty()
userScore = db.IntegerProperty(default=0)
class Comment(db.Model):
user = db.ReferenceProperty(User, collection_name="comments")
comment = db.StringProperty()
class Venue(db.Model):
user = db.ReferenceProperty(User, collection_name="venues")
venue = db.StringProperty()
If I understand correctly, the same user, uniquely identified by userEmail can have many comments and may be associated with many venues (restaurants etc.)
Now, let's say the user az#example.com is already in the database and he submits a new entry.
Based on this answer I do something like:
q = User.all()
q.filter("userEmail =", az#example.com)
results = q.fetch(1)
newEntry = results[0]
But I am not clear what this does! What I want to do is to update comment and venue fields which are under class Comment and class Venue.
Can you help me understand how this works? Thanks.
The snippet you posted is doing this (see comments):
q = User.all() # prepare User table for querying
q.filter("userEmail =", "az#example.com") # apply filter, email lookup
- this is a simple where clause
results = q.fetch(1) # execute the query, apply limit 1
the_user = results[0] # the results is a list of objects, grab the first one
After this code the_user will be an object that corresponds to the user record with email "az#example.com". Seing you've set up your reference properties, you can access its comments and venues with the_user.comments and the_user.venues. Some venue of these can be modified, say like this:
some_venue = the_user.venues[0] # the first from the list
some_venue.venue = 'At DC. square'
db.put(some_venue) # the entry will be updated
I suggest that you make a general sweep of the gae documentation that has very good examples, you will find it very helpful:
http://code.google.com/appengine/docs/python/overview.html
** UPDATE **: For adding new venue to user, simply create new venue and assign the queried user object as the venue's user attribute:
new_venue = Venue(venue='Jeferson memorial', user=the_user) # careful with the quoting
db.put(new_venue)
To get all Comments for a given user, filter the user property using the key of the user:
comments = Comment.all().filter("user =", user.key()).fetch(50)
So you could first lookup the user by the email, and then search comments or venues using its key.

Categories

Resources