I have the following code that iterates the tags queryset, and for each item, creates a Department object and adds it to the departments list:
departments: List[Department] = []
tags = Tag.objects.filter(type="department")
for tag in tags:
dept_id = tag.reference_id
dept_name = tag.name
parent_tag = Tag.objects.get(type="department", reference_id=tag.parent_reference_id)
dept_parent_id = parent_tag.reference_id
departments.append(Department(dept_id, dept_name, dept_parent_id))
However, as you can see, it is making multiple DB calls via Tag.objects.get(), which seems highly inefficient. Is there an efficient way to populate that departments list without making so many DB calls?
TIA.
What you need to use is "in" in your query.
check querysets
Entry.objects.filter(id__in=[1, 3, 4])
Entry.objects.filter(headline__in='abc')
so in your case you can use the the following example :
tags = Tag.objects.filter(id=some_id, type="department").values('id')
tags_list = [tag['id'] for tag in tags]
parent_tag = Tag.objects.get(id__in=tags_list, type="department")
I have used parts of the answer from #Vanda to write the following solution, and this solves my problem.
departments: List[Department] = []
tags = Tag.objects.filter(type="department")
parents_set = {tag.parent_reference_id for tag in tags}
for tag in tags:
dept_id = tag.reference_id
dept_name = tag.name
dept_parent_id = tag.parent_reference_id
if(dept_parent_id not in parents_set):
dept_parent_id = None
departments.append(Department(dept_id, dept_name, dept_parent_id))
It's too slow when I update a ListField with mongoengine.Here is an example
class Post(Document):
_id = StringField()
txt = StringField()
comments = ListField(EmbeddedDocumentField(Comment))
class Comment(EmbeddedDocument):
comment = StringField()
...
...
position = 3000
_id = 3
update_comment_str = "example"
#query
post_obj = Post.objects(_id=str(_id)).first()
#update
post_obj.comments[position].comment = update_comment_str
#save
post_obj.save()
The time it cost increases with the increase of the length of post_obj.comments.
How to optimize it?
Post.objects(id=str(_id)).update(**{"comments__{}__comment".format(position): update_comment_str})
In your code.
You fetched the whole document into python instance which will take place in RAM.
Then update 3000 th comments which will do some magic in mongoengine(marking changed fields and so on).
Then saves document.
In my answer,I have sent the update instruction to mongodb instead of fetching whole documents with N comments into Python which will save memory(RAM) and time.
The mongoengine/MongoDB supports index support update like
set__comments__1000__comment="blabla"
In order to give position using variable, I've used python dictionary and kwargs trick.
I'm having problem with writing query for getting similar posts in a blog based on tags they have. I have following models:
class Articles(BaseModel):
name = CharField()
...
class Tags(BaseModel):
name = CharField()
class ArticleTags(BaseModel):
article = ForeignKeyField(Articles, related_name = "articles")
tags = ForeignKeyField(Tags, related_name = "tags")
What i'd like to do is to get articles with similar tags sorted by amount of common tags.
Edit
After 2 hours of fiddling with it i got the anwser i was looking for, i'm not sure if it's the most efficient way but it's working:
Here is the function if anyone might need that in the future:
def get_similar_articles(self,common_tags = 1, limit = 3):
"""
Get 3 similar articles based on tag used
Minimum 1 common tags i required
"""
art = (ArticleTags.select(ArticleTags.tag)\
.join(Articles)\
.where(ArticleTags.article == self))
return Articles.select(Articles, ArticleTags)\
.join(ArticleTags)\
.where((ArticleTags.article != self) & (ArticleTags.tag << art))\
.group_by(Articles)\
.having(fn.Count(ArticleTags.id) >= common_tags)\
.order_by(fn.Count(Articles.id).desc())\
.limit(limit)
Just a stylistic nit, table names (and model classes) should preferably be singular.
# Articles tagged with 'tag1'
Articles.select().join(ArticleTags).join(Tags).where(Tags.name == 'tag1')
I have a django model that looks something like this:
class Definition
name = models.CharField(max_length=254)
text = models.TextField()
If I do the following query:
animal = Definition.objects.get(name='Owl')
and if I have the following definitions with these names in my database:
Elephant, Owl, Zebra, Human
is there a way to do a django query(ies) that will show me the previous and the next Definitions based on the animal object based on alphabetical order of the name field in the model?
I know that there are ways of getting previous/next based on datetime fields, but I am not so sure for this case.
I don't know of any way of doing this in less than three queries.
target = 'Owl'
animal = Definition.objects.get(name=target)
previous_animal = Definition.objects.order_by('name').filter(name__lt=target)[0]
next_animal = Definition.objects.order_by('name').filter(name__gt=target)[0]
If anyone comes across this like I just did...
heres my solution... it also loops(so if on last item it shows first item as next and if on first item shows last item as previous)
def get_previous_by_title(self):
curr_title = self.get_object().title
queryset = self.my_queryset()
try:
prev = queryset.filter(title__lt=curr_title).order_by("-title")[0:1].get()
except Video.DoesNotExist:
prev = queryset.order_by("-title")[0:1].get()
return prev
def get_next_by_title(self):
curr_title = self.get_object().title
queryset = self.my_queryset()
try:
next = queryset.filter(title__gt=curr_title).order_by("title")[0:1].get()
except Video.DoesNotExist:
next = queryset.order_by("title")[0:1].get()
return next
i have custom querysets based on user level so could just set the queryset as a normal queryset like... Video.objects.all() but anyplace I repeat code more than once I make a function
I have a Django query and some Python code that I'm trying to optimize because 1) it's ugly and it's not as performant as some SQL I could use to write it, and 2) because the hierarchical regrouping of the data looks messy to me.
So,
1. Is it possible to improve this to be a single query?
2. How can I improve my Python code to be more Pythonic?
Background
This is for a photo gallery system. The particular view is attempting to display the thumbnails for all photos in a gallery. Each photo is statically sized several times to avoid dynamic resizing, and I would like to also retrieve the URLs and "Size Type" (e.g. Thumbnail, Medium, Large) of each sizing so that I can Lightbox the alternate sizes without hitting the database again.
Entities
I have 5 models that are of relevance:
class Gallery(models.Model):
Photos = models.ManyToManyField('Photo', through = 'GalleryPhoto', blank = True, null = True)
class GalleryPhoto(models.Model):
Gallery = models.ForeignKey('Gallery')
Photo = models.ForeignKey('Photo')
Order = models.PositiveIntegerField(default = 1)
class Photo(models.Model):
GUID = models.CharField(max_length = 32)
class PhotoSize(models.Model):
Photo = models.ForeignKey('Photo')
PhotoSizing = models.ForeignKey('PhotoSizing')
PhotoURL = models.CharField(max_length = 1000)
class PhotoSizing(models.Model):
SizeName = models.CharField(max_length = 20)
Width = models.IntegerField(default = 0, null = True, blank = True)
Height = models.IntegerField(default = 0, null = True, blank = True)
Type = models.CharField(max_length = 10, null = True, blank = True)
So, the rough idea is that I would like to get all Photos in a Gallery through GalleryPhoto, and for each Photo, I want to get all the PhotoSizes, and I would like to be able to loop through and access all this data through a dictionary.
A rough sketch of the SQL might look like this:
Select PhotoSize.PhotoURL
From PhotoSize
Inner Join Photo On Photo.id = PhotoSize.Photo_id
Inner Join GalleryPhoto On GalleryPhoto.Photo_id = Photo.id
Inner Join Gallery On Gallery.id = GalleryPhoto.Gallery_id
Where Gallery.id = 5
Order By GalleryPhoto.Order Asc
I would like to turn this into a list that has a schema like this:
(
photo: {
'guid': 'abcdefg',
'sizes': {
'Thumbnail': 'http://mysite/image1_thumb.jpg',
'Large': 'http://mysite/image1_full.jpg',
more sizes...
}
},
more photos...
)
I currently have the following Python code (it doesn't exactly mimic the schema above, but it'll do for an example).
gallery_photos = [(photo.Photo_id, photo.Order) for photo in GalleryPhoto.objects.filter(Gallery = gallery)]
photo_list = list(PhotoSize.objects.select_related('Photo', 'PhotoSizing').filter(Photo__id__in=[gallery_photo[0] for gallery_photo in gallery_photos]))
photos = {}
for photo in photo_list:
order = 1
for gallery_photo in gallery_photos:
if gallery_photo[0] == photo.Photo.id:
order = gallery_photo[1] //this gets the order column value
guid = photo.Photo.GUID
if not guid in photos:
photos[guid] = { 'Photo': photo.Photo, 'Thumbnail': None, 'Sizes': [], 'Order': order }
photos[guid]['Sizes'].append(photo)
sorted_photos = sorted(photos.values(), key=operator.itemgetter('Order'))
The Actual Question, Part 1
So, my question is first of all whether I can do my many-to-many query better so that I don't have to do the double query for both gallery_photos and photo_list.
The Actual Question, Part 2
I look at this code and I'm not too thrilled with the way it looks. I sure hope there's a better way to group up a hierarchical queryset result by a column name into a dictionary. Is there?
When you have sql query, that is hard to write using orm - you can use postgresql views. Not sure about mysql. In this case you will have:
Raw SQL like:
CREATE VIEW photo_urls AS
Select
photo.id, --pseudo primary key for django mapper
Gallery.id as gallery_id,
PhotoSize.PhotoURL as photo_url
From PhotoSize
Inner Join Photo On Photo.id = PhotoSize.Photo_id
Inner Join GalleryPhoto On GalleryPhoto.Photo_id = Photo.id
Inner Join Gallery On Gallery.id = GalleryPhoto.Gallery_id
Order By GalleryPhoto.Order Asc
Django model like:
class PhotoUrls(models.Model):
class Meta:
managed = False
db_table = 'photo_urls'
gallery_id = models.IntegerField()
photo_url = models.CharField()
ORM Queryset like:
PhotoUrls.objects.filter(gallery_id=5)
Hope it will help.
Django has some built in functions that will clean up the way your code looks. It will result in subqueries, so I guess it depends on performance. https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.values
gallery_photos = GalleryPhoto.objects.filter(Gallery=gallery).values('Photo_id', 'Order')
photo_queryset = PhotoSize.objects.selected_related('Photo', 'PhotoSizing').filter(
Photo__id__in=gallery_photos.values_list('Photo_id', flat=True))
calling list() will instantly evaluate the queryset, this might affect performance if you have a lot of data.
Additionally, there should be a rather easy way to get rid of if gallery_photo[0] == photo.Photo.id: This seems like it can be easily resolved with another query, getting gallery_photos for all photos.
You can retrieve all data with a single query, and get a list of data dictionaries. Then you can manage this dictionary or create a new one to form your final dictionary... You can use reverse relations in filtering and selecting specific rows from a table... So:
Letx be your selected Galery...
GalleryPhoto.objexts.filter(Galery=x).values('Order', 'Photo__GUID', 'Photo__Photo__PhotoURL', 'Photo__Photo__PhotoSizing__SizeName', 'Photo__Photo__PhotoSizing__Width', 'Photo__Photo__PhotoSizing__Height', 'Photo__Photo__PhotoSizing__Type')
Using Photo__ will create an inner join to Photo table while Photo__Photo__ will create inner join to PhotoSize (via reverse relation) and Photo__Photo__PhotoSizing__ will inner join to PhotoSizing....
You get a list of dictionaries:
[{'Order':....,'GUID': ..., 'PhotoURL':....., 'SizeName':...., 'Width':...., 'Height':..., 'Type':...}, {'Order':....,'GUID': ..., 'PhotoURL':....., 'SizeName':...., 'Width':...., 'Height':..., 'Type':...},....]
You can select rows that you need and get all values as a list of dictionaries... Then you can Write a loop function or iterator to loop through this list and create a new dictionary whit grouping your data...