Django retrieve rows for the distinct column values

Django retrieve rows for the distinct column values - python

I want to query Model rows in Django,
class Language(models.Model):
language_id = models.CharField(max_length=100, default="")
code = models.CharField(max_length=100, default="")
name = models.CharField(max_length=500, default="")
In this table, the language_id is not unique, for example, below is the sample data
+-------------+------+---------+
| language_id | code | name |
+-------------+------+---------+
| 12345 | en | english |
| 12345 | te | telugu |
| 54321 | en | english |
| 54321 | te | telugu |
+-------------+------+---------+
I want to filter the rows(all columns) which should have distinct language_ids.
What currently I am doing.
language_list = Language.objects.all()
list = []
idlist = []
for language in language_list:
if language.language_id not in idlist:
il = language
list.append(il)
idlist.append(language.language_id)
Then list will have all the distinct rows(model objects).
Is there any better way to do this. I don't want to rotate through all the language models.

It's unclear what you are trying to do.
What your script does is take the first occurrence of a given ID arbitrarily.
If that's what you want, it will depend on what database your model is based.
PostgreSQL allows the use of distinct on a field:
https://docs.djangoproject.com/en/2.1/ref/models/querysets/#distinct
On MySQL what you could do is get all the unique instances of your id and get an instance of your model matching once per ID:
language_ids = Language.objects.values_list('language_id', flat=True).distinct()
result = []
for language_id in language_ids:
result.append(Language.objects.filter(language_id=language_id).first())
It's not necessarily much better than your solution simply because arbitrary picking isn't an expected use case for the ORM.
If on the other hand you meant to only get language_ids that appear once and only once:
Language.objects.values('language_id').annotate(cnt=Count('id')).filter(cnt=1)

Related

PostgreSQL JOIN on JSON Object column

I'm supposed to join 3 different tables on postgres:
lote_item (on which I have some books id's)
lote_item_log (on which I have a column "attributes", with a JSON object such as {"aluno_id": "2823", "aluno_email": "someemail#outlook.com", "aluno_unidade": 174, "livro_codigo": "XOZK-0NOYP0Z1EMJ"}) - Obs.: Some values on aluno_unidade are null
and finally
company (on which I have every school name for every aluno_unidade.
Ex: aluno_unidade = 174 ==> nome_fantasia = mySchoolName).
Joining the first two tables was easy, since lote_item_log has a foreign key which I could match like this:
SELECT * FROM lote_item JOIN lote_item_log ON lote_item.id = lote_item_log.lote_item_id
Now, I need to get the School Name, contained on table company, with the aluno_unidade ID from table lote_item_log.
My current query is:
SELECT
*
FROM
lote_item
JOIN
lote_item_log
ON
lote_item.id = lote_item_log.lote_item_id
JOIN
company
ON
(
SELECT
JSON_EXTRACT_PATH_TEXT(attributes, 'aluno_unidade')::int
FROM
lote_item_log
WHERE
operation_id = 6
) = company.senior_id
WHERE
item_id = {book_id};
operation_id determines which school is active.
ERROR I'M GETTING:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.CardinalityViolation) more than one row returned by a subquery used as an expression
I tried LIMIT 1, but then I got just an empty array.
What I need is:
lote_item.created_at | lote_item.updated_at | lote_item.item_id | uuid | aluno_email | c014_id | nome_fantasia | cnpj | is_franchise | is_active
somedate | somedate | some_item_id | XJW4 | someemail#a | some_id | SCHOOL NAME | cnpj | t | t

I got it.
Not sure it's the best way, but worked...
SELECT
*
FROM
lote_item
JOIN
lote_item_log
ON
lote_item.id = lote_item_log.lote_item_id
JOIN
company
ON
JSON_EXTRACT_PATH_TEXT(attributes, 'aluno_unidade')::int = company.senior_id
WHERE
item_id = {book_id};

Django pivot table without id and primary key

On the database i have 3 tables:
languages
cities
city_language
city_language Table:
+-------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+-------+
| city_id | bigint(20) unsigned | NO | PRI | NULL | |
| language_id | bigint(20) unsigned | NO | PRI | NULL | |
| name | varchar(255) | NO | | NULL | |
+-------------+---------------------+------+-----+---------+-------+
Model
class CityLanguage(models.Model):
city = models.ForeignKey('Cities', models.DO_NOTHING)
language = models.ForeignKey('Languages', models.DO_NOTHING)
name = models.CharField(max_length=255)
class Meta:
managed = False
db_table = 'city_language'
unique_together = (('city', 'language'),)
Model doesn't have id field and primary key also my table doesn't have id column. If i run this code i got error:
(1054, "Unknown column 'city_language.id' in 'field list'")
If i define primary key for a column this column values should unique. If i use primary_key when i want to put same city with different languages i get
With this city (name or language it depends on which column choose for primary key) already exists.
I don't want to create id column for pivot table. There is no reason create id column for pivot table. Please can you tell me how can i use pivot table with correct way. Thank you.

Django without primary_key not work. There is two way to figure out it:
Create id (Then Django model you don't need to add primary key)
Create other unique column and set it primary key, and also made it unique.
On my side i choose second way created a column named: unique_key and in model put the code.
unique_key = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
you need to import uuid.
Good luck.

how to group by a column and pick one object ordered by created time

I have a model like below,
class MusicData(BaseModel):
name = models.CharField(max_length=100)
url = models.URLField()
description = models.TextField()
age = models.CharField(max_length=25)
language = models.ForeignKey(Language, on_delete=models.CASCADE,
related_name="music_data",
related_query_name="music_data")
count = models.IntegerField()
last_updated = models.CharField(max_length=255)
playlist = models.ForeignKey(PlayList, on_delete=models.CASCADE,
related_name="music_data",
related_query_name="music_data")
I want to get MusicData such that group by name, in each group get the one which has latest created_on (created_on is a DateTimeField in BaseModel)
suppose say I have following data
| Name | Created On |
| ----------- | ----------- |
| ABC | 2019-02-22 1:06:45 AM |
| ABC | 2019-02-22 1:07:45 AM |
| BAC | 2019-02-22 1:08:45 AM |
| BAC | 2019-02-22 1:09:45 AM |
| BAC | 2019-02-22 1:10:45 AM |
| BBC | 2019-02-22 1:11:45 AM |
The expected output is that
| Name | Created On |
| ----------- | ----------- |
| ABC | 2019-02-22 1:07:45 AM |
| BAC | 2019-02-22 1:10:45 AM |
| BBC | 2019-02-22 1:11:45 AM |
I have written this query, which is working fine for above case
models.MusicData.objects.filter(playlist__isnull=True).values(
"name").annotate(maxdate=Max("created_on"))
But, the problem is along with name and created_on I also need other values like name, url, age, count, playlist__name etc...
so I have followed this guide : https://docs.djangoproject.com/en/2.1/topics/db/aggregation/#combining-multiple-aggregations
Came up with this query,
models.MusicData.objects.filter(playlist__isnull = True).values(
"name").annotate(maxdate = Max("created_on")).values("age",
"name",
"description",
"url",
"count",
"last_updated",
"playlist",
language = F(
"language__name")
)
But, in this case I got duplicate objects, then I inspected sql queries, I figured out this
In the first case, only GROUP BY name is there along with joins which is fine
But in the second case, GROUP BY has all the columns I have specified in the values, I understand that if we want a column to SELECT we must include in GROUP BY clause
I even tried to generate a list of ids then filter on it, but that also the same case, it aggregates over the whole queryset
result = models.MusicData.objects.filter(playlist__isnull=True).values(
"name").annotate(maxdate=Max("created_on")).values_list("id", flat=True)
# Then filter on this list of id's
Anyone help me ???
Note: I am using PostgreSQL database

Finally figured it out, I could do this in PostgreSQL
MusicData.objects.order_by('name', '-created_on').distinct('name')..values(
"age",
"name",
"description",
"url",
"since_last_pushed",
"last_updated",
language=F("language__name"),
)
which will result me in the following query,
SELECT DISTINCT ON ('name', 'created_on')
id,
name,
url,
description,
... etc,
FROM
musicdata
ORDER BY name ASC,
created_on DESC;
I wondered if this is a tough question, been more than 2 days, but didn't get any response here make me surprise, I expected answer in hours, did I mistag topics ???

Django JOIN eliminates desired columns

I'm trying to join two tables with django related to each other with a foreign key field.
class Question(models.Model):
description = models.TextField('Description', blank=True, null=True)
class Vote(models.Model):
question = models.ForeignKey(Question)
profile = models.ForeignKey(UserProfile)
value = models.IntegerField('Value')
creator = models.ForeignKey(User)
I tried to create a queryset by using
questions = Question.objects.filter(vote__creator=3).values()
which results in a set like this
+----+-------------+
| id | description |
+----+-------------+
....
If I run a slightly similar query by hand in mysql with
select * from questions as t1 join votes as t2 on t1.id=question_id where creator_id=3;
it results in a set like this
+----+-------------+------+-------------+------------+-------+------------+
| id | description | id | question_id | profile_id | value | creator_id |
How can I prevent django from cutting columns from my resulting queryset? I'd really wish to retrieve a fully joined table.

use objects.select_related():
questions = Question.objects.select_related().filter(vote__creator=3).values()

A Question.objects.filter(...) will generate the following SQL
select * from question where question.id in (...)`
So, as you can see, it's different from what you wanted (question join votes)

SQLAlchemy Column to Row Transformation and vice versa -- is it possible?

I'm looking for a SQLAlchemy only solution for converting a dict received from a form submission into a series of rows in the database, one for each field submitted. This is to handle preferences and settings that vary widely across applications. But, it's very likely applicable to creating pivot table like functionality. I've seen this type of thing in ETL tools but I was looking for a way to do it directly in the ORM. I couldn't find any documentation on it but maybe I missed something.
Example:
Submitted from form: {"UniqueId":1, "a":23, "b":"Hello", "c":"World"}
I would like it to be transformed (in the ORM) so that it is recorded in the database like this:
_______________________________________
|UniqueId| ItemName | ItemValue |
---------------------------------------
| 1 | a | 23 |
---------------------------------------
| 1 | b | Hello |
---------------------------------------
| 1 | c | World |
---------------------------------------
Upon a select the result would be transformed (in the ORM) back into a row of data from each of the individual values.
---------------------------------------------------
| UniqueId | a | b | c |
---------------------------------------------------
| 1 | 23 | Hello | World |
---------------------------------------------------
I would assume on an update that the best course of action would be to wrap a delete/create in a transaction so the current records would be removed and the new ones inserted.
The definitive list of ItemNames will be maintained in a separate table.
Totally open to more elegant solutions but would like to keep out of the database side if at all possible.
I'm using the declarative_base approach with SQLAlchemy.
Thanks in advance...
Cheers,
Paul

Here is a slightly modified example from documentation to work with such table structure mapped to dictionary in model:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.collections import attribute_mapped_collection
from sqlalchemy.ext.associationproxy import association_proxy
from sqlalchemy.orm import relation, sessionmaker
metadata = MetaData()
Base = declarative_base(metadata=metadata, name='Base')
class Item(Base):
__tablename__ = 'Item'
UniqueId = Column(Integer, ForeignKey('ItemSet.UniqueId'),
primary_key=True)
ItemSet = relation('ItemSet')
ItemName = Column(String(10), primary_key=True)
ItemValue = Column(Text) # Use PickleType?
def _create_item(ItemName, ItemValue):
return Item(ItemName=ItemName, ItemValue=ItemValue)
class ItemSet(Base):
__tablename__ = 'ItemSet'
UniqueId = Column(Integer, primary_key=True)
_items = relation(Item,
collection_class=attribute_mapped_collection('ItemName'))
items = association_proxy('_items', 'ItemValue', creator=_create_item)
engine = create_engine('sqlite://', echo=True)
metadata.create_all(engine)
session = sessionmaker(bind=engine)()
data = {"UniqueId": 1, "a": 23, "b": "Hello", "c": "World"}
s = ItemSet(UniqueId=data.pop("UniqueId"))
s.items = data
session.add(s)
session.commit()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django retrieve rows for the distinct column values - python

Related

PostgreSQL JOIN on JSON Object column

Django pivot table without id and primary key

how to group by a column and pick one object ordered by created time

Django JOIN eliminates desired columns

SQLAlchemy Column to Row Transformation and vice versa -- is it possible?

Categories

Resources