SQLAlchemy: How to select max from several tables - python

I am starting to use sqlalchemy in an ORM way rather than in an SQL way. I have been through the doc quickly but I don't find how to easily do the equivalent of SQL:
select max(Table1.Date) from Table1, Table2
where...
I can do:
session.query(Table1, Table2)
...
order_by(Table1.c.Date.desc())
and then select the first row but it must be quite inefficient. Could anyone tell me what is the proper way to select the max?
Many thanks

Ideally one would know the other parts of the query. But without any additional information, below should do it
import sqlalchemy as sa
q = (
session
.query(sa.func.max(Table1.date))
.select_from(Table1, Table2) # or any other `.join(Table2)` would do
.filter(...)
.order_by(Table1.c.Date.desc())
)

Related

How to create SQL Pypika Query with "Min()"

I am trying to create a Pypika Query which uses the MIN('') function of SQL. Pypika supports the function but I don't know how to use it.
Basically I want to create this SQL statement in Pypika:
select
"ID","Car","Road","House"
from "thingsTable"
where "ID" not in
(
select MIN("ID")
from "thingsTable"
GROUP BY
"Car","Road","House"
)
order by "ID"
I have tried something like this:
from pypika import Query, Table, Field, Function
query = Query.from_(table).select(min(table.ID)).groupby(table.Car, table.Road, table.House)
And variations of it, but can't figure out how to use this function. There are not a lot of examples around.
Thanks in advance.
Try this one
the code based on Selecting Data with pypika
from pypika import functions as fn
tbl = Table('thingsTable')
q = Query.from_(tbl).where(
tbl.ID.isin(tbl.groupby(tbl.Car, tbl.Road, tbl.House).select(fn.Min(tbl.Id)))
).select(
tbl.Id,tbl.Car,tbl.House,tbl.Road
).orderby(tbl.Id)

How to write subquery in django

Is it possible to make following sql query in django
select * from (
select * from users
) order by id
It is just minimal example. I have a long subquery instead of select * from users. But I can't understand how insert it into subquery.
UPDATED:
Subquery from doc doesn't suits because it build following request
SELECT "post"."id", (
SELECT U0."email"
FROM "comment" U0
WHERE U0."post_id" = ("post"."id")
ORDER BY U0."created_at" DESC LIMIT 1
) AS "newest_commenter_email" FROM "post"
and this subquery can return only one value (.values('email')).
Construction select (subquery) as value from table instead of select value from (subquery)
i would use a python connector to postgreSQL - http://www.postgresqltutorial.com/postgresql-python/query/, that is what i do for the mysql, thought did not try for the postgresql
Making a subquery is essentially setting up two queries and using one query to "feed" another:
from django.db.models import Subquery
all_users = User.objects.all()
User.objects.annotate(the_user=Subquery(all_users.values('email')[:1]))
This is more or less the same as what you provided. You can get about as complicated as you'd like here but the best source to get going with subqueries is the docs

Selecting the first item of an ARRAY with PostgreSQL/SqlAlchemy

Trying to move some queries I run daily into an automated script. I have one in Postgres like the below:
SELECT regexp_split_to_array(col1, "|")[1] AS item, COUNT(*) AS itemcount FROM Tabel1 GROUP BY item ORDER BY itemcount
In SqlAlchemy I have this:
session.query((func.regexp_split_to_array(model.table1.col1, "|")[1]).label("item"), func.count().label("itemcount")).group_by("item").order_by("itemcount")
Python can't "get_item" since it's not actually a collection. I've looked through the docs and can't seem to find something that would let me do this without running raw SQL using execute (which I can do and works, but was looking for a solution for next time).
SQLAlchemy does support indexing with [...]. If you declare a type of a column that you have to be of type postgresql.ARRAY, then it works:
table2 = Table("table2", meta, Column("col1", postgresql.ARRAY(String)))
q = session.query(table2.c.col1[1])
print(q.statement.compile(dialect=postgresql.dialect()))
# SELECT table2.col1[%(col1_1)s] AS anon_1
# FROM table2
The reason why your code doesn't work is that SQLAlchemy does not know that func.regexp_split_to_array(...) returns an array, since func.foo produces a generic function for convenience. To make it work, we need to make sure SQLAlchemy knows the return type of the function, by specifying the type_ parameter:
q = session.query(func.regexp_split_to_array(table1.c.col1, "|", type_=postgresql.ARRAY(String))[1].label("item"))
print(q.statement.compile(dialect=postgresql.dialect()))
# SELECT (regexp_split_to_array(table1.col1, %(regexp_split_to_array_1)s))[%(regexp_split_to_array_2)s] AS item
# FROM table1

Can I get table names along with column names using .description() in Python's DB API?

I am using Python with SQLite 3. I have user entered SQL queries and need to format the results of those for a template language.
So, basically, I need to use .description of the DB API cursor (PEP 249), but I need to get both the column names and the table names, since the users often do joins.
The obvious answer, i.e. to read the table definitions, is not possible -- many of the tables have the same column names.
I also need some intelligent behaviour on the column/table names for aggregate functions like avg(field)...
The only solution I can come up with is to use an SQL parser and analyse the SELECT statement (sigh), but I haven't found any SQL parser for Python that seems really good?
I haven't found anything in the documentation or anyone else with the same problem, so I might have missed something obvious?
Edit: To be clear -- the problem is to find the result of an SQL select, where the select statement is supplied by a user in a user interface. I have no control of it. As I noted above, it doesn't help to read the table definitions.
Python's DB API only specifies column names for the cursor.description (and none of the RDBMS implementations of this API will return table names for queries...I'll show you why).
What you're asking for is very hard, and only even approachable with an SQL parser...and even then there are many situations where even the concept of which "Table" a column is from may not make much sense.
Consider these SQL statements:
Which table is today from?
SELECT DATE('now') AS today FROM TableA FULL JOIN TableB
ON TableA.col1 = TableB.col1;
Which table is myConst from?
SELECT 1 AS myConst;
Which table is myCalc from?
SELECT a+b AS myCalc FROM (select t1.col1 AS a, t2.col2 AS b
FROM table1 AS t1
LEFT OUTER JOIN table2 AS t2 on t1.col2 = t2.col2);
Which table is myCol from?
SELECT SUM(a) as myCol FROM (SELECT a FROM table1 UNION SELECT b FROM table2);
The above were very simple SQL statements for which you either have to make up a "table", or arbitrarily pick one...even if you had an SQL parser!
What SQL gives you is a set of data back as results. The elements in this set can not necessarily be attributed to specific database tables. You probably need to rethink your approach to this problem.

SELECT DISTINCT ON (geometry column) equivalent with GeoDjango

I'm trying to create a Django query that will do the equivalent of the following PostgreSQL/PostGIS query:
SELECT DISTINCT ON (site) * FROM some_table;
site is a POINT type geometry column. How can this be done?
Basically, many of the records in some_table share the same POINT geometry; I just want a list of the geometries with no duplicates. I don't care about the rest of the some_table columns.
The rest of my query is pretty simple; it looks something like this:
qs = models.SomeTable.objects.filter(foo='bar', site__contained=some_polygon)
Side note:
The 'manager' for SomeTable (SomeTable.objects) is a django.contrib.gis.db.models.GeoManger type. I don't know if that helps at all.
Relevant version info:
Django 1.3
PostgreSQL 9.1.1
PostGIS 1.5.3
I figured it out. I had overlooked distinct: https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.distinct
Here's the django query that does exactly what I need:
qs = models.SomeTable.objects.filter(foo='bar', site__contained=some_polygon).values('site').distinct()

Categories

Resources