I'm trying to use peewee to fetch and format some data coming from a sqlite database using GROUP_CONCAT and Case. But I'm facing an issue with those functions.
First let start with what I want to achieve:
I simplified my table structure to better point the problem: 1 simple table with two columns: name (Char), is_controlled (Boolean).
This SQL request compute the desired result:
SELECT
SUM(is_controlled),
GROUP_CONCAT(CASE WHEN is_controlled = 1 THEN name ELSE NULL END, ':') as controlled,
GROUP_CONCAT(CASE WHEN is_controlled = 0 THEN name ELSE NULL END, ':') as not_controlled
FROM component;
Output (which is what I expect to have with peewee):
2
comp1:comp3
comp2
Here is an script allowing to test my problem:
from peewee import *
db = SqliteDatabase('test.db')
class Component(Model):
name = CharField()
is_controlled = BooleanField()
class Meta:
database = db
raw_data = [
{'name': 'comp1', 'is_controlled': True},
{'name': 'comp2', 'is_controlled': False},
{'name': 'comp3', 'is_controlled': True},
]
db.connect()
# Populate database
db.create_tables([Component])
for item in raw_data:
Component.get_or_create(**item)
res = Component.select(
fn.Sum(Component.is_controlled).alias('controlled_count'),
fn.GROUP_CONCAT(Case(None, [((Component.is_controlled == True), Component.name)], None), ':').alias('controlled'),
fn.GROUP_CONCAT(Case(None, [((Component.is_controlled == False), Component.name)], None), ':').alias('not_controlled')
)
print res[0].controlled_count
print res[0].controlled
print res[0].not_controlled
db.close()
As you can see, the data structure is simple (I simplified at maximum the example). The ouput is:
2
:comp3:
:
I inspected the SQL query generated by peewee (using res.sql()) and it looks like that:
sql = 'SELECT Sum("t1"."is_controlled") AS "controlled_count", GROUP_CONCAT(CASE WHEN ("t1"."is_controlled" = ?) THEN ? END, "t1"."name") AS "controlled", GROUP_CONCAT(CASE WHEN ("t1"."is_controlled" = ?) THEN ? END, "t1"."name") AS "not_controlled" FROM "component" AS "t1"'
params = [True, ':', False, ':']
We can see that the ELSE NULL part is missing from the peewee generated SQL request. I have tried several things, like adapting the parameters given to the Case function, but I can not get it to work correctly.
How can correctly use peewee to have the same result as using SQL ?
(I'm using python 2.7.15 with peewee 3.6.4 ans sqlite 3.19.4)
The Case function's signature provides a clue:
def Case(predicate, expression_tuples, default=None):
Inside the code, it checks:
if default is not None:
clauses.extend((SQL('ELSE'), default))
So, when you're passing None it's indistinguishable from the "empty/unspecified" case, and Peewee ignores it.
As a workaround you could instead specify SQL('NULL') as the default value. Or you could use an empty string, although I'm not sure whether you're relying on some behavior of group-concat with nulls, so that may not work?
Related
I have a table Ticket that has the id (autoincremental), ticket_number (trigger that reads a sequence), value and date in Oracle. And I want to do the following:
INSERT INTO TICKET (value, date) values (100, TO_DATE('07-29-2015', 'mm-dd-yyyy')) returning ticket_number into :ticket_number;
I need to do this in raw SQL, but I don't know how to get the value in sqlalchemy. is it possible?
I've tried this with a toy table in Postgres and it works, I think should be equivalent in Oracle, please let me know.
In [15]:
result = session.connection().execute("insert into usertable values('m6', 'kk2', 'Chile') returning username")
for r in result:
print r
(u'm6',)
Hope it helps.
EDIT for Oracle: the one way to do it that I've found is not very elegant. It would be using the raw connection underneath SQLAlchemy connection, something like:
In [15]:
from sqlalchemy.sql import text
import cx_Oracle
cur = session.connection().connection.cursor()
out = cur.var(cx_Oracle.STRING)
par = { "u" : out }
cur.prepare("insert into usertable values('m34', 'kk2', 'Chile') returning username into :u")
cur.execute(None, par)
print(out)
print(type(out))
print(out.getvalue())
<cx_Oracle.STRING with value 'm34'>
<type 'cx_Oracle.STRING'>
m34
Unfortunately, I don't think there is a way to create a cx_oracle variable instance, it is just not available in the api, see docs.
Then, there is no way to avoid creating the cursor, even if it works when you delegate more to SQLAlchemy:
In [28]:
from sqlalchemy.sql import text
import cx_Oracle
cur = session.connection().connection.cursor()
out = cur.var(cx_Oracle.STRING)
par = { "u" : out }
q = text("insert into usertable values('m43', 'kk2', 'Chile') returning username into :u")
result = session.connection().execute(q, par)
print(par["u"])
print(out)
type(out)
<cx_Oracle.STRING with value 'm43'>
<cx_Oracle.STRING with value 'm43'>
Out[28]:
cx_Oracle.STRING
Of course, you should close the cursor in this second case (in the first one, oracle closes it). The point that there is no way to create an instance like out = cx_Oracle.STRING()
As I say, it is not very elegant, but I don't think there is a way to create an equivalent variable in SQLAlchemy. It is something that the code handles internally. I would just go for the raw connection-cursor.
Hope it helps.
EDIT2: In the code above, added out.getvalue() as suggested. Thanks!
You should be able to use the execute command.
Something like this:
raw_SQL = "INSERT INTO TICKET (value, date) values (100, TO_DATE('07-29-2015', 'mm-dd-yyyy')) returning ticket_number into :ticket_number;"
connection = engine.connect()
result = connection.execute(raw_SQL)
Please suggest is there way to write query multi-column in clause using SQLAlchemy?
Here is example of the actual query:
SELECT url FROM pages WHERE (url_crc, url) IN ((2752937066, 'http://members.aye.net/~gharris/blog/'), (3799762538, 'http://www.coxandforkum.com/'));
I have a table that has two columns primary key and I'm hoping to avoid adding one more key just to be used as an index.
PS I'm using mysql DB.
Update: This query will be used for batch processing - so I would need to put few hundreds pairs into the in clause. With IN clause approach I hope to know fixed limit of how many pairs I can stick into one query. Like Oracle has 1000 enum limit by default.
Using AND/OR combination might be limited by the length of the query in chars. Which would be variable and less predictable.
Assuming that you have your model defined in Page, here's an example using tuple_:
keys = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/')
]
select([
Page.url
]).select_from(
Page
).where(
tuple_(Page.url_crc, Page.url).in_(keys)
)
Or, using the query API:
session.query(Page.url).filter(tuple_(Page.url_crc, Page.url).in_(keys))
I do not think this is currently possible in sqlalchemy, and not all RDMBS support this.
You can always transform this to a OR(AND...) condition though:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter(or_(*(and_(Page.url_crc == crc, Page.url == url) for crc, url in filter_rows)))
print qry
should produce something like (for SQLite):
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE pages.url_crc = ? AND pages.url = ? OR pages.url_crc = ? AND pages.url = ?
-- (2752937066L, 'http://members.aye.net/~gharris/blog/', 3799762538L, 'http://www.coxandforkum.com/')
Alternatively, you can combine two columns into just one:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter((func.cast(Page.url_crc, String) + '|' + Page.url).in_(["{}|{}".format(*_frow) for _frow in filter_rows]))
print qry
which produces the below (for SQLite), so you can use IN:
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE (CAST(pages.url_crc AS VARCHAR) || ? || pages.url) IN (?, ?)
-- ('|', '2752937066|http://members.aye.net/~gharris/blog/', '3799762538|http://www.coxandforkum.com/')
I ended up using the test() based solution: generated "(a,b) in ((:a1, :b1), (:a2,:b2), ...)" with named bind vars and generating dictionary with bind vars' values.
params = {}
for counter, r in enumerate(records):
a_param = "a%s" % counter
params[a_param] = r['a']
b_param = "b%s" % counter
params[b_param] = r['b']
pair_text = "(:%s,:%s)" % (a_param, b_param)
enum_pairs.append(pair_text)
multicol_in_enumeration = ','.join(enum_pairs)
multicol_in_clause = text(
" (a,b) in (" + multicol_in_enumeration + ")")
q = session.query(Table.id, Table.a,
Table.b).filter(multicol_in_clause).params(params)
Another option I thought about using mysql upserts but this would make whole included even less portable for the other db engine then using multicolumn in clause.
Update SQLAlchemy has sqlalchemy.sql.expression.tuple_(*clauses, **kw) construct that can be used for the same purpose. (I haven't tried it yet)
I've got a SQLite query, which depends on 2 variables, gender and hand. Each of these can have 3 values, 2 which actually mean something (so male/female and left/right) and the third is 'all'. If a variable has a value of 'all' then I don't care what the particular value of that column is.
Is it possible to achieve this functionality with a single query, and just changing the variable? I've had a look for a wildcard or don't care operator but haven't been able to find any except for % which doesn't work in this situation.
Obviously I could make a bunch of if statements and have different queries to use for each case but that's not very elegant.
Code:
select_sql = """ SELECT * FROM table
WHERE (gender = ? AND hand = ?)
"""
cursor.execute(select_sql, (gender_var, hand_var))
I.e. this query works if gender_val = 'male' and hand_var = 'left', but not if gender_val or hand_var = 'all'
You can indeed do this with a single query. Simply compare each variable to 'all' in your query.
select_sql = """ SELECT * FROM table
WHERE ((? = 'all' OR gender = ?) AND (? = 'all' OR hand = ?))
"""
cursor.execute(select_sql, (gender_var, gender_var, hand_var, hand_var))
Basically, when gender_var or hand_var is 'all', the first part of each OR expression is always true, so that branch of the AND is always true and matches all records, i.e., it is a no-op in the query.
It might be better to build a query dynamically in Python, however, that has just the fields you actually need to test. It might be noticeably faster, but you'd have to benchmark that to be sure.
I've tried these queries with these results:
queryset.update(done=not F('boolean'))
{'time': '0.001', 'sql': u'UPDATE "todo_item" SET "done" = True'}
queryset.update(done=(F('boolean')==False))
{'time': '0.001', 'sql': u'UPDATE "todo_item" SET "done" = False'}
What I would like is something like this:
queryset.update(done=F('done'))
{'time': '0.002', 'sql': u'UPDATE "todo_item" SET "done" = "todo_item"."done"'}
But with
SET "done" = !"todo_item"."done"
to toggle the boolean value
I am developing django-orm extension, and have already partially implemented the solution to your problem.
>>> from django_orm.expressions import F
>>> from niwi.models import TestModel
>>> TestModel.objects.update(done=~F('done'))
# SQL:
UPDATE "niwi_testmodel" SET "done" = NOT "niwi_testmodel"."done"; args=()
https://github.com/niwibe/django-orm
Is a partial solution and not very clean. And so far only for postgresql. In a while I'll see how to improve it.
Update: now improved and works on postgresql, mysql and sqlite.
This is the standard way and works pretty well
conditional expressions
I'll explain it simply with an even simpler example :)
suppose we have Restaurant objects (model), which has is_closed field (BooleanField)
and we want to toggle is_closed for object with pk=1, this snippet does that:
r1 = Restaurant.objects.get(pk=1).update(
is_closed=Case(
When(is_closed=True, then=False),
default=True
))
#update_status
update_status = (
"UPDATE csv SET status = (NOT csv.status) "
"WHERE id = %s")
print(random_number)
cursor.execute(update_status, (random_number))
print("update_status")
For boolean values, you can use the more Pythonic not F() and for bit negation on fields, use ~F()
I am new to python, I come here from the land of PHP. I constructed a SQL query like this in python based on my PHP knowledge and I get warnings and errors
cursor_.execute("update posts set comment_count = comment_count + "+str(cursor_.rowcount)+" where ID = " + str(postid))
# rowcount here is int
What is the right way to form queries?
Also, how do I escape strings to form SQL safe ones? like if I want to escape -, ', " etc, I used to use addslashes. How do we do it in python?
Thanks
First of all, it's high time to learn to pass variables to the queries safely, using the method Matus expressed. Clearer,
tuple = (foovar, barvar)
cursor.execute("QUERY WHERE foo = ? AND bar = ?", tuple)
If you only need to pass one variable, you must still make it a tuple: insert comma at the end to tell Python to treat it as a one-tuple: tuple = (onevar,)
Your example would be of form:
cursor_.execute("update posts set comment_count = comment_count + ? where id = ?",
(cursor_.rowcount, postid))
You can also use named parameters like this:
cursor_.execute("update posts set comment_count = comment_count + :count where id = :id",
{"count": cursor_.rowcount, "id": postid})
This time the parameters aren't a tuple, but a dictionary that is formed in pairs of "key": value.
from python manual:
t = (symbol,)
c.execute( 'select * from stocks where symbol=?', t )
this way you prevent SQL injection ( suppose this is the SQL safe you refer to ) and also have formatting solved