How to add a SQL function to sqlalchemy - python

I'm working with Oracle SQL and I want to use some of Oracle's function that don't exist in other types of relational databases.
Basically I want to add function that return a weekday for a given date.
From what I understand sqlachemy gives me two way to do that, one is provide sql query as text, another exend sqlalchemy implementing a new python function that represents the SQL function. I'm leaning torwards implementing the function because I expect to use this in few queries.
Here is what I implemented so far to get this done, I'm not really sure what is my next step, or if this is even correct.
from sqlalchemy.sql.expression import FunctionElement
from sqlalchemy.ext.compiler import compiles
class weekday(FunctionElement):
name= 'weekday'
#compiles(weekday)
def complie(element, compiler, **kw):
if len(element.clauses) == 1:
return "TO_CHAR(%s,'DY')" % compiler.process(element.clauses)
elif len(element.clauses) == 0:
raise TypeError("Weekday needs a date as parameter")
else:
raise TypeError("Weekday needs just one parameter")
When I tried to add this funtion for one of my objects instead of caculating results I got the function istelf back, here is an example of what I'm taking about:
from sqlalchemy import Column, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
class SomeObject(Base):
__tablename__ = 'table1'
asof = Column(Date,primary_key=True)
#hybrid_property
def weekday(self):
return weekday(self.asof)
In shell I tried:
from datetime import datetime
my_object = SomeObject()
my_object.asof = datetime(2018,1,1)
session.add(my_object)
session.commit()
result = session.query(SomeObject).filter(SomeObject.asof == datetime(2018,1,1)).first()
result.weekday # returns `<orm.weekday as 0x1724b7deeb8; weekday>`
NOTE
I insist on extracting that weekday in SQL query rather than in python because I need this to filter out some records, and in my case that funtion will determine if sqlalchemy pulls out couple million or just couple records.

After trying out few things I realized that hybrid_property is not supposed to return a sql expression, it needs to return the actual value that sql expression would have returned.
Thta being said my sql function would go into the 'expression' part of hybrid_property which would look like this:
#weekday.expression
def weekday(cls):
return weekday(cls.asof)

Old question, but you can also use sqlachemy's SQL and Generic Functions:
https://docs.sqlalchemy.org/en/13/core/functions.html

Related

Is using quoted_name safe way for parametrizing table name and fields in python's SQL Alchemy?

I spent a lot of time looking for solution to parametrize table names and field names in SQL Alchemy plain textual SQL queries for SQL Server. I stumbled upon several stackoverflow questions and other resources like:
SQL Alchemy Parametrized Query , binding table name as parameter gives error
Answer to above which I don't like as it is just building query from string which is proun to SQL Injection attacks
I know it is possible (I was doing it that way in the past) to do by creating table objects from sqlalchemy.ext.declarative.declarative_base but it requires to declare whole schema of your database which is a lot of unscalable code.
Without much luck with SQL Server I found solution in Postgres psycopg2 using
psycopg2.sql.Identifier. So from here I started looking for equivalent in SQL Alchemy. I found quoted_name. Which I understand works as identifier preventing from SQL Injections. But is it really? Could somebody confirm that it is safe to use?
Code example below which returns number of rows in the passed in table:
def count_rows(self, table_name: str) -> int:
query_base = "SELECT COUNT(*) FROM {}"
query_params = [quoted_name(table_name, True)]
query = text((query_base).format(*query_params))
with self.engine.connect() as con:
result = con.execute(query).fetchone()
return result[0]
I don't get the impression from the documentation this is the purpose for which quoted_name is intended. My reading was that it's for cases where non-standard naming conventions for column or table names are in use, requiring quotation for them to work.
I think there are two possible solutions:
1. exercise total control over the allowed table names
f"SELECT COUNT(*) FROM {table_name}" is fine if you don't allow table_name to be provided by the user without filtering.
For example, you could simply have
...
allowed = ["table_1", ..., "table_N"]
if table_name not in allowed:
raise ValueError(f"Table name must be one of {allowed}. Received {table_name}.")
There are, of course, plenty of other ways to do this. But the idea is to either map user input to allowed values, reject disallowed values, or a mixture of both.
2. reflect the schema
You mentioned that
I know it is possible (I was doing it that way in the past) to do by creating table objects from sqlalchemy.ext.declarative.declarative_base but it requires to declare whole schema of your database which is a lot of unscalable code.
This is not true. You can 'reflect' the schema of an existing database as follows:
from sqlalchemy import create_engine, func, select, MetaData
class YourClass:
def __init__(self, db_connection_string: str):
"""
__init__ for YourClass
(for example)
"""
self.engine = create_engine(db_connection_string)
self.metadata = MetaData(bind=self.engine)
MetaData.reflect(self.metadata)
def count_rows(self, table_name: str) -> int:
"""
count_rows
Returns the COUNT of the rows for a given table
"""
table = self.metadata.tables[table_name]
result = select([func.count()]).select_from(table).scalar()
return result
Worth noting that this approach will also throw an exception if table_name doesn't exist in the database.
Alternative syntax - for full ORM-goodness, use a sessionmaker:
from sqlalchemy import create_engine, MetaData
from sqlalchemy.orm import sessionmaker
class YourClass:
def __init__(self, db_connection_string: str):
self.engine = create_engine(db_connection_string)
self.Session = sessionmaker(bind=self.engine)
self.metadata = MetaData(bind=self.engine)
MetaData.reflect(self.metadata)
def count_rows(self, table_name: str) -> int:
table = self.metadata.tables[table_name]
# if you want a new session every call:
with self.Session.begin() as session:
return session.query(table).count()

Is there a SqlAlchemy database agnostic FROM_UNIXTIME() function?

Currently I have a query similar to the below in flask sqlalchemy:
from sqlalchemy.sql import func
models = (
Model.query
.join(ModelTwo)
.filter(Model.finish_time >= func.from_unixtime(ModelTwo.start_date))
.all()
)
This works fine with MySQL which I am running in production, however when I run tests against the method using an in-memory SqlLite database it fails because from_unixtime is not a SqlLite function.
Aside from the running tests on the same database as production as closely as possible issue and the fact that I have two different ways of representing data in the database, is there a database agnostic method in SqlAlchemy for handling the conversion of dates to unix timestamps and vice-versa?
For anyone else interested in this, I found a way to create custom functions in SqlAlchemy based on the SQL dialect being used. As such the below achieves what I need:
from sqlalchemy.sql import expression
from sqlalchemy.ext.compiler import compiles
class convert_timestamp_to_date(expression.FunctionElement):
name = 'convert_timestamp_to_date'
#compiles(convert_timestamp_to_date)
def mysql_convert_timestamp_to_date(element, compiler, **kwargs):
return 'from_unixtime({})'.format(compiler.process(element.clauses))
#compiles(convert_timestamp_to_date, 'sqlite')
def sqlite_convert_timestamp_to_date(element, compiler, **kwargs):
return 'datetime({}, "unixepoch")'.format(compiler.process(element.clauses))
The query above can now be re-written as such:
models = (
Model.query
.join(ModelTwo)
.filter(Model.finish_time >= convert_timestamp_to_date(ModelTwo.start_date))
.all()
)

SQLAlchemy parameter creation by loop or function

I'm building a short answer quiz (input as a string). While I've managed to get the quiz up and running, I'm finding that I'm writing code which seems to be a) a bit ugly/unwieldy and b) isn't expandable/reusable in different situations.
Here's an example (this pattern is used a number of times in different functions):
answers_to_store = Answer(answer_1=user_answer_list[0],
answer_2=user_answer_list[1],
answer_3=user_answer_list[2],
answer_4=user_answer_list[3],
answer_5=user_answer_list[4])
Is there a better way to create these parameters? I can't see a way of replacing the parameter name (i.e. answer_1) with a variable or similar (which is the only way I can think of making the task easier).
The only way I've found would be to create the parameters as text using a loop, and then running the resulting command via exec - is that the only way of achieving this, or is there a better way? The other limitation is that this means that having a storage function to store 5 answers from a quiz would be different to one storing 20 answers (the database has that many columns).
I've tried searching, but the problem is I don't know the right term for this, and the nearest thing I came up with was the creation of the appropriate command via a loop and exec - which would work, but seems a long-winded way of doing this.
Assuming this is a short answer quiz, create a database of questionID, answer_string then access the table with the questionID and compare the output with the user input. Here is the idea:
from sqlalchemy import create_engine, Table, MetaData, Column, Integer, VARCHAR
from sqlalchemy.orm import mapper, sessionmaker
Session = sessionmaker()
engine = create_engine('sqlite:///foo.db')
Session.configure(bind=engine)
sess = Session()
metadata = MetaData()
question = Table('question', metadata,
Column('id', Integer, primary_key=True),
Column('answerKey', VARCHAR(None))
)
class Question(object):
def __init__(self, answer):
self.answer = answer
mapper(Question, question)
def check(INquestionID, answerInput):
# Return an answer string associated with the question.
ans = sess.query(question.answer).filter(question.id == INquestionID).limit(1)
if answerInput == ans[0]:
return True
else:
return False
This method stores a string (up to 8000 characters in SQL Server), accesses based on the ID and compares for grading.

SQLAlchemy - override orm.Query.count for a database without subselect

I am using sqlalchemy with a database that doesn't support subselects. What that means is that something like this wouldn't work (where Calendar is a model inheriting a declarative base):
Calendar.query.filter(uuid=uuid).count()
I am trying to override the count method with something like this:
def count(self):
col = func.count(literal_column("'uuid'"))
return self.from_self(col).scalar()
However, the from_self bit still does the subselect. I can't do something like this:
session.query(sql.func.count(Calendar.uuid)).scalar()
Because I want all the filter information from the Query. Is there a way I can get the filter arguments for the current Query without doing the subselect?
Thanks~
From the SQLAlchemy documentation:
For fine grained control over specific columns to count, to skip the usage of a subquery or otherwise control of the FROM clause, or to use other aggregate functions, use func expressions in conjunction with query(), i.e.:
from sqlalchemy import func
# count User records, without
# using a subquery.
session.query(func.count(User.id))
# return count of user "id" grouped
# by "name"
session.query(func.count(User.id)).\
group_by(User.name)
from sqlalchemy import distinct
# count distinct "name" values
session.query(func.count(distinct(User.name)))
Source: SQLAlchemy (sqlalchemy.orm.query.Query.count)

Filter on/Order by Postgres range type in SQLAlchemy

SQLAlchemy supports Postgres range types, as described here. It uses the postgresql+psycopg2 dialect for Postgres communication. These testcases give usage examples for the range types in SQLALchemy.
How can I filter by, or order by, one component (lower or upper) of such a range field in SQLAlchemy?
Using the example from the first link
from psycopg2.extras import DateTimeRange
from sqlalchemy.dialects.postgresql import TSRANGE
class RoomBooking(Base):
__tablename__ = 'room_booking'
room = Column(Integer(), primary_key=True)
during = Column(TSRANGE())
booking = RoomBooking(
room=101,
during=DateTimeRange(datetime(2013, 3, 23), None)
)
I would, e.g., like to filter on bookings with a during that begins on a given datetime or order the bookings by the start of the datetime.
As such I'm looking to generate roughly this SQL:
SELECT room, during
FROM room_booking
WHERE lower(during) = foo
ORDER BY upper(during)
I have tried constructs like
RoomBooking.query.filter(RoomBooking.during.lower == foo).order_by(RoomBooking.during.upper)
but recognize that this is likely not working because lower is an attribute on the python object and not associated with the underlying table column.
One possible solution to this might be finding a way to use the upper()/lower() range functions from SQLAlchemy.
One way to do this is to use the already existing func.lower()/func.upper() methods in sqlalchemy:
from sqlalchemy import func
RoomBooking.query.filter(func.lower(RoomBooking.during) == foo).order_by(func.upper(RoomBooking.during))
These methods were probably introduced to support (un)capitalizing text – it would be interesting to see if other unavailable postgres functions can be implemented in a similar manner as well.

Categories

Resources