Have an SQLAlchemy + SQLite "create_function" issue with datetime representations

Have an SQLAlchemy + SQLite "create_function" issue with datetime representations - python

We have sqlite databases, and datetimes are actually stored in Excel format (there is a decent reason for this; it's our system's standard representation of choice, and the sqlite databases may be accessed by multiple languages/systems)
Have been introducing Python into the mix with great success in recent months, and SQLAlchemy is a part of that. The ability of the sqlite3 dbapi layer to swiftly bind custom Python functions where SQLite lacks a given SQL function is particularly appreciated.
I wrote an ExcelDateTime type decorator, and that works fine when retrieving result sets from the sqlite databases; Python gets proper datetimes back.
However, I'm having a real problem binding custom python functions that expect input params to be python datetimes; I'd have thought this was what the bindparam was for, but I'm obviously missing something, as I cannot get this scenario to work. Unfortunately, modifying the functions to convert from excel datetimes to python datetimes is not an option, and neither is changing the representation of the datetimes in the database, as more than one system/language may access it.
The code below is a self-contained example that can be run "as-is", and is representative of the issue. The custom function "get_month" is created, but fails because it receives the raw data, not the type-converted data from the "Born" column. At the end you can see what I've tried so far, and the errors it spits out...
Is what I'm trying to do impossible? Or is there a different way of ensuring the bound function receives the appropriate python type? It's the only problem I've been unable to overcome so far, would be great to find a solution!
import sqlalchemy.types as types
from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData
from sqlalchemy.sql.expression import bindparam
from sqlalchemy.sql import select, text
from sqlalchemy.interfaces import PoolListener
import datetime
# setup type decorator for excel<->python date conversions
class ExcelDateTime( types.TypeDecorator ):
impl = types.FLOAT
def process_result_value( self, value, dialect ):
lxdays = int( value )
lxsecs = int( round((value-lxdays) * 86400.0) )
if lxsecs == 86400:
lxsecs = 0
lxdays += 1
return ( datetime.datetime.fromordinal(lxdays+693594)
+ datetime.timedelta(seconds=lxsecs) )
def process_bind_param( self, value, dialect ):
if( value < 200000 ): # already excel float?
return value
elif( isinstance(value,datetime.date) ):
return value.toordinal() - 693594.0
elif( isinstance(value,datetime.datetime) ):
date_part = value.toordinal() - 693594.0
time_part = ((value.hour*3600) + (value.minute*60) + value.second) / 86400.0
return date_part + time_part # time part = day fraction
# create sqlite memory db via sqlalchemy
def get_month( dt ):
return dt.month
class ConnectionFactory( PoolListener ):
def connect( self, dbapi_con, con_record ):
dbapi_con.create_function( 'GET_MONTH',1,get_month )
eng = create_engine('sqlite:///:memory:',listeners=[ConnectionFactory()])
eng.dialect.dbapi.enable_callback_tracebacks( 1 ) # show better errors from user functions
meta = MetaData()
birthdays = Table('Birthdays', meta, Column('Name',String,primary_key=True), Column('Born',ExcelDateTime), Column('BirthMonth',Integer))
meta.create_all(eng)
dbconn = eng.connect()
dbconn.execute( "INSERT INTO Birthdays VALUES('Jimi Hendrix',15672,NULL)" )
# demonstrate the type decorator works and we get proper datetimes out
res = dbconn.execute( select([birthdays]) )
tuple(res)
# >>> ((u'Jimi Hendrix', datetime.datetime(1942, 11, 27, 0, 0)),)
# simple attempt (blows up with "AttributeError: 'float' object has no attribute 'month'")
dbconn.execute( text("UPDATE Birthdays SET BirthMonth = GET_MONTH(Born)") )
# more involved attempt( blows up with "InterfaceError: (InterfaceError) Error binding parameter 0 - probably unsupported type")
dbconn.execute( text( "UPDATE Birthdays SET BirthMonth = GET_MONTH(:Born)",
bindparams=[bindparam('Born',ExcelDateTime)],
typemap={'Born':ExcelDateTime} ),
Born=birthdays.c.Born )
Many thanks.

Instead of letting Excel/Microsoft dictate how you store date/time, it would be less trouble and work for you to rely on standard/"obvious way" of doing things.
Process objects according to the standards of their domain - Python's way (datetime objects) inside Python/SQLAlchemy, SQL's way inside SQLite (native date/time type instead of float!).
Use APIs to do the necessary translation between domains. (Python talks to SQLite via SQLAlchemy, Python talks to Excel via xlrd/xlwt , Python talks to other systems, Python is your glue.)
Using standard date/time types in SQLite allows you to write SQL without Python involve in standard readable way (WHERE date BETWEEN '2011-11-01' AND '2011-11-02' makes much more sense than WHERE date BETWEEN 48560.9999 AND 48561.00001). It allows you to easily port it to another DBMS (without rewriting all those ad-hoc functions) when your application/databse needs to grow.
Using native datetime objects in Python allows you to use a lot of freely available, well tested, and non-EEE (embrace, extend, extinguish) APIs. SQLAlchemy is one of those.
And I hope you are aware of that slight but dangerous difference between Excel datetime floats in Mac and Windows? Who knows that one of your clients would in the future submit an Excel file from a Mac and crash your application (actually, what's worse is they suddenly earned a million dollars from the error)?
So my suggestion is for you to use xlrd/xlwt when dealing with Excel from Python (there's another package out there for reading Excel 2007 up) and let SQLALchemy and your database use standard datetime types. However if you insist on continuing to store datetime as Excel float, it could save you a lot of time to reuse code from xlrd/xlwt. It has functions for converting Python objects to Excel data and vice-versa.
EDIT: for clarity...
You have no issues reading from the database to Python because you have that class that converts the float into Python datetime.
You have issues writing to the database through SQLAlchemy or using other native Python functions/modules/extensions because you are trying to force a non-standard type when they are expecting the standard Python datetime. ExcelDateTime type from the point of view Python is a float, not datetime.
Although Python uses dynamic/duck typing, it still is strongly typed. It won't allow you to do "nonsense/silliness" like adding integers to string, or forcing float for datetime.
At least two ways to address that:
Declare a custom type - Seems to be the path you wanted to take. Unfortunately this is the hard way. It's quite difficult to create a type that is a float that can also pretend to be datetime. Possible, yes, but requires a lot of study on type instrumentation. Sorry, you have to grok the documentation for that on your own.
Create utility functions - Should be the easier way, IMHO. You need 2 functions: a) float_to_datetime() for converting data from the database to return a Python datetime, and b) datetime_to_float() for converting Python datetime to Excel float.
About solution #2, as I was saying that you could simplify your life by reusing the xldate_from_datetime_tuple() from xlrd/xlwt. That function "Convert a datetime tuple (year, month, day, hour, minute, second) to an Excel date value." Install xlrd then go to /path_to_python/lib/site-packages/xlrd. The function is in xldate.py - the source is well documented for understanding.

Related

Is the SQLAlchemy text function exposed to SQL Injection?

I'm learning how to use SQL Alchemy, and I'm trying to re-implement a previously defined API but now using Python.
The REST API has the following query parameter:
myService/v1/data?range=time:2015-08-01:2015-08-02
So I want to map something like field:FROM:TO to filter a range of results, like a date range, for example.
This is what I'm using at this moment:
rangeStatement = range.split(':')
if(len(rangeStatement)==3):
query = query.filter(text('{} BETWEEN "{}" AND "{}"'.format(*rangeStatement)))
So, this will produce the following WHERE condition:
WHERE time BETWEEN "2015-08-01" AND "2015-08-02"
I know SQL Alchemy is a powerful tool that allows creating queries like Query.filter_by(MyClass.temp), but I need the API request to be as open as possible.
So, I'm worried that someone could pass something like DROP TABLE in the range parameter and exploit the text function

If queries are constructed using string formatting then sqlalchemy.text will not prevent SQL injection - the "injection" will already be present in the query text. However it's not difficult to build queries dynamically, in this case by using getattr to get a reference to the column. Assuming that you are using the ORM layer with model class Foo and table foos you can do
import sqlalchemy as sa
...
col, lower, upper = 'time:2015-08-01:2015-08-02'.split(':')
# Regardless of style, queries implement a fluent interface,
# so they can be built iteratively
# Classic/1.x style
q1 = session.query(Foo)
q1 = q1.filter(getattr(Foo, col).between(lower, upper))
print(q1)
or
# 2.0 style (available in v1.4+)
q2 = sa.select(Foo)
q2 = q2.where(getattr(Foo, col).between(lower, upper))
print(q2)
The respective outputs are (parameters will be bound at execution time):
SELECT foos.id AS foos_id, foos.time AS foos_time
FROM foos
WHERE foos.time BETWEEN ? AND ?
and
SELECT foos.id, foos.time
FROM foos
WHERE foos.time BETWEEN :time_1 AND :time_2
SQLAlchemy will delegate quoting of the values to the connector package being used by the engine, so you're protection against injection will be as good as that provided by the connector package*.
* In general I believe correct quoting should be a good defence against SQL injections, however I'm not sufficiently expert to confidently state that it's 100% effective. It will be more effective than building queries from strings though.

Why does string concatenation create extra parameters using Access and PYODBC?

I created this procedure that I call with the cursor.execute method. The problem that I'm having is that PYODBC sees extra parameters than what I've given.
In this example query, the "-" and "-" are being read as extra parameters by PYODBC. Does anyone know why this is the case? This is happening any time I do any string concatenation in Access.
def GetAccessResults(self):
with pyodbc.connect(SQL.DBPath) as con:
cursor = con.cursor()
if self.parameters == None:
cursor.execute('{{Call {}}}'.format(self.storedProc))
else:
callString = self.__CreateStoredProcString()
cursor.execute(callString, self.parameters)
returnValue = cursor.fetchall()
return returnValue
def __CreateStoredProcString(self):
questionMarks = ('?,' * len(self.parameters))[:-1]
return '{{Call {} ({})}}'.format(self.storedProc, questionMarks)

As OP found out, MS Access being both a frontend GUI application and backend database operates differently in running SQL. Usually, the backend mode tends to be closer to standard SQL, namely:
Quotes: In backend, single quotes are reserved for literals and double quotes for identifiers as opposed to being interchangeable inside MSAccess.exe.
Wildcards: In backend, by default wildcards for LIKE uses % and GUI uses * unless running Access database in SQL Server compatible syntax (ANSI 92) that uses the standard %. For this reason, consider Access' ALIKE (ANSI-Like) with % to be compatible in both Access modes. Interestingly, this is not the case with stored queries as OP uses but will if writing queries in code.
Parameter: In backend, any unquoted object not recognized in query is considered a named parameter and errs out. Meanwhile the GUI launches a pop-up Enter Parameter box (which beginners do not know is actually a runtime error) allowing typed answers to then be evaluated on fly.
GUI Objects: While in GUI, Access queries can reference controls in open forms and reports, even user-defined functions defined in standalone VBA modules, these same references will error out in backend which essentially runs Access in headless mode and only recognizes other tables and queries.
Optimization: See Allen Browne's differences in optimizations that can occur when creating queries in GUI vs backend especially referencing Access object library functions.
By the way, using LIKE on subquery evaluates one scalar to another scalar. In fact, Access will err out if subquery returns more than one row which potentially can occur with current setup.
Error 3354: At most one record can be returned by this subquery
In other databases, the evaluation runs on first row of subquery (which without ORDER BY can be a random row) and not all records of subquery. Instead, consider re-factoring SQL to use EXISTS clause:
PARAMETERS prmServiceName Tex(255);
SELECT c.*
FROM Charts c
WHERE EXISTS
(SELECT 1
FROM Services s
WHERE s.ServiceName = prmService_Name
AND c.FileName ALIKE '%-' & s.Service_Abbreviation & '-%');

Try using ampersands:
Select "*-" & Service_Abbreviation & "-*"
Also, Like expects a string wrapped in quotes, and your subquery doesn't return that. So perhaps:
Select "'*-" & Service_Abbreviation & "-*'"

What is the replacement for DateModifierNode in new versions of Django

I want to do a query based on two fields of a model, a date, offset by an int, used as a timedelta
model.objects.filter(last_date__gte=datetime.now()-timedelta(days=F('interval')))
is a no-go, as a F() expression cannot be passed into a timedelta
A little digging, and I discovered DateModifierNode - though it seems it was removed in this commit: https://github.com/django/django/commit/cbb5cdd155668ba771cad6b975676d3b20fed37b (from this now-outdated SO question Django: Using F arguments in datetime.timedelta inside a query)
the commit mentions:
The .dates() queries were implemented by using custom Query, QuerySet,
and Compiler classes. Instead implement them by using expressions and
database converters APIs.
which sounds sensible, and like there should still be a quick easy way - but I've been fruitlessly looking for how to do that for a little too long - anyone know the answer?

In Django 1.10 there's simpler method to do it but you need to change the model a little: use a DurationField. My model is as follows:
class MyModel(models.Model):
timeout = models.DurationField(default=86400 * 7) # default: week
last = models.DateTimeField(auto_now_add=True)
and the query to find objects where last was before now minus timeout is:
MyModel.objects.filter(last__lt=datetime.datetime.now()-F('timeout'))

Ah, answer from the docs: https://docs.djangoproject.com/en/1.9/ref/models/expressions/#using-f-with-annotations
from django.db.models import DateTimeField, ExpressionWrapper, F
Ticket.objects.annotate(
expires=ExpressionWrapper(
F('active_at') + F('duration'), output_field=DateTimeField()))
which should make my original query look like
model.objects.annotate(new_date=ExpressionWrapper(F('last_date') + F('interval'), output_field=DateTimeField())).filter(new_date__gte=datetime.now())

How can I specify a default time for a ndb.TimeProperty()?

I find myself stuck on this problem, and repeated Googling, checking SO, and reading numerous docs has not helped me get the right answer, so I hope this isn't a bad question.
One entity I want to create is an event taking place during a convention. I'm giving it the property start_time = ndb.TimeProperty(). I also have a property date = messages.DateProperty(), and I'd like to keep the two discrete (in other words, not using DateTimeProperty).
When a user enters information to create an event, I want to specify defaults for any fields they do not enter at creation and I'd like to set the default time as midnight, but I can't seem to format it correctly so the service accepts it (constant 503 Service Unavailable response when I try it using the API explorer).
Right now I've set things up like this (some unnecessary details removed):
event_defaults = {...
...
"start_time": 0000,
...
}
and then I try looping over my default values to enter them into a dictionary which I'll use to .put() the info on the server.
data = {field.name: getattr(request, field.name) for field in request.all_fields()
for default in event_defaults:
if data[default] in (None, []):
data[default] = event_defaults[default]
setattr(request, default, event_defaults[default])
In the logs, I see the error Encountered unexpected error from ProtoRPC method implementation: BadValueError (Expected time, got 0). I have also tried using the time and datetime modules, but I must be using them incorrectly, because I still receive errors.
I suppose I could work around this problem by using ndb.StringProperty() instead, and just deal with strings, but then I'd feel like I would be missing out on a chance to learn more about how GAE and NDB work (all of this is for a project on udacity.com, so learning is certainly the point).
So, how can I structure my default time properly for midnight? Sorry for the wall of text.
Link to code on github. The conference.py file contains the code I'm having the trouble with, and models.py contains my definitions for the entities I'm working with.
Update: I'm a dummy. I had my model class using a TimeProperty() and the corresponding message class using a StringField(), but I was never making the proper conversion between expected types. That's why I could never seem to give it the right thing, but it expected two different things at different points in the code. Issue resolved.

TimeProperty expects a datetime.time value
import datetime
event_defaults = {...
...
"start_time": datetime.time(),
...
}
More in the docs: https://cloud.google.com/appengine/docs/python/ndb/entity-property-reference#Date_and_Time

Use the datetime() library to convert it into a valid ndb time property value
if data['time']:
data['time'] = datetime.strptime(data['time'][:10], "%H:%M").time()
else:
data['time'] = datetime.datetime.now().time()
ps: Don't forget to change data['time'] with your field name

How to get current date and time from DB using SQLAlchemy

I need to retrieve what's the current date and time for the database I'm connected with SQLAlchemy (not date and time of the machine where I'm running Python code). I've seen this functions, but they don't seem to do what they say:
>>> from sqlalchemy import *
>>> print func.current_date()
CURRENT_DATE
>>> print func.current_timestamp()
CURRENT_TIMESTAMP
Moreover it seems they don't need to be binded to any SQLAlchemy session or engine. It makes no sense...
Thanks!

I foud the solution: these functions cannot be used in the way I used (print...), but need to be called inside of the code that interacts with the database. For instance:
print select([my_table, func.current_date()]).execute()
or assigned to a field in an insert operation.
Accidentally I discovered that exists at least a couple of parameters for these functions:
type_ that indicates the type of the value to return, I guess
bind that indicates a binding to an SQLAlchemy engine
Two examples of use:
func.current_date(type_=types.Date, bind=engine1)
func.current_timestamp(type_=types.Time, bind=engine2)
Anyway my tests seems to say these parameters are not so important.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.