SQLAlchemy Python 3.8 work with renamed columns [duplicate] - python

Using the following SQL expression but I'm getting an error.
select
CampaignCustomer.CampaignCustomerID,
convert(varchar, CampaignCustomer.ModifiedDate, 111) as startdate,
CampaignCustomer.CampaignID,
CampaignCustomer.CampaignCallStatusID,
CampaignCustomer.UserID,
CampaignCustomerSale.Value,
Users.Name
from CampaignCustomer
inner join CampaignCustomerSale
on CampaignCustomer.CampaignCustomerID = CampaignCustomerSale.CampaignCustomerID
inner join Users
on CampaignCustomer.UserID = Users.UserID
where
CampaignCustomer.CampaignCallStatusID = 21
and CampaignCustomer.startdate = '2011/11/22' <------- THIS
order by
startdate desc,
Users.Name asc
Error:
Msg 207, Level 16, State 1, Line 1
Invalid column name 'startdate'.
I can't recognize my alias name startdate in the WHERE clause, but it can in my ORDER BY clause. What's wrong?
Edit:
And no, it is not possible for me to change the datatype to date instead of datetime. The time is needed elsewhere. But in this case, I need only to get all posts on a specific date and I really don't care about what time of the date the modifieddate is :)
Maybe another method is needed instead of convert()?

You can't use column alias in WHERE clause.
Change it to:
where
CampaignCustomer.CampaignCallStatusID = 21
and convert(varchar, CampaignCustomer.ModifiedDate, 111) = '2011/11/22'

Do this:
select
CampaignCustomer.CampaignCustomerID,
convert(varchar, CampaignCustomer.ModifiedDate, 111) as startdate,
CampaignCustomer.CampaignID,
CampaignCustomer.CampaignCallStatusID,
CampaignCustomer.UserID,
CampaignCustomerSale.Value,
Users.Name
from CampaignCustomer
inner join CampaignCustomerSale
on CampaignCustomer.CampaignCustomerID = CampaignCustomerSale.CampaignCustomerID
inner join Users
on CampaignCustomer.UserID = Users.UserID
where
CampaignCustomer.CampaignCallStatusID = 21
and convert(varchar, CampaignCustomer.ModifiedDate, 111) = '2011/11/22'
order by
startdate desc,
Users.Name asc
You need to put in your where clause no aliases, and in the above query I replaced your alias with what it represents.

You didn't mention what version of SQL Server you're using - but if you're on 2008 or newer, you could use:
where
CampaignCustomer.CampaignCallStatusID = 21
and CAST(CampaignCustomer.ModifiedDate AS DATE) = '20111122'
You could cast it to a DATE - just for this comparison.
Also: I would recommend to always use the ISO-8601 standard format of representing a date if you need to compare a date to string - ISO-8601 defines a date as YYYYMMDD and is the only format in SQL Server that will always work - no matter what language/regional settings you have. Any other string representation of a date is always subject to settings in your SQL Server - it might work for you, but I bet for someone else, it will break....

Related

SQLite query in Python using DATETIME and variables not working as expected

I'm trying to query a database using Python/Pandas. This will be a recurring request where I'd like to look back into a window of time that changes over time, so I'd like to use some smarts in how I do this.
In my SQLite query, if I say
WHERE table.date BETWEEN DATETIME('now', '-6 month') AND DATETIME('now')
I get the result I expect. But if I try to move those to variables, the resulting table comes up empty. I found out that the endDate variable does work but the startDate does not. Presumably I'm doing something wrong with the escapes around the apostrophes? Since the result is coming up empty it's like it's looking at DATETIME(\'now\') and not seeing the '-6 month' bit (comparing now vs. now which would be empty). Any ideas how I can pass this through to the query correctly using Python?
startDate = 'DATETIME(\'now\', \'-6 month\')'
endDate = 'DATETIME(\'now\')'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN ? AND ?
'''
df = pd.read_sql_query(query, db, params=[startDate, endDate])
You can try with the string format as shown below,
startDate = "DATETIME('now', '-6 month')"
endDate = "DATETIME('now')"
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN {start_date} AND {end_data}
'''
df = pd.read_sql_query(query.format(start_date=startDate, end_data=endDate), db)
When you provide parameters to a query, they're treated as literals, not expressions that SQL should evaluate.
You can pass the function arguments rather than the function as a string.
startDate = 'now'
startOffset = '-6 month'
endDate = 'now'
endOffset = '+0 seconds'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN DATETIME(?, ?) AND DATETIME(?, ?)
'''
df = pd.read_sql_query(query, db, params=[startDate, startOffset, endDate, endOffset])

sql: select list of columns

I want to pass an str or list argument and want that sql knows how to treat it.
Example of list_col='date1, date2, date3, date4' and at the end i want to have dataframe
date1, date2, date3, id
query = """
SELECT {list_col} AT TIME ZONE 'Europe/Paris' as {list_col}, {table}.{id}
FROM {table}
ORDER BY {table}.{id}
"""
def fun_query(table_name, list_col, id):
return query.format(table=table_name, list_col=list_col, id=id)
Does anyone knows how to do it please?
As already noted this is not doable in a way you suggested because both AT TIME ZONE and AS clauses should appear along with each column. I would suggest doing something like this.
query = """
SELECT {date_cols_as_tz}, {table}.{id}
FROM {table}
ORDER BY {table}.{id}
"""
def fun_query(table_name, list_col, id, tz="'Europe/Paris'"):
date_cols_as_tz = ",".join((f"{c} AT TIME ZONE {tz} as {c}" for c in list_col))
return query.format(date_cols_as_tz=date_cols_as_tz, table=table_name, list_col=list_col, id=id)
When you call e.g. fun_query("my_table", ["date1", "date2"], "table_id") and print it you get following query:
SELECT date1 AT TIME ZONE 'Europe/Paris' as date1,date2 AT TIME ZONE 'Europe/Paris' as date2, my_table.table_id
FROM my_table
ORDER BY my_table.table_id
The major changes are:
create date_cols_as_tz inside the fun_query
use real list for list_col parameter (not string like "date1,date2" but list like ["date1", "date2"])
added optional tz parameter to the function
The advantage of this solution is that you can easily change the timezone by using different value for tz instead of hard coded value.
Also note that this function expects that all columns in list_col are dates (but that's probably what you expect if I understood your question correctly).

Python postgreSQL sqlalchemy query a DATERANGE column

I have a booking system and I save the booked daterange in a DATERANGE column:
booked_date = Column(DATERANGE(), nullable=False)
I already know that I can access the actual dates with booked_date.lower or booked_date.upper
For example I do this here:
for bdate in room.RoomObject_addresses_UserBooksRoom:
unaviable_ranges['ranges'].append([str(bdate.booked_date.lower),\
str(bdate.booked_date.upper)])
Now I need to filter my bookings by a given daterange. For example I want to see all bookings between 01.01.2018 and 10.01.2018.
Usually its simple, because dates can be compared like this: date <= other date
But if I do it with the DATERANGE:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.filter(UserBooks.booked_date.lower >= the_daterange_lower,\
UserBooks.booked_date.upper <= the_daterange_upper).all()
I get an error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with UserBooks.booked_date has an attribute 'lower'
EDIT
I found a sheet with useful range operators and it looks like there are better options to do what I want to do, but for this I need somehow to create a range variable, but python cant do this. So I am still confused.
In my database my daterange column entries look like this:
[2018-11-26,2018-11-28)
EDIT
I am trying to use native SQL and not sqlalchemy, but I dont understand how to create a daterange object.
bookings = db_session.execute('SELECT * FROM usersbookrooms WHERE booked_date && [' + str(the_daterange_lower) + ',' + str(the_daterange_upper) + ')')
The query
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.\
filter(UserBooks.booked_date.lower >= the_daterange_lower,
UserBooks.booked_date.upper <= the_daterange_upper).\
all()
could be implemented using "range is contained by" operator <#. In order to pass the right operand you have to create an instance of psycopg2.extras.DateRange, which represents a Postgresql daterange value in Python:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y').date()
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y').date()
the_daterange = DateRange(the_dateranger_lower, the_daterange_upper)
bookings = UserBooks.query.\
filter(UserBooks.booked_date.contained_by(the_daterange)).\
all()
Note that the attributes lower and upper are part of the psycopg2.extras.Range types. The SQLAlchemy range column types do not provide such, as your error states.
If you want to use raw SQL and pass date ranges, you can use the same DateRange objects to pass values as well:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s',
(DateRange(the_daterange_lower, the_daterange_upper),))
You can also build literals manually, if you want to:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s::daterange',
(f'[{the_daterange_lower}, {the_daterange_upper})',))
The trick is to build the literal in Python and pass it as a single value – using placeholders, as always. It should avoid any SQL injection possibilities; only thing that can happen is that the literal has invalid syntax for a daterange. Alternatively you can pass the bounds to a range constructor:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && daterange(%s, %s)',
(the_daterange_lower, the_daterange_upper))
All in all it is easier to just use the Psycopg2 Range types and let them handle the details.

peewee select() return SQL query, not the actual data

I'm trying sum up the values in two columns and truncate my date fields by the day. I've constructed the SQL query to do this(which works):
SELECT date_trunc('day', date) AS Day, SUM(fremont_bridge_nb) AS
Sum_NB, SUM(fremont_bridge_sb) AS Sum_SB FROM bike_count GROUP BY Day
ORDER BY Day;
But I then run into issues when I try to format this into peewee:
Bike_Count.select(fn.date_trunc('day', Bike_Count.date).alias('Day'),
fn.SUM(Bike_Count.fremont_bridge_nb).alias('Sum_NB'),
fn.SUM(Bike_Count.fremont_bridge_sb).alias('Sum_SB'))
.group_by('Day').order_by('Day')
I don't get any errors, but when I print out the variable I stored this in, it shows:
<class 'models.Bike_Count'> SELECT date_trunc(%s, "t1"."date") AS
Day, SUM("t1"."fremont_bridge_nb") AS Sum_NB,
SUM("t1"."fremont_bridge_sb") AS Sum_SB FROM "bike_count" AS t1 ORDER
BY %s ['day', 'Day']
The only thing that I've written in Python to get data successfully is:
Bike_Count.get(Bike_Count.id == 1).date
If you just stick a string into your group by / order by, Peewee will try to parameterize it as a value. This is to avoid SQL injection haxx.
To solve the problem, you can use SQL('Day') in place of 'Day' inside the group_by() and order_by() calls.
Another way is to just stick the function call into the GROUP BY and ORDER BY. Here's how you would do that:
day = fn.date_trunc('day', Bike_Count.date)
nb_sum = fn.SUM(Bike_Count.fremont_bridge_nb)
sb_sum = fn.SUM(Bike_Count.fremont_bridge_sb)
query = (Bike_Count
.select(day.alias('Day'), nb_sum.alias('Sum_NB'), sb_sum.alias('Sum_SB'))
.group_by(day)
.order_by(day))
Or, if you prefer:
query = (Bike_Count
.select(day.alias('Day'), nb_sum.alias('Sum_NB'), sb_sum.alias('Sum_SB'))
.group_by(SQL('Day'))
.order_by(SQL('Day')))

Django GROUP BY strftime date format

I would like to do a SUM on rows in a database and group by date.
I am trying to run this SQL query using Django aggregates and annotations:
select strftime('%m/%d/%Y', time_stamp) as the_date, sum(numbers_data)
from my_model
group by the_date;
I tried the following:
data = My_Model.objects.values("strftime('%m/%d/%Y',
time_stamp)").annotate(Sum("numbers_data")).order_by()
but it seems like you can only use column names in the values() function; it doesn't like the use of strftime().
How should I go about this?
This works for me:
select_data = {"d": """strftime('%%m/%%d/%%Y', time_stamp)"""}
data = My_Model.objects.extra(select=select_data).values('d').annotate(Sum("numbers_data")).order_by()
Took a bit to figure out I had to escape the % signs.
As of v1.8, you can use Func() expressions.
For example, if you happen to be targeting AWS Redshift's date and time functions:
from django.db.models import F, Func, Value
def TimezoneConvertedDateF(field_name, tz_name):
tz_fn = Func(Value(tz_name), F(field_name), function='CONVERT_TIMEZONE')
dt_fn = Func(tz_fn, function='TRUNC')
return dt_fn
Then you can use it like this:
SomeDbModel.objects \
.annotate(the_date=TimezoneConvertedDateF('some_timestamp_col_name',
'America/New_York')) \
.filter(the_date=...)
or like this:
SomeDbModel.objects \
.annotate(the_date=TimezoneConvertedDateF('some_timestamp_col_name',
'America/New_York')) \
.values('the_date') \
.annotate(...)
Any reason not to just do this in the database, by running the following query against the database:
select date, sum(numbers_data)
from my_model
group by date;
If your answer is, the date is a datetime with non-zero hours, minutes, seconds, or milliseconds, my answer is to use a date function to truncate the datetime, but I can't tell you exactly what that is without knowing what RBDMS you're using.
I'm not sure about strftime, my solution below is using sql postgres trunc...
select_data = {"date": "date_trunc('day', creationtime)"}
ttl = ReportWebclick.objects.using('cms')\
.extra(select=select_data)\
.filter(**filters)\
.values('date', 'tone_name', 'singer', 'parthner', 'price', 'period')\
.annotate(loadcount=Sum('loadcount'), buycount=Sum('buycount'), cancelcount=Sum('cancelcount'))\
.order_by('date', 'parthner')
-- equal to sql query execution:
select date_trunc('month', creationtime) as date, tone_name, sum(loadcount), sum(buycount), sum(cancelcount)
from webclickstat
group by tone_name, date;
my solution like this when my db is mysql:
select_data = {"date":"""FROM_UNIXTIME( action_time,'%%Y-%%m-%%d')"""}
qs = ViewLogs.objects.filter().extra(select=select_data).values('mall_id', 'date').annotate(pv=Count('id'), uv=Count('visitor_id', distinct=True))
to use which function, you can read mysql datetime processor docs like DATE_FORMAT,FROM_UNIXTIME...

Categories

Resources