I have a booking system and I save the booked daterange in a DATERANGE column:
booked_date = Column(DATERANGE(), nullable=False)
I already know that I can access the actual dates with booked_date.lower or booked_date.upper
For example I do this here:
for bdate in room.RoomObject_addresses_UserBooksRoom:
unaviable_ranges['ranges'].append([str(bdate.booked_date.lower),\
str(bdate.booked_date.upper)])
Now I need to filter my bookings by a given daterange. For example I want to see all bookings between 01.01.2018 and 10.01.2018.
Usually its simple, because dates can be compared like this: date <= other date
But if I do it with the DATERANGE:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.filter(UserBooks.booked_date.lower >= the_daterange_lower,\
UserBooks.booked_date.upper <= the_daterange_upper).all()
I get an error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with UserBooks.booked_date has an attribute 'lower'
EDIT
I found a sheet with useful range operators and it looks like there are better options to do what I want to do, but for this I need somehow to create a range variable, but python cant do this. So I am still confused.
In my database my daterange column entries look like this:
[2018-11-26,2018-11-28)
EDIT
I am trying to use native SQL and not sqlalchemy, but I dont understand how to create a daterange object.
bookings = db_session.execute('SELECT * FROM usersbookrooms WHERE booked_date && [' + str(the_daterange_lower) + ',' + str(the_daterange_upper) + ')')
The query
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y')
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y')
bookings = UserBooks.query.\
filter(UserBooks.booked_date.lower >= the_daterange_lower,
UserBooks.booked_date.upper <= the_daterange_upper).\
all()
could be implemented using "range is contained by" operator <#. In order to pass the right operand you have to create an instance of psycopg2.extras.DateRange, which represents a Postgresql daterange value in Python:
the_daterange_lower = datetime.strptime(the_daterange[0], '%d.%m.%Y').date()
the_daterange_upper = datetime.strptime(the_daterange[1], '%d.%m.%Y').date()
the_daterange = DateRange(the_dateranger_lower, the_daterange_upper)
bookings = UserBooks.query.\
filter(UserBooks.booked_date.contained_by(the_daterange)).\
all()
Note that the attributes lower and upper are part of the psycopg2.extras.Range types. The SQLAlchemy range column types do not provide such, as your error states.
If you want to use raw SQL and pass date ranges, you can use the same DateRange objects to pass values as well:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s',
(DateRange(the_daterange_lower, the_daterange_upper),))
You can also build literals manually, if you want to:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && %s::daterange',
(f'[{the_daterange_lower}, {the_daterange_upper})',))
The trick is to build the literal in Python and pass it as a single value – using placeholders, as always. It should avoid any SQL injection possibilities; only thing that can happen is that the literal has invalid syntax for a daterange. Alternatively you can pass the bounds to a range constructor:
bookings = db_session.execute(
'SELECT * FROM usersbookrooms WHERE booked_date && daterange(%s, %s)',
(the_daterange_lower, the_daterange_upper))
All in all it is easier to just use the Psycopg2 Range types and let them handle the details.
Related
I am pulling data from an online database using SQL/postgresql queries and converting it into a Python dataframe using Pandas. I want to be able to change the dates in the SQL query from one point in my Python script instead of having to manually go through every SQL query and change it one by one as there are many queries and many lines in each one.
This is what I have to begin with for example:
random_query = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('2022-03-01')
and date_trunc('day',a.created_at) <= date('2022-03-31')
group by 1,2,3
"""
Then I will read the data into Pandas as follows:
df_random_query = pd.read_sql(random_query, conn)
The connection above is to the database - the issue is not there so I am excluding that portion of code here.
What I have attempted is the following:
start_date = '2022-03-01'
end_date = '2022-03-31'
I have set the above 2 dates as variables and then below I have tried to use them in the SQL query as follows:
attempted_solution = """
select *
from table_A as a
where date_trunc('day',a.created_at) >= date(
""" + start_date + """)
and date_trunc('day',a.created_at) <= date(
""" + end_date + """)
group by 1,2,3
"""
This does run but it gives me a dataframe with no data in it - i.e. no numbers. I am not sure what I am doing wrong - any assistance will really help.
try dropping date function and formatting:
my_query = f"... where date_trunc('day', a.created_at) >= {start_date}"
I was able to work it out as follows:
start_date = '2022-03-01'
end_date = '2022-03-31'
random_query = f"""
select *
from table_A as a
where date_trunc('day',a.created_at) >= date('start_date')
and date_trunc('day',a.created_at) <= date('end_date')
group by 1,2,3
"""
It was amusing to see that all I needed to do was put start_date and end_date in ' ' as well. I noticed this simply by printing what query was showing in the script. Key thing here is to know how to troubleshoot.
Another option was also to use the .format() at the end of the query and inside it say .format(start_date = '2022-03-01', end_date = '2022-03-31').
I'm trying to query a database using Python/Pandas. This will be a recurring request where I'd like to look back into a window of time that changes over time, so I'd like to use some smarts in how I do this.
In my SQLite query, if I say
WHERE table.date BETWEEN DATETIME('now', '-6 month') AND DATETIME('now')
I get the result I expect. But if I try to move those to variables, the resulting table comes up empty. I found out that the endDate variable does work but the startDate does not. Presumably I'm doing something wrong with the escapes around the apostrophes? Since the result is coming up empty it's like it's looking at DATETIME(\'now\') and not seeing the '-6 month' bit (comparing now vs. now which would be empty). Any ideas how I can pass this through to the query correctly using Python?
startDate = 'DATETIME(\'now\', \'-6 month\')'
endDate = 'DATETIME(\'now\')'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN ? AND ?
'''
df = pd.read_sql_query(query, db, params=[startDate, endDate])
You can try with the string format as shown below,
startDate = "DATETIME('now', '-6 month')"
endDate = "DATETIME('now')"
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN {start_date} AND {end_data}
'''
df = pd.read_sql_query(query.format(start_date=startDate, end_data=endDate), db)
When you provide parameters to a query, they're treated as literals, not expressions that SQL should evaluate.
You can pass the function arguments rather than the function as a string.
startDate = 'now'
startOffset = '-6 month'
endDate = 'now'
endOffset = '+0 seconds'
query = '''
SELECT some stuff
FROM table
WHERE table.date BETWEEN DATETIME(?, ?) AND DATETIME(?, ?)
'''
df = pd.read_sql_query(query, db, params=[startDate, startOffset, endDate, endOffset])
I'm trying to run SQL statements through Python on a list.
By passing in a list, in this case date. Since i want to run multiple SELECT SQL queries and return them.
I've tested this by passing in integers, however when trying to pass in a date I am getting ORA-01036 error. Illegal variable name/number. I'm using an Oracle DB.
cursor = connection.cursor()
date = ["'01-DEC-21'", "'02-DEC-21'"]
sql = "select * from table1 where datestamp = :date"
for item in date:
cursor.execute(sql,id=item)
res=cursor.fetchall()
print(res)
Any suggestions to make this run?
You can't name a bind variable date, it's an illegal name. Also your named variable in cursor.execute should match the bind variable name. Try something like:
sql = "select * from table1 where datestamp = :date_input"
for item in date:
cursor.execute(sql,date_input=item)
res=cursor.fetchall()
print(res)
Some recommendation and warnings to your approach:
you should not depend on your default NLS date setting, while binding a String (e.g. "'01-DEC-21'") to a DATE column. (You probably need also remone one of the quotes).
You should ommit to fetch data in a loop if you can fetch them in one query (using an IN list)
use prepared statement
Example
date = ['01-DEC-21', '02-DEC-21']
This generates the query that uses bind variables for your input list
in_list = ','.join([f" TO_DATE(:d{ind},'DD-MON-RR','NLS_DATE_LANGUAGE = American')" for ind, d in enumerate(date)])
sql_query = "select * from table1 where datestamp in ( " + in_list + " )"
The sql_query generate is
select * from table1 where datestamp in
( TO_DATE(:d0,'DD-MON-RR','NLS_DATE_LANGUAGE = American'), TO_DATE(:d1,'DD-MON-RR','NLS_DATE_LANGUAGE = American') )
Note that the INlist contains one bind variable for each member of your input list.
Note also the usage of to_date with explicite mask and fixing the language to avoid problems with interpretation of the month abbreviation. (e.g. ORA-01843: not a valid month)
Now you can use the query to fetch the data in one pass
cur.prepare(sql_query)
cur.execute(None, date)
res = cur.fetchall()
I'm trying to fetch the data returned by a View in MySQL using Pony ORM but the documentation does not provide any information on how to achieve this (well, I couldn't find any solution until this moment). Can Pony ORM do this? If so, what should I do to get it working?
Here is my MySQL View:
CREATE
ALGORITHM = UNDEFINED
DEFINER = `admin`#`%`
SQL SECURITY DEFINER
VIEW `ResidueCountByDate` AS
SELECT
CAST(`ph`.`DATE` AS DATE) AS `Date`,
COUNT(`art`.`RESIDUE_TYPE_ID`) AS `Aluminum Count`,
COUNT(`prt`.`RESIDUE_TYPE_ID`) AS `PET Count`
FROM
((((`TBL_PROCESS_HISTORY` `ph`
JOIN `TBL_RESIDUE` `pr` ON ((`ph`.`RESIDUE_ID` = `pr`.`RESIDUE_ID`)))
LEFT JOIN `TBL_RESIDUE_TYPE` `prt` ON (((`pr`.`RESIDUE_TYPE_ID` = `prt`.`RESIDUE_TYPE_ID`)
AND (`prt`.`DESCRIPTION` = 'PET'))))
JOIN `TBL_RESIDUE` `ar` ON ((`ph`.`RESIDUE_ID` = `ar`.`RESIDUE_ID`)))
LEFT JOIN `TBL_RESIDUE_TYPE` `art` ON (((`ar`.`RESIDUE_TYPE_ID` = `art`.`RESIDUE_TYPE_ID`)
AND (`art`.`DESCRIPTION` = 'ALUMINUM'))))
GROUP BY CAST(`ph`.`DATE` AS DATE)
ORDER BY CAST(`ph`.`DATE` AS DATE)
You can try one of the following:
1) Define a new entity and specify the view name as a table name for that entity:
class ResidueCountByDate(db.Entity):
dt = PrimaryKey(date, column='Date')
aluminum_count = Required(int, column='Aluminum Count')
pet_count = Required(int, column='PET Count')
After that you can use that entity to select data from the view:
with db_session:
start_date = date(2017, 1, 1)
query = select(rc for rc in ResidueCountByDate if rc.date >= start_date)
for rc in query:
print(rc.date, rc.aluminum_count, rc.pet_count)
By default, a column name is equal to an attribute name. I explicitly specified column for each attribute, because in Python attribute names cannot contain spaces, and usually written in lowercase.
It is possible to explicitly specify table name if it is not equal to entity name:
class ResidueCount(db.Entity):
_table_ = 'ResidueCountByDate'
...
2) You can write raw SQL query without defining any entity:
with db_session:
start_date = date(2017, 1, 1)
rows = db.select('''
SELECT `Date` AS dt, `Aluminum Count` AS ac, `PET Count` AS pc
FROM `ResidueCountByDate`
WHERE `Date` >= $start_date
''')
for row in rows:
print(row[0], row[1], row[2])
print(row.dt, row.ac, row.pc) # the same as previous row
If column name can be used as a Python identifier (i.e. it does not contain spaces or special characters) you can access column value using dot notation as in the last line
I'm trying sum up the values in two columns and truncate my date fields by the day. I've constructed the SQL query to do this(which works):
SELECT date_trunc('day', date) AS Day, SUM(fremont_bridge_nb) AS
Sum_NB, SUM(fremont_bridge_sb) AS Sum_SB FROM bike_count GROUP BY Day
ORDER BY Day;
But I then run into issues when I try to format this into peewee:
Bike_Count.select(fn.date_trunc('day', Bike_Count.date).alias('Day'),
fn.SUM(Bike_Count.fremont_bridge_nb).alias('Sum_NB'),
fn.SUM(Bike_Count.fremont_bridge_sb).alias('Sum_SB'))
.group_by('Day').order_by('Day')
I don't get any errors, but when I print out the variable I stored this in, it shows:
<class 'models.Bike_Count'> SELECT date_trunc(%s, "t1"."date") AS
Day, SUM("t1"."fremont_bridge_nb") AS Sum_NB,
SUM("t1"."fremont_bridge_sb") AS Sum_SB FROM "bike_count" AS t1 ORDER
BY %s ['day', 'Day']
The only thing that I've written in Python to get data successfully is:
Bike_Count.get(Bike_Count.id == 1).date
If you just stick a string into your group by / order by, Peewee will try to parameterize it as a value. This is to avoid SQL injection haxx.
To solve the problem, you can use SQL('Day') in place of 'Day' inside the group_by() and order_by() calls.
Another way is to just stick the function call into the GROUP BY and ORDER BY. Here's how you would do that:
day = fn.date_trunc('day', Bike_Count.date)
nb_sum = fn.SUM(Bike_Count.fremont_bridge_nb)
sb_sum = fn.SUM(Bike_Count.fremont_bridge_sb)
query = (Bike_Count
.select(day.alias('Day'), nb_sum.alias('Sum_NB'), sb_sum.alias('Sum_SB'))
.group_by(day)
.order_by(day))
Or, if you prefer:
query = (Bike_Count
.select(day.alias('Day'), nb_sum.alias('Sum_NB'), sb_sum.alias('Sum_SB'))
.group_by(SQL('Day'))
.order_by(SQL('Day')))