Discrepancies when subtracting dates with timestamp in SQLAlchemy and Postgresql - python

I have some discrepancy when subtracting dates in Postgresql and SQLAlchemy. For instance, I have the following in Postgresql:
SELECT trunc(EXTRACT(EPOCH FROM ('2019-07-05 15:20:10.111497-07:00'::timestamp - '2019-07-04 11:45:17.293328-07:00'::timestamp)))
--99292
and the following query in SQLAlchemy:
date_diff = session.query(func.trunc((func.extract('epoch',
func.date('2019-07-05 15:20:10.111497-07:00'))-
func.extract('epoch',
func.date('2019-07-04 11:45:17.293328-07:00'))))).all()
print(date_diff)
#[(86400.0,)]
We can see that the most exact difference is coming from Postgresql query. How can I get the same result using SQLAlchemy? I have not been able to spot what is the cause of this difference. If you know please let me know.
Thanks a lot.

Have never used SQLAlchemy before but it looks like you are trying to truncate to a date instead of a timestamp or datetime
Don't worry, this is an easy mistake to make. DateTime libraries can be confusing with their definitions (a date is a literally a Date so YYYY-MM-DD whereas a timestamp includes both the date and time to some denomination)
This is why you have a difference of 86,400 (one day) because it is comparing the dates of the two objects (2019-07-05 - 2019-07-04)
Try using the func.time.as_utc() or something similar to get a timestamp
You want to be comparing the WHOLE timestamp
EDIT: Sorry, didn't see your comment until after posting.

Related

How to count the date between 1945 and 1950 year in SQLite? [duplicate]

I can't seem to get reliable results from the query against a sqlite database using a datetime string as a comparison as so:
select *
from table_1
where mydate >= '1/1/2009' and mydate <= '5/5/2009'
how should I handle datetime comparisons to sqlite?
update:
field mydate is a DateTime datatype
To solve this problem, I store dates as YYYYMMDD. Thus,
where mydate >= '20090101' and mydate <= '20050505'
It just plain WORKS all the time. You may only need to write a parser to handle how users might enter their dates so you can convert them to YYYYMMDD.
SQLite doesn't have dedicated datetime types, but does have a few datetime functions. Follow the string representation formats (actually only formats 1-10) understood by those functions (storing the value as a string) and then you can use them, plus lexicographical comparison on the strings will match datetime comparison (as long as you don't try to compare dates to times or datetimes to times, which doesn't make a whole lot of sense anyway).
Depending on which language you use, you can even get automatic conversion. (Which doesn't apply to comparisons in SQL statements like the example, but will make your life easier.)
I had the same issue recently, and I solved it like this:
SELECT * FROM table WHERE
strftime('%s', date) BETWEEN strftime('%s', start_date) AND strftime('%s', end_date)
The following is working fine for me using SQLite:
SELECT *
FROM ingresosgastos
WHERE fecharegistro BETWEEN "2010-01-01" AND "2013-01-01"
Following worked for me.
SELECT *
FROM table_log
WHERE DATE(start_time) <= '2017-01-09' AND DATE(start_time) >= '2016-12-21'
Sqlite can not compare on dates. we need to convert into seconds and cast it as integer.
Example
SELECT * FROM Table
WHERE
CAST(strftime('%s', date_field) AS integer) <=CAST(strftime('%s', '2015-01-01') AS integer) ;
I have a situation where I want data from up to two days ago and up until the end of today.
I arrived at the following.
WHERE dateTimeRecorded between date('now', 'start of day','-2 days')
and date('now', 'start of day', '+1 day')
Ok, technically I also pull in midnight on tomorrow like the original poster, if there was any data, but my data is all historical.
The key thing to remember, the initial poster excluded all data after 2009-11-15 00:00:00. So, any data that was recorded at midnight on the 15th was included but any data after midnight on the 15th was not.
If their query was,
select *
from table_1
where mydate between Datetime('2009-11-13 00:00:00')
and Datetime('2009-11-15 23:59:59')
Use of the between clause for clarity.
It would have been slightly better. It still does not take into account leap seconds in which an hour can actually have more than 60 seconds, but good enough for discussions here :)
I had to store the time with the time-zone information in it, and was able to get queries working with the following format:
"SELECT * FROM events WHERE datetime(date_added) BETWEEN
datetime('2015-03-06 20:11:00 -04:00') AND datetime('2015-03-06 20:13:00 -04:00')"
The time is stored in the database as regular TEXT in the following format:
2015-03-06 20:12:15 -04:00
Right now i am developing using System.Data.SQlite NuGet package (version 1.0.109.2). Which using SQLite version 3.24.0.
And this works for me.
SELECT * FROM tables WHERE datetime
BETWEEN '2018-10-01 00:00:00' AND '2018-10-10 23:59:59';
I don't need to use the datetime() function. Perhaps they already updated the SQL query on that SQLite version.
Below are the methods to compare the dates but before that we need to identify the format of date stored in DB
I have dates stored in MM/DD/YYYY HH:MM format so it has to be compared in that format
Below query compares the convert the date into MM/DD/YYY format and get data from last five days till today. BETWEEN operator will help and you can simply specify start date AND end date.
select * from myTable where myColumn BETWEEN strftime('%m/%d/%Y %H:%M', datetime('now','localtime'), '-5 day') AND strftime('%m/%d/%Y %H:%M',datetime('now','localtime'));
Below query will use greater than operator (>).
select * from myTable where myColumn > strftime('%m/%d/%Y %H:%M', datetime('now','localtime'), '-5 day');
All the computation I have done is using current time, you can change the format and date as per your need.
Hope this will help you
Summved
You could also write up your own user functions to handle dates in the format you choose. SQLite has a fairly simple method for writing your own user functions. For example, I wrote a few to add time durations together.
My query I did as follows:
SELECT COUNT(carSold)
FROM cars_sales_tbl
WHERE date
BETWEEN '2015-04-01' AND '2015-04-30'
AND carType = "Hybrid"
I got the hint by #ifredy's answer. The all I did is, I wanted this query to be run in iOS, using Objective-C. And it works!
Hope someone who does iOS Development, will get use out of this answer too!
Here is a working example in C# in three ways:
string tableName = "TestTable";
var startDate = DateTime.Today.ToString("yyyy-MM-dd 00:00:00"); \\From today midnight
var endDate = date.AddDays(1).ToString("yyyy-MM-dd HH:mm:ss"); \\ Whole day
string way1 /*long way*/ = $"SELECT * FROM {tableName} WHERE strftime(\'%s\', DateTime)
BETWEEN strftime('%s', \'{startDate}\') AND strftime('%s', \'{endDate}\')";
string way2= $"SELECT * FROM {tableName} WHERE DateTime BETWEEN \'{startDate}\' AND \'{endDate}\'";
string way3= $"SELECT * FROM {tableName} WHERE DateTime >= \'{startDate}\' AND DateTime <=\'{endDate}\'";
select *
from table_1
where date(mydate) >= '1/1/2009' and date(mydate) <= '5/5/2009'
This work for me

Using BigQuery SQL with Built-in Python Functions

I recently started using Google's BigQuery service, and their Python API, to query some large databases. I'm new to SQL, and the BigQuery documentation isn't incredibly helpful for what I'm doing.
Currently I'm looking through the reddit_comments database, and there's 'created_utc' tag that I'm trying to filter by. This created_utc field is in terms of Unix timestamps (i.e. November 1st, 12:00 AM is 1541030400)
I'd like to grab comments day by day (or between two Unix timestamps) but in a way that I'm iterating over each day. Something like:
from datetime import datetime, timedelta
start = datetime.fromtimestamp(1538352000)
end = datetime.fromtimestamp(1541030400)
time = start
while time < end:
print(time)
time = time + timedelta(days = 1)
Printing times here yield one like: 2018-09-30 20:00:00
However in order to query, I have to convert back to the Unix timestamp by invoking datetime's timestamp() function like time.timestamp()
The problem is, I'm trying to use the timestamp() function inside the query like so:
SELECT *
FROM 'fh-bigquery.reddit_comments.2018_10'
...
AND (created_utc >= curr_day.timestamp() AND created_utc <= next_day.timestamp())
however, it's throwing a BadRequest: 400 Function not found. Is there a way to use built-in Python functions in the way that I've described above? Or does there need to be some alternative?
Everything so far seems pretty intuitive, but it's weird that I can't find much helpful information on this specifically.
You should use BigQuery's Built-in functions
For example:
To get current timestamp - CURRENT_TIMESTAMP()
To get timestamp of start of current date - TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), DAY)
To get timestamp of start of next date - TIMESTAMP_TRUNC(TIMESTAMP_ADD(CURRENT_TIMESTAMP() , INTERVAL 1 DAY), DAY)
and so on
Also, to convert created_utc to TIMESTAMP type - you can use TIMESTAMP_SECONDS(created_utc)
You can see more about TIMESTAMP Functions

Python Pandas to_datetime Out of bounds nanosecond timestamp on a pandas.datetime

I am using Python 2--I am behind moving over my code--so perhaps this issue has gone away.
Using pandas, I can create a datetime like this:
import pandas as pd
big_date= pd.datetime(9999,12,31)
print big_date
9999-12-31 00:00:00
big_date2 = pd.to_datetime(big_date)
. . .
Out of bounds nanosecond timestamp: 9999-12-31 00:00:00
I understand the reason for the error in that there are obviously too many nanoseconds in a date that big. I also know that big_date2 = pd.to_datetime(big_date, errors='ignore') would work. However, in my situation, I have a column of what are supposed to be dates (read from SQL server) and I do indeed want it to change invalid data/dates to NaT. In effect, I was using pd.to_datetime as a validity check. To Pandas, on the one hand, 9999-12-31 is a valid date, and on the other, it's not. That means I can't use it and have had to come up with something else.
I've played around with the arguments in pandas to_datetime and not been able to solve this.
I've looked at other questions/problems of this nature, and not found an answer.
I have a similar issue and was able to find a solution.
I have a pandas dataframe with one column that contains a datetime (retrieved from a database table where the column was a DateTime2 data type), but I need to be able to represents date that are further in the future than the Timestamp.max value.
Fortunately, I didn't need to worry about the time part of the datetime column - it was actually always 00:00:00 (I didn't create the database design and, yes, it probably should have been a Date data type and not a DateTime2 data type). So I was able to get round the issue by converting the pandas dataframe column to just a date type. For example:
for i, row in df.iterrows():
df.set_value(i, 'DateColumn', datetime.datetime(9999, 12, 31).date())
sets all of the values in the column to the date 9999-12-31 and you don't receive any errors when using this column anymore.
So, if you can afford to lose the time part of the date you are trying to use you can work round the limitation of the datetime values in the dataframe by converting to a date.

PostgreSQL time conversion/formatting from "24:00:00" to "00:00:00" in SELECT statement

I have some market data with time fields stored in a PostgreSQL database. PostgreSQL uses the format "00:00:00" to "24:00:00" to store times (see http://www.postgresql.org/docs/9.1/static/datatype-datetime.html) which works perfectly as long as I only work within the database.
The problem is that I have to do some data processing (using Python) afterwards and the Python datetime.time format only supports the hours from "00:00:00" to "23:00:00". So if I fetch a record that contains "24:00:00" using the psycopg2 module I get an error "ValueError: hour must be in 0..23" because the time field cannot be converted properly.
My idea for a clean workaround is to convert the time field that contains the "24:00:00" hour already in the SELECT statement to "00:00:00". This would solve the problem as the converter function would not fail afterwards.
I have already looked at the formatting functions (see http://www.postgresql.org/docs/9.4/static/functions-formatting.html) but could not find anything suitable..
Is there a way to realize this using SQL?
Thanks in advance!
The problem with the value '24:00:00'::time is clearly a psycopg2 error. While we wait for Daniele or me to fix it (if possible at all), here's a workaround: just use a CASE expression to check for the specific value that cause the error. If your table is named tab and the time column is t then you can do:
SELECT CASE t WHEN '24:00:00'::time THEN '0:00:00'::time ELSE t END FROM tab;
And everything should work.
Note that this is a problem only if you extract a time column. It seems that PostgreSQL converts timestamp columns (even ones representing a leap second) to the corresponding midnight, i.e., 2012-6-30T24:00:00 (30 June 2012 leap second) results in 2012-7-1T00:00:00.
When you add time to a date the result is a timestamp which you can cast to time:
select (current_date + market_data_time)::time;

Comparing a python date variable with timestamp from select query

I want to take some action based on comparing two dates. Date 1 is stored in a python variable. Date 2 is retrieved from the database in the select statement. For example I want to retrieve some records from the database where the associated date in the record (in form of the timestamp) is later than the date defined by the python variable. Preferably, I would like the comparison to be in readable date format rather than in timestamps.
I am a beginner with python.
----edit -----
Sorry for being ambiguous. Here's what I am trying to do:
import MySQLdb as mdb
from datetime import datetime
from datetime import date
import time
conn = mdb.connect('localhost','root','root','my_db')
cur = conn.cursor()
right_now = date.today()// python date
this is the part which I want to figure out
The database has a table which has timestamp. I want to compare that timestamp with this date and then retrieve records based on that comparison. For example I want to retrieve all records for which timestamp is above this date
cur.execute("SELECT created from node WHERE timestamp > right_now")
results = cur.fetchall()
for row in results:
print row
first of all, I guess Date 1 (python variable) is a datetime object. http://docs.python.org/2/library/datetime.html
As far as I have used it, MySQLdb gives you results in a (python) datetime object if the sql type was datetime.
So actually you have nothing to do, you can use python datetime comparison methods with date 1 and date 2.
I am a little bit confused about "comparison to be in readable date format rather than in timestamps". I mean the timestamps is readable enough, right?
If Date 1 is timestamps data, then you just simply do comparison. If not, then convert it to timestamps or convert the date in database to date type, both way works.
If you are asking how to write the code to do the comparison, you would use either '_mysql' or sqlalchemy to help you. The detailed syntax can be found at any where.
Anyway, the question itself is not clear enough, so the answer is blur, too.

Categories

Resources