Comparing timestamp to date in Python SQLAlchemy - python

I have an API in Python using sqlalchemy.
I have a string which represents a date in ISO format. I convert it using datetime.strptime like so: datetime.strptime(ToActionDateTime, '%Y-%m-%dZ').
Now I have to compare the value of a table's column which is a timestamp to that date.
After converting the initial ISO string, an example result looks like this 2018-12-06 00:00:00. I have to compare it for equality depending on date and not time but I can't manage to get it right. Any help would be appreciated.
Sample Python code:
ToActionDateTimeObj = datetime.strptime(ToActionDateTime, '%Y-%m-%dZ')
query = query.filter(db.c.Audit.ActionDateTime <= ToActionDateTimeObj)
Edit:
I have also tried to implement cast to both parts of the equation but it does not work either. I can't manage to get the right result when the selected date matches the date of the timestamp.
from sqlalchemy import Date, cast
ToActionDateTimeObj = datetime.strptime(ToActionDateTime, '%Y-%m-%dZ')
query = query.filter(cast(db.c.Audit.ActionDateTime, Date) <= cast(ToActionDateTimeObj, Date))

Since Oracle DATE datatype actually stores both date and time, a cast to DATE will not rid the value of its time portion, as it would in most other DBMS. Instead the function TRUNC(date [, fmt]) can be used to reduce a value to its date portion only. In its single argument form it truncates to the nearest day, or in other words uses 'DD' as the default model:
ToActionDateObj = datetime.strptime(ToActionDateTime, '%Y-%m-%dZ').date()
...
query = query.filter(func.trunc(db.c.Audit.ActionDateTime) <= ToActionDateObj)
If using the 2-argument form, then the precision specifier for day precision is either 'DDD', 'DD', or 'J'.
But this solution hides the column ActionDateTime from possible indexes. To make the query index friendly increment the date ToActionDateObj by one day and use less than comparison:
ToActionDateObj = datetime.strptime(ToActionDateTime, '%Y-%m-%dZ').date()
ToActionDateObj += timedelta(days=1)
...
query = query.filter(db.c.Audit.ActionDateTime < ToActionDateObj)

Related

Using a table in python UDF in Redshift

I need to create a python UDF(user-defined function) in redshift, that would be called in some other procedure. This python UDF takes two date values and compares those dates within the given start and end date, and check for the occurrence of these intermediate dates in some list.
This list needs to collect it's values from another table's column. Now the issue is, python UDF are defined in plpythonplu language and they don't recognize any sql. What should I do to make this list out of the table's column?
This is my function:
create or replace function test_tmp (ending_date date, starting_date date)
returns integer
stable
as $$
def get_working_days(ending_date , starting_date ):
days=0
if start_date is not None and end_date is not None:
for n in range(int ((ending_date - starting_date).days)):
btw_date= (start_date + timedelta(n)).strftime('%Y-%m-%d')
if btw_date in date_list:
days=days+1
return days
return 0
return get_working_days(ending_date,starting_date)
$$ language plpythonu;
Now, I need to create this date_list as something like:
date_list = [str(each["WORK_DATE"]) for each in (select WORK_DATE from public.date_list_table).collect()]
But, using this line in the function obviously gives an error, as select WORK_DATE from public.date_list_table is SQL.
Following is the structure of table public.date_list_table:
CREATE TABLE public.date_list
(
work_date date ENCODE az64
)
DISTSTYLE EVEN;
Some sample values for this table (actually this table stores only the working days values for the entire year):
insert into date_list_table values ('2021-07-01'),('2021-06-30'),('2021-06-29');
An Amazon Redshift Scalar SQL UDF - Amazon Redshift cannot access any tables. It needs to be self-contained by passing all the necessary information into the function. Or, you could store the date information inside the function so it doesn't need to access the table (not unreasonable, since it only needs to hold exceptions such as public holidays on weekdays).
It appears that your use-case is to calculate the number of working days between two dates. One way that this is traditionally solved is to create a table calendar with one row per day and columns providing information such as:
Work Day (Boolean)
Weekend (Boolean)
Public Holiday (Boolean)
Month
Quarter
Day of Year
etc.
You can then JOIN or query the table to identify the desired information, such as:
SELECT COUNT(*) FROM calendar WHERE work_day AND date BETWEEN start_date AND end_date

Filtering and extracting specific dates from SQLite file using python(Anaconda)

I'm trying to filter and extract a specific date with the month 2 inside my SQLite database using python and calculating their average monthly prices. This is what I've got so far...
The CurrentMonth variable currently holds the value 02. I keep receiving invalid syntax errors. My database is here:
Your syntax is invalid in SQLite. I think that you mean:
select * from stock_stats where strftime('%m', date) + 0 = ?
Rationale: strftime('%m', date) extracts the month part from the date column, and returns a string like '02'. You can just add 0 to force the conversion to a numeric value.
Note that:
you should also filter on the year part, to avoid mixing data from different years
a more efficient solution would be to pass 2 parameters, that define the start and end of the date range; this would avoid the need to use date functions on the date column.
date >= ? and date < ?

Changing Period to Datetime

My goal is to convert period to datetime.
If Life Was Easy:
master_df = master_df['Month'].to_datetime()
Back Story:
I built a new dataFrame that originally summed the monthly totals and made a 'Month' column by converting a timestamp to period. Now I want to convert that time period back to a timestamp so that I can create plots using matplotlib.
I have tried following:
Reading the docs for Period.to_timestamp.
Converting to a string and then back to datetime. Still keeps the period issue and won't convert.
Following a couple similar questions in Stackoverflow but could not seem to get it to work.
A simple goal would be to plot the following:
plot.bar(m_totals['Month'], m_totals['Showroom Visits']);
This is the error I get if I try to use a period dtype in my charts
ValueError: view limit minimum 0.0 is less than 1 and is an invalid Matplotlib date value.
This often happens if you pass a non-datetime value to an axis that has datetime units.
Additional Material:
Code I used to create the Month column (where period issue was created):
master_df['Month'] = master_df['Entry Date'].dt.to_period('M')
Codes I used to group to monthly totals:
m_sums = master_df.groupby(['DealerName','Month']).sum().drop(columns={'Avg. Response Time','Closing Percent'})
m_means = master_df.groupby(['DealerName','Month']).mean()
m_means = m_means[['Avg. Response Time','Closing Percent']]
m_totals = m_sums.join(m_means)
m_totals.reset_index(inplace=True)
m_totals
Resulting DataFrame:
I was able to cast the period type to string then to datetime. Just could not go straight from period to datetime.
m_totals['Month'] = m_totals['Month'].astype(str)
m_totals['Month'] = pd.to_datetime(m_totals['Month'])
m_totals.dtypes
I wish I did not get downvoted for not providing the entire dataFrame.
First change it to str then to date
index=pd.period_range(start='1949-01',periods=144 ,freq='M')
type(index)
#changing period to date
index=index.astype(str)
index=pd.to_datetime(index)
df.set_index(index,inplace=True)
type(df.index)
df.info()
Another potential solution is to use to_timestamp. For example: m_totals['Month'] = m_totals['Month'].dt.to_timestamp()

Convert dataframe to date format

I'm reading a sql query and using it as dataframe columns.
query = "SELECT count(*) as numRecords, YEARWEEK(date) as weekNum FROM events GROUP BY YEARWEEK(date)"
df = pd.read_sql(query, connection)
date = df['weekNum']
records = df['numRecords']
The date column, which are int64 values, look like this:
...
201850
201851
201852
201901
201902
...
How can I convert the dataframe to a real date value (instead of int64), so when I plot this, the axis do not break because of the year change?
I'm using matplotlib.
All you need to do is use:
pd.to_datetime(date,format='%Y&%W')
Edited:
It gave an error that Day should be mentioned to convert it into datetime. So to tackle that we attach a '-1' to the end (which means Monday... you can add any specific value from 0 to 6 where each represents a day).
Then grab the 'day of the week' using an additional %w in the format and it will work:
pd.to_datetime(date.apply(lambda x: str(x)+'-0'), format="%Y%W-%w")
Remember that to perform any of the above operations, the value in date dataframe or series should be a string object. If not, you can easily convert them using d.astype(str) and then perform all these operations.

How can I convert a timestamp string of the form "%d%H%MZ" to a datetime object?

I have timestamp strings of the form "091250Z", where the first two numbers are the date and the last four numbers are the hours and minutes. The "Z" indicates UTC. Assuming the timestamp corresponds to the current month and year, how can this string be converted reliably to a datetime object?
I have the parsing to a timedelta sorted, but the task quickly becomes nontrivial when going further and I'm not sure how to proceed:
datetime.strptime("091250Z", "%d%H%MZ")
What you need is to replace the year and month of your existing datetime object.
your_datetime_obj = datetime.strptime("091250Z", "%d%H%MZ")
new_datetime_obj = your_datetime_obj.replace(year=datetime.now().year, month=datetime.now().month)
Like this? You've basically already done it, you just needed to assign it a variable
from datetime import datetime
dt = datetime.strptime('091250Z', '%m%H%MZ')

Categories

Resources