pandas.tslib.Timestamp date matching - python

I am trying to find a way to check if item_date contain today's date. But even I hard code it, print True never happen. Anyone know how to solve this?
for item_date in buy_crossing_dates:
print item_date
print type(item_date)
if item_date == '2015-03-25 00:00:00':
print 'True'
result:
2015-03-25 00:00:00
<class 'pandas.tslib.Timestamp'>

Two options for checking for today's date in a pandas Series of Timestamps ...
import pandas as pd
# option 1 - compare using python datetime.date objects
dates = pd.Series(pd.date_range('2015-01-01', '2016-12-31')) # Timestamps
python_dates = pd.Series([x.date() for x in dates]) # datetime.date
today = pd.Timestamp('now').date() # datetime.date
print(python_dates[python_dates == today])
# option 2 - compare pandas.Timestamp objects using Series.dt accessor
dates = pd.Series(pd.date_range('2015-01-01', '2016-12-31')) # Timestamps
today = pd.Timestamp('now') # Timestamp
print(dates[(dates.dt.year == today.year) &
(dates.dt.month == today.month) &
(dates.dt.day == today.day)])
Note: option one uses a list comprehension to convert a pandas Series of Timestamps to a Series of datetime.date objects (using the pandas.Timestamp.date() method).

Related

How to compare only day, month, and year using timestamp?

How do we can compare 2 dates if they have same date irrespective of timestamp?
For example:
date1 = 1508651293229
date2 = 1508651293220
date1 = datetime.fromtimestamp(int(date1) / 1e3)
date2 = datetime.fromtimestamp(int(date2) / 1e3)
if(date1 == date2):
print(True)
but this checks for entire day stamp including time and I only want to check for day, month, and year.
I tried for some documentation but couldn't find much relevant.
datetime.datetime instances have a date method that returns a datetime.date object, ignoring the time of the original value.
if date1.date() == date2.date():
You can also create the datetime.date instances directly if you don't care about the time, as datetime.date.fromtimestamp also exists.
date1 = datetime.date.fromtimestamp(date1 // 1000)
Try:
if date1.date() == date2.date():
print('True')
# and indeed, they are:
True
datetime.date() returns a date object with the same year, month, and day
In your case, you should use:
if(date1.date() == date2.date()):
print(True)
You can also shorten this by simply doing
print(date1.date() == date2.date())

Python pandas print value where column = X and row = Y

I am relatively new to working with python and pandas and I'm trying to get the value of a cell in an excel sheet with python. To make matters worse, the excel sheet I'm working with doesn't have proper column names.
Here's what the dataframe looks like:
Sign Name 2020-09-05 2020-09-06 2020-09-07
JD John Doe A A B
MP Max Power B B A
What I want to do is to print the value of the "cell" where the column header is the current date and the sign is "MP".
What I've tried so far is this:
import pandas as pd
from datetime import datetime
time=datetime.now()
relevant_sheet = time.strftime("%B" " %y")
current_day = time.strftime("%Y-%m-%d")
excel_file = pd.ExcelFile('theexcelfile.xlsx')
df = pd.read_excel(excel_file, relevant_sheet, skiprows=[0,1,2,3]) # I don't need these
relevant_value = df.loc[df['Sign'] == "MP", df[current_day]]
This gives me a key error for current_day:
KeyError: '2020-09-07'
To fully disclose any possible issue with the real dataframe I'm working with: If I just print the dataframe, I get columns that look like this:
2020-09-01 00:00:00
Which is why I also tried:
current_day = time.strftime("%Y-%m-%d 00:00:00")
Of course I also "manually" tried all kinds of date formats, but to no avail. Am I going entirely wrong about this? Is this excel screwing with me?
If in columns names are datetimes use Timestamp.floor for remove times (set them to 00:00:00):
current_day = pd.to_datetime('now').floor('d')
print (current_day)
2020-09-07 00:00:00
relevant_value = df.loc[df['Sign'] == "MP", current_day]
If in columns names are datetimes in strings format use:
relevant_value = df.loc[df['Sign'] == "MP", current_day]
If there are python dates:
current_day = pd.to_datetime('now').date()
print (current_day)
2020-09-07
relevant_value = df.loc[df['Sign'] == "MP", current_day]
You need to pass column name only instead of df[col_name].
Look .loc[] for detail.
df.loc[df['Sign'] == "MP", current_day]
Use df.filter to filter the relevant column.
Get the relevant column by extracting today's date and converting it to a string.
Proceed and query Sign for MP
df.loc[df['Sign']=='MP',(dt.date.today()).strftime('%Y-%m-%d')]
Minor changes to how you are doing things will get you the result.
Step 1: strip out the 00:00:00 (if you want just the date value)
Step 2: your condition had an extra df[]
#strip last part of the column names if column starts with 2020
df.rename(columns=lambda x: x[:10] if x[:4] == '2020' else x, inplace=True)
current_day = datetime.date(datetime.now()).strftime("%Y-%m-%d")
relevant_value = df.loc[df['Sign'] == 'MP', current_day] #does not need df before current_day
print(relevant_value)
since you are already using pandas, you don't need to import datetime. you can just give this to get your date in yyyy-mm-dd format
current_day = pd.to_datetime('now').strftime("%Y-%m-%d")

How to perform logical tests on time values in a pandas dataframe

I have an excel sheet where one column contains a time field, where the values are the time of day entered as four digits: i.e. 0845, 1630, 1000.
I've read this into a pandas dataframe for analysis, one piece of which is labeling each time as day or evening. To do this, I first changed the datatype and format:
# Get start time as time
df['START_TIME'] = pd.to_datetime(df['START_TIME'],format='%H%M').dt.time
Which gets the values looking like:
08:45:00
16:30:00
10:00:00
The new dtype is object.
When I try to perform a logical test on that field, i.e.
# Create indicator of whether course begins before or after 4:00 PM
df['DAY COURSE INDICATOR'] = df['START_TIME'] < '16:00:00'
I get a Type Error:
TypeError: '<' not supported between instances of >'datetime.time' and 'str'
or syntax error if I remove the quotes.
What is the best way to create that indicator; how do I work with stand-alone time values? Or am I better off just leaving them as integers.
You can't compare a datetime.time and a str but you certainly can compare a datetime.time and a datetime.time:
import datetime
df['DAY COURSE INDICATOR'] = df['START_TIME'] < datetime.time(16, 0)
You can do exactly what you did in the first place:
pd.to_datetime(df['START_TIME'], format='%H:%M:%S') < pd.to_datetime('16:00:00', format='%H:%M:%S')
Example:
df = pd.DataFrame({'START_TIME': ['08:45']})
>>> pd.to_datetime(df['START_TIME'], format='%H:%M:%S') < pd.to_datetime('16:00:00', format='%H:%M:%S')
0 True
Name: START_TIME, dtype: bool

How do I change the Date but not the Time of a Timestamp within a dataframe column?

Python 3.6.0
I am importing a file with Unix timestamps.
I’m converting them to Pandas datetime and rounding to 10 minutes (12:00, 12:10, 12:20,…)
The data is collected from within a specified time period, but from different dates.
For our analysis, we want to change all dates to the same dates before doing a resampling.
At present we have a reduce_to_date that is the target for all dates.
current_date = pd.to_datetime('2017-04-05') #This will later be dynamic
reduce_to_date = current_date - pd.DateOffset(days=7)
I’ve tried to find an easy way to change the date in a series without changing the time.
I was trying to avoid lengthy conversions with .strftime().
One method that I've almost settled is to add the reduce_to_date and df['Timestamp'] difference to df['Timestamp']. However, I was trying to use the .date() function and that only works on a single element, not on the series.
GOOD!
passed_df['Timestamp'][0] = passed_df['Timestamp'][0] + (reduce_to_date.date() - passed_df['Timestamp'][0].date())
NOT GOOD
passed_df['Timestamp'][:] = passed_df['Timestamp'][:] + (reduce_to_date.date() - passed_df['Timestamp'][:].date())
AttributeError: 'Series' object has no attribute 'date'
I can use a loop:
x=1
for line in passed_df['Timestamp']:
passed_df['Timestamp'][x] = line + (reduce_to_date.date() - line.date())
x+=1
But this throws a warning:
C:\Users\elx65i5\Documents\Lightweight Logging\newmain.py:60: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
The goal is to have all dates the same, but leave the original time.
If we can simply specify the replacement date, that’s great.
If we can use mathematics and change each date according to a time delta, equally as great.
Can we accomplish this in a vectorized fashion without using .strftime() or a lengthy procedure?
If I understand correctly, you can simply subtract an offset
passed_df['Timestamp'] -= pd.offsets.Day(7)
demo
passed_df=pd.DataFrame(dict(
Timestamp=pd.to_datetime(['2017-04-05 15:21:03', '2017-04-05 19:10:52'])
))
# Make sure your `Timestamp` column is datetime.
# Mine is because I constructed it that way.
# Use
# passed_df['Timestamp'] = pd.to_datetime(passed_df['Timestamp'])
passed_df['Timestamp'] -= pd.offsets.Day(7)
print(passed_df)
Timestamp
0 2017-03-29 15:21:03
1 2017-03-29 19:10:52
using strftime
Though this is not ideal, I wanted to make a point that you absolutely can use strftime. When your column is datetime, you can use strftime via the dt date accessor with dt.strftime. You can create a dynamic column where you specify the target date like this:
pd.to_datetime(passed_df.Timestamp.dt.strftime('{} %H:%M:%S'.format('2017-03-29')))
0 2017-03-29 15:21:03
1 2017-03-29 19:10:52
Name: Timestamp, dtype: datetime64[ns]
I think you need convert df['Timestamp'].dt.date to_datetime, because output of date is python date object, not pandas datetime object:
df=pd.DataFrame({'Timestamp':pd.to_datetime(['2017-04-05 15:21:03','2017-04-05 19:10:52'])})
print (df)
Timestamp
0 2017-04-05 15:21:03
1 2017-04-05 19:10:52
current_date = pd.to_datetime('2017-04-05')
reduce_to_date = current_date - pd.DateOffset(days=7)
df['Timestamp'] = df['Timestamp'] - reduce_to_date + pd.to_datetime(df['Timestamp'].dt.date)
print (df)
Timestamp
0 2017-04-12 15:21:03
1 2017-04-12 19:10:52

python datetime and date comparison

Based on the example I found here I want to grab icalendar data and process it. This my code so far:
from datetime import datetime, timedelta
from icalendar import Calendar
import urllib
import time
ics = urllib.urlopen('https://www.google.com/calendar/ical/xy/basic.ics').read()
ical1 = Calendar.from_ical(ics)
for vevent in ical1.subcomponents:
if vevent.name == "VEVENT":
title = str(vevent.get('SUMMARY').encode('utf-8'))
start = vevent.get('DTSTART').dt # datetime
end = vevent.get('DTEND').dt # datetime
print title
print start
print end
print type(start)
print "---"
It is fetching the title and start+end date from my google calender. This working and the output looks like this:
This is a Test Title
2012-12-20 15:00:00+00:00
2012-12-20 18:00:00+00:00
<type 'datetime.datetime'>
---
Another Title
2012-12-10
2012-12-11
<type 'datetime.date'>
---
...
As you can see the start and end date can be of type datetime.datetime or datetime.date. It depends on whether I have an entry in google calender for 2 whole days (then it is a datetime.date) or for a time period e.g. 3 hours (then it is a datetime.datetime. I need to only print dates/datetimes from today and for the next 4 weeks. I had problems comparing dates and datetimes (its seems like it is not possible) so I failed to do that. How can I compare datetimes and dates conserving the data types? If I need to "cast" them, its ok. I was not able to print "today + 4 weeks", only "4 weeks".
You will need to convert to a common type for comparison. Since both datetime's and date's have dates in common, it makes sense to convert -> date types. Just extract the date() from the datetime. For example:
>>> import datetime
>>> datetime.date(2014, 2, 22) == datetime.datetime.now().date()
True
Datetime objects can return an appropriate date object by calling date on them: start.date() which then can be compared to another date object: start.date() < datetime.date.today() + datetime.timedelta(4*7)

Categories

Resources