I've got a pandas.Series object that might look like this:
import pandas as pd
myVar = pd.Series(["VLADIVOSTOK 690090", "MAHE", NaN, NaN, "VLADIVOSTOK 690090", "2000-07-01 00:00:00"])
myVar[5] is parsed as a datetime.datetime object when the data is read into Python via pandas. I'm assuming that converting this value to the number of days since epoch (36708) isn't difficult at all. I'm just new to Python and don't know how to do it. Thanks in advance!
I'm not sure where you're getting 36,708 days since the epoch (it's only been 16,644 days since January 1, 1970), but datetime.timedelta objects (used in date arithmetic) have a days attribute:
>>> import datetime
>>> (datetime.datetime.utcnow() - datetime.datetime(1970,1,1)).days
16644
myVar = pd.Series(["VLADIVOSTOK 690090", "MAHE", "NaN", "NaN", "VLADIVOSTOK 690090", "2000-07-01 00:00:00"])
myVar[5] = pd.to_datetime(myVar[5]) - pd.datetime(1970,1,1)
print(myVar)
0 VLADIVOSTOK 690090
1 MAHE
2 NaN
3 NaN
4 VLADIVOSTOK 690090
5 11139 days 00:00:00
dtype: object
You can convert this to seconds since epoch first, then divide it out by the amount of seconds in a day (86,400 seconds in a day). Please note the integer division here - will not return a float.
from datetime import datetime
now = datetime.now()
seconds = now.strftime("%s") # seconds since epoch
days = int(seconds) / 86400 # days since epoch
I added the import and now as an example of a datetime object I can play with.
For a Pandas Dataframe:
df_train["DaysSinceEpoch"] = [i.days for i in df_train["date"] - datetime.datetime(1970, 1, 1)]
Assuming that you want days since Unix Epoch of 1970-01-01 and you have a column of Pythonic datetime64[ns].
And see my other answer with the exact reverse.
Related
The question in the title seems to be familiar as I could see lot of example blog posts and SO posts. However, I couldn't find a question similar to the issue I am facing. I have a netcdf file in which variable time has a single data value 10643385. The unit of this time variable is minutes since 2000-01-01 00:00:00 which is different from many examples I found on the internet.I am also aware of the fact that actual value of time is 27-03-2020 05:45. My query is that how do I get this epoch value int to the date time format like `27-03-2020 05:45'. Here is the sample code I have been trying which results in the reference datetime rather than actual datetime of the file:-
print(datetime.datetime.fromtimestamp(int(epoch_time_value)).strftime('%Y-%m-%d %H:%M:%S'))
The above single line of code result in 1970-05-04 09:59:45. Can some one help me to get the correct date.
import datetime
t = datetime.datetime(2000, 1, 1) + datetime.timedelta(minutes=10643385)
outputs
datetime.datetime(2020, 3, 27, 5, 45)
Python epoch time is in seconds, so we must first convert this to seconds by multiplying by 60.
Python epoch time starts on 1, Jan, 1970. Since netcdf starts on 2000-01-01, we must adjust by adding the amount of seconds from 1970 to 2000 (which happens to be 946684800).
Putting these together we get:
>>> import datetime
>>> epoch_time_value = 10643385
>>> _epoch_time_value = epoch_time_value * 60 + 946684800
>>> print(datetime.datetime.fromtimestamp(int(_epoch_time_value)).strftime('%Y-%m-%d %H:%M:%S'))
2020-03-26 22:45:00
Then, there may be some shift (possibly +/- 12 hours) based on timezone, so make sure the timezones in your calculations are synced when you do this!
I am just wondering how best to approach using this 24 hour time format as a predictive feature. My thoughts were to bin it into 24 categories for each hour of the day. Is there an easy way to convert this object into a python datetime object that would make binning easier or how would you advise handling this feature? Thanks :)
df['Duration']
0 2:50
1 7:25
2 19:00
3 5:25
4 4:45
5 2:25
df['Duration'].dtype
dtype('O')
The best solution will depend on what you hope to get from your model. In many cases it makes sense to convert it to total number of seconds (or minutes or hours) since some epoch. To convert your data to seconds since 00:00, you can use:
from datetime import datetime
t_str = "2:50"
t_delta = datetime.strptime(t_str, "%H:%M") - datetime(1900, 1, 1)
seconds = t_delta.total_seconds()
hours = seconds/60**2
print(seconds)
# 10200.0
Using Python's datetime class will not support time values over 23:59. Since it appears that your data may actually be a duration, you may want to represent it as an instance of Python's timedelta class.
from datetime import timedelta
h, m = map(int, t_str.split(sep=':'))
t_delta = timedelta(hours=h, minutes=m)
# Get total number of seconds
seconds = t_delta.total_seconds()
You can use datetime to create a useable datetime string
>>> from datetime import datetime
>>> x = datetime(2019, 1, 1, 0).strftime('%Y-%m-%d %H:%M:%S')
>>> # Use that for your timestring then you can reverse it nicely back into a datetime object
>>> d = datetime.strptime('2019-01-01 00:00:00', '%Y-%m-%d %H:%M:%S')
Of course you can use any valid format string.
You should calculate the time in seconds or minutes or hours from some initial time like the 1st time. Then you can make an x-y scatter plot of the data since the x-axis (time) is now numbers.
I want to take a time stamp from the epoch - 1507498737.999 and store it as float in a nosql database. I also want to convert the epoch time to:
Year
Month
Day
Hour
Minute
Seconds
Milliseconds
DayName
MonthName
others?
I keep running into issues with conversions. My thought is:
Get the now() timestamp (floating)
Convert the timestamp to Year, Month, Day, etc...
How?
If you create a python datetime object, you may access the properties you need.
>>> import datetime
>>> dt = datetime.datetime.fromtimestamp(1507498737.999)
>>> print(dt)
2017-10-09 08:38:57.999000
>>> dt.microsecond
999000
>>> dt.day
9
How can one convert a serial date number, representing the number of days since epoch (1970), to the corresponding date string? I have seen multiple posts showing how to go from string to date number, but I haven't been able to find any posts on how to do the reverse.
For example, 15951 corresponds to "2013-09-02".
>>> import datetime
>>> (datetime.datetime(2013, 9, 2) - datetime.datetime(1970,1,1)).days + 1
15951
(The + 1 because whatever generated these date numbers followed the convention that Jan 1, 1970 = 1.)
TL;DR: Looking for something to do the following:
>>> serial_date_to_string(15951) # arg is number of days since 1970
"2013-09-02"
This is different from Python: Converting Epoch time into the datetime because I am starting with days since 1970. I not sure if you can just multiply by 86,400 due to leap seconds, etc.
Use the datetime package as follows:
import datetime
def serial_date_to_string(srl_no):
new_date = datetime.datetime(1970,1,1,0,0) + datetime.timedelta(srl_no - 1)
return new_date.strftime("%Y-%m-%d")
This is a function which returns the string as required.
So:
serial_date_to_string(15951)
Returns
>> "2013-09-02"
And for a Pandas Dataframe:
df["date"] = pd.to_datetime(df["date"], unit="d")
... assuming that the "date" column contains values like 18687 which is days from Unix Epoch of 1970-01-01 to 2021-03-01.
Also handles seconds and milliseconds since Unix Epoch, use unit="s" and unit="ms" respectively.
Also see my other answer with the exact reverse.
I have some measurements that happened on specific days in a dictionary. It looks like
date_dictionary['YYYY-MM-DD'] = measurement.
I want to calculate the variance between the measurements within 7 days from a given date. When I convert the date strings to a datetime.datetime, the result looks like a tuple or an array, but doesn't behave like one.
Is there an easy way to generate all the dates one week from a given date? If so, how can I do that efficiently?
You can do this using - timedelta . Example -
>>> from datetime import datetime,timedelta
>>> d = datetime.strptime('2015-07-22','%Y-%m-%d')
>>> for i in range(1,8):
... print(d + timedelta(days=i))
...
2015-07-23 00:00:00
2015-07-24 00:00:00
2015-07-25 00:00:00
2015-07-26 00:00:00
2015-07-27 00:00:00
2015-07-28 00:00:00
2015-07-29 00:00:00
You do not actually need to print it, datetime object + timedelta object returns a datetime object. You can use that returned datetime object directly in your calculation.
Using datetime, to generate all 7 dates following a given date, including the the given date, you can do:
import datetime
dt = datetime.datetime(...)
week_dates = [ dt + datetime.timedelta(days=i) for i in range(7) ]
There are libraries providing nicer APIs for performing datetime/date operations, most notably pandas (though it includes much much more). See pandas.date_range.