Python Pandas Convert 10 digit datetime to a proper date format - python

I have an excel file which contains date format in 10 digit.
For example,
Order Date as 1806825282.731065,
Purchase Date as 1806765295
Does anyone know how to convert them to a proper date format such as dd/mm/yyyy hh:mm or dd/mm/yyyy? Any date format will be fine.
I tried pd.to_datetime but does not work.
Thanks!

You can do this
(pd.to_timedelta(1806825282, unit='s') + pd.to_datetime('1960-1-1'))
or
(pd.to_timedelta(df['Order Date'], unit='s') + pd.to_datetime('1960-1-1'))

SAS timestamp are stored in seconds from 1960-1-1:
import pandas as pd
origin = pd.Timestamp('1960-1-1')
df = pd.DataFrame({'Order Date': [1806825282.731065],
'Purchase Date': [1806765295]})
df['Order Date'] = origin + pd.to_timedelta(df['Order Date'], unit='s')
df['Purchase Date'] = origin + pd.to_timedelta(df['Purchase Date'], unit='s')
Output:
>>> df
Order Date Purchase Date
0 2017-04-03 07:54:42.731065035 2017-04-02 15:14:55
From The Essential Guide to SAS Dates and Times
SAS has three separate counters that keep track of dates and times. The date counter started
at zero on January 1, 1960. Any day before 1/1/1960 is a negative number, and any day
after that is a positive number. Every day at midnight, the date counter is increased by one.
The time counter runs from zero (at midnight) to 86,399.9999, when it resets to zero. The last
counter is the datetime counter. This is the number of seconds since midnight, January 1, 1960. Why January 1, 1960? One story has it that the founders of SAS wanted to use the
approximate birth date of the IBM 370 system, and they chose January 1, 1960 as an easy-
to-remember approximation.

According to The Pandas Documentation Link:
https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
Code
>>> pd.to_datetime(1674518400, unit='s')
Timestamp('2023-01-24 15:16:45')
>>> pd.to_datetime(1674518400433502912, unit='ns')
Timestamp('2023-01-24 15:16:45.433502912')
# you can use template
df[DATE_FIELD]=(pd.to_datetime(df[DATE_FIELD],unit='ms'))

You can use something like this:
# Convert the 10-digit datetime to a datetime object
df['date_column'] = pd.to_datetime(df['date_column'], unit='s')
# Format the datetime object to the desired format
df['date_column'] = df['date_column'].dt.strftime('%d/%m/%Y %H:%M')
Or if you want a one-liner:
df['date_column'] = pd.to_datetime(df['date_column'], unit='s').dt.strftime('%d/%m/%Y %H:%M')

Related

change YYYYDDMM to YYYYMMDD in python

I have a df with dates in a column converted to a datetime. the current format is YYYYDDMM. I need this converted to YYYYMMDD. I tried the below code but it does not change the format and still gives me YYYYDDMM. the end goal is to subtract 1 business day from the effective date but the format needs to be in YYYYMMDD to do this otherwise it subtracts 1 day from the M and not D. can someone help?
filtered_df['Effective Date'] = pd.to_datetime(filtered_df['Effective Date'])
# Effective Date = 20220408 (4th Aug 2022 for clarity)
filtered_df['Effective Date new'] = filtered_df['Effective Date'].dt.strftime("%Y%m%d")
# Effective Date new = 20220408
desired output -- > Effective Date new = 20220804
By default, .to_datetime will interpret the input YYYYDDMM as YYYYMMDD, and therefore print the same thing with %Y%m%d as the format. You can fix this and make it properly parse days in the month greater than 12 by adding the dayfirst keyword argument.
filtered_df['Effective Date'] = pd.to_datetime(filtered_df['Effective Date'], dayfirst=True)
I like to use the datetime library for this purpose. You can use strptime to convert a string into the datetime object and strftime to convert your datetime object to the new string.
from datetime import datetime
def change_date(row):
row["Effective Date new"] = datetime.strptime(row["Effective Date"], "%Y%d%m").strftime("%Y%m%d")
return row
df2 = df.apply(change_date, axis=1)
The output df2 will have Effective Date new as your new column.

How would I do date time math on a DF column using today's date?

Essentially I want to create a new column that has the number of days remaining until maturity from today. The code below doesn't work, kind of stuck what to do next as nearly all examples showcase doing math on 2 DF columns.
today = date.today()
today = today.strftime("%m/%d/%y")
df['Maturity Date'] = df['Maturity Date'].apply(pd.to_datetime)
df['Remaining Days til Maturity] = (df['Maturity Date'] - today)
You're mixing types, it's like subtracting apples from pears. In your example, today is a string representing - to us humans - a date (in some format, looks like the one used in the USA). Your pandas Series (the column of interest in your DataFrame) has a datetime64[ns] type, after you did the apply(pd.to_datetime) (which, you could do more efficiently without the apply as that will run an operation in a non-vectorized way over every element of the Series - have a look below, where I'm converting those strings into datetime64[ns] type in a vectorized way).
The main idea is that whenever you do operations with multiple objects, they should be of the same type. Sometimes frameworks will automatically convert types for you, but don't rely on it.
import pandas as pd
df = pd.DataFrame({"date": ["2000-01-01"]})
df["date"] = pd.to_datetime(df["date"])
today = pd.Timestamp.today().floor("D") # That's one way to do it
today
# Timestamp('2021-11-02 00:00:00')
today - df["date"]
# 0 7976 days
# Name: date, dtype: timedelta64[ns]
parse the Maturity Date as a datetime and format it as month/day/year then subtract the Maturity Date as a date type and store the difference in days as Remaining Days til Maturity
from datetime import date
today = date.today()
df=pd.DataFrame({'Maturity Date':'11/04/2021'},index=[0])
df['Maturity Date'] = pd.to_datetime(df['Maturity Date'], format='%m/%d/%Y')
df['Remaining Days til Maturity'] = (df['Maturity Date'].dt.date - today).dt.days
print(df)
output:
Maturity Date Remaining Days til Maturity
0 2021-11-04 2

Unable to subtract a day from any specific date format

I'm trying to subtract a day from this date 06-30-2019 in order to make it 06-29-2019 but can't figure out any way to achive that.
I've tried with:
import datetime
date = "06-30-2019"
date = datetime.datetime.strptime(date,'%m-%d-%Y').strftime('%m-%d-%Y')
print(date)
It surely gives me back the date I used above.
How can I subtract a day from a date in the above format?
try this
import datetime
date = "06/30/19"
date = datetime.datetime.strptime(date, "%m/%d/%y")
NewDate = date + datetime.timedelta(days=-1)
print(NewDate) # 2019-06-29 00:00:00
Your code:
date = "06-30-2019"
date = datetime.datetime.strptime(date,'%m-%d-%Y').strftime('%m-%d-%Y')
Check type of date variable.
type(date)
Out[]: str
It is in string format. To perform subtraction operation you must convert it into date format first. You can use pd.to_datetime()
# Import packages
import pandas as pd
from datetime import timedelta
# input date
date = "06-30-2019"
# Convert it into pd.to_datetime format
date = pd.to_datetime(date)
print(date)
# Substracting days
number_of_days = 1
new_date = date - timedelta(number_of_days)
print(new_date)
output:
2019-06-29 00:00:00
If you want to get rid of timestamp you can use:
str(new_date.date())
Out[]: '2019-06-29'
use timedelta
import datetime
date = datetime.datetime.strptime("06/30/19" ,"%m/%d/%y")
print( date - datetime.timedelta(days=1))

pandas.to_datetime with different length date strings

I have a column of timestamps that I would like to convert to datetime in my pandas dataframe. The format of the dates is %Y-%m-%d-%H-%M-%S which pd.to_datetime does not recognize. I have manually entered the format as below:
df['TIME'] = pd.to_datetime(df['TIME'], format = '%Y-%m-%d-%H-%M-%S')
My problem is some of the times do not have seconds so they are shorter
(format = %Y-%m-%d-%H-%M).
How can I get all of these strings to datetimes?
I was thinking I could add zero seconds (-0) to the end of my shorter dates but I don't know how to do that.
try strftime and if you want the right format and if Pandas can't recognize your custom datetime format, you should provide it explicetly
from functools import partial
df1 = pd.DataFrame({'Date': ['2018-07-02-06-05-23','2018-07-02-06-05']})
newdatetime_fmt = partial(pd.to_datetime, format='%Y-%m-%d-%H-%M-%S')
df1['Clean_Date'] = (df1.Date.str.replace('-','').apply(lambda x: pd.to_datetime(x).strftime('%Y-%m-%d-%H-%M-%S'))
.apply(newdatetime_fmt))
print(df1,df1.dtypes)
output:
Date Clean_Date
0 2018-07-02-06-05-23 2018-07-02 06:05:23
1 2018-07-02-06-05 2018-07-02 06:05:00
Date object
Clean_Date datetime64[ns]

How Can I Convert Microseconds to Human Readable Date & Time w/ Python Pandas?

So I'm trying to convert the 'Event Time' (Time in microseconds since 1970-01-01 00:00:00 UTC) column in a pandas data frame to human readable date & time.
Name of column: df['Event Time']
input: df['Event Time'].iloc[0]
output: 1519952249827533
I need a script to transform the entire column.
I tried the following, but there needs to be something much more simple:
from datetime import datetime, timedelta
epoch = datetime(1970, 1, 1)
cookie_microseconds_since_epoch = df['Event Time'].iloc[0]
cookie_datetime = epoch + timedelta(microseconds=cookie_microseconds_since_epoch)
str(cookie_datetime)
Thank you!
Use pd.to_datetime and specify the unit as microseconds. Mind that the microseconds unit is us (I didn't know that a few minutes ago (-:)
df['Event Time'] = pd.to_datetime(df['Event Time'], unit='us')
MCVE
df = pd.DataFrame({'Event Time': [1519952249827533]})
df['Event Time'] = pd.to_datetime(df['Event Time'], unit='us').dt.floor('s')
df
Event Time
0 2018-03-02 00:57:29
If you wanted a string result instead of Timestamps use strftime
pd.to_datetime(df['Event Time'], unit='us').dt.strftime('%Y-%m-%d %H:%M:%S')
0 2018-03-02 00:57:29
Name: Event Time, dtype: object
See strftime.org for more info on using strftime.

Categories

Resources