Convert 44710.37680 to readable timestamp [duplicate]

Convert 44710.37680 to readable timestamp [duplicate] - python

This question already has answers here:
Convert Excel style date with pandas
(3 answers)
Closed 7 months ago.
I'm having a hard time converting what is supposed to be a datetime column from an excel file. When opening it with pandas I get 44710.37680 instead of 5/29/2022 9:02:36. I tried this peace of code to convert it.
df = pd.read_excel(file,'Raw')
df.to_csv(finalfile, index = False)
df = pd.read_csv(finalfile)
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.Timestamp(x).strftime('%Y-%m-%d %H:%M:%S'))
print(df)
And the result I get is 1970-01-01 00:00:00 :c
Don't know if this helps but its an .xlsb file that I'm working with.

You can use unit='d' (for days) and substract 70 years:
pd.to_datetime(44710.37680, unit='d') - pd.DateOffset(years=70)
Result:
Timestamp('2022-05-30 09:02:35.520000')
For dataframes use:
import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = pd.to_datetime(df['First LogonTime'], unit='d') - pd.DateOffset(years=70)
Or:
import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.to_datetime(x, unit='d') - pd.DateOffset(years=70))
Result:
First LogonTime
0 2022-05-30 09:02:35.520
1 2022-07-16 00:00:00.000

Related

Python pandas add a years integer column to a date column

I have a question somehow similar to what discussed here How to add a year to a column of dates in pandas
however in my case, the number of years to add to the date column is stored in another column. This is my not working code:
import datetime
import pandas as pd
df1 = pd.DataFrame( [ ["Tom",5], ['Jane',3],['Peter',1]], columns = ["Name","Years"])
df1['Date'] = datetime.date.today()
df1['Final_Date'] = df1['Date'] + pd.offsets.DateOffset(years=df1['Years'])
The goal is to add 5 years to the current date for row 1, 3 years to current date in row 2, eccetera.
Any suggestions? Thank you

Convert to time delta by converting years to days, then adding to a converted datetime column:
df1['Final_Date'] = pd.to_datetime(df1['Date']) \
+ pd.to_timedelta(df1['Years'] * 365, unit='D')
Use of to_timedelta with unit='Y' for years is deprecated and throws ValueError.
Edit. If you need day-exact changes, you will need to go row-by-row and update the date objects accordingly. Other answers explain.

Assuming the number of different values in Years is limited, you can try groupby and do the operation with pd.DateOffset like:
df1['new_date'] = (
df1.groupby('Years')
['Date'].apply(lambda x: x + pd.DateOffset(years=x.name))
)
print(df1)
Name Years Date new_date
0 Tom 5 2021-07-13 2026-07-13
1 Jane 3 2021-07-13 2024-07-13
2 Peter 1 2021-07-13 2022-07-13
else you can extract year, month and day, add the Years column to year and recreate a datetime column
df1['Date'] = pd.to_datetime(df1['Date'])
df1['new_date'] = (
df1.assign(year=lambda x: x['Date'].dt.year+x['Years'],
month=lambda x: x['Date'].dt.month,
day=lambda x: x['Date'].dt.day,
new_date=lambda x: pd.to_datetime(x[['year','month','day']]))
['new_date']
)
same result

import datetime
import pandas as pd
df1 = pd.DataFrame( [ ["Tom",5], ['Jane',3],['Peter',1]], columns = ["Name","Years"])
df1['Date'] = datetime.date.today()
df1['Final_date'] = datetime.date.today()
df1['Final_date'] = df1.apply(lambda g: g['Date'] + pd.offsets.DateOffset(years = g['Years']), axis=1)
print(df1)
Try this, you were trying to add the whole column when you called pd.offsets.DateOffset(years=df1['Years']) instead of just 1 value in the column.
EDIT: I changed from iterrows to a vectorization method due to iterrows's poor performance

Pandas: Multiple date formats in one column

I have two date formats in one Pandas series (column) that need to be standardized into one format (mmm dd & mm/dd/YY)
Date
Jan 3
Jan 2
Jan 1
12/31/19
12/30/19
12/29/19
Even Excel won't recognize the mmm dd format as a date format. I can change the mmm to a fully-spelled out month using str.replace:
df['Date'] = df['Date'].str.replace('Jan', 'January', regex=True)
But how do I add the current year? How do I then convert January 1, 2020 to 01/01/20?

Have you tried the parse()
from dateutil.parser import parse
import datetime
def clean_date(text):
datetimestr = parse(text)
text = datetime.strptime(datetimestr, '%Y%m%d')
return text
df['Date'] = df['Date'].apply(clean_date)
df['Date'] = pd.to_datetime(df['Date'])

If it's in a data frame use this:
from dateutil.parser import parse
import pandas as pd
for i in range(len(df['Date'])):
df['Date'][i] = parse(df['Date'][i])
df['Date'] = pd.to_datetime(df['Date']).dt.strftime("%d-%m-%Y")

Found the solution (needed to use apply):
df['date'] = df['date'].apply(dateutil.parser.parse)

Convert Stacked DataFrame of Years and Months to DataFrame with Datetime Indices

I am reading a csv file of the number of employees in the US by year and month (in thousands). It starts out like this:
Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
1961,45119,44969,45051,44997,45119,45289,45400,45535,45591,45716,45931,46035
1962,46040,46309,46375,46679,46668,46644,46720,46775,46888,46927,46910,46901
1963,46912,47000,47077,47316,47328,47356,47461,47542,47661,47805,47771,47863
...
I want my Pandas Dataframe to have the datetime as the index for each month's value. I'm doing this so I can later add values for specific time ranges. I want it to look something like this:
1961-01-01 45119.0
1961-02-01 44969.0
1961-03-01 45051.0
1961-04-01 44997.0
1961-05-01 45119.0
...
I did some research and thought that if I stacked the years and months together, I could combine them into a datetime. Here is what I have done:
import pandas as pd
import numpy as np
df = pd.read_csv("BLS_private.csv", header=5, index_col="Year")
df.columns = range(1, 13) # I transformed months into numbers 1-12 for easier datetime conversion
df = df.stack() # Months are no longer columns
print(df)
Here is my output:
Year
1961 1 45119.0
2 44969.0
3 45051.0
4 44997.0
5 45119.0
...
I do not know how to combine the year and the months in the stacked indices. Does stacking the indices help at all in my case? I am also not the most familiar with Pandas datetime, so any explanation about how I could use that would be very helpful. Also if anyone has alternate solutions than making datetime the index, I welcome ideas.

After the stack create the DateTimeIndex from the current index
from datetime import datetime
dt_index = pd.to_datetime([datetime(year=year, month=month, day=1)
for year, month in df.index.values])
df.index = dt_index
df.head(3)
# 1961-01-01 45119
# 1961-02-01 44969
# 1961-03-01 45051

import pandas as pd
df = pd.read_csv("BLS_private.csv", index_col="Year")
dates = pd.date_range(start=str(df.index[0]), end=str(df.index[-1] + 1), closed='left', freq="MS")
df = df.stack()
df.index = dates
df.to_frame()

s = """Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
1961,45119,44969,45051,44997,45119,45289,45400,45535,45591,45716,45931,46035
1962,46040,46309,46375,46679,46668,46644,46720,46775,46888,46927,46910,46901
1963,46912,47000,47077,47316,47328,47356,47461,47542,47661,47805,47771,47863"""
df = pd.read_csv(StringIO(s))
# set index and stack
stack = df.set_index('Year').stack().reset_index()
# create a new index
stack.index = pd.to_datetime(stack['Year'].astype(str) +'-'+ stack['level_1'])
# remove columns
final = stack[0].to_frame()
1961-01-01 45119
1961-02-01 44969
1961-03-01 45051
1961-04-01 44997
1961-05-01 45119
1961-06-01 45289

how to sort by dates in format d-month-y in pandas?

I have a column with dates looking like this: 10-apr-18.
when I'm transposing my df or doing anything with it, pandas automatically sort this column by the day (the first number) so it's not chronological.
I've tried to use to_datetime but because the month is a string it won't work.
How can I convert this to date OR cancel the automatically sorting (my raw data is already in the right order).

I suggest convert to datetimes with to_datetime and parameter format:
df = pd.DataFrame({'dates':['10-may-18','10-apr-18']})
#also working for me
#df['dates'] = pd.to_datetime(df['dates'])
df['dates'] = pd.to_datetime(df['dates'], format='%d-%b-%y')
df = df.sort_values('dates')
df['dates'] = df['dates'].dt.strftime('%d-%B-%y')
print (df)
dates
1 10-April-18
0 10-May-18
df = pd.DataFrame({'dates':['10-may-18','10-apr-18']})
#also working for me
#df['dates'] = pd.to_datetime(df['dates'])
df['datetimes'] = pd.to_datetime(df['dates'], format='%d-%b-%y')
df = df.sort_values('datetimes')
df['full'] = df['datetimes'].dt.strftime('%d-%B-%y')
print (df)
dates datetimes full
1 10-apr-18 2018-04-10 10-April-18
0 10-may-18 2018-05-10 10-May-18

df['dates'] = pd.to_datetime(df['dates'], format='%d-%b-%y').dt.strftime('%d/%B/%y')

How to convert date format QQ-YYYY to a datetime object [duplicate]

This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 4 years ago.
I have a pandas dataframe with a column that should indicate the end of a financial quarter. The format is of the type "Q1-2009". Is there a quick way to convert these strings into a timestamp as "2009-03-31"?
I have found only the conversion from the format "YYYY-QQ", but not the opposite.

Create quarters periods with swap quarter and year part by replace and convert to datetimes with PeriodIndex.to_timestamp:
df = pd.DataFrame({'per':['Q1-2009','Q3-2007']})
df['date'] = (pd.PeriodIndex(df['per'].str.replace(r'(Q\d)-(\d+)', r'\2-\1'), freq='Q')
.to_timestamp(how='e'))
print (df)
per date
0 Q1-2009 2009-03-31
1 Q3-2007 2007-09-30
Another solution is use string indexing:
df['date'] = (pd.PeriodIndex(df['per'].str[-4:] + df['per'].str[:2], freq='Q')
.to_timestamp(how='e'))

One solution using a list comprehension followed by pd.offsets.MonthEnd:
# data from #jezrael
df = pd.DataFrame({'per':['Q1-2009','Q3-2007']})
def get_values(x):
''' Returns string with quarter number multiplied by 3 '''
return f'{int(x[0][1:])*3}-{x[1]}'
values = [get_values(x.split('-')) for x in df['per']]
df['LastDay'] = pd.to_datetime(values, format='%m-%Y') + pd.offsets.MonthEnd(1)
print(df)
per LastDay
0 Q1-2009 2009-03-31
1 Q3-2007 2007-09-30

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert 44710.37680 to readable timestamp [duplicate] - python

Related

Python pandas add a years integer column to a date column

Pandas: Multiple date formats in one column

Convert Stacked DataFrame of Years and Months to DataFrame with Datetime Indices

how to sort by dates in format d-month-y in pandas?

How to convert date format QQ-YYYY to a datetime object [duplicate]

Categories

Resources