How can I get from array with Timestamps - years, months, days?
I have DataFrame where index is Timestamps and I try this
for i in data_frame.index:
print(datetime.fromtimestamp(i).isoformat())
But I got this error:
print(datetime.fromtimestamp(i).isoformat()) ===>
===> TypeError: an integer is required (got type Timestamp)
first use df['date'] = pd.to_datetime(df['timestap']) to convert to proper format
then create new columns for year, month, and day
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
Related
I have a column in my dataframe which consists of date 1/6/2023 (m/d/yyy) format. The date datatype is object but I want to convert it from object to int64 data type. I have tried the following code but it is drastically changing date values:
df = df.astype({'date':'int'})
is changing my values drastically is there any other alternative for the same ?
df = df.astype({'date':'int'})
Convert values to datetimes, then to strings - e.g. here YYYYMMDD format and last to integers:
print (df)
date
0 1/6/2023
df['date'] = pd.to_datetime(df['date'], dayfirst=True).dt.strftime('%Y%m%d').astype(int)
print (df)
date
0 20230601
I have a dataset which has start date and end date column and I have to create duplicate rows based on the date difference as well as increment the start date in each of the rows based on the difference. I have been able to make duplicate rows based on the difference but I am not able to increment the start date since the difference is fixed in each of duplicate rows
for s in range(0,len(Fortnight_1)):
if Fortnight_1['Date_Difference'].iloc[s]>0:
for i in range(0,int(Fortnight_1['Date_Difference'].iloc[s])):
df = df.append(Fortnight_1.iloc[s])
By above code I have duplicated the rows based on Date_Difference. I need to put the dates in Dates column:
We need the dates column as mentioned but I am not able to increment the dates in duplicate rows.
Use:
#convert to datetimes
df['Start Date'] = pd.to_datetime(df['Start Date'])
df['End Date'] = pd.to_datetime(df['End Date'])
df['Date_difference'] = df['End Date'].sub(df['Start Date']).dt.days
#repeat indices by Date_difference
df1 = df.loc[df.index.repeat(df['Date_difference'] + 1)]
#create ecounter converted to days timedeltas
s = pd.to_timedelta(df1.groupby(level=0).cumcount(), unit='d')
#add timedeltas to Start Date and convert to DatetimeIndex
out = df1.assign(Dates = df1['Start Date'].add(s))
#if need days instead datetimes
out = df1.assign(Dates = df1['Start Date'].add(s).dt.day)
I'm trying to extract date information from a date column, and append the new columns to the original dataframe. However, I kept getting this message saying I cannot use .dt with this column. Not sure what I did wrong here, any help will be appreciated.
Error message that I got in python:
First do df.datecolumn = pd.to_datetime(df.datecolumn), then live happily ever after.
This will give you year, month and day in that month. You can also easily get week of the year and day of the week.
import pandas as pd
df = pd.DataFrame(data=[['1920-01-01'], ['2008-12-06']], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'])
df['Year'] = df['Date'].apply(lambda x : x.year)
df['Month'] = df['Date'].apply(lambda x : x.month)
df['Day'] = df['Date'].apply(lambda x : x.day)
print(df)
In your Time list you have a typo Dayorweek should be dayofweek.
I have dataframe with column date with type datetime64[ns].
When I try to create new column day with format MM-DD based on date column only first method works from below. Why second method doesn't work in pandas?
df['day'] = df['date'].dt.strftime('%m-%d')
df['day2'] = str(df['date'].dt.month) + '-' + str(df['date'].dt.day)
Result for one row:
day 01-04
day2 0 1\n1 1\n2 1\n3 1\n4 ...
Types of columns
day object
day2 object
Problem of solution is if use str with df['date'].dt.month it return Series, correct way is use Series.astype:
df['day2'] = df['date'].dt.month.astype(str) + '-' + df['date'].dt.day.astype(str)
I originally have dates in string format.
I want to extract the month as a number from these dates.
df = pd.DataFrame({'Date':['2011/11/2', '2011/12/20', '2011/8/16']})
I convert them to a pandas datetime object.
df['Date'] = pd.to_datetime(df['Date'])
I then want to extract all the months.
When I try:
df.loc[0]["Date"].month
This works returning the correct value of 11.
But when I try to call multiple months it doesn't work?
df.loc[1:2]["Date"].month
AttributeError: 'Series' object has no attribute 'month'
df.loc[0]["Date"] returns a scalar: pd.Timestamp objects have a month attribute, which is what you are accessing.
df.loc[1:2]["Date"] returns a series: pd.Series objects do not have a month attribute, they do have a dt.month attribute if df['Date'] is a datetime series.
In addition, don't use chained indexing. You can use:
df.loc[0, 'Date'].month for a scalar
df.loc[1:2, 'Date'].dt.month for a series
There are different functions. pandas.Series.dt.month for converting Series filled by datetimes and pandas.Timestamp for converting scalar. For converting Index is function pandas.DatetimeIndex.month, there is no .dt.
So need:
#Series
df.loc[1:2, "Date"].dt.month
#scalar
df.loc[0, 'Date'].month
#DatetimeIndex
df.set_index('Date').month