This question already has an answer here:
Python: Adding hours to pandas timestamp
(1 answer)
Closed 3 years ago.
I have a pandas dataframe where date and hour is in two different columns as shown below -
I want to concat these two columns to have a new datatime column where I can apply pandas window/shift functions. Please share your views.
date hour
0 20190409 0
1 20190409 0
2 20190409 0
3 20190409 0
4 20190409 0
Use pandas.to_datetime and pd.to_timedelta and add them together:
df['datetime'] = pd.to_datetime(df['date'], format='%Y%m%d') + pd.to_timedelta(df['hour'], unit='H')
Related
This question already has answers here:
Pandas Timedelta in Days
(5 answers)
Closed 1 year ago.
I have a DataFrame like:
from datetime import date, timedelta
import pandas as pd
df = pd.DataFrame([{'a': date(2020, 2, 1), 'b': date(2020, 4, 2)}])
df['c'] = df['b']-df['a']
# df:
# a b c
# 0 2020-02-01 2020-04-02 62 days
the column c is counting the days between dates of a and b. However, its dtype is timedelta64[ns], not an int count of days. So I have to do:
df['c'] = (df['b']-df['a']).apply(lambda x: x.days)
This works. But I am just wondering if there is a more vectorized solution to perform better. Thanks.
Don't use apply:
df['c'] = (df['b'] - df['a']).dt.days
This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 1 year ago.
I need some help
I have the follow CSV file with this Data Frame:
how could I transfer the data of cases in columns week 1, week 2 (...) using Python and Pandas?
It would be something like this:
x = (
df.pivot_table(
index=["city", "population"],
columns="week",
values="cases",
aggfunc="max",
)
.add_prefix("week ")
.reset_index()
.rename_axis("", axis=1)
)
print(x)
Prints:
city population week 1 week 2
0 x 50000 5 10
1 y 88000 2 15
This question already has answers here:
How to calculate number of days between two given dates
(15 answers)
Closed 1 year ago.
How do you convert a pandas dataframe column from a date formatted as below to a number as shown below:
date
0 4/5/2010
1 9/26/2014
2 8/3/2010
To this
date newFormat
0 4/5/2010 40273
1 9/26/2014 41908
2 8/3/2010 40393
Where the second columns is the number of days since 1/1/1900.
Use:
data['newFormat'] = data['Date'].dt.strftime("%Y%m%d").astype(int)
This has been answered before:
Pandas: convert date 'object' to int
enter link description here
This question already has answers here:
pandas: multiple conditions while indexing data frame - unexpected behavior
(5 answers)
Pandas slicing/selecting with multiple conditions with or statement
(1 answer)
Closed 2 years ago.
I have a dataframe which looks like this:
id start_date end_date
0 1 2017/06/01 2021/05/31
1 2 2018/10/01 2022/09/30
2 3 2015/01/01 2019/02/28
3 4 2017/11/01 2021/10/31
Can anyone tell me how i will slice the rows only for the start date which is 2017/06/01 and end date which is 2021/10/31 only.
I have sales information for different types of parts having different durations.I want to take difference in months when my date is in 'YYYYMM' format.
I have tried this.
(data.YYYYMM.max() - data.YYYYMM.min()
which gives me difference in days.how can I get this difference in months.
You can convert column to_datetime and then to_period:
df = pd.DataFrame({'YYYYMM':['201505','201506','201508','201510']})
print (df)
YYYYMM
0 201505
1 201506
2 201508
3 201510
df['YYYYMM'] = pd.to_datetime(df['YYYYMM'], format='%Y%m').dt.to_period('m')
a = df.YYYYMM.max() - df.YYYYMM.min()
print (a)
5