This question already has answers here:
How to change the datetime format in Pandas
(8 answers)
Closed 1 year ago.
import pandas as pd
import sys
df = pd.read_csv(sys.stdin, sep='\t', parse_dates=['Date'], index_col=0)
df.to_csv(sys.stdout, sep='\t')
Date Open
2020/06/15 182.809924
2021/06/14 257.899994
I got the following output with the input shown above.
Date Open
2020-06-15 182.809924
2021-06-14 257.899994
The date format is changed. Is there a way to maintain the date format automatically? (For example, if the input is in YYYY/MM/DD format, the output should be in YYYY-MM-DD. If the input is in YYYY-MM-DD, the output should in YYYY-MM-DD, etc.)
I prefer a way that I don't have to manually test the data format. It is best if there is an automatical way to maintain the date format, no matter what the particular date format is.
You can specify the date_format argument in to_csv:
df.to_csv(sys.stdout, sep='\t', date_format="%Y/%m/%d")
Keep the dates as strings and parse them into an extra column if you need to operate on them as dates?
df = pd.read_csv(sys.stdin, sep='\t', index_col=0)
df['DateParsed'] = pd.to_datetime(df["Date"])
Related
I want to make a time-series analysis with python, but i can't convert the data into datetime because the data is still in string (MM-DD).
Period
Jan-10
Feb-10
Mar-10
Apr-10
etc
Is there any other way to convert this kind of data into datetime object?
There is no need to use the datetime module. Pandas can convert strings to date when reading the data from the csv file or you can use the to_datetime method after the data is loaded.
import pandas as pd
df = pd.read_csv('file.csv', parse_dates=['date'], infer_datetime_format=True)
If you are using a non-standard format, then you will get better results if you specify a format string. Here, it looks like the format string is '%b-%y', which is the abbreviated month name and the two-digit year without the century.
import pandas as pd
df = pd.read_csv('file.csv')
df['date'] = pd.to_datetime(df['date'], format='%b-%y')
Please have look at both these images, especially Dates from Sno 32. The month column and day column are not properly converted . How can I make this correct? I have already referred to questions regarding timeseries but haven't found any answer to this kind of issue.
There is problem pandas by default parse months first if possible.
You can specify the format as DD/MM/YY
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%y')
Or try using dayfirst=True parameter:
df['date'] = pd.to_datetime(df['date'], dayfirst=True)
Or if create DataFrame from file use parse_dates and dayfirst=True parameters:
df = pd.read_csv(file, parse_dates=['date'], dayfirst=True)
This question already has answers here:
Can pandas automatically read dates from a CSV file?
(13 answers)
Closed 3 years ago.
I have a csv file which contains a date column, the dates in this file have the format of 'dd.mm.yy', when pandas parse the dates it understands the day as a month if it was less than or equal to 12, so 05.01.05 becomes 01/05/2005.
How can I solve this issue
Regards
This is one way to solve it using pandas.to.datetime and setting the argument dayfirst=True. However, I've had to make assumptions about the format of your data since you are not sharing any code. In the case below the original format of the date column is object.
import pandas as pd
df = pd.DataFrame({
'date': ['01.02.20', '25.12.19', '10.03.18'],
})
df = pd.to_datetime(df['date'], dayfirst=True)
df
0 2020-02-01
1 2019-12-25
2 2018-03-10
Name: date, dtype: datetime64[ns]
This question already has answers here:
How to change the datetime format in Pandas
(8 answers)
Closed 3 years ago.
i have a csv file and want to select one specific colum (date string). then i want to change the format of the date string from yyyymmdd to dd.mm.yyyy for every entry.
i read the csv file in a dataframe with pandas and then saved the specific column with the header DATE to a variable.
import pandas as pd
# read csv file
df = pd.read_csv('csv_file')
# save specific column
df_date_col = df['DATE']
now i want to change the values in df_date_col. How can i do this?
I know i can do it a step before like this:
df['DATE'] = modify(df['DATE'])
Is this possible just using the variable df_date_col?
If i try df_date_Col['DATE']=... it will give a KeyError.
Use to_datetime with Series.dt.strftime:
df['DATE'] = pd.to_datetime(df['DATE'], format='%Y%m%d').dt.strftime('%d.%m.%Y')
Is this possible just using the variable df_date_col?
Sure, but working with Series, so cannot again select by []:
df_date_col = df['DATE']
df_date_col = pd.to_datetime(df_date_col, format='%Y%m%d').dt.strftime('%d.%m.%Y')
I'm using Pandas version 0.12.0 to import a csv file with dates
The dates are in the following format 'SEP2005'
using pandas to read the csv file:
import pandas as pd
DF = pd.read_csv('mydata.csv')
mydata.head()
Out[40]:
Date Quantity
0 APR2002 282.0000
1 APR2002 NaN
2 APR2002 0.0000
3 APR2002 20.2253
4 APR2002 55.6853
I then turn the Date Column to the index using the follow:
mydata.index = pd.to_datetime(mydata.pop('Date'))
Here is what is very strange in the past it has parsed my dates and turned the format into
2002-04-15 which is what I want. Then I would just make sure the days where set the the last day of the month:
mydate.index = mydata.index.to_period('M').to_timestamp('M')
Pandas in the past has done a great job of picking the best date format.
However, When I do this now I'm getting my DataFrame back with the same text "APR2002"
As you would guess the last to_period will not work on that.
I have not change my code and I have not updated Pandas so I'm not sure where this change in coming from.
I'm not sure if I care too much about the why. What I really need help with is how do I format the index column to reflect Year-Month-Day or %Y%m%d as in 2005-04-30
I'm coming from R so any help would be huge!
You could try
pd.to_datetime(mydata.pop('Date'), format="%b%Y")
but that would expect the date to appear like Apr2002 (note not all caps).
You can specify a datetime format using the format string, and the format string will accept strftime arguments (defined here). There is some pandas documentation on this too.
Try:
DF = pd.read_csv('mydata.csv', parse_dates=[0])