Date and Month formatting issue into Excel from DF - python

On Python 3.9 and Pandas 1.3.4.
So I'm trying to format 2 columns of my df and export it to excel. These 2 columns are date and month. Date is supposed to be formatted as %m/%d/%y and Month is supposed to be formatted as %B %Y.
When I do print(df['Date']) and print(df['Month']) it prints 01/04/22 and January 2022 respectively. However when I do df.to_csv(file.csv) it shows in excel as 1/4/2022 and Jan-22. I would like it to be formatted as 01/04/22 and January 2022 respectively. How can I solve this?
This is my current code:
import pandas as pd
df = pd.DataFrame(pd.read_csv(file.csv, dtype=str))
df["Date"] = pd.Timestamp("today").strftime("%m/%d/%y")
df["Month"] = pd.Timestamp("today").strftime("%B %Y")
print(df["Date"])
print(df["Month"])
df.to_excel('file.xlsx', index=False)
NOTE: to_excel fixed this issue

Related

Bad datetime conversion in pandas when a csv file it's opened

I have a simple csv in which there are a Date and Activity column like this:
and when I open it with pandas and I try to convert the Date column with pd.to_datetime its change the date. When there are a change of month like this
Its seems that pandas change the day by the month or something like that:
The format of date that I want it's dd-mm-yyyy or yyyy-mm-dd.
This it's the code that I using:
import pandas as pd
dataset = pd.read_csv(directory + "Time 2020 (Activities).csv", sep = ";")
dataset[["Date"]] = dataset[["Date"]].apply(pd.to_datetime)
How can I fix that?
You could specify the date format in the pd.to_datetime parameters:
dataset['Date'] = pd.to_datetime(dataset['Date'], format='%Y-%m-%d')

Pandas: Multiple date formats in one column

I have two date formats in one Pandas series (column) that need to be standardized into one format (mmm dd & mm/dd/YY)
Date
Jan 3
Jan 2
Jan 1
12/31/19
12/30/19
12/29/19
Even Excel won't recognize the mmm dd format as a date format. I can change the mmm to a fully-spelled out month using str.replace:
df['Date'] = df['Date'].str.replace('Jan', 'January', regex=True)
But how do I add the current year? How do I then convert January 1, 2020 to 01/01/20?
Have you tried the parse()
from dateutil.parser import parse
import datetime
def clean_date(text):
datetimestr = parse(text)
text = datetime.strptime(datetimestr, '%Y%m%d')
return text
df['Date'] = df['Date'].apply(clean_date)
df['Date'] = pd.to_datetime(df['Date'])
If it's in a data frame use this:
from dateutil.parser import parse
import pandas as pd
for i in range(len(df['Date'])):
df['Date'][i] = parse(df['Date'][i])
df['Date'] = pd.to_datetime(df['Date']).dt.strftime("%d-%m-%Y")
Found the solution (needed to use apply):
df['date'] = df['date'].apply(dateutil.parser.parse)

Converting Date Format in a Dataframe from a CSV File [duplicate]

This question already has answers here:
Convert DataFrame column type from string to datetime
(6 answers)
Convert Pandas Column to DateTime
(8 answers)
Closed 1 year ago.
I need to convert the date format of my csv file into the proper pandas format so I could sort it later on. My current format cannot be interacted reasonably in pandas so I had to convert it.
This is what my csv file looks like:
ARTIST,ALBUM,TRACK,DATE
ARTIST1,ALBUM1,TRACK1,23 Nov 2019 02:08
ARTIST1,ALBUM1,TRACK1,23 Nov 2019 02:11
ARTIST1,ALBUM1,TRACK1,23 Nov 2019 02:15
So far I've successfully converted it into pandas format by doing this:
df= pd.read_csv("mycsv.csv", delimiter=',')
convertdate= pd.to_datetime(df["DATE"])
print convertdate
####
#Original date format: 23 Nov 2019 02:08
#Output and desired date format: 2019-11-23 02:08:00
However, that only changes the values in the entire "DATE" column. Printing the dataframe of the csv file still outputs the original, non-converted date format. I need to append the converted format into the source csv file.
My desired output would then be
ARTIST,ALBUM,TRACK,DATE
ARTIST1,ALBUM1,TRACK1,2019-11-23 02:08:00
ARTIST1,ALBUM1,TRACK1,2019-11-23 02:11:00
ARTIST1,ALBUM1,TRACK1,2019-11-23 02:15:00
There are many options to the read_csv method.
Make sure to read the data in in the format you want instead of fixing it later.
df = pd.read_csv('mycsv.csv"', parse_dates=['DATE'])
Just pass in to the parse_dates argument the column names you want transformed.
There were 2 problems in the original code.
It wasn't a part of the original dataframe because you didn't save it back to the column once you transformed it.
so instead of:
convertdate= pd.to_datetime(df["DATE"])
use:
df["DATE"]= pd.to_datetime(df["DATE"])
and for goodness sake stop using python 2.
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
df = pd.read_csv('mycsv.csv', parse_dates=['DATE'], date_parser=dateparse)

Change date from specific format in pandas?

I currently have a pandas DF with date column in the following format:
JUN 05, 2028
Expected Output:
2028-06-05.
Most of the examples I see online do not use the original format I have posted, is there not an existing solution for this?
Use to_datetime with custom format from python's strftime directives:
df = pd.DataFrame({'dates':['JUN 05, 2028','JUN 06, 2028']})
df['dates'] = pd.to_datetime(df['dates'], format='%b %d, %Y')
print (df)
dates
0 2028-06-05
1 2028-06-06

Stripping and testing against Month component of a date

I have a dataset that looks like this:
import numpy as np
import pandas as pd
raw_data = {'Series_Date':['2017-03-10','2017-04-13','2017-05-14','2017-05-15','2017-06-01']}
df = pd.DataFrame(raw_data,columns=['Series_Date'])
print df
I would like to pass in a date parameter as a string as follows:
date = '2017-03-22'
I would now like to know if there are any dates in my DataFrame 'df' for which the month is 3 months after the month in the date parameter.
That is if the month in the date parameter is March, then it should check if there are any dates in df from June. If there are any, I would like to see those dates. If not, it should just output 'No date found'.
In this example, the output should be '2017-06-01' as it is a date from June as my date parameter is from March.
Could anyone help how may I get started with this?
convert your column to Timestamp
df.Series_Date = pd.to_datetime(df.Series_Date)
date = pd.to_datetime('2017-03-01')
Then
df[
(df.Series_Date.dt.year - date.year) * 12 +
df.Series_Date.dt.month - date.month == 3
]
Series_Date
4 2017-06-01

Categories

Resources