Converting A timestamp string to the right format - python

I have a datafield, in the format : '13:55:07 03-01-2023'
This is 3rd of january and not 1st of March
I want to convert this into a timestamp, when I do it directly using
pd.to_datetime(order_data['exch_tm'])
I get the output of the timestamp like this : Timestamp('2023-03-01 13:55:07')
However, this is incorrect as it converting into 1st of march, whereas it should convert it into 3rd of January.

Is the datetime format always the same in the data? If so, what about using a format parameter to pd.to_datetime:
>>> import pandas as pd
>>> pd.to_datetime('13:55:07 03-01-2023', format='%H:%M:%S %d-%m-%Y')
Timestamp('2023-01-03 13:55:07')

You just need to mention the format of date entered.
For eg:
pd.to_datetime(order_data['exch_tm'],format='%H:%M:%S %d-%m-%Y'))
modify it as per your needs and you can find more about datetime is here.

Related

How do I convert object type data to datetime data in this case?

So I have data given in the format of:
1/1/2022 0:32
I looked up the type that was given with dataframe.dytpe and found out that this was an object type. Now for my further analysis I think it would be best to get this converted to a datetime data type.
In order to do so, I tried using
dataframe["time"] = pd.to_datetime(["time"], format = "%d%m%y")
, though that just leaves me with NaT showing up in the rows where I want my time to appear. What is the correct way to convert my given data?
Would it be better to split time and date or is there a way to convert the whole?
You can do like this:
df['time'] = pd.to_datetime(df['time']).dt.normalize()
or
df["time"]=df["time"].astype('datetime64')
it will convert object type to datetime64[ns]
I think datetime64[ns] is always in YYYY-MM-DD format, if you want to change your format you can use:
df['time'] = df['time'].dt.strftime('%m/%d/%Y')
You can change the '%m/%d/%Y'according to your desired format. But, The datatype will again change to object by using strftime.

convert time to UTC in pandas

I have multiple csv files, I've set DateTime as the index.
df6.set_index("gmtime", inplace=True)
#correct the underscores in old datetime format
df6.index = [" ".join( str(val).split("_")) for val in df6.index]
df6.index = pd.to_datetime(df6.index)
The time was put in GMT, but I think it's been saved as BST (British summertime) when I set the clock for raspberry pi.
I want to shift the time one hour backwards. When I use
df6.tz_convert(pytz.timezone('utc'))
it gives me below error as it assumes that the time is correct.
Cannot convert tz-naive timestamps, use tz_localize to localize
How can I shift the time to one hour?
Given a column that contains date/time info as string, you would convert to datetime, localize to a time zone (here: Europe/London), then convert to UTC. You can do that before you set as index.
Ex:
import pandas as pd
dti = pd.to_datetime(["2021-09-01"]).tz_localize("Europe/London").tz_convert("UTC")
print(dti) # notice 1 hour shift:
# DatetimeIndex(['2021-08-31 23:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None)
Note: setting a time zone means that DST is accounted for, i.e. here, during winter you'd have UTC+0 and during summer UTC+1.
To add to FObersteiner's response (sorry,new user, can't comment on answers yet):
I've noticed that in all the real world situations I've run across it (with full dataframes or pandas series instead of just a single date), .tz_localize() and .tz_convert() need to be called slightly differently.
What's worked for me is
df['column'] = pd.to_datetime(df['column']).dt.tz_localize('Europe/London').dt.tz_convert('UTC')
Without the .dt, I get "index is not a valid DatetimeIndex or PeriodIndex."

Aligning datetime formats for comparrison

I'm having trouble align two different dates. I have an excel import which I turn into a DateTime in pandas and I would like to compare this DateTime with the current DateTime. The troubles are in the formatting of the imported DateTime.
Excel format of the date:
2020-07-06 16:06:00 (which is yyyy-dd-mm hh:mm:ss)
When I add the DateTime to my DataFrame it creates the datatype Object. After I convert it with pd.to_datetime it creates the format yyyy-mm-dd hh:mm:ss. It seems that the month and the day are getting mixed up.
Example code:
df = pd.read_excel('my path')
df['Arrival'] = pd.to_datetime(df['Arrival'], format='%Y-%d-%m %H:%M:%S')
print(df.dtypes)
Expected result:
2020-06-07 16:06:00
Actual result:
2020-07-06 16:06:00
How do I resolve this?
Gr,
Sempah
An ISO-8601 date/time is always yyyy-MM-dd, not yyyy-dd-MM. You've got the month and date positions switched around.
While localized date/time strings are inconsistent about the order of month and date, this particular format where the year comes first always starts with the biggest units (years) and decreases in unit size going right (month, date, hour, etc.)
It's solved. I think that I misunderstood the results. It already was working without me knowledge. Thanks for the help anyway.

Find the earliest and oldest date in a list of dates' string representation

How can I convert this to a Python Date so I can find the latest date in a list?
["2018-06-12","2018-06-13", ...] to date
Then:
max(date_objects)
min(date_objects)
Since you want to convert from a list, you'll need to use my linked duplicate with a list comprehension,
from datetime import datetime
list_of_string_dates = ["2018-06-12","2018-06-13","2018-06-14","2018-06-15"]
list_of_dates= [datetime.strptime(date,"%Y-%m-%d") for date in list_of_string_dates]
print(max(list_of_dates)) # oldest
print(min(list_of_dates)) # earliest
2018-06-15 00:00:00
2018-06-12 00:00:00
Basically, you're converting the string representation of your dates to a Python date using datetime.strptime and then applying max and min which are implemented on this type of objects.
import datetime
timestamp = datetime.datetime.strptime("2018-06-12", "%Y-%m-%d")
date_only = timestamp.date()
You can use the datetime module. In particular, since your date is in the standard POSIX format, you'll be able to use the function date.fromtimestamp(timestamp) to return a datetime object from your string; otherwise, you can use the strptime function to read in a more complicated string - that has a few intricacies that you can figure out by looking at the documentation

How can I convert a timestamp string of the form "%d%H%MZ" to a datetime object?

I have timestamp strings of the form "091250Z", where the first two numbers are the date and the last four numbers are the hours and minutes. The "Z" indicates UTC. Assuming the timestamp corresponds to the current month and year, how can this string be converted reliably to a datetime object?
I have the parsing to a timedelta sorted, but the task quickly becomes nontrivial when going further and I'm not sure how to proceed:
datetime.strptime("091250Z", "%d%H%MZ")
What you need is to replace the year and month of your existing datetime object.
your_datetime_obj = datetime.strptime("091250Z", "%d%H%MZ")
new_datetime_obj = your_datetime_obj.replace(year=datetime.now().year, month=datetime.now().month)
Like this? You've basically already done it, you just needed to assign it a variable
from datetime import datetime
dt = datetime.strptime('091250Z', '%m%H%MZ')

Categories

Resources