I am trying to work with some time series data that are in cumulative hours, however I am having trouble getting the times to convert to datetime correctly.
csv format
cumulative_time,temperature
01:03:10,30,
02:03:10,31,
...
22:03:10,30,
23:03:10,29,
24:03:09,29,
25:03:09,25,
etc
df['cumulative_time']=pd.to_datetime(df['cumulative_time'],format='%H:%M:%S').dt.time
keeps yielding the error:
time data '24:03:09' does not match format '%H:%M:%S'
Any thoughts on how to convert just times to datetime format, especially if the hours exceed 24 hours?
You probably want the pd.to_timedelta function instead.
A "datetime" is a point in time, eg. "at 3pm in the afternoon"; it's complaining about "24:03:09" because that's 0:03:09 the next day.
A "timedelta" is an amount of elapsed time.
Related
I have a column of float values which are tweet creation dates. This is the code I used to convert them from float to datetime:
t = 1508054212.0
datetime.utcfromtimestamp(t).strftime('%Y-%m-%d %H:%M:%S')
All the values returned belong to October 2017. However, the data is supposed to be collected over multiple months. So the dates should have different months and not just different Hours, Minutes and Seconds.
These are some values which I need to convert:
1508054212.0
1508038548.0
1506890436.0
Request you to suggest an alternative approach to determine the dates. Thank you.
I assumed df['tweet_creation'].loc[1] will return a number like the examples you gave.
Unfortunately, I don't know what f is, but I assumed it was a float.
My answer is inspired by this other answer: Converting unix timestamp string to readable date. You have a UNIX timestamp, so the easiest way is to use it and not convert it as a string.
from datetime import datetime, timedelta
dtobj = datetime.utcfromtimestamp(int(df['tweet_creation'].loc[1])) + timedelta(days=f-int(f))
To have the string representation you can use the function strftime.
I'm converting an string type of Time series to datetime in Python and I'm so confused that why is my datetime always display the result I don't expect. \n
what I want is shown in my img here
import datetime
time = '23:30:00' # Time in string format
dt=datetime.datetime.strptime(time, '%H:%M:%S')
print(dt.time()) # time method will only return the time
I hope this helps
You should put your question in the question, not some off-site illustration. We do have code blocks available. Also, you converted to Pandas datetime, not Python datetime. Both of these have "date" in their name because they do contain the date. You could represent just a time using e.g. Pandas timedelta or Python datetime.time. The format you pass to panads.to_datetime is how to parse the input, not how to display the result.
You have converted your string Series to a Series of pd.Timestamp. Internally a Timestamp is a number of nanoseconds from 1970-01-01 00:00:00.
The correct way to format a date in pandas is to convert it to a string with .dt.strfime, *when you no longer need to process it as a Timestamp.
TL/DR:
if you want it in HH:MM:SS format leave it in string dtype
if you need to process it as a Timestampand yet have it in HH:MM:SS format, convert it to Timestamp, process it and when done convert it back to a string
I have a large list of timestamps in nanoseconds (can easily be converted to miliseconds). I now want to make an instance of DatetimeIndex using these timestamps. Yet simply passing
timestamps = [3377536510631, 3377556564631, 3377576837400, 3377596513631, ...]
dti = DatetimeIndex(timestamps)
yields dates at 1970 yet they should be at 2017. Dividing them by a million to get milliseconds gives the same rsult. It seems that the input isn't as expected but I wouldn't know either how to easily set the input correctly or how to set the parameters correctly
Your timestamp probably has a false starting time (wrong offset). This usually happens, if the time is not set correctly on the a measurement device. If you cold-start the measurement, It will probably start at time stamp 0, which is 01/01/1970.
If you know the exact time and date the measurement was started, simply subtract the .mim() value from the time stamp column and add the time stamp of the actual start time to the result.
I'm working with some video game speedrunning (basically, races where people try to beat a game as fast as they can) data, and I have many different run timings in HH:MM:SS format. I know it's possible to convert to seconds, but I want to keep in this format for the purposes of making the axes on any graphs easy to read.
I have all the data in a data frame already and tried converting the timing data to datetime format, with format = '%H:%M:%S', but it just uses this as the time on 1900-01-01.
data=[['Aggy','01:02:32'], ['Kirby','01:04:54'],['Sally','01:06:04']]
df=pd.DataFrame(data, columns=['Runner','Time'])
df['Time']=pd.to_datetime(df['Time'], format='%H:%M:%S')
I thought specifying the format to be just hours/minutes/seconds would strip away any date, but when I print out the header of my dataframe, it says that the time data is now 1900-01-01 01:02:32, as an example. 1:02:32 AM on January 1st, 1900. I want Python to recognize the 1:02:32 as a duration of time, not a datetime format. What's the best way to go about this?
The format argument defines the format of the input date, not the format of the resulting datetime object (reference).
For your needs you can either use the H:m:s part of the datetime, or use the to_timedelta
method.
I am trying to figure out what the best way to create a list of timestamps in Python is, where the values for the items in the list increment by one minute. The timestamps would be by minute, and would be for the previous 24 hours. I need to create timestamps of the format "MM/dd/yyy HH:mm:ss" or to at least contain all of those measures. The timestamps will be an axis for a graph of data that I am collecting.
Calculating the times alone isn't too bad, as I could just get the current time, convert it to seconds, and change the value by one minute very easily. However, I am kind of stuck on figuring out the date aspect of it without having to do a lot of checking, which doesn't feel very Pythonic.
Is there an easier way to do this? For example, in JavaScript, you can get a Date() object, and simply subtract one minute from the value and JS will take care of figuring out if any of the other fields need to change and how they need to change.
datetime is the way to go, you might want to check out This Blog.
import datetime
import time
now = datetime.datetime.now()
print now
print now.ctime()
print now.isoformat()
print now.strftime("%Y%m%dT%H%M%S")
This would output
2003-08-05 21:36:11.590000
Tue Aug 5 21:36:11 2003
2003-08-05T21:36:11.590000
20030805T213611
You can also do subtraction with datetime and timedelta objects
now = datetime.datetime.now()
minute = timedelta(days=0,seconds=60,microseconds=0)
print now-minute
would output
2015-07-06 10:12:02.349574
You are looking for datetime and timedelta objects. See the docs.