This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 1 year ago.
I have a dataframe which contains a datetime column like this:
As you see in the "date_time" column the smallest time unit is minute. In fact, it does not have second uinte. I mean, for example, in the first six rows, 4:24 is repeated which means data gathered every 10 seconds or 4:25 repeated 10 times which means data recorded every 6 seconds.
Indeed, I am looking for a solution to have second in the "date_time" column.
The desirable format is like this:
Just use to_datetime() method of pandas
Solution:-
df['date_time']=pd.to_datetime(df['date_time'])
Then use apply() method:-
df['date_time']=df['date_time'].apply(lambda x:x.strftime("%H:%M:%S"))
Related
This question already has an answer here:
Pandas read_excel: parsing Excel datetime field correctly [duplicate]
(1 answer)
Closed 1 year ago.
I'm trying to convert an entire column containing a 5 digit date code (EX: 43390, 43599) to a normal date format. This is just to make data analysis easier, it doesn't matter which way it's formatted. In a series, the DATE column looks like this:
1 43390
2 43599
3 43605
4 43329
5 43330
...
264832 43533
264833 43325
264834 43410
264835 43461
264836 43365
I don't understand previous submissions with this question, and when I tried code such as
date_col = df.iloc[:,0]
print((datetime.utcfromtimestamp(0) + timedelta(date_col)).strftime("%Y-%m-%d"))
I get this error
unsupported type for timedelta days component: Series
Thanks, sorry if this is a basic question.
You are calling the dataframe and assigning it to date_col. If you want to get the value of your first row, for example, use date_col = df.iloc[0]. This will return the value.
Timedelta takes an integer value, not Series.
This question already has answers here:
Pandas - convert strings to time without date
(3 answers)
Closed 1 year ago.
I have a column in for stop_time 05:38 (MM:SS) but it is showing up as an object. is there a way to turn this to a time?
I tried using # perf_dfExtended['Stop_Time'] = pd.to_datetime(perf_dfExtended['Stop_Time'], format='%M:%S')
but then it adds a date to the output: 1900-01-01 00:05:38
I guess what you're looking for is pd.to_timedelta (https://pandas.pydata.org/docs/reference/api/pandas.to_timedelta.html). to_datetime operation which will of course always try to create a date.
What you have to remember about though is that pd.to_timedelta could raise ValueError for your column, as it requires hh:mm:ss format. Try to use apply function on your column by adding '00:' by the beginning of arguments of your column (which I think are strings?), and then turn the column to timedelta. Could be something like:
pd.to_timedelta(perf_dfExtended['Stop_Time'].apply(lambda x: f'00:{x}'))
This may work for you:
perf_dfExtended['Stop_Time'] = \
pd.to_datetime(perf_dfExtended['Stop_Time'], format='%M:%S').dt.time
Output (with some additional examples)
0 00:05:38
1 00:10:17
2 00:23:45
This question already has answers here:
Resampling Minute data
(2 answers)
Closed 2 years ago.
I have some dataset. Let's presume it is:
dataset = pd.read_csv('some_stock_name_here.csv', index_col=['Date'], parse_dates=['Date'])
The csv file has 2500 observation(Date and Close price position) and I want to create a new csv file which inlude the same time series but with much less frequency data on the raw. For example every 40-th of the previous? How can I do this?
2. Also I'm wondering whether I could manipulate that frequency within the notebook without creating new csv file.
Thanks in advance.
You can slice your df using iloc:
Going over all rows and taking those at indexes that are divisible with X.
X = 40
df.iloc[::X]
Saving data-frame is achieved by the following code:
df.to_csv(FILE_PATH_HERE)
This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
Is there a way in excel, Python, or R to convert data that is in the format of time and quantity per date into one long column. For instance:
Current format:
Instead I want this data to be one long column of 17 0s followed by 1 1 and 176 0s etc.
Thank you in advance for any help.
To elaborate the data looks like this:
Current data:
And I need this data to look like this:
Final result:
One option with uncount
library(tidyr)
uncount(dat, quantity)
Or with rep
with(dat, rep(time, quantity))
This question already has answers here:
Pandas groupby: How to get a union of strings
(8 answers)
Closed 3 years ago.
new in pandas and I was able to create a dataframe from a csv file. I was also able to sort it out.
What I am struggling now is the following: I give an image as an example from a pandas data frame.
First column is the index,
Second column is a group number
Third column is what happened.
I want based on the second column to take out the third column on the same unique data frame.
I highlight few examples: For the number 9 return back the sequence
[60,61,70,51]
For the number 6 get back the sequence
[65,55,56]
For the number 8 get back the single element 8.
How groupby can be used to do this extraction?
Thanks a lot
Regards
Alex
Starting from the answers on this question we can extract following code to receive the desired result.
dataframe = pd.DataFrame({'index':[0,1,2,3,4], 'groupNumber':[9,9,9,9,9], 'value':[12,13,14,15,16]})
grouped = dataframe.groupby('groupNumber')['value'].apply(list)