This question already has answers here:
Resampling Minute data
(2 answers)
Closed 2 years ago.
I have some dataset. Let's presume it is:
dataset = pd.read_csv('some_stock_name_here.csv', index_col=['Date'], parse_dates=['Date'])
The csv file has 2500 observation(Date and Close price position) and I want to create a new csv file which inlude the same time series but with much less frequency data on the raw. For example every 40-th of the previous? How can I do this?
2. Also I'm wondering whether I could manipulate that frequency within the notebook without creating new csv file.
Thanks in advance.
You can slice your df using iloc:
Going over all rows and taking those at indexes that are divisible with X.
X = 40
df.iloc[::X]
Saving data-frame is achieved by the following code:
df.to_csv(FILE_PATH_HERE)
Related
This question already has answers here:
How do I transpose dataframe in pandas without index?
(3 answers)
Closed 11 months ago.
I am trying to analyze chines GDP according to its provinces. I want to make a line chart that shows changing GDP over time but I cannot group them.
i want to pivot the table but it is not working as I want.
but I want to make it like this
It looks like you want to switch x and y axes. Use transpose. You can call it with T.
transposed_df = df_data.T
print(transposed_df)
This question already has answers here:
Convert Pandas Column to DateTime
(8 answers)
Closed 1 year ago.
I have a dataframe which contains a datetime column like this:
As you see in the "date_time" column the smallest time unit is minute. In fact, it does not have second uinte. I mean, for example, in the first six rows, 4:24 is repeated which means data gathered every 10 seconds or 4:25 repeated 10 times which means data recorded every 6 seconds.
Indeed, I am looking for a solution to have second in the "date_time" column.
The desirable format is like this:
Just use to_datetime() method of pandas
Solution:-
df['date_time']=pd.to_datetime(df['date_time'])
Then use apply() method:-
df['date_time']=df['date_time'].apply(lambda x:x.strftime("%H:%M:%S"))
This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
Is there a way in excel, Python, or R to convert data that is in the format of time and quantity per date into one long column. For instance:
Current format:
Instead I want this data to be one long column of 17 0s followed by 1 1 and 176 0s etc.
Thank you in advance for any help.
To elaborate the data looks like this:
Current data:
And I need this data to look like this:
Final result:
One option with uncount
library(tidyr)
uncount(dat, quantity)
Or with rep
with(dat, rep(time, quantity))
This question already has answers here:
The first three max value in a column in python
(1 answer)
Count and Sort with Pandas
(5 answers)
Closed 3 years ago.
I am doing an online course which has a problem like " Find the name of the state with maximum number of counties". The problem dataframe is the image below
Problem Dataframe
Now, I have given the dataframe two new index (hierarchical indexing) and after that the dataframe takes a new look like the image below
Modified Dataframe
I have used this code to get the modified dataframe:
def answer_five():
new_df = census_df[census_df['SUMLEV'] == 50]
new_df = new_df.set_index(['STNAME', 'CTYNAME'])
return new_df
answer_five()
What I want to do now is to find the name of the state with most number of counties i.e to find the index with maximum number of rows. How Can I do that?
I know that using something like groupby() method this can be done but I'm not familiar with this method yet and so don't want to use it. Can anyone help? I have searched for this but failed. Sorry if the problem is rudimentary. Thanks in advance.
This question already has answers here:
Pandas groupby: How to get a union of strings
(8 answers)
Closed 3 years ago.
new in pandas and I was able to create a dataframe from a csv file. I was also able to sort it out.
What I am struggling now is the following: I give an image as an example from a pandas data frame.
First column is the index,
Second column is a group number
Third column is what happened.
I want based on the second column to take out the third column on the same unique data frame.
I highlight few examples: For the number 9 return back the sequence
[60,61,70,51]
For the number 6 get back the sequence
[65,55,56]
For the number 8 get back the single element 8.
How groupby can be used to do this extraction?
Thanks a lot
Regards
Alex
Starting from the answers on this question we can extract following code to receive the desired result.
dataframe = pd.DataFrame({'index':[0,1,2,3,4], 'groupNumber':[9,9,9,9,9], 'value':[12,13,14,15,16]})
grouped = dataframe.groupby('groupNumber')['value'].apply(list)