how to group several columns into one column [duplicate] - python

This question already has answers here:
How do I transpose dataframe in pandas without index?
(3 answers)
Closed 11 months ago.
I am trying to analyze chines GDP according to its provinces. I want to make a line chart that shows changing GDP over time but I cannot group them.
i want to pivot the table but it is not working as I want.
but I want to make it like this

It looks like you want to switch x and y axes. Use transpose. You can call it with T.
transposed_df = df_data.T
print(transposed_df)

Related

Show the entire output when pandas dataframe is big (when many nan values in multiple columns need to be shown) [duplicate]

This question already has answers here:
Pandas: Setting no. of max rows
(10 answers)
Closed 9 months ago.
What kind of pandas display setting (or whatever) is required if I want to see the entire output of the following :
As you can see there are only few lines but I would like to see everything (669 in total) where I would be able to see each line individually to see how many nan values there are in total per line (such that I find the 27 along the dataframe).
You can try:
DataFrame.to_string()
Or:
DataFrame.to_markdown()
Or:
pandas.set_option('display.max_rows', None)
Now you could decide to just display NaN values like this:
-by column:
df[df['column name'].isna()]
-or for entire dataframe:
df[df.isna().any(axis=1)]

How to reverse a series in pandas by Python? [duplicate]

This question already has an answer here:
Reversing the order of values in a single column of a Dataframe
(1 answer)
Closed 1 year ago.
I need to reverse a series for correct plotting.
So I wrote the code below:
dataFrame["close"] = dataFrame["close"][::-1]
But it doesn't differ. Why?
You are including the specific column. You should reverse the entire dataframe like this:
dataFrame.reindex(index=dataFrame.index[::-1])
or
dataFrame.iloc[::-1]

Finding mean from two different variables [duplicate]

This question already has answers here:
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
Pandas Groupby: Count and mean combined
(6 answers)
Closed 2 years ago.
I have dataset like
df = pd.DataFrame({"type" :["A","B","C","A","B","B"], "value": [40,25,33,22,45,62]})
I want to find each individual type mean, ie., type = A has mean of 31
I did by subsetting
df_a = df.loc[df['type']=="A"]
df_a['value'].mean()
I want to do it in single line,
Thanks in advance
A possible solution might be:
df.gropuby('type')['value'].mean()

groupby and extract data [duplicate]

This question already has answers here:
Pandas groupby: How to get a union of strings
(8 answers)
Closed 3 years ago.
new in pandas and I was able to create a dataframe from a csv file. I was also able to sort it out.
What I am struggling now is the following: I give an image as an example from a pandas data frame.
First column is the index,
Second column is a group number
Third column is what happened.
I want based on the second column to take out the third column on the same unique data frame.
I highlight few examples: For the number 9 return back the sequence
[60,61,70,51]
For the number 6 get back the sequence
[65,55,56]
For the number 8 get back the single element 8.
How groupby can be used to do this extraction?
Thanks a lot
Regards
Alex
Starting from the answers on this question we can extract following code to receive the desired result.
dataframe = pd.DataFrame({'index':[0,1,2,3,4], 'groupNumber':[9,9,9,9,9], 'value':[12,13,14,15,16]})
grouped = dataframe.groupby('groupNumber')['value'].apply(list)

How to filter dataset to contain only specific keywords? [duplicate]

This question already has answers here:
Use a list of values to select rows from a Pandas dataframe
(8 answers)
Filter dataframe rows if value in column is in a set list of values [duplicate]
(7 answers)
Closed 4 years ago.
I have dataset which contains multiple countries.
How can I filter it so that it contains only specific countries?
For example now it contains UK, Belgium, France, ...etc
I would like to filter it so that it shows only France and Belgium.
So far I have tried that:
dataset = dataset.loc[dataset.Country == "France"].copy()
dataset.head()
and it works, because it filters only the data for France, but if I add Belgium
dataset = dataset.loc[dataset.Country == "France","Belgium"].copy()
dataset.head()
It doesn't work any more.
I get the following error:
'the label [Belgium] is not in the [columns]'
Any help will be highly appreciated.
what you tried failed because it's treating 'Belgium' as a column to look for, which doesn't exist. If you want to filter against multiple values then use isin:
dataset = dataset[dataset['Country'].isin([ "France","Belgium"])].copy()
when you use loc the param after the comma is treated as the label to look for, in this case in the column axis

Categories

Resources