I have dataframe like this:
Continent Surplus1980 Surplus1985 ... Surplus2005 Surplus2010
Africa -711.186834 -894.362995 ... -1001.189049 -960.203280
Asia -1464.995609 -1528.688190 ... -1511.834129 -1529.459409
Europe 716.832130 580.341819 ... 574.808741 590.688746
North America 1586.628358 2559.054466 ... 2851.819722 2867.880633
Oceania 4163.456825 3899.532718 ... 3807.652781 3796.396563
South America 1455.955084 1196.506188 ... 1086.940969 1093.484142
Now I want to plot a bar-chart that shows each continent value from 1980 to 2010 in the x-axis. I am using this function:
df.plot(kind="bar", rot=0, ax=ax, width=0.5)
my result shows me:
bar-chart
So how I can change to have continent name in the legend and for each year shows me the value of each continent?
Use:
df.set_index('Continent').T.plot(kind='bar', rot=0, width=0.5, figsize=(10,8))
Output:
Related
I have these data from the years 1991-2020 and for five countries.
Tempcountries
Date Temperature Units Year Month Statistics Country CODE
Jan 1991 -26.2 Celsius 1991 Jan Average Canada CAN
Feb 1991 -21.0 Celsius 1991 Feb Average Canada CAN
Mar 1991 -18.2 Celsius 1991 Mar Average Canada CAN
Apr 1991 -8.6 Celsius 1991 Apr Average Canada CAN
May 1991 0.8 Celsius 1991 May Average Canada CAN
Merge
pop_est continent name CODE ... Country ISO2 latitude longitude
0 35623680 North America Canada CAN ... Canada CA 56.130366 -106.346771
1 35623680 North America Canada CAN ... Canada CA 56.130366 -106.346771
2 35623680 North America Canada CAN ... Canada CA 56.130366 -106.346771
3 35623680 North America Canada CAN ... Canada CA 56.130366 -106.346771
4 35623680 North America Canada CAN ... Canada CA 56.130366 -106.346771
I want to plot the temperature values on the world map for every month just for the year 1991. So at the end, I will get 12 plots, showing the temperature of each country.
How do I select only the year 1991 and how do I put a title indicating the Year and month of each plot?
I did this:
import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas
location=pd.read_excel('countries.xlsx') #Longitude latitude data
tempcountries = pd.read_excel('Temperature Countries.xlsx')
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world.columns=['pop_est', 'continent', 'name', 'CODE', 'gdp_md_est', 'geometry']
for y in tempcountries['Year']:
for i in tempcountries['Month']:
fig, ax = plt.subplots(figsize=(8,6))
world.plot(ax=ax, color='lightgrey')
merge.plot(x="longitude", y="latitude", kind="scatter",
c="Temperature", colormap="coolwarm",
ax=ax)
plt.show()
I get multiple plots but I do not see that the colors (indicating the temperature) on the map change.
Assuming your merge dataframe is the result of merging tempcountries and world, you could do the following:
month_map = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
# only select year 1991
df = merge.loc[merge["Year"] == '1991']
fig, axs = plt.subplots(3, 4, figsize=(16,9), sharey=True)
for i, ax in enumerate(axs.flat):
# plot basemap
world.plot(ax=ax, color='lightgrey')
# plot values of current month
df.loc[df["Month"] == month_map[i]].plot(ax=ax, x="longitude", y="latitude", kind="scatter", c="Temperature", colormap="coolwarm")
ax.set_title(month_map[i])
As I'm not allowed to post images yet, here is the output of the above code.
I would like to find a shortcut to labeling data since I am working with a large data set.
here's the data I'm charting from the large data set:
Nationality
Afghanistan 4
Albania 40
Algeria 60
Andorra 1
Angola 15
...
Uzbekistan 2
Venezuela 67
Wales 129
Zambia 9
Zimbabwe 13
Name: count, Length: 164, dtype: int64
And so far this is my code:
import pandas as pd
import matplotlib.pyplot as plt
the_data = pd.read_csv('fifa_data.csv')
plt.title('Percentage of Players from Each Country')
the_data['count'] = 1
Nations = the_data.groupby(['Nationality']).count()['count']
plt.pie(Nations)
plt.show()
creating the pie chart is easy and quick this way but I haven't figured out how to automatically label each country in the pie chart without having to label each data point one by one.
pandas plot function would automatic label the data for you
# count:
Nations = the_data.groupby('Nationality').size()
# plot data
Nations.plot.pie()
plt.title('Percentage of Players from Each Country')
plt.show()
I have a pandas dataframe which looks like this:
Country
Japan
Japan
Korea
India
India
USA
USA
USA
I need to count the unique values of the country column and change to percentage and need to put in the x-axis and y-axis of plotly bar chart. Can anyone teach me how to do it?
Use value_counts:
df.Country.value_counts(normalize=True)
I have a dataframe with 4 columns and I want to do a groupby and plot the data. But I am not sure how to go about this.
Cont Coun X3 Y1
Africa nigeria A 10
Africa nigeria B 93
Africa nigeria C 124
Africa nigeria D 24
-------------------------------
Africa kenya A 123
Africa kenya B 540
Africa kenya C 1000
Africa kenya D 183
--------------------------------
Asia Japan A 1234
Asia Japan B 820
Asia Japan C 2130
Asia Japan D 912
For every distinct continent(cont) and country(coun) pair, plot 4 different bars corresponding to the column X3. The Y1 column is the Y-axis
Result:-
I'd recommend seaborn for this kind of plots:
import seaborn as sns
sns.barplot(df.Cont+'\n'+df.Coun, 'Y1', hue='X3', data=df)
For adjusting figure size you can create a figure with a subplot first and then put the seaborn plot into the desired destination with the ax kwarg:
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(16, 8))
sns.barplot(df.Cont+'\n'+df.Coun, 'Y1', hue='X3', data=df, ax=ax)
I've been tried to create heatmap with seaborn. The dataframe I use is: https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv
The dataset has 5 columns namely: country,year,pop,continent,lifeExp and gdpPercap. I want to create a pivot table dataframe with year along x-axes, continent along y-axes and lifeExp filled within cells then plot it to heatmap.
The first thing I did is pivot the dataframe using codes
df1 = pd.read_csv('https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv')
df2 = df1.pivot('year','continent','lifeExp')
but got an error.
So, I tried to change my codes to:
df = pd.read_csv('https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv')
print(df.head())
df2 = df.pivot_table(values= 'lifeExp', index=['year', 'continent'])
print(df2)
and the output of df2 is like this
lifeExp
year continent
1952 Africa 39.135500
Americas 53.279840
Asia 46.314394
Europe 64.408500
Oceania 69.255000
1957 Africa 41.266346
Americas 55.960280
Asia 49.318544
Europe 66.703067
Oceania 70.295000
.....
and when I tried to plot it to seaborn
sns.heatmap(df2)
the lifeExp won't fill the heatmap.
How to fix?
-- Hi ebuzz168,
It looks to me like you have set both 'year' and 'continent' as index and nothing as column. Looking at the documentation the function call should look like this:
table = df.pivot_table(values='lifeExp', index='year', columns='continent', aggfunc=np.mean)
sns.heatmap(table)