I would like to write scout report on some football players and for that I need visualizations. One type of which is pie charts. Now I need some pie charts that looks like below, with different size of slices ( proportionate to the number of the thing the slice indicates) . Can anyone suggest how to do it or have any link to websites where I can learn this?
What you are looking for is called a "Radar Pie Chart". It's analogous to the more commonly used "Radar Chart", but I think it looks better as it highlights the values, rather than focus on meaningless shapes.
The challenge you face with your football dataset is that each category is on a different scale, so you want to plot each value as a percentage of some max. My code will accomplish that, but you'll want to annotate the original values to finish off these charts.
The plot itself can be done with just the standard matplotlib library using polar axes. I borrowed code from here (https://raphaelletseng.medium.com/getting-to-know-matplotlib-and-python-docx-5ee67bad38d2).
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import pi
from random import random, seed
seed(12345)
# Generate dataset with 10 rows, different maxes
maxes = [5, 5, 5, 2, 2, 10, 10, 10, 10, 10]
df = pd.DataFrame(
data = {
'categories': ['category_{}'.format(x) for x, _ in enumerate(maxes)],
'scores': [random()*max for max in maxes],
'max_values': maxes,
},
)
df['pct'] = df['scores'] / df['max_values']
df = df.set_index('categories')
# Plot pie radar chart
N = df.shape[0]
theta = np.linspace(0.0, 2*np.pi, N, endpoint=False)
categories = df.index
df['radar_angles'] = theta
ax = plt.subplot(polar=True)
ax.bar(df['radar_angles'], df['pct'], width=2*pi/N, linewidth=2, edgecolor='k', alpha=0.5)
ax.set_xticks(theta)
ax.set_xticklabels(categories)
_ = ax.set_yticklabels([])
I had previously work with rose or polar bar chart. Here is the example.
import plotly.express as px
df = px.data.wind()
fig = px.bar_polar(df, r="frequency", theta="direction",
color="strength", template="plotly_dark",
color_discrete_sequence= px.colors.sequential.Plasma_r)
fig.show()
this is my first foray into Plotly. I love the ease of use compared to matplotlib and bokeh. However I'm stuck on some basic questions on how to beautify my plot. First, this is the code below (its fully functional, just copy and paste!):
import plotly.express as px
from plotly.subplots import make_subplots
import plotly as py
import pandas as pd
from plotly import tools
d = {'Mkt_cd': ['Mkt1','Mkt2','Mkt3','Mkt4','Mkt5','Mkt1','Mkt2','Mkt3','Mkt4','Mkt5'],
'Category': ['Apple','Orange','Grape','Mango','Orange','Mango','Apple','Grape','Apple','Orange'],
'CategoryKey': ['Mkt1Apple','Mkt2Orange','Mkt3Grape','Mkt4Mango','Mkt5Orange','Mkt1Mango','Mkt2Apple','Mkt3Grape','Mkt4Apple','Mkt5Orange'],
'Current': [15,9,20,10,20,8,10,21,18,14],
'Goal': [50,35,21,44,20,24,14,29,28,19]
}
dataset = pd.DataFrame(d)
grouped = dataset.groupby('Category', as_index=False).sum()
data = grouped.to_dict(orient='list')
v_cat = grouped['Category'].tolist()
v_current = grouped['Current']
v_goal = grouped['Goal']
fig1 = px.bar(dataset, x = v_current, y = v_cat, orientation = 'h',
color_discrete_sequence = ["#ff0000"],height=10)
fig2 = px.bar(dataset, x = v_goal, y = v_cat, orientation = 'h',height=15)
trace1 = fig1['data'][0]
trace2 = fig2['data'][0]
fig = make_subplots(rows = 1, cols = 1, shared_xaxes=True, shared_yaxes=True)
fig.add_trace(trace2, 1, 1)
fig.add_trace(trace1, 1, 1)
fig.update_layout(barmode = 'overlay')
fig.show()
Here is the Output:
Question1: how do I make the width of v_current (shown in red bar) smaller? As in, it should be smaller in height since this is a horizontal bar. I added the height as 10 for trace1 and 15 for trace2, but they are still showing at the same heights.
Question2: Is there a way to make the v_goal (shown in blue bar) only show it's right edge, instead of a filled out bar? Something like this:
If you noticed, I also added a line under each of the category. Is there a quick way to add this as well? Not a deal breaker, just a bonus. Other things I'm trying to do is add animation, etc but that's for some other time!
Thanks in advance for answering!
Running plotly.express wil return a plotly.graph_objs._figure.Figure object. The same goes for plotly.graph_objects running go.Figure() together with, for example, go.Bar(). So after building a figure using plotly express, you can add lines or traces through references directly to the figure, like:
fig['data'][0].width = 0.4
Which is exactly what you need to set the width of your bars. And you can easily use this in combination with plotly express:
Code 1
fig = px.bar(grouped, y='Category', x = ['Current'],
orientation = 'h', barmode='overlay', opacity = 1,
color_discrete_sequence = px.colors.qualitative.Plotly[1:])
fig['data'][0].width = 0.4
Plot 1
In order to get the bars or shapes to indicate the goal levels, you can use the approach described by DerekO, or you can use:
for i, g in enumerate(grouped.Goal):
fig.add_shape(type="rect",
x0=g+1, y0=grouped.Category[i], x1=g, y1=grouped.Category[i],
line=dict(color='#636EFA', width = 28))
Complete code:
import plotly.express as px
from plotly.subplots import make_subplots
import plotly as py
import pandas as pd
from plotly import tools
d = {'Mkt_cd': ['Mkt1','Mkt2','Mkt3','Mkt4','Mkt5','Mkt1','Mkt2','Mkt3','Mkt4','Mkt5'],
'Category': ['Apple','Orange','Grape','Mango','Orange','Mango','Apple','Grape','Apple','Orange'],
'CategoryKey': ['Mkt1Apple','Mkt2Orange','Mkt3Grape','Mkt4Mango','Mkt5Orange','Mkt1Mango','Mkt2Apple','Mkt3Grape','Mkt4Apple','Mkt5Orange'],
'Current': [15,9,20,10,20,8,10,21,18,14],
'Goal': [50,35,21,44,20,24,14,29,28,19]
}
dataset = pd.DataFrame(d)
grouped = dataset.groupby('Category', as_index=False).sum()
fig = px.bar(grouped, y='Category', x = ['Current'],
orientation = 'h', barmode='overlay', opacity = 1,
color_discrete_sequence = px.colors.qualitative.Plotly[1:])
fig['data'][0].width = 0.4
fig['data'][0].marker.line.width = 0
for i, g in enumerate(grouped.Goal):
fig.add_shape(type="rect",
x0=g+1, y0=grouped.Category[i], x1=g, y1=grouped.Category[i],
line=dict(color='#636EFA', width = 28))
f = fig.full_figure_for_development(warn=False)
fig.show()
You can use Plotly Express and then directly access the figure object as #vestland described, but personally I prefer to use graph_objects to make all of the changes in one place.
I'll also point out that since you are stacking bars in one chart, you don't need subplots. You can create a graph_object with fig = go.Figure() and add traces to get stacked bars, similar to what you already did.
For question 1, if you are using go.Bar(), you can pass a width parameter. However, this is in units of the position axis, and since your y-axis is categorical, width=1 will fill the entire category, so I have chosen width=0.25 for the red bar, and width=0.3 (slightly larger) for the blue bar since that seems like it was your intention.
For question 2, the only thing that comes to mind is a hack. Split the bars into two sections (one with height = original height - 1), and set its opacity to 0 so that it is transparent. Then place down bars of height 1 on top of the transparent bars.
If you don't want the traces to show up in the legend, you can set this individually for each bar by passing showlegend=False to fig.add_trace, or hide the legend entirely by passing showlegend=False to the fig.update_layout method.
import plotly.express as px
import plotly.graph_objects as go
# from plotly.subplots import make_subplots
import plotly as py
import pandas as pd
from plotly import tools
d = {'Mkt_cd': ['Mkt1','Mkt2','Mkt3','Mkt4','Mkt5','Mkt1','Mkt2','Mkt3','Mkt4','Mkt5'],
'Category': ['Apple','Orange','Grape','Mango','Orange','Mango','Apple','Grape','Apple','Orange'],
'CategoryKey': ['Mkt1Apple','Mkt2Orange','Mkt3Grape','Mkt4Mango','Mkt5Orange','Mkt1Mango','Mkt2Apple','Mkt3Grape','Mkt4Apple','Mkt5Orange'],
'Current': [15,9,20,10,20,8,10,21,18,14],
'Goal': [50,35,21,44,20,24,14,29,28,19]
}
dataset = pd.DataFrame(d)
grouped = dataset.groupby('Category', as_index=False).sum()
data = grouped.to_dict(orient='list')
v_cat = grouped['Category'].tolist()
v_current = grouped['Current']
v_goal = grouped['Goal']
fig = go.Figure()
## you have a categorical plot and the units for width are in position axis units
## therefore width = 1 will take up the entire allotted space
## a width value of less than 1 will be the fraction of the allotted space
fig.add_trace(go.Bar(
x=v_current,
y=v_cat,
marker_color="#ff0000",
orientation='h',
width=0.25
))
## you can show the right edge of the bar by splitting it into two bars
## with the majority of the bar being transparent (opacity set to 0)
fig.add_trace(go.Bar(
x=v_goal-1,
y=v_cat,
marker_color="#ffffff",
opacity=0,
orientation='h',
width=0.30,
))
fig.add_trace(go.Bar(
x=[1]*len(v_cat),
y=v_cat,
marker_color="#1f77b4",
orientation='h',
width=0.30,
))
fig.update_layout(barmode='relative')
fig.show()
I'm trying to create a simple line chart in Python, for each customer I want to show a single line showing the trend based on the value (y-axis) and date (x-axis). The marker colours should be based on the colour field.
Please find below for the sample data and the expected output (sorry for my scribbling)
Sample Data Here
Expected Output
A simple way is to first use the plot function to plot the lines, for example in black, and then use scatter to plot the colored markers. Here I show an example:
I generate data similar to yours as follows (I suppose you are storing the data in a pandas dataframe):
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
columns = ['CustID', 'Date', 'Value', 'Color']
data = []
np.random.seed(42)
for cus_id in np.arange(1, 5):
for date in ['jan', 'feb', 'mar', 'apr']:
data.append([cus_id, date, np.random.randint(20,35),
np.random.choice(['red', 'blue', 'green'])])
df = pd.DataFrame(data=data, columns=columns)
At this point, df contains this table.
Finally, to plot your line chart:
fig, ax = plt.subplots(1)
for _, df_cus in df.groupby('CustID'):
ax.plot(df_cus.Date, df_cus.Value, color='black', zorder=0)
ax.scatter(df_cus.Date, df_cus.Value, color=df_cus.Color, zorder=1)
Here's the output
I use zorder to make sure the scatter plots are always on top of the black lines.
Following up my previous question: Sorting datetime objects by hour to a pandas dataframe then visualize to histogram
I need to plot 3 bars for one X-axis value representing viewer counts. Now they show those under one minute and above. I need one showing the overall viewers. I have the Dataframe but I can't seem to make them look right. With just 2 bars I have no problem, it looks just like I would want it with two bars:
The relevant part of the code for this:
# Time and date stamp variables
allviews = int(df['time'].dt.hour.count())
date = str(df['date'][0].date())
hours = df_hist_short.index.tolist()
hours[:] = [str(x) + ':00' for x in hours]
The hours variable that I use to represent the X-axis may be problematic, since I convert it to string so I can make the hours look like 23:00 instead of just the pandas index output 23 etc. I have seen examples where people add or subtract values from the X to change the bars position.
fig, ax = plt.subplots(figsize=(20, 5))
short_viewers = ax.bar(hours, df_hist_short['time'], width=-0.35, align='edge')
long_viewers = ax.bar(hours, df_hist_long['time'], width=0.35, align='edge')
Now I set the align='edge' and the two width values are absolutes and negatives. But I have no idea how to make it look right with 3 bars. I didn't find any positioning arguments for the bars. Also I have tried to work with the plt.hist() but I couldn't get the same output as with the plt.bar() function.
So as a result I wish to have a 3rd bar on the graph shown above on the left side, a bit wider than the other two.
pandas will do this alignment for you, if you make the bar plot in one step rather than two (or three). Consider this example (adapted from the docs to add a third bar for each animal).
import pandas as pd
import matplotlib.pyplot as plt
speed = [0.1, 17.5, 40, 48, 52, 69, 88]
lifespan = [2, 8, 70, 1.5, 25, 12, 28]
height = [1, 5, 20, 3, 30, 6, 10]
index = ['snail', 'pig', 'elephant',
'rabbit', 'giraffe', 'coyote', 'horse']
df = pd.DataFrame({'speed': speed,
'lifespan': lifespan,
'height': height}, index=index)
ax = df.plot.bar(rot=0)
plt.show()
In pure matplotlib, instead of using the width parameter to position the bars as you've done, you can adjust the x-values for your plot:
import numpy as np
import matplotlib.pyplot as plt
# Make some fake data:
n_series = 3
n_observations = 5
x = np.arange(n_observations)
data = np.random.random((n_observations,n_series))
# Plotting:
fig, ax = plt.subplots(figsize=(20,5))
# Determine bar widths
width_cluster = 0.7
width_bar = width_cluster/n_series
for n in range(n_series):
x_positions = x+(width_bar*n)-width_cluster/2
ax.bar(x_positions, data[:,n], width_bar, align='edge')
In your particular case, seaborn is probably a good option. You should (almost always) try keep your data in long-form so instead of three separate data frames for short, medium and long, it is much better practice to keep a single data frame and add a column that labels each row as short, medium or long. Use this new column as the hue parameter in Seaborn's barplot