Using Python to graph student progress

Using Python to graph student progress - python

I'm experimenting with python graphing for the first time and I want try what I've learned by graphing some of my student's progress. My progress data is in a table in a format the same as what I have mocked up below. I Have used MSPaint (sorry) to mock up what I think would be a decent graph to show them their progress.
What is the right name for this type of graph and what would be the first steps to achieve it? I can't see anything quite like it on http://matplotlib.org/ or on https://plot.ly/
Please feel free to tell me I am laying out the graph all wrong.

I took a stab at generating your example chart in matplotlib. I suspect others with stronger matplotlib-foo could greatly improve this :)
import matplotlib.pyplot as plt
import numpy as np
students = ['steve', 'bob', 'ralph']
progress = [
[1, 3, 4, 4, 5],
[2, 3, 4, 4, 5],
[3, 3, 4, 5, 5]]
(fig, ax) = plt.subplots(1, 1)
# Offset lines by some fraction of one
dx = 1.0 / len(progress)
xoff = dx / 2.0
for i, (name, data) in enumerate(zip(students, progress)):
ax.plot(np.arange(len(data)) + xoff, data, label=name, marker='o')
xoff += dx
ax.set_xticks(np.arange(0, len(progress[0]) + 0.01, dx), minor=True)
ax.set_xticks(np.arange(1, len(progress[0])+1))
labels = students * len(progress[0])
week = 1
for i,l in enumerate(labels):
if l == students[1]:
# hack to add Week label below the second label for each block
labels[i] = "%s\nWeek %s" % (l, week)
week += 1
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_xticklabels(labels, fontsize=8, ha='left', minor=True)
ax.set_xticklabels([])
ax.tick_params(which='both', direction = 'out')
ax.tick_params(axis='x', which='major', width=4)
ax.tick_params(axis='x', which='major', length=7)
ax.tick_params(axis='y', which='major', width=0, length=0)
ax.set_ylim(0, 6)
ax.set_yticks(range(1, 6))
ax.get_xaxis().tick_bottom()
ax.get_yaxis().tick_left()
ax.set_title("Student Progress")
ax.legend(loc='best')
fig.show()

Something like this is probably what you are looking for
import matplotlib.pyplot as plt
weeks = range(1,6)
steve = [1, 3, 4, 4, 5]
bob = [2, 3, 4, 4, 5]
ralph = [3, 3, 4, 5, 5]
plt.figure()
plt.plot(weeks, bob, label='Bob')
plt.plot(weeks, steve, label='Steve')
plt.plot(weeks, ralph, label='Ralph')
plt.title('Student Progress')
plt.ylabel('Score')
plt.xlabel('Week')
plt.xticks(range(6))
plt.ylim(0, 6)
plt.legend(loc='lower right')
plt.show()

Try bokeh. It supports categorical axes, and additionally supports datetime categorical axes (docs link)

Related

Plot a curve on top of 2 subplots simultaneously

EDIT: My question was closed because someone thought another question was responding to it (but it doesn't: Matplotlib different size subplots). To clarify what I want:
I would like to replicate something like what is done on this photo: having a 3rd dataset plotted on top of 2 subplots, with its y-axis displayed on the right.
I have 3 datasets spanning the same time interval (speed, position, precipitation). I would like to plot the speed and position in 2 horizontal subplots, and the precipitation spanning the 2 subplots.
For example in the code below, instead of having the twinx() only on the first subplot, I would like to have it overlap the two subplots (ie. on the right side have a y-axis with 0 at the bottom right of the 2nd subplot, and 20 at the top right of the 1st subplot).
I could I achieve that ?
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(2,1,figsize=(20,15), dpi = 600)
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
ax[0].plot(x,y, label = 'speed')
plt.legend()
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
ax[1].plot(x,y, label = 'position')
plt.legend()
#plot 3:
x = np.array([0, 1, 2, 3])
y = np.array([10, 0, 4, 20])
ax2=ax[0].twinx()
ax2.plot(x,y, label = 'precipitation')
plt.legend(loc='upper right')
plt.show()

Best way I found is not very elegant but it works:
# Prepare 2 subplots
fig, ax = plt.subplots(2,1,figsize=(20,15), dpi = 600)
#plot 1:
# Dummy values for plotting
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
ax[0].plot(x,y, label = 'speed')
# Prints the legend
plt.legend()
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
ax[1].plot(x,y, label = 'position')
plt.legend()
#plot 3:
x = np.array([0, 1, 2, 3])
y = np.array([10, 0, 4, 20])
# Add manually a 3rd subplot that stands on top of the 2 others
ax2 = fig.add_subplot(111, label="new subplot", facecolor="none")
# Move the y-axis to the right otherwise it will overlap with the ones on the left
ax2.yaxis.set_label_position("right")
# "Erase" every tick and label of this 3rd plot
ax2.tick_params(left=False, right=True, labelleft=False, labelright=True,
bottom=False, labelbottom=False)
# This line merges the x axes of the 1st and 3rd plot, and indicates
# that the y-axis of the 3rd plot will be drawn on the entirety of the
# figure instead of just 1 subplot (because fig.add_subplot(111) makes it spread on the entirety of the figure)
ax[0].get_shared_x_axes().join(ax[0],ax2)
ax2.plot(x,y, label = 'precipitation')
# Prints the legend in the upper right corner
plt.legend(loc='upper right')
plt.show()

How to have 2 different scales on same Y axis in Python using Matplotlib

I need to draw 4 X vs Y plots, where X is constant but different Y Values. I used below code to get the plots but need to show the Y scale on either side of the Secondary Y axes (Y Axis 2 in the image), the way Primary Y Axis has (both inward and outward). Right now, it comes on same side of Secondary Y Axis. How to modify the below code to get this done.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
fig.subplots_adjust(left=0.1,right=0.9)
twin2 = ax.twinx()
twin3 = ax.twinx()
twin4 = ax.twinx()
twin3.spines['right'].set_position(("axes", 0.0))
twin4.spines['right'].set_position(('axes', 1.0))
p1, = ax.plot([0, 1, 2], [0, 1, 2], "b-", label="plot1")
p2, = twin2.plot([0, 1, 2], [0, 3, 2], "r-", label="plot2")
p3, = twin3.plot([0, 1, 2], [50, 30, 15], "g-", label="plot3")
p4, = twin4.plot([0, 1, 2], [5, 3, 1], "y-", label="plot4")
ax.set_xlim(0, 2)
ax.set_ylim(0, 2)
ax.set_xlabel("X Axis")
ax.set_ylabel("Y Axis")
twin2.set_ylabel("Y Axis 2")
plt.show()

On your 4th axes, set tick_left and move the left spine to the right-hand side:
twin4.yaxis.tick_left()
twin4.spines['left'].set_position(('axes', 1.0))

Using a different dataset for ticks with matplotlib

I have a question concerning matplotblib in Python. I am working with a dataset, which has 30 sessions. In each session there are 0 to 5 runs. I have created a plot, which displays the values of each run over the run. So the runs go from 0-200. However, I need the ticks to be resetted when a new run starts. So instead of 0-200, I want 0,1,2,3...0,1,2...0,1,2,3,4,5. The graph as it is however, is not supposed to change. Do you have any idea how this would be possible?
The code:
for ses in range(len(all_runs)):
if len(all_runs[ses]) > 0:
plt.plot(xval[ses],all_runs[ses],'.-',color='tab:blue')

You can pass a labels argument to plt.xticks(), specifying the repeating tick labels, without changing the plotted data. For example:
import matplotlib.pyplot as plt
n = 5 # number of ticks per run
r = 3 # number of runs
# sample plot
plt.plot(list(range(n * r)))
plt.xticks(list(range(n * r)))
# set repeating tick labels
ticks = list(plt.xticks()[0])
plt.xticks(ticks, labels = ticks[:n] * r);

If I understand the question correctly, this is how it would work. Optionally, the minor ticks could indicate the sessions.
import matplotlib.pyplot as plt
import numpy as np
xval = [np.array([0, 1, 2, 3, 4]), np.array([5, 6, 7, 8, 9, 10]), np.array([11, 12, 13, 14, 15, 16, 17])]
all_runs = [np.random.randint(1, 10, len(xv)) for xv in xval]
total_len = sum([len(xv) for xv in xval])
for ses in range(len(all_runs)):
if len(all_runs[ses]) > 0:
plt.plot(xval[ses], all_runs[ses], '.-', color='tab:blue')
# if ses > 0:
# plt.axvline(xval[ses][0] - 0.5, ls=':', lw=1, color='purple')
plt.xticks(range(total_len), [i for xv in xval for i in range(len(xv))])
ax = plt.gca()
ax.set_xticks([xv[0] + 0.1 for xv in xval if len(xv) > 0], minor=True)
ax.set_xticklabels([f'session {i}' for i, xv in enumerate(xval) if len(xv) > 0], minor=True)
ax.tick_params(axis='x', which='minor', length=0, pad=18)
for tick in ax.xaxis.get_minor_ticks():
tick.label1.set_horizontalalignment('left')
plt.show()

How to plot a scatter plot over a line plot? [duplicate]

Does anybody have a suggestion on what's the best way to present overlapping lines on a plot? I have a lot of them, and I had the idea of having full lines of different colors where they don't overlap, and having dashed lines where they do overlap so that all colors are visible and overlapping colors are seen.
But still, how do I that.

I have the same issue on a plot with a high degree of discretization.
Here the starting situation:
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
plt.plot(grid,graph,label='g'+str(gg))
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
No one can say where the green and blue lines run exactly
and my "solution"
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
lw=10-8*gg/len(graphs)
ls=['-','--','-.',':'][gg%4]
plt.plot(grid,graph,label='g'+str(gg), linestyle=ls, linewidth=lw)
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
I am grateful for suggestions on improvement!

Just decrease the opacity of the lines so that they are see-through. You can achieve that using the alpha variable. Example:
plt.plot(x, y, alpha=0.7)
Where alpha ranging from 0-1, with 0 being invisible.

imagine your panda data frame is called respone_times, then you can use alpha to set different opacity for your graphs. Check the picture before and after using alpha.
plt.figure(figsize=(15, 7))
plt.plot(respone_times,alpha=0.5)
plt.title('a sample title')
plt.grid(True)
plt.show()

Depending on your data and use case, it might be OK to add a bit of random jitter to artificially separate the lines.
from numpy.random import default_rng
import pandas as pd
rng = default_rng()
def jitter_df(df: pd.DataFrame, std_ratio: float) -> pd.DataFrame:
"""
Add jitter to a DataFrame.
Adds normal distributed jitter with mean 0 to each of the
DataFrame's columns. The jitter's std is the column's std times
`std_ratio`.
Returns the jittered DataFrame.
"""
std = df.std().values * std_ratio
jitter = pd.DataFrame(
std * rng.standard_normal(df.shape),
index=df.index,
columns=df.columns,
)
return df + jitter
Here's a plot of the original data from Markus Dutschke's example:
And here's the jittered version, with std_ratio set to 0.1:

Replacing solid lines by dots or dashes works too
g = sns.FacetGrid(data, col='config', row='outputs', sharex=False)
g.map_dataframe(sns.lineplot, x='lag',y='correlation',hue='card', linestyle='dotted')

Instead of random jitter, the lines can be offset just a little bit, creating a layered appearance:
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
grid = list(range(10))
graphs = [[1, 1, 1, 4, 4, 4, 3, 5, 6, 0],
[1, 1, 1, 5, 5, 5, 3, 5, 6, 0],
[1, 1, 1, 0, 0, 3, 3, 2, 4, 0],
[1, 2, 4, 4, 3, 2, 3, 2, 4, 0],
[1, 2, 3, 3, 4, 4, 3, 2, 6, 0],
[1, 1, 3, 3, 0, 3, 3, 5, 4, 3]]
fig, ax = plt.subplots()
lw = 1
for gg, graph in enumerate(graphs):
trans_offset = offset_copy(ax.transData, fig=fig, x=lw * gg, y=lw * gg, units='dots')
ax.plot(grid, graph, lw=lw, transform=trans_offset, label='g' + str(gg))
ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.01))
# manually set the axes limits, because the transform doesn't set them automatically
ax.set_xlim(grid[0] - .5, grid[-1] + .5)
ax.set_ylim(min([min(g) for g in graphs]) - .5, max([max(g) for g in graphs]) + .5)
plt.tight_layout()
plt.show()

Connecting lines between points plotted in Matplotlib

I have 4 X and Y lists that I am plotting separately with matplotlib.pyplot, i.e. point1[x1,y1], point2[x2,y2], point3[x3,y3] and point4[x4,y4]. What I am trying to do in the plot is to connect point1 to point2, point2 to point3, etc. until all 4 points are connected which would represent a square in my case. The data is for dynamic x and y displacements for a rectangular pump I'm working on that shows if a displacement limitation is exceeded inside a vessel's moonpool.
Here is the code I have so far that gives me the following plot and the plot generated:
## SSLP displacement time histories to be plotted
point1 = (3.61, 4, -3)
point2 = (3.61, -4, -3)
point3 = (-3.61, -4, -3)
point4 = (-3.61, 4, -3)
SSLPXPoint1 = SSLP.TimeHistory('X', 1, objectExtra=OrcFxAPI.oeBuoy(point1))
SSLPYPoint1 = SSLP.TimeHistory('Y', 1, objectExtra=OrcFxAPI.oeBuoy(point1))
SSLPXPoint2 = SSLP.TimeHistory('X', 1, objectExtra=OrcFxAPI.oeBuoy(point2))
SSLPYPoint2 = SSLP.TimeHistory('Y', 1, objectExtra=OrcFxAPI.oeBuoy(point2))
SSLPXPoint3 = SSLP.TimeHistory('X', 1, objectExtra=OrcFxAPI.oeBuoy(point3))
SSLPYPoint3 = SSLP.TimeHistory('Y', 1, objectExtra=OrcFxAPI.oeBuoy(point3))
SSLPXPoint4 = SSLP.TimeHistory('X', 1, objectExtra=OrcFxAPI.oeBuoy(point4))
SSLPYPoint4 = SSLP.TimeHistory('Y', 1, objectExtra=OrcFxAPI.oeBuoy(point4))
# setup plot
caseName = os.path.splitext(info.modelFileName)[0]
point1Plot = [3.61, 4]
point2Plot = [3.61, -4]
point3Plot = [-3.61, -4]
point4Plot = [-3.61, 4]
vesselPointsX = [90.89, 100.89, 100.89, 90.89, 90.89]
vesselPointsY = [5, 5, -5, -5, 5]
moonpoolCLX = [89, 103]
moonpoolCLY = [0, 0]
fig = plt.figure(figsize=(20, 15))
ax = fig.add_subplot(1, 1, 1)
plt.plot(vesselPointsX, vesselPointsY, 'r', lw=2, label='OCV Moonpool Limits')
plt.plot(moonpoolCLX, moonpoolCLY, 'k--', label='Moonpool CL')
plt.plot(SSLPXPoint1, SSLPYPoint1, 'k')
plt.plot(SSLPXPoint2, SSLPYPoint2, 'k')
plt.plot(SSLPXPoint3, SSLPYPoint3, 'k')
plt.plot(SSLPXPoint4, SSLPYPoint4, 'k')
ax.set_title("SSLP Maximum Offsets Inside Moonpool for {}".format(caseName), fontsize=20)
ax.set_xlabel('Distance Along OCV from Stern [m]', fontsize=15)
ax.set_ylabel('Distance from Moonpool Centerline, (+) Towards Portside [m]', fontsize=15)
ax.set_xlim(89, 103)
ax.set_ylim(-7, 7)
plt.gca().set_aspect('equal', adjustable='box')
plt.draw()
plt.legend()
plt.tight_layout()
plt.show()
Any help would be greatly appreciated.
Thanks,
Brian

This should help you. Make your changes.
from matplotlib.pyplot import plot, show
plot([SSLPXPoint1, SSLPXPoint2], [SSLPYPoint1, SSLPYPoint2])
plot([SSLPXPoint3, SSLPXPoint2], [SSLPYPoint3, SSLPYPoint2])
plot([SSLPXPoint3, SSLPXPoint4], [SSLPYPoint3, SSLPYPoint4])
show()
Edited, because previous one was connecting all dots.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Python to graph student progress - python

Try bokeh. It supports categorical axes, and additionally supports datetime categorical axes (docs link)

Related

Plot a curve on top of 2 subplots simultaneously

How to have 2 different scales on same Y axis in Python using Matplotlib

Using a different dataset for ticks with matplotlib

How to plot a scatter plot over a line plot? [duplicate]

Connecting lines between points plotted in Matplotlib

Categories

Resources