Scatter plot to show the majorities and include extreme numbers - python

Simple data as below and I want to put them in a scatter plot.
It goes well if there's not outliers (i.e. extremely big numbers).
import pandas as pd
import matplotlib.pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
dates = ["2021-01-01",
"2021-01-01", "2021-01-06",
"2021-01-08", "2021-01-12",
"2021-02-01", "2021-02-11",
"2021-02-12", "2021-02-15",
"2021-02-16", "2021-03-11",
"2021-03-21", "2021-03-22",
"2021-03-23", "2021-03-24",
"2021-04-02", "2021-04-12",
"2021-04-22", "2021-04-26",
"2021-04-30"]
numbers= [6400,
5100,5000,
4000,3686,
9000,8050,
8000,6050,
6000,9000,
8500,7800,
7000,6000,
10000,9600,
8000,7883,
6686]
dates = [pd.to_datetime(d) for d in dates]
plt.scatter(dates, numbers, s =100, c = 'red')
plt.show()
But when there are one or more extreme numbers, for example the last number 6686 became 66860. The new plot shows most the scatters insignificant (because of the the new y-axis).
What's the good solution to have a scatter plot as before (keeping the y-axis as it was), and still visualizing the extreme numbers?
The purpose of the chart is show and focus the distribution of the scatters under 10000, and also note there are extreme numbers.

If you don't want to use a log scale, you can break the plot in two (or more) and plot the values below/above a threshold:
df = pd.DataFrame({'num': numbers}, index=dates)
thresh = 12000
f, (ax1, ax2) = plt.subplots(nrows=2, sharex=True,
gridspec_kw={'height_ratios': (1,3)},
figsize=(10,4)
)
lows = df.mask(df['num'].ge(thresh))
highs = df.mask(df['num'].lt(thresh))
ax2.scatter(df.index, lows)
ax1.scatter(df.index, highs)
output:

Related

Seaborn color bar on FacetGrid for histplot with normalized color mapping

I seem unable to show the color bar for a two dimensional histplot using seaborn FacetGrid. Can someone point me to the missing link please?
Understanding that similar solutions have been discussed I have not been able to adapt to my use case:
Has the right position and values for color bar but isn't working for histplot
This proposal is not running at all & is rather dated so I am not sure it is still supposed to work
Seems to have fixed vmin/vmax and does not work with histplot
Specifically I am looking to extend the code below so that color bar is shown.
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(list(zip([random.randint(0,10) for i in range(1000)], pd.to_datetime(
[d.strftime('%Y-%m-%d') for d in pd.date_range('1800-01-01', periods=250, freq='1d')]+\
[d.strftime('%Y-%m-%d') for d in pd.date_range('1800-01-01', periods=250, freq='1d')]+\
[d.strftime('%Y-%m-%d') for d in pd.date_range('1800-01-01', periods=250, freq='1d')]+\
[d.strftime('%Y-%m-%d') for d in pd.date_range('1800-01-01', periods=250, freq='1d')]),
[random.choice(string.ascii_letters[26:30]) for i in range(1000)])),
columns=["range","date","case_type"])
df["range"][df["case_type"]=="A"] = [random.randint(4562,873645) for i in range(1000)]
df["range"][df["case_type"]=="C"] = [random.random() for i in range(1000)]
fg = sns.FacetGrid(df, col="case_type", col_wrap=2, sharey=False)
fg.map(sns.histplot, "date", "range", stat="count", data=df)
fg.set_xticklabels(rotation=30)
fg.fig.show()
The objective would be to have a color bar on the right side of the facet grid, spanning the entire chart - two rows here but more may be shown. The displayed 2D histogram feature some very different data types so the counts per bin & color are likely very different and it matters to know if "dark blue" is 100 or 1000.
EDIT: For sake of clarity it appears from comments that the problem breaks down into two steps:
How to normalize the color coding among all plots and
Display a color bar on the right side of the plot using the normalized color mapping
I am not sure there is a seaborn-inherent way to achieve your desired plot. But we can pre-compute sensible values for bin number and vmin/vmax and apply them to all histplots:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
#generate a test dataset with different case_type probabilities
np.random.seed(123)
p1, p2, p3 = 0.8, 0.1, 0.03
df = pd.DataFrame(list(zip(np.random.randint(0, 20, 1000),
pd.to_datetime(4 * [d.strftime('%Y-%m-%d') for d in pd.date_range('1800-01-01', periods=250, freq='1d')]),
np.random.choice(list("ABCD"),size=1000, p=[p1, p2, p3, 1-(p1+p2+p3)]))),
columns=["range","date","case_type"])
df.loc[df.case_type == "A", "range"] *= 3
df.loc[df.case_type == "B", "range"] *= 23
df.loc[df.case_type == "C", "range"] *= 123
#determine the bin number for the x-axis
_, bin_edges = np.histogram(df["date"].dt.strftime("%Y%m%d").astype(int), bins="auto")
bin_nr = len(bin_edges)-1
#predetermine min and max count for each category
c_types = df["case_type"].unique()
vmin_list, vmax_list = [], []
for c_type in c_types:
arr, _, _ = np.histogram2d(df.loc[df.case_type == c_type, "date"], df.loc[df.case_type == c_type, "range"], bins=bin_nr)
vmin_list.append(arr.min())
vmax_list.append(arr.max())
#find lowest and highest counts for all subplots
vmin_all = min(vmin_list)
vmax_all = max(vmax_list)
#now we are ready to plot
fg = sns.FacetGrid(df, col="case_type", col_wrap=2, sharey=False)
#create common colorbar axis
cax = fg.fig.add_axes([.92, .12, .02, .8])
#map colorbar to colorbar axis with common vmin/vmax values
fg.map(sns.histplot,"date", "range", stat="count", bins=bin_nr, vmin=vmin_all, vmax=vmax_all, cbar=True, cbar_ax=cax, data=df)
#prevent overlap
fg.fig.subplots_adjust(right=.9)
fg.set_xticklabels(rotation=30)
plt.show()
Sample output:
You may also notice that I changed your sample dataframe so that the case_types occur at different frequencies, otherwise you don't see much difference between histplots. You should also be aware that the histplots are plotted in the order they appear in the dataframe, which might not be the order you would like to see in your graph.
Disclaimer: This is largely based on mwaskom's answer.

How to create a wind rose or polar bar plot

I would like to write scout report on some football players and for that I need visualizations. One type of which is pie charts. Now I need some pie charts that looks like below, with different size of slices ( proportionate to the number of the thing the slice indicates) . Can anyone suggest how to do it or have any link to websites where I can learn this?
What you are looking for is called a "Radar Pie Chart". It's analogous to the more commonly used "Radar Chart", but I think it looks better as it highlights the values, rather than focus on meaningless shapes.
The challenge you face with your football dataset is that each category is on a different scale, so you want to plot each value as a percentage of some max. My code will accomplish that, but you'll want to annotate the original values to finish off these charts.
The plot itself can be done with just the standard matplotlib library using polar axes. I borrowed code from here (https://raphaelletseng.medium.com/getting-to-know-matplotlib-and-python-docx-5ee67bad38d2).
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import pi
from random import random, seed
seed(12345)
# Generate dataset with 10 rows, different maxes
maxes = [5, 5, 5, 2, 2, 10, 10, 10, 10, 10]
df = pd.DataFrame(
data = {
'categories': ['category_{}'.format(x) for x, _ in enumerate(maxes)],
'scores': [random()*max for max in maxes],
'max_values': maxes,
},
)
df['pct'] = df['scores'] / df['max_values']
df = df.set_index('categories')
# Plot pie radar chart
N = df.shape[0]
theta = np.linspace(0.0, 2*np.pi, N, endpoint=False)
categories = df.index
df['radar_angles'] = theta
ax = plt.subplot(polar=True)
ax.bar(df['radar_angles'], df['pct'], width=2*pi/N, linewidth=2, edgecolor='k', alpha=0.5)
ax.set_xticks(theta)
ax.set_xticklabels(categories)
_ = ax.set_yticklabels([])
I had previously work with rose or polar bar chart. Here is the example.
import plotly.express as px
df = px.data.wind()
fig = px.bar_polar(df, r="frequency", theta="direction",
color="strength", template="plotly_dark",
color_discrete_sequence= px.colors.sequential.Plasma_r)
fig.show()

Axis ticks in histogram of times in matplotlib/seaborn

I've got a df with messages from a WhatsApp chat, the sender and the corresponding time in datetime format.
Time
Sender
Message
2020-12-21 22:23:00
Sender 1
"..."
2020-12-21 22:26:00
Sender 2
"..."
2020-12-21 22:35:00
Sender 1
"..."
I can plot the histogram with sns.histplot(df["Time"], bins=48)
But now the ticks on the x-axis don't make much sense. I end up with 30 ticks even though it should be 24 and also the ticks all contain the whole date plus the time where I would want only the time in "%H:%M"
Where is the issue with the wrong ticks coming from?
Thanks!
Both seaborn and pandas use matplotlib for plotting functions. Let's see who returns the bin values, we would need to adapt the x-ticks:
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 5))
#fake data generation
np.random.seed(1234)
n=20
start = pd.to_datetime("2020-11-15")
df = pd.DataFrame({"Time": pd.to_timedelta(np.random.rand(n), unit="D") + start, "A": np.random.randint(1, 100, n)})
#print(df)
#pandas histogram plotting function, left
pd_g = df["Time"].hist(bins=5, xrot=90, ax=ax1)
#no bin information
print(pd_g)
ax1.set_title("Pandas")
#seaborn histogram plotting, middle
sns_g = sns.histplot(df["Time"], bins=5, ax=ax2)
ax2.tick_params(axis="x", labelrotation=90)
#no bin information
print(sns_g)
ax2.set_title("Seaborn")
#matplotlib histogram, right
mpl_g = ax3.hist(df["Time"], bins=5, edgecolor="white")
ax3.tick_params(axis="x", labelrotation=90)
#hooray, bin information, alas in floats representing dates
print(mpl_g)
ax3.set_title("Matplotlib")
plt.tight_layout()
plt.show()
Sample output:
From this exercise we can conclude that all three refer to the same routine. So, we can directly use matplotlib which provides us with the bin values:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.dates import num2date
fig, ax = plt.subplots(figsize=(8, 5))
#fake data generation
np.random.seed(1234)
n=20
start = pd.to_datetime("2020-11-15")
df = pd.DataFrame({"Time": pd.to_timedelta(np.random.rand(n), unit="D") + start, "A": np.random.randint(1, 100, n)})
#plots histogram, returns counts, bin border values, and the bars themselves
h_vals, h_bins, h_bars = ax.hist(df["Time"], bins=5, edgecolor="white")
#plot x ticks at the place where the bin borders are
ax.set_xticks(h_bins)
#label them with dates in HH:MM format after conversion of the float values that matplotlib uses internally
ax.set_xticklabels([num2date(curr_bin).strftime("%H:%M") for curr_bin in h_bins])
plt.show()
Sample output:
Seaborn and pandas make life easier because they provide convenience wrappers and some additional functionality for commonly used plotting functions. However, if they do not suffice in the parameters they provide, one has often to revert to matplotlib which is more flexible in what it can do. Obviously, there might be an easier way in pandas or seaborn, I am not aware of. I will happily upvote any better suggestion within these libraries.

Plot point on time series line graph

I have this dataframe and I want to line plot it. As I have plotted it.
Graph is
Code to generate is
fig, ax = plt.subplots(figsize=(15, 5))
date_time = pd.to_datetime(df.Date)
df = df.set_index(date_time)
plt.xticks(rotation=90)
pd.DataFrame(df, columns=df.columns).plot.line( ax=ax,
xticks=pd.to_datetime(frame.Date))
I want a marker of innovationScore with value(where innovationScore is not 0) on open, close line. I want to show that that is the change when InnovationScore changes.
You have to address two problems - marking the corresponding spots on your curves and using the dates on the x-axis. The first problem can be solved by identifying the dates, where the score is not zero, then plotting markers on top of the curve at these dates. The second problem is more of a structural nature - pandas often interferes with matplotlib when it comes to date time objects. Using pandas standard plotting functions is good because it addresses common problems. But mixing pandas with matplotlib plotting (and to set the markers, you have to revert to matplotlib afaik) is usually a bad idea because they do not necessarily present the date time in the same format.
import pandas as pd
from matplotlib import pyplot as plt
#fake data generation, the following code block is just for illustration
import numpy as np
np.random.seed(1234)
n = 50
date_range = pd.date_range("20180101", periods=n, freq="D")
choice = np.zeros(10)
choice[0] = 3
df = pd.DataFrame({"Date": date_range,
"Open": np.random.randint(100, 150, n),
"Close": np.random.randint(100, 150, n),
"Innovation Score": np.random.choice(choice, n)})
fig, ax = plt.subplots()
#plot the three curves
l = ax.plot(df["Date"], df[["Open", "Close", "Innovation Score"]])
ax.legend(iter(l), ["Open", "Close", "Innovation Score"])
#filter dataset for score not zero
IS = df[df["Innovation Score"] > 0]
#plot markers on these positions
ax.plot(IS["Date"], IS[["Open", "Close"]], "ro")
#and/or set vertical lines to indicate the position
ax.vlines(IS["Date"], 0, max(df[["Open", "Close"]].max()), ls="--")
#label x-axis score not zero
ax.set_xticks(IS["Date"])
#beautify the output
ax.set_xlabel("Month")
ax.set_ylabel("Artifical score people take seriously")
fig.autofmt_xdate()
plt.show()
Sample output:
You can achieve it like this:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3], "ro-") # r is red, o is for larger marker, - is for line
plt.plot([3, 2, 1], "b.-") # b is blue, . is for small marker, - is for line
plt.show()
Check out also example here for another approach:
https://matplotlib.org/3.3.3/gallery/lines_bars_and_markers/markevery_prop_cycle.html
I very often get inspiration from this list of examples:
https://matplotlib.org/3.3.3/gallery/index.html

Fix x-axis scale seaborn factorplot

I'm attempting to make a figure that shows two plots, with each plot separated based on a set of categorical data. However, although I can make the graph, I cant figure out how to get the x-axis to be properly spaced.
I want the x-axis to start before the first value (want axis to start at 60 [first value = 63]) and end after the last (want axis to end at 95 [last value = 92.1]), with xticks going up in 5's.
Any help is much appreciated! Thanks in advance!
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.axes
import seaborn as sns
Temperature = [63.0,63.3,63.6,63.9,64.2,64.5,64.8,65.2,65.5,65.8,66.1,66.4,66.7,67.0,67.3,67.7,68.0,68.3,68.6,68.9,69.2,69.5,69.9,70.2,70.5,70.8,71.1,71.4,71.8,72.1,72.4,72.7,73.0,73.4,73.7,74.0,74.3,74.6,74.9,75.2,75.6,75.9,76.2,76.5,76.9,77.2,77.5,77.8,78.1,78.5,78.8,79.1,79.4,79.7,80.1,80.4,80.7,81.0,81.3,81.6,81.9,82.3,82.6,82.9,83.2,83.5,83.8,84.1,84.4,84.8,85.1,85.4,85.7,86.0,86.3,86.6,86.9,87.2,87.5,87.8,88.1,88.4,88.7,89.0,89.3,89.6,89.8,90.1,90.4,90.7,91.0,91.2,91.5,91.8,92.1,63.0,63.3,63.6,63.9,64.2,64.5,64.8,65.2,65.5,65.8,66.1,66.4,66.7,67.0,67.3,67.7,68.0,68.3,68.6,68.9,69.2,69.5,69.9,70.2,70.5,70.8,71.1,71.4,71.8,72.1,72.4,72.7,73.0,73.4,73.7,74.0,74.3,74.6,74.9,75.2,75.6,75.9,76.2,76.5,76.9,77.2,77.5,77.8,78.1,78.5,78.8,79.1,79.4,79.7,80.1,80.4,80.7,81.0,81.3,81.6,81.9,82.3,82.6,82.9,83.2,83.5,83.8,84.1,84.4,84.8,85.1,85.4,85.7,86.0,86.3,86.6,86.9,87.2,87.5,87.8,88.1,88.4,88.7,89.0,89.3,89.6,89.8,90.1,90.4,90.7,91.0,91.2,91.5,91.8,92.1]
Derivative = [0.0495,0.0507,0.0525,0.0548,0.0570,0.0579,0.0579,0.0574,0.0574,0.0576,0.0581,0.0587,0.0593,0.0592,0.0584,0.0580,0.0579,0.0580,0.0582,0.0588,0.0592,0.0594,0.0588,0.0581,0.0578,0.0579,0.0580,0.0579,0.0582,0.0581,0.0579,0.0574,0.0571,0.0563,0.0548,0.0538,0.0536,0.0540,0.0544,0.0551,0.0556,0.0551,0.0542,0.0535,0.0536,0.0542,0.0564,0.0623,0.0748,0.0982,0.1360,0.1897,0.2550,0.3228,0.3807,0.4177,0.4248,0.3966,0.3365,0.2558,0.1713,0.0971,0.0438,0.0140,0.0034,0.0028,0.0048,0.0058,0.0057,0.0050,0.0042,0.0038,0.0039,0.0041,0.0038,0.0031,0.0023,0.0017,0.0014,0.0012,0.0015,0.0019,0.0020,0.0018,0.0017,0.0015,0.0014,0.0014,0.0015,0.0014,0.0013,0.0011,0.0007,0.0004,0.0011,0.0105,0.0100,0.0096,0.0090,0.0084,0.0081,0.0077,0.0071,0.0066,0.0063,0.0064,0.0060,0.0057,0.0055,0.0054,0.0051,0.0047,0.0046,0.0042,0.0037,0.0035,0.0040,0.0043,0.0039,0.0032,0.0028,0.0028,0.0027,0.0029,0.0034,0.0038,0.0034,0.0027,0.0024,0.0021,0.0017,0.0015,0.0016,0.0015,0.0011,0.0008,0.0012,0.0019,0.0025,0.0027,0.0026,0.0019,0.0012,0.0010,0.0014,0.0016,0.0014,0.0010,0.0007,0.0007,0.0010,0.0017,0.0021,0.0020,0.0013,0.0012,0.0013,0.0014,0.0015,0.0018,0.0017,0.0012,0.0013,0.0018,0.0028,0.0031,0.0033,0.0027,0.0022,0.0015,0.0016,0.0022,0.0026,0.0026,0.0019,0.0012,0.0006,0.0007,0.0011,0.0016,0.0014,0.0010,0.0009,0.0012,0.0015,0.0014,0.0008,0.0001,-0.0003,0.0002]
Category = ["a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b"]
df = pd.DataFrame({"Temperature": Temperature,
"Derivative": Derivative,
"Category" : Category})
g = sns.factorplot(x="Temperature", y="Derivative", data=df, col="Category")
g.set_xticklabels(step=10)
All the desired feature you describe suggest that using a factorplot here is absolutely the wrong choice. Instead use a normal matplotlib plot and then set the limits as usual, plt.xlim(60,95).
import pandas as pd
import matplotlib.pyplot as plt
Temperature = [63.0,63.3,63.6,63.9,64.2,64.5,64.8,65.2,65.5,65.8,66.1,66.4,66.7,67.0,67.3,67.7,68.0,68.3,68.6,68.9,69.2,69.5,69.9,70.2,70.5,70.8,71.1,71.4,71.8,72.1,72.4,72.7,73.0,73.4,73.7,74.0,74.3,74.6,74.9,75.2,75.6,75.9,76.2,76.5,76.9,77.2,77.5,77.8,78.1,78.5,78.8,79.1,79.4,79.7,80.1,80.4,80.7,81.0,81.3,81.6,81.9,82.3,82.6,82.9,83.2,83.5,83.8,84.1,84.4,84.8,85.1,85.4,85.7,86.0,86.3,86.6,86.9,87.2,87.5,87.8,88.1,88.4,88.7,89.0,89.3,89.6,89.8,90.1,90.4,90.7,91.0,91.2,91.5,91.8,92.1,63.0,63.3,63.6,63.9,64.2,64.5,64.8,65.2,65.5,65.8,66.1,66.4,66.7,67.0,67.3,67.7,68.0,68.3,68.6,68.9,69.2,69.5,69.9,70.2,70.5,70.8,71.1,71.4,71.8,72.1,72.4,72.7,73.0,73.4,73.7,74.0,74.3,74.6,74.9,75.2,75.6,75.9,76.2,76.5,76.9,77.2,77.5,77.8,78.1,78.5,78.8,79.1,79.4,79.7,80.1,80.4,80.7,81.0,81.3,81.6,81.9,82.3,82.6,82.9,83.2,83.5,83.8,84.1,84.4,84.8,85.1,85.4,85.7,86.0,86.3,86.6,86.9,87.2,87.5,87.8,88.1,88.4,88.7,89.0,89.3,89.6,89.8,90.1,90.4,90.7,91.0,91.2,91.5,91.8,92.1]
Derivative = [0.0495,0.0507,0.0525,0.0548,0.0570,0.0579,0.0579,0.0574,0.0574,0.0576,0.0581,0.0587,0.0593,0.0592,0.0584,0.0580,0.0579,0.0580,0.0582,0.0588,0.0592,0.0594,0.0588,0.0581,0.0578,0.0579,0.0580,0.0579,0.0582,0.0581,0.0579,0.0574,0.0571,0.0563,0.0548,0.0538,0.0536,0.0540,0.0544,0.0551,0.0556,0.0551,0.0542,0.0535,0.0536,0.0542,0.0564,0.0623,0.0748,0.0982,0.1360,0.1897,0.2550,0.3228,0.3807,0.4177,0.4248,0.3966,0.3365,0.2558,0.1713,0.0971,0.0438,0.0140,0.0034,0.0028,0.0048,0.0058,0.0057,0.0050,0.0042,0.0038,0.0039,0.0041,0.0038,0.0031,0.0023,0.0017,0.0014,0.0012,0.0015,0.0019,0.0020,0.0018,0.0017,0.0015,0.0014,0.0014,0.0015,0.0014,0.0013,0.0011,0.0007,0.0004,0.0011,0.0105,0.0100,0.0096,0.0090,0.0084,0.0081,0.0077,0.0071,0.0066,0.0063,0.0064,0.0060,0.0057,0.0055,0.0054,0.0051,0.0047,0.0046,0.0042,0.0037,0.0035,0.0040,0.0043,0.0039,0.0032,0.0028,0.0028,0.0027,0.0029,0.0034,0.0038,0.0034,0.0027,0.0024,0.0021,0.0017,0.0015,0.0016,0.0015,0.0011,0.0008,0.0012,0.0019,0.0025,0.0027,0.0026,0.0019,0.0012,0.0010,0.0014,0.0016,0.0014,0.0010,0.0007,0.0007,0.0010,0.0017,0.0021,0.0020,0.0013,0.0012,0.0013,0.0014,0.0015,0.0018,0.0017,0.0012,0.0013,0.0018,0.0028,0.0031,0.0033,0.0027,0.0022,0.0015,0.0016,0.0022,0.0026,0.0026,0.0019,0.0012,0.0006,0.0007,0.0011,0.0016,0.0014,0.0010,0.0009,0.0012,0.0015,0.0014,0.0008,0.0001,-0.0003,0.0002]
Category = ["a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b"]
df = pd.DataFrame({"Temperature": Temperature,
"Derivative": Derivative,
"Category" : Category})
for n, data in df.groupby("Category"):
plt.plot(data["Temperature"],data["Derivative"] , marker="o", label=n)
plt.xlim(60,95)
plt.legend()
plt.show()
Or if subplots are desired,
fig,axes = plt.subplots(ncols=len(df["Category"].unique()), sharey=True)
for ax,(n, data) in zip(axes,df.groupby("Category")):
ax.plot(data["Temperature"],data["Derivative"] , marker="o", label=n)
ax.set_title("Category {}".format(n))
ax.set_xlim(60,95)
plt.show()
Finally, you may use a seaborn FacetGrid onto which you plot your data with a plot:
g = sns.FacetGrid(df, col="Category")
g.map(plt.plot, "Temperature", "Derivative",marker="o",)
for ax in g.axes.flat:
ax.set_xlim(60,95)
plt.show()

Categories

Resources