Different y axis in one array subplot - python

I don't know how to tell matplotlib to use different axis in one peculiar subplot of an array subplot:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
def plotter():
y=np.random.rand(10)
y1 = np.random.rand(10)*100
x = np.arange(len(y))
f, axarr = plt.subplots(2,2,sharex=True)
axarr[0][0].errorbar(x,y,)
axarr[0][0].errorbar(x,y1)
axarr[1][1].twinx()
axarr[1][1].errorbar(x,y)
axarr[1][1].errorbar(x,y1)
plt.show()
plotter()
This gives:
The issue is that my one data set is greater by a factor of hundred, so plotting them on the same y axis is useless. What I want to have for the lower right panel (and only for this panel) is one y axis that ranges from (0,10) on the right side of the plot and one that ranges from (0,100) on the other side. The blue line should be represented by the right (0,10) y axis, while the blue line should be represented by the left (0,100) y axis

One way of doing this is:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
def plotter():
y=np.random.rand(10)
y1 = np.random.rand(10)*100
x = np.arange(len(y))
f, axarr = plt.subplots(2,2,sharex=True)
axarr[0][0].errorbar(x,y,)
axarr[0][0].errorbar(x,y1)
axarr[1][1].errorbar(x,y)
ax2 = axarr[1][1].twinx()
ax2.plot(x,y1, 'r')
#ax2.tick_params('y', colors='r')
plt.show()
plotter()
Which gives this:

Related

Create a heat map out of three 1D arrays

I want to create a heatmap out of 3 1dimensional arrays. Something that looks like this:
Up to this point, I was only able to create a scatter plot where the markers have a different color and marker size depending on the third value:
My code:
xf = np.random.rand(1000)
yf = np.random.rand(1000)
zf = 1e5*np.random.rand(1000)
ms1 = (zf).astype('int')
from matplotlib.colors import LinearSegmentedColormap
# Remove the middle 40% of the RdBu_r colormap
interval = np.hstack([np.linspace(0, 0.4), np.linspace(0.6, 1)])
colors = plt.cm.RdBu_r(interval)
cmap = LinearSegmentedColormap.from_list('name', colors)
col = cmap(np.linspace(0,1,len(ms1)))
#for i in range(len(ms1)):
plt.scatter(xf, yf, c=zf, s=5*ms1/1e4, cmap=cmap,alpha=0.8)#, norm =matplotlib.colors.LogNorm())
ax1 =plt.colorbar(pad=0.01)
is giving me this result:
Any idea how I could make it look like the first figure?
Essentially what I want to do is find the average of the z value for groups of the x and y arrays
I think the functionality you are looking for is provided by scipy.stats.binned_statistic_2d. You can use it to organize values of xf and yf arrays into 2-dimensional bins, and compute the mean of zf values in each bin:
import numpy as np
from scipy import stats
np.random.seed(0)
xf = np.random.rand(1000)
yf = np.random.rand(1000)
zf = 1e5 * np.random.rand(1000)
means = stats.binned_statistic_2d(xf,
yf,
values=zf,
statistic='mean',
bins=(5, 5))[0]
Then you can use e.g. seaborn to plot a heatmap of the array of mean values:
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(10, 8))
sns.heatmap(means,
cmap="Reds_r",
annot=True,
annot_kws={"fontsize": 16},
cbar=True,
linewidth=2,
square=True)
plt.show()
This gives:

Bar Chart using Matlplotlib

I have two values:
test1 = 0.75565
test2 = 0.77615
I am trying to plot a bar chart (using matlplotlib in jupyter notebook) with the x-axis as the the two test values and the y-axis as the resulting values but I keep getting a crazy plot with just one big box
here is the code I've tried:
plt.bar(test1, 1, width = 2, label = 'test1')
plt.bar(test2, 1, width = 2, label = 'test2')
As you can see in this example, you should define X and Y in two separated arrays, so you can do it like this :
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(2)
y = [0.75565,0.77615]
fig, ax = plt.subplots()
plt.bar(x, y)
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()
the final plot would be like :
UPDATE
If you want to draw each bar with a different color, you should call the bar method multiple times and give it colors to draw, although it has default colors :
import matplotlib.pyplot as plt
import numpy as np
number_of_points = 2
x = np.arange(number_of_points)
y = [0.75565,0.77615]
fig, ax = plt.subplots()
for i in range(number_of_points):
plt.bar(x[i], y[i])
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()
or you can do it even more better and choose the colors yourself :
import matplotlib.pyplot as plt
import numpy as np
number_of_points = 2
x = np.arange(number_of_points)
y = [0.75565,0.77615]
# choosing the colors and keeping them in a list
colors = ['g','b']
fig, ax = plt.subplots()
for i in range(number_of_points):
plt.bar(x[i], y[i],color = colors[i])
# set your labels for the x axis here :
plt.xticks(x, ('test1', 'test2'))
plt.show()
The main reason your plot is showing one large value is because you are setting a width for the columns that is greater than the distance between the explicit x values that you have set. Reduce the width to see the individual columns. The only advantage to doing it this way is if you need to set the x values (and y values) explicitly for some reason on a bar chart. Otherwise, the other answer is what you need for a "traditional bar chart".
import matplotlib.pyplot as plt
test1 = 0.75565
test2 = 0.77615
plt.bar(test1, 1, width = 0.01, label = 'test1')
plt.bar(test2, 1, width = 0.01, label = 'test2')

Marking y value using dotted line in matplotlib.pyplot

I am trying to plot a graph using matplotlib.pyplot.
import matplotlib.pyplot as plt
import numpy as np
x = [i for i in range (1,201)]
y = np.loadtxt('final_fscore.txt', dtype=np.float128)
plt.plot(x, y, lw=2)
plt.show()
It looks something like this:
I want to mark the first value of x where y has reached the highest ( which is already known, say for x= 23, y= y[23]), like this figure shown below:
I have been searching this for some time now, with little success. I have tried adding a straight line for now, which is not behaving the desired way:
import matplotlib.pyplot as plt
import numpy as np
x = [i for i in range (1,201)]
y = np.loadtxt('final_fscore.txt', dtype=np.float128)
plt.plot(x, y, lw=2)
plt.plot([23,y[23]], [23,0])
plt.show()
Resulting graph:
Note: I want to make the figure like in the second graph.
It's not clear what y[23] would do here. You would need to find out the maximum value and the index at which this occurs (np.argmax). You may then use this to plot a 3 point line with those coordinates.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(9)
x = np.arange(200)
y = np.cumsum(np.random.randn(200))
plt.plot(x, y, lw=2)
amax = np.argmax(y)
xlim,ylim = plt.xlim(), plt.ylim()
plt.plot([x[amax], x[amax], xlim[0]], [xlim[0], y[amax], y[amax]],
linestyle="--")
plt.xlim(xlim)
plt.ylim(ylim)
plt.show()

Partial shade of distribution plot using Seaborn

Following simple code:
import numpy as np
import seaborn as sns
dist = np.random.normal(loc=0, scale=1, size=1000)
ax = sns.kdeplot(dist, shade=True);
Yields the following image:
I would like to only shade everything right (or left to some x value). What is the simplest way? I am ready to use something other than Seaborn.
After calling ax = sns.kdeplot(dist, shade=True), the last line in ax.get_lines() corresponds to the kde density curve:
ax = sns.kdeplot(dist, shade=True)
line = ax.get_lines()[-1]
You can extract the data corresponding to that curve using line.get_data:
x, y = line.get_data()
Once you have the data, you can, for instance, shade the region corresponding to x > 0 by selecting those points and calling ax.fill_between:
mask = x > 0
x, y = x[mask], y[mask]
ax.fill_between(x, y1=y, alpha=0.5, facecolor='red')
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
dist = np.random.normal(loc=0, scale=1, size=1000)
ax = sns.kdeplot(dist, shade=True)
line = ax.get_lines()[-1]
x, y = line.get_data()
mask = x > 0
x, y = x[mask], y[mask]
ax.fill_between(x, y1=y, alpha=0.5, facecolor='red')
plt.show()
Using seaborn is often fine for standard plots, but when some customized requirements come into play, falling back to matplotlib is often easier.
So one may first calculate the kernel density estimate and then plot it in the region of interest.
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("seaborn-darkgrid")
dist = np.random.normal(loc=0, scale=1, size=1000)
kde = stats.gaussian_kde(dist)
# plot complete kde curve as line
pos = np.linspace(dist.min(), dist.max(), 101)
plt.plot(pos, kde(pos))
# plot shaded kde only right of x=0.5
shade = np.linspace(0.5,dist.max(), 101)
plt.fill_between(shade,kde(shade), alpha=0.5)
plt.ylim(0,None)
plt.show()

How to plot one line in different colors

I have two list as below:
latt=[42.0,41.978567980875397,41.96622693388357,41.963791391892457,...,41.972407378075879]
lont=[-66.706920989908909,-66.703116557977069,-66.707351643324543,...-66.718218142021925]
now I want to plot this as a line, separate each 10 of those 'latt' and 'lont' records as a period and give it a unique color.
what should I do?
There are several different ways to do this. The "best" approach will depend mostly on how many line segments you want to plot.
If you're just going to be plotting a handful (e.g. 10) line segments, then just do something like:
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color():
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random())
xy = (np.random.random((10, 2)) - 0.5).cumsum(axis=0)
fig, ax = plt.subplots()
for start, stop in zip(xy[:-1], xy[1:]):
x, y = zip(start, stop)
ax.plot(x, y, color=uniqueish_color())
plt.show()
If you're plotting something with a million line segments, though, this will be terribly slow to draw. In that case, use a LineCollection. E.g.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
xy = (np.random.random((1000, 2)) - 0.5).cumsum(axis=0)
# Reshape things so that we have a sequence of:
# [[(x0,y0),(x1,y1)],[(x0,y0),(x1,y1)],...]
xy = xy.reshape(-1, 1, 2)
segments = np.hstack([xy[:-1], xy[1:]])
fig, ax = plt.subplots()
coll = LineCollection(segments, cmap=plt.cm.gist_ncar)
coll.set_array(np.random.random(xy.shape[0]))
ax.add_collection(coll)
ax.autoscale_view()
plt.show()
For both of these cases, we're just drawing random colors from the "gist_ncar" coloramp. Have a look at the colormaps here (gist_ncar is about 2/3 of the way down): http://matplotlib.org/examples/color/colormaps_reference.html
Copied from this example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib.colors import ListedColormap, BoundaryNorm
x = np.linspace(0, 3 * np.pi, 500)
y = np.sin(x)
z = np.cos(0.5 * (x[:-1] + x[1:])) # first derivative
# Create a colormap for red, green and blue and a norm to color
# f' < -0.5 red, f' > 0.5 blue, and the rest green
cmap = ListedColormap(['r', 'g', 'b'])
norm = BoundaryNorm([-1, -0.5, 0.5, 1], cmap.N)
# Create a set of line segments so that we can color them individually
# This creates the points as a N x 1 x 2 array so that we can stack points
# together easily to get the segments. The segments array for line collection
# needs to be numlines x points per line x 2 (x and y)
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
# Create the line collection object, setting the colormapping parameters.
# Have to set the actual values used for colormapping separately.
lc = LineCollection(segments, cmap=cmap, norm=norm)
lc.set_array(z)
lc.set_linewidth(3)
fig1 = plt.figure()
plt.gca().add_collection(lc)
plt.xlim(x.min(), x.max())
plt.ylim(-1.1, 1.1)
plt.show()
See the answer here to generate the "periods" and then use the matplotlib scatter function as #tcaswell mentioned. Using the plot.hold function you can plot each period, colors will increment automatically.
Cribbing the color choice off of #JoeKington,
import numpy as np
import matplotlib.pyplot as plt
def uniqueish_color(n):
"""There're better ways to generate unique colors, but this isn't awful."""
return plt.cm.gist_ncar(np.random.random(n))
plt.scatter(latt, lont, c=uniqueish_color(len(latt)))
You can do this with scatter.
I have been searching for a short solution how to use pyplots line plot to show a time series coloured by a label feature without using scatter due to the amount of data points.
I came up with the following workaround:
plt.plot(np.where(df["label"]==1, df["myvalue"], None), color="red", label="1")
plt.plot(np.where(df["label"]==0, df["myvalue"], None), color="blue", label="0")
plt.legend()
The drawback is you are creating two different line plots so the connection between the different classes is not shown. For my purposes it is not a big deal. It may help someone.

Categories

Resources