I have to plot multiple lines and their curve fit lines on a single plot. All these lines are plotted using a for loop. Since it is plot using loops the curve fit lines of the succeeding step is plotted over its predecessor as shown in figure.
The reproducible code:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
y = np.array([[4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24],
[6, 5.2, 8.5, 9.1, 13.4, 15.1, 16.1, 18.3, 20.4, 22.1, 23.7]])
m, n = x.shape
figure = plt.figure(figsize=(5.15, 5.15))
figure.clf()
plot = plt.subplot(111)
for i in range(m):
poly = np.polyfit(x[i, :], y[i, :], deg =1)
plt.plot(poly[0] * x[i, :] + poly[1], linestyle = '-')
plt.plot(x[i, :], y[i, :], linestyle = '', marker = 'o', markersize = 20)
plot.set_ylabel('Y', labelpad = 6)
plot.set_xlabel('X', labelpad = 6)
plt.show()
I can fix this using another loop as:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
y = np.array([[4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24],
[6, 5.2, 8.5, 9.1, 13.4, 15.1, 16.1, 18.3, 20.4, 22.1, 23.7]])
m, n = x.shape
figure = plt.figure(figsize=(5.15, 5.15))
figure.clf()
plot = plt.subplot(111)
for i in range(m):
poly = np.polyfit(x[i, :], y[i, :], deg =1)
plt.plot(poly[0] * x[i, :] + poly[1], linestyle = '-')
for i in range(m):
plt.plot(x[i, :], y[i, :], linestyle = '', marker = 'o', markersize = 20)
plot.set_ylabel('Y', labelpad = 6)
plot.set_xlabel('X', labelpad = 6)
plt.show()
which gives me all the fit lines below the markers.
But is there any built-in function in Python/matplotlib to do this without using two loops?
Update
Only as an example I have used n = 2, n can be greater than 2, i.e. the loop would be run multiple times.
Update 2 after answer
Can I do this for the same line also? As an example:
plt.plot(x[i, :], y[i, :], linestyle = ':', marker = 'o', markersize = 20)
Can I give the linestyle a zorder = 1 and the markers a zorder = 3?
Editing just your plotting lines:
plt.plot(poly[0] * x[i, :] + poly[1], linestyle = '-',
zorder=-1)
plt.plot(x[i, :], y[i, :], linestyle = '', marker = 'o', markersize = 20,
zorder=3)
now the markers are all in front of the lines, though within marker/line groups they're still order-of-plotting.
Update answer
No. One call to plot, one zorder argument.
If you want to match the color and style of markers and line in each pass through the loop, set up an iterator or generator for colors and get current_color on each pass, then use that as an argument for plot calls.
Related
Let's begin with the code:
b_min = [-3, 6, 0]
b_max = [24, 24, 9]
n_vertices = [10, 10, 10]
p_a = [-1, 11, 4.5]
p_b = [11, 9, 9]
p_c = [6, 8, 6]
p_d = [-1, 7, 10]
points = [p_a, p_b, p_c, p_d]
df = pd.DataFrame(points, columns=['x', 'y', 'z'])
figure = px.scatter_3d(df, x='x', y='y', z='z')
figure.update_layout(scene={
'xaxis': {'nticks': n_vertices[0], 'range': [b_min[0], b_max[0]]},
'yaxis': {'nticks': n_vertices[1], 'range': [b_min[1], b_max[1]]},
'zaxis': {'nticks': n_vertices[2], 'range': [b_min[2], b_max[2]]}
})
figure.show()
I expect there to be 10 ticks per axis. This is true for axis y and z, but not for x. Why?
Looking through the documentation for scatter3d traces here, nticks is the maximum number of a ticks for an axis, and the actual number of ticks is less than or equal to nticks.
I have been trying to create a matplotlib subplot (1 x 3) with horizontal bar plots on either side of a lineplot.
It looks like this:
The code for generating the above plot -
u_list = [2, 0, 0, 0, 1, 5, 0, 4, 0, 0]
n_list = [0, 0, 1, 0, 4, 3, 1, 1, 0, 6]
arr_ = list(np.arange(10, 11, 0.1))
data_ = pd.DataFrame({
'points': list(np.arange(0, 10, 1)),
'value': [10.4, 10.5, 10.3, 10.7, 10.9, 10.5, 10.6, 10.3, 10.2, 10.4][::-1]
})
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(20, 8))
ax1 = plt.subplot(1, 3, 1)
sns.barplot(u_list, arr_, orient="h", ax=ax1)
ax2 = plt.subplot(1, 3, 2)
x = data_['points'].tolist()
y = data_['value'].tolist()
ax2.plot(x, y)
ax2.set_yticks(arr_)
plt.gca().invert_yaxis()
ax3 = plt.subplot(1, 3, 3, sharey=ax1, sharex=ax1)
sns.barplot(n_list, arr_, orient="h", ax=ax3)
fig.tight_layout()
plt.show()
Edit
How do I share the y-axis of the central line plot with the other horizontal bar plots?
I would set the limits of all y-axes to the same range, set the ticks in all axes and than set the ticks/tick-labels of all but the most left axis to be empty. Here is what I mean:
from matplotlib import pyplot as plt
import numpy as np
u_list = [2, 0, 0, 0, 1, 5, 0, 4, 0, 0]
n_list = [0, 0, 1, 0, 4, 3, 1, 1, 0, 6]
arr_ = list(np.arange(10, 11, 0.1))
x = list(np.arange(0, 10, 1))
y = [10.4, 10.5, 10.3, 10.7, 10.9, 10.5, 10.6, 10.3, 10.2, 10.4]
fig, axs = plt.subplots(1, 3, figsize=(20, 8))
axs[0].barh(arr_,u_list,height=0.1)
axs[0].invert_yaxis()
axs[1].plot(x, y)
axs[1].invert_yaxis()
axs[2].barh(arr_,n_list,height=0.1)
axs[2].invert_yaxis()
for i in range(1,len(axs)):
axs[i].set_ylim( axs[0].get_ylim() ) # align axes
axs[i].set_yticks([]) # set ticks to be empty (no ticks, no tick-labels)
fig.tight_layout()
plt.show()
This is a minimal example and for the sake of conciseness, I refrained from mixing matplotlib and searborn. Since seaborn uses matplotlib under the hood, you can reproduce the same output there (but with nicer bars).
How to generate a histogram with the list below?
[[0, 0, 0, 19, 7], [0, 0, 0, 21, 7], [0, 0, 0, 21, 7], [0, 0, 0, 29, 0]]
Explaining the list: [0, 0, 0, 19, 7]
First value = repetition average between 0-20
Second value = repetition average between 20-40
Third value = repetition average between 40-60
Fourth value = average repetition between 60-80
Fifth value = repetition average between 80-100
These sublists within the list can grow exponentially, I would like each sub-list to have a distance between each other, to better interpret the graph
What I have achieved so far:
result = [[[0, 0, 0, 19, 7], [0, 0, 0, 21, 7], [0, 0, 0, 21, 7], [0, 0, 0, 29, 0]]]
fig, ax = plt.subplots(figsize=(10,6))
for i in range(len(result)):
data = np.array(result[i])
x=np.arange(len(data)) + i*6
# draw means
ax.bar(x-0.2, data[:,0], color='blue', width=0.4)
ax.bar(x+0.2, data[:,1], color='green', width=0.4)
ax.bar(x-0.2, data[:,2], color='yellow', width=0.4)
ax.bar(x+0.2, data[:,3], color='orange', width=0.4)
ax.bar(x+0.2, data[:,4], color='red', width=0.4)
# separation line
ax.axvline(4.75)
# turn off xticks
ax.set_xticks([])
ax.legend(labels=['0-20', '20-40', '40-60', '60-80', '80-100'])
leg = ax.get_legend()
leg.legendHandles[0].set_color('blue')
leg.legendHandles[1].set_color('green')
leg.legendHandles[2].set_color('yellow')
leg.legendHandles[3].set_color('orange')
leg.legendHandles[4].set_color('red')
plt.title("Histogram")
plt.ylabel('Consume')
plt.xlabel('Percent')
plt.show()
Any suggetions?
Here is an approach to draw the described plot. Note that normally matplotlib only sets one legend entry for a complete bar graph. To have an entry for individual bars, a label needs to be set to each of them explicitly. In the code below such a label is added to each bar in the first set.
(Note that I left out one set of square parenthesis for result as in the original post it is a 3D list. If such a 3D list would be necessary, you could write the loop as for i, data in enumerate(result[0])).
import numpy as np
import matplotlib.pyplot as plt
result = [[0, 0, 0, 19, 7], [0, 0, 0, 21, 7], [0, 0, 0, 21, 7], [0, 0, 0, 29, 0]]
colors = ['blue', 'green', 'yellow', 'orange', 'red']
labels = ['0-20', '20-40', '40-60', '60-80', '80-100']
fig, ax = plt.subplots(figsize=(10, 6))
for i, data in enumerate(result):
x = np.arange(len(data)) + i*6
bars = ax.bar(x, data, color=colors, width=0.4)
if i == 0:
for bar, label in zip(bars, labels):
bar.set_label(label)
if i < len(result) - 1:
# separation line after each part, but not after the last
ax.axvline(4.75 + i*6, color='black', linestyle=':')
ax.set_xticks([])
ax.legend()
ax.set_title("Histogram")
ax.set_ylabel('Consume')
ax.set_xlabel('Percent')
plt.show()
X = np.array([[24,13,38],[8,3,17],[21,6,40],[1,14,-9],[9,3,21],[7,1,14],[8,7,11],[10,16,3],[1,3,2],
[15,2,30],[4,6,1],[12,10,18],[1,9,-4],[7,3,19],[5,1,13],[1,12,-6],[21,9,34],[8,8,7],
[1,18,-18],[15,8,25],[16,10,29],[7,0,17],[14,2,31],[3,7,0],[5,6,7]])
pca = PCA(n_components=1)
pca.fit(X)
a = pca.components_[0][0] # a
b = pca.components_[0][1] # b
c = pca.components_[0][2] # c
def average(values):
if(values) ==0:
return None
return sum(values, 0.0) / len(values)
x_mean = average(x) # For an approximation
y_mean = average(y)
z_mean = average(z)
d = -(a * x_mean + b * y_mean + c * z_mean)
so -0.375978766054x + 0.10612154283y -0.920531469111z + 15.1366572005 = 0
Actually, I'm not sure it is right.
I want to draw a plane in this situation using matplotlib library.
How can I code this?
Each principal component defines a vector in the feature space. PCA orders those vectors based on the variance of the data in each direction. So the first vector will represent the maximum variance of the data and the last vector minimum variance. Assuming the data are distributed around a plane the third vector should be perpendicular to the plane. Here's the code:
import numpy as np
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X = np.array([[24,13,38],[8,3,17],[21,6,40],[1,14,-9],[9,3,21],[7,1,14],[8,7,11],[10,16,3],[1,3,2],
[15,2,30],[4,6,1],[12,10,18],[1,9,-4],[7,3,19],[5,1,13],[1,12,-6],[21,9,34],[8,8,7],
[1,18,-18],[15,8,25],[16,10,29],[7,0,17],[14,2,31],[3,7,0],[5,6,7]])
pca = PCA(n_components=3)
pca.fit(X)
eig_vec = pca.components_
print(pca.explained_variance_ratio_)
# [0.90946569 0.08816839 0.00236591]
# Percentage of variance explain by last vector is less 0.2%
# This is the normal vector of minimum variance
normal = eig_vec[2, :] # (a, b, c)
centroid = np.mean(X, axis=0)
# Every point (x, y, z) on the plane should satisfy a*x+b*y+c*z = d
# Taking centroid as a point on the plane
d = -centroid.dot(normal)
# Draw plane
xx, yy = np.meshgrid(np.arange(np.min(X[:, 0]), np.max(X[:, 0])), np.arange(np.min(X[:, 1]), np.max(X[:, 1])))
z = (-normal[0] * xx - normal[1] * yy - d) * 1. / normal[2]
# plot the surface
plt3d = plt.figure().gca(projection='3d')
plt3d.plot_surface(xx, yy, z)
plt3d.scatter(*(X.T))
plt.show()
The first principal component doesn't define a plane, it defines a vector in three dimensions. Here's how to visualize it in 3D: the code starts out with yours, and then has the plotting steps:
import numpy as np
from sklearn.decomposition import PCA
X = np.array([[24, 13, 38], [8, 3, 17], [21, 6, 40], [1, 14, -9], [9, 3, 21], [7, 1, 14],
[8, 7, 11], [10, 16, 3], [1, 3, 2], [15, 2, 30], [4, 6, 1], [12, 10, 18], [1, 9, -4],
[7, 3, 19], [5, 1, 13], [1, 12, -6], [21, 9, 34], [8, 8, 7], [1, 18, -18],
[15, 8, 25], [16, 10, 29], [7, 0, 17], [14, 2, 31], [3, 7, 0], [5, 6, 7]])
pca = PCA(n_components=1)
pca.fit(X)
## New code below
p = pca.components_
centroid = np.mean(X, 0)
segments = np.arange(-40, 40)[:, np.newaxis] * p
import matplotlib
matplotlib.use('TkAgg') # might not be necessary for you
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
plt.ion()
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
scatterplot = ax.scatter(*(X.T))
lineplot = ax.plot(*(centroid + segments).T, color="red")
plt.xlabel('x')
plt.ylabel('y')
plt.savefig('result.png', dpi=150)
(Note the above code was auto-formatted with yapf, which I highly recommend.) Resulting figure:
I guess I just didn't use the right keywords, because this probably has been asked before, but I didn't find a solution. Anyway, I have a problem where the the bars of a histogram do not line up with the xticks. I want the bars to be centred over the xticks they correspond to, but they get placed between ticks to fill the space in-between evenly.
import matplotlib.pyplot as plt
data = [1, 1, 1, 1.5, 2, 4, 4, 4, 4, 4.5, 5, 6, 6.5, 7, 9,9, 9.5]
bins = [x+n for n in range(1, 10) for x in [0.0, 0.5]]+[10.0]
plt.hist(data, bins, rwidth = .3)
plt.xticks(bins)
plt.show()
Note that what you are plotting here is not a histogram. A histogram would be
import matplotlib.pyplot as plt
data = [1, 1, 1, 1.5, 2, 4, 4, 4, 4, 4.5, 5, 6, 6.5, 7, 9,9, 9.5]
bins = [x+n for n in range(1, 10) for x in [0.0, 0.5]]+[10.0]
plt.hist(data, bins, edgecolor="k", alpha=1)
plt.xticks(bins)
plt.show()
Here, the bars range between the bins as expected. E.g. you have 3 values in the interval 1 <= x < 1.5.
Conceptually what you want to do here is get a bar plot of the counts of data values. This would not require any bins at all and could be done as follows:
import numpy as np
import matplotlib.pyplot as plt
data = [1, 1, 1, 1.5, 2, 4, 4, 4, 4, 4.5, 5, 6, 6.5, 7, 9,9, 9.5]
u, inv = np.unique(data, return_inverse=True)
counts = np.bincount(inv)
plt.bar(u, counts, width=0.3)
plt.xticks(np.arange(1,10,0.5))
plt.show()
Of course you can "misuse" a histogram plot to get a similar result. This would require to move the center of the bar to the left bin edge, plt.hist(.., align="left").
import matplotlib.pyplot as plt
data = [1, 1, 1, 1.5, 2, 4, 4, 4, 4, 4.5, 5, 6, 6.5, 7, 9,9, 9.5]
bins = [x+n for n in range(1, 10) for x in [0.0, 0.5]]+[10.0]
plt.hist(data, bins, align="left", rwidth = .6)
plt.xticks(bins)
plt.show()
This results in the same plot as above.