problem: Visualization shape showing up on Visualization shape in python - python

I am working on showing a correlation matrix for dataset features using this code
#Correlation matrix/Heatmap
fig= plt.subplots(figsize=(14,8))
sns.heatmap(cdf.corr() , annot = True, vmin=-1, vmax=1, center= 0)
and then show the distribution of two features on the grid with
plt.plot(cdf['BALANCE'], cdf['PAYMENTS'], marker='.', linewidth=0, color='#128128')
plt.grid(which='major', color='#cccccc', alpha=0.45)
plt.xlabel('Balance', fontsize=16)
plt.ylabel('Payment', fontsize=16)
plt.title('Balance vs payment', fontsize=20)
plt.show()
But the problem here is that the correlation matrix is displayed in combination with the other shape, what is the reason for that?
Like this:

The two plots are drawn on the same axes. You can either clear the axis with plt.cla() after the heatmap or have different axes (either both in the same figure of different ones)
for different figures
fig1 , ax1 = plt.subplots()
fig2 , ax2 = plt.subplots()
sns.heatmap(cdf.corr(), ax = ax1 )
ax2.plot( cdf['BALANCE'], cdf['PAYMENTS'] )
plt.show()
or on the same figure
fig , axs = plt.subplots(2)
sns.heatmap( cdf.corr() , ax = axs[0] )
axs[1].plot( cdf['BALANCE'], cdf['PAYMENTS'] )
plt.show()

Related

How Can I space legend items with variable spacing and have legend marker colors reflect the colormap

I would like to have an increasing spacing between legend items instead of a single value (labelspacing). The latter only accepts an int value type, but I want a variable spacing between legend items. Also, I want the markerfacecolor to follow the colormap used when creating the scatter plot.
N = 45
x, y = np.random.rand(2, N)
s = np.random.randint(10, 1000, size=N)
fig, ax = plt.subplots()
scatter = ax.scatter(x, y, c=s, s=s)
cbar = fig.colorbar(scatter,
ax=ax,
label='Size',
fraction=0.1,
pad=0.04)
# produce a legend with a cross section of sizes from the scatter
handles, labels = scatter.legend_elements(prop="sizes", alpha=0.6)
for hd in handles:
hd.set_markeredgewidth(2)
hd.set_markeredgecolor("red")
hd.set_markerfacecolor('blue')
legend2 = ax.legend(
handles[::2], labels[::2], loc="upper right", title="Sizes", labelspacing=1.2
)
plt.show()
I searched StackOverflow and tried some possible methods but without success. Could someone guide how I can achieve the desired output?
I managed to set markerfacecolor as the colormap. But I am still struggling with the variable labelspacing!.
Any help!
N = 45
x, y = np.random.rand(2, N)
s = np.random.randint(10, 1000, size=N)
fig, ax = plt.subplots()
scatter = ax.scatter(x, y, c=s, s=s)
cbar = fig.colorbar(scatter,
ax=ax,
label='Size',
fraction=0.1,
pad=0.04)
# produce a legend with a cross section of sizes from the scatter
handles, labels = scatter.legend_elements(prop="sizes", alpha=0.6)
leg_colrs = [color.get_markerfacecolor() for color in scatter.legend_elements()[0]]
for hd, color in zip(handles, leg_colrs):
hd.set_markeredgewidth(2)
hd.set_markeredgecolor("red")
hd.set_markerfacecolor(color)
legend2 = ax.legend(
handles[::2], labels[::2], loc="upper right", title="Sizes", labelspacing=1.2
)
plt.show()

How do you controle zorder across twinx in matplotlib?

I'm trying to control the zorder of different plots across twinx axes. How can I get the blue noisy plots to appear in the background and the orange smoothed plots to appear in the foreground in this plot?
from matplotlib import pyplot as plt
import numpy as np
from scipy.signal import savgol_filter
random = np.random.RandomState(0)
x1 = np.linspace(-10,10,500)**3 + random.normal(0, 100, size=500)
x2 = np.linspace(-10,10,500)**2 + random.normal(0, 100, size=500)
fig,ax1 = plt.subplots()
ax1.plot(x1, zorder=0)
ax1.plot(savgol_filter(x1,99,2), zorder=1)
ax2 = ax1.twinx()
ax2.plot(x2, zorder=0)
ax2.plot(savgol_filter(x2,99,2), zorder=1)
plt.show()
Similar to this thread, though not ideal, this is an approach using twiny along with twinx.
# set up plots
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax3 = ax1.twiny()
ax4 = ax2.twiny()
# background
ax1.plot(x1)
ax2.plot(x2)
# smoothed
ax3.plot(savgol_filter(x1,99,2), c='orange')
ax4.plot(savgol_filter(x2,99,2), c='orange')
# turn off extra ticks and labels
ax3.tick_params(axis='x', which='both', bottom=False, top=False)
ax4.tick_params(axis='x', which='both', bottom=False, top=False)
ax3.set_xticklabels([])
ax4.set_xticklabels([])
# fix zorder
ax1.set_zorder(1)
ax2.set_zorder(2)
ax3.set_zorder(3)
ax4.set_zorder(4)
plt.show()
Output:

Trying to plot 2 charts side-by-side, but one of them always comes out empty

I have two plots that I generated from my data:
Here the second plot shows the distribution of results from the first one.
What I want is to plot them side-by-side so you could see both the data and the distribution on the same plot. And I want plots to share y-axis as well.
I tried to do the following:
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(40, 15), sharey=True)
ax1 = sns.lineplot(plotting_df.index, plotting_df.error, color=('#e65400'), lw=2, label='random forest residual error')
ax1 = sns.lineplot(plotting_df.index, plotting_df.val, color=('#9b9b9b'), lw=1, label='current model residual error')
ax1 = sns.lineplot(plotting_df.index, 0, color=('#2293e3'), lw=1)
ax1.xaxis.set_visible(False)
ax1.set_ylabel('Residual Fe bias', fontsize=16)
ax1.set_title('Models residual error comparison', fontsize=20, fontweight='bold')
sns.despine(ax=ax1, top=True, bottom=True, right=True)
ax2 = sns.distplot(results_df.error, hist=True, color=('#e65400'), bins=81,
label='Random forest model', vertical=True)
ax2 = sns.distplot(plotting_df.val, hist=True, color=('#9b9b9b'),
bins=81, label='Rolling averages model', vertical=True)
ax2.set_title('Error distribution comparison between models', fontsize=20, fontweight='bold')
sns.despine(ax=ax2, top=True, right=True)
fig.savefig("blabla.png", format='png')
But when I do run it I get strange results - the first chart is in the second column, whereas I wanted it on the left and the second chart is completely blank. Not sure what I did wrong here.
Both lineplot and distplot accept a matplotlib axes object as an argument, which tells it which axes to plot onto. If no axes is passed into it, then the plot is placed onto the current axes.
You create a figure and 2 axes using :
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(40, 15), sharey=True)
Therefore, ax2 will be the current axes. So your distplot is being plotted on top of your lineplot, both in ax2.
You need to pass the axes into the seaborn plotting functions.
sns.lineplot(..., ax=ax1)
sns.distplot(..., ax=ax2)

Subplot problem: how to plot for each plot a histogram by categorical values?

I have a DataFrame with three numerical variables Porosity, Perm and AI. I would like to make a subplot and in each plot, I would like the histogram of the three variables, by a categorical variable 'Facies'. Facies can take only two values: Sand and Shale.
In summary, each subplot needs a histogram and each histogram must be drawn based in the categorical variable Facies, to make a comparison between facies.
So far, I can make it work, but I cannot add the axis title to each subplot.
plt.subplot(311)
plt.hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
plt.hist(df_sh['Porosity'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='Porosity (fraction)', ylabel='Density', title='Porosity
Histogram')
plt.legend()
plt.subplot(312)
plt.hist(df_sd['log10Perm'].values, label='Sand', bins=30, alpha=0.6,)
plt.hist(df_sh['log10Perm'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='Permeability (mD)', ylabel='Density', title='Permeability
Histogram')
plt.legend()
plt.subplot(313)
plt.hist(df_sd['AI'].values, label='Sand', bins=30, alpha=0.6)
plt.hist(df_sh['AI'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='AI (units)', ylabel='Density', title='Acoustic Impedance
Histogram')
plt.legend()
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.5, top=3.5, wspace=0.1,
hspace=0.2);
#I have tried with:
fig, axs = plt.subplots(2, 1)
but when I code
axs[0].hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
axs[0].hist(df_sd['Porosity'].values, label='Shale', bins=30, alpha=0.6)
#But the histogram for shale overrides the histogram for Sand.
I would like to have this result but with both x and y axis with label names. Furthermore, it would be helpful to have a title for each subplot.
I just did a subplot with contours, but I think the framework will be very similar:
fig, axs = plt.subplots(2, 2, constrained_layout=True)
for ax, extend in zip(axs.ravel(), extends):
cs = ax.contourf(X, Y, Z, levels, cmap=cmap, extend=extend, origin=origin)
fig.colorbar(cs, ax=ax, shrink=0.9)
ax.set_title("extend = %s" % extend)
ax.locator_params(nbins=4)
plt.show()
I think the main point to note (and this I learned from the link below) is their use of zip(axs.ravel()) in the for loop to establish each ax and then plot what you wish on that ax. I'm fairly certain you can adapt this for your uses.
The full example is available at: https://matplotlib.org/gallery/images_contours_and_fields/contourf_demo.html#sphx-glr-gallery-images-contours-and-fields-contourf-demo-py
I have found an answer:
fig = plt.figure()
ax = fig.add_subplot(111)
ax1 = fig.add_subplot(311)
ax2 = fig.add_subplot(312)
ax2 = fig.add_subplot(313)
plt.subplot(311)
ax1.hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
ax1.hist(df_sh['Porosity'].values, label='Shale', bins=30, alpha=0.6)
ax1.set(xlabel='Porosity (fraction)', ylabel='Density', title='Porosity Histogram')
ax1.legend()

Difference between a loglog plot of scatter data and loglog plot of non scatter data

I trying to represent data in a loglog plot, but i cant figure out the difference between the two plotting methods, FIG4 the data is scattered. FIG5 the data is not scattered. What is the interpretation?
Here is the code:
fig4, ax4 = plt.subplots()
ax4.scatter(t, sigma, marker='o', label='strain', color='red', s=0.5)
ax4.set_xlabel('log(t)')
ax4.set_ylabel('log(Sigma)')
ax4.set_title('FIG4:Log(t),log(sigma)')
ax4.set_yscale('log')
ax4.set_xscale('log')
plt.grid()
plt.show()
fig5, ax5 = plt.subplots()
ax5.set_xlabel('log(t)')
ax5.set_ylabel('log(Sigma)')
ax5.set_title('FIG5: Log(t),log(sigma)')
plt.loglog(t,sigma)
plt.grid()
plt.show()
Here are the two plots:

Categories

Resources