pyplot hist, plotting binned data

pyplot hist, plotting binned data - python

Here is the code non-binned plot of the data.
link_weights = [float(s)/float(k) for s,k in zip(strengths, degrees)]
fig2 = plt.figure()
ax = fig2.add_subplot(121)
ax.set_xlabel('degree [k]')
ax.set_ylabel('Average link weight <w>')
ax.scatter(degrees, link_weights, alpha=0.5)
ax = fig2.add_subplot(122)
ax.set_xlabel('degree [k]')
ax.set_ylabel('Average link weight <w>')
ax.set_xscale('log')
ax.scatter(degrees, link_weights, alpha=0.5)
fig2.savefig("figure-1b-1.pdf", ftype='pdf')
I want to plot histogram of binned data (bin-averaged versions of the plots). I managed to create binned data but stuck with plotting.
n_bins = 20
fig3 = plt.figure()
ax = fig3.add_subplot(121)
bin_means, bin_edges, _= stats.binned_statistic(link_weights, degrees,
statistic='mean', bins=n_bins)
I need to know how to plot bin_means as a histogram (lin x-axis and log x-axis). My initial attempt is below but it failed.
ax.hist(bin_means, n_bins, normed=True, histtype='bar')
Any help would be greatly appreciated. Thank you.

Related

problem: Visualization shape showing up on Visualization shape in python

I am working on showing a correlation matrix for dataset features using this code
#Correlation matrix/Heatmap
fig= plt.subplots(figsize=(14,8))
sns.heatmap(cdf.corr() , annot = True, vmin=-1, vmax=1, center= 0)
and then show the distribution of two features on the grid with
plt.plot(cdf['BALANCE'], cdf['PAYMENTS'], marker='.', linewidth=0, color='#128128')
plt.grid(which='major', color='#cccccc', alpha=0.45)
plt.xlabel('Balance', fontsize=16)
plt.ylabel('Payment', fontsize=16)
plt.title('Balance vs payment', fontsize=20)
plt.show()
But the problem here is that the correlation matrix is displayed in combination with the other shape, what is the reason for that?
Like this:

The two plots are drawn on the same axes. You can either clear the axis with plt.cla() after the heatmap or have different axes (either both in the same figure of different ones)
for different figures
fig1 , ax1 = plt.subplots()
fig2 , ax2 = plt.subplots()
sns.heatmap(cdf.corr(), ax = ax1 )
ax2.plot( cdf['BALANCE'], cdf['PAYMENTS'] )
plt.show()
or on the same figure
fig , axs = plt.subplots(2)
sns.heatmap( cdf.corr() , ax = axs[0] )
axs[1].plot( cdf['BALANCE'], cdf['PAYMENTS'] )
plt.show()

Plotting histogram in Python with frequency percentage

I have a list of ratings for which I am plotting a histogram. On the left (y-axis) it shows the count of the frequency, is there a way for it to show the % based on traffic.
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.hist(item['ratings'], bins = 5)
ax.legend()
ax.set_title("Ratings Frequency")
ax.set_xlabel("Ratings")
ax.set_ylabel("frequency")
ax.axhline(y=0, linestyle='--', color='k')

You can use countplot try using the seaborn library it will make it very easy to do data visualization
import seaborn as sns
sns.countplot()

Subplot problem: how to plot for each plot a histogram by categorical values?

I have a DataFrame with three numerical variables Porosity, Perm and AI. I would like to make a subplot and in each plot, I would like the histogram of the three variables, by a categorical variable 'Facies'. Facies can take only two values: Sand and Shale.
In summary, each subplot needs a histogram and each histogram must be drawn based in the categorical variable Facies, to make a comparison between facies.
So far, I can make it work, but I cannot add the axis title to each subplot.
plt.subplot(311)
plt.hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
plt.hist(df_sh['Porosity'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='Porosity (fraction)', ylabel='Density', title='Porosity
Histogram')
plt.legend()
plt.subplot(312)
plt.hist(df_sd['log10Perm'].values, label='Sand', bins=30, alpha=0.6,)
plt.hist(df_sh['log10Perm'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='Permeability (mD)', ylabel='Density', title='Permeability
Histogram')
plt.legend()
plt.subplot(313)
plt.hist(df_sd['AI'].values, label='Sand', bins=30, alpha=0.6)
plt.hist(df_sh['AI'].values, label='Shale', bins=30, alpha=0.6)
ax.set(xlabel='AI (units)', ylabel='Density', title='Acoustic Impedance
Histogram')
plt.legend()
plt.subplots_adjust(left=0.0, bottom=0.0, right=1.5, top=3.5, wspace=0.1,
hspace=0.2);
#I have tried with:
fig, axs = plt.subplots(2, 1)
but when I code
axs[0].hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
axs[0].hist(df_sd['Porosity'].values, label='Shale', bins=30, alpha=0.6)
#But the histogram for shale overrides the histogram for Sand.
I would like to have this result but with both x and y axis with label names. Furthermore, it would be helpful to have a title for each subplot.

I just did a subplot with contours, but I think the framework will be very similar:
fig, axs = plt.subplots(2, 2, constrained_layout=True)
for ax, extend in zip(axs.ravel(), extends):
cs = ax.contourf(X, Y, Z, levels, cmap=cmap, extend=extend, origin=origin)
fig.colorbar(cs, ax=ax, shrink=0.9)
ax.set_title("extend = %s" % extend)
ax.locator_params(nbins=4)
plt.show()
I think the main point to note (and this I learned from the link below) is their use of zip(axs.ravel()) in the for loop to establish each ax and then plot what you wish on that ax. I'm fairly certain you can adapt this for your uses.
The full example is available at: https://matplotlib.org/gallery/images_contours_and_fields/contourf_demo.html#sphx-glr-gallery-images-contours-and-fields-contourf-demo-py

I have found an answer:
fig = plt.figure()
ax = fig.add_subplot(111)
ax1 = fig.add_subplot(311)
ax2 = fig.add_subplot(312)
ax2 = fig.add_subplot(313)
plt.subplot(311)
ax1.hist(df_sd['Porosity'].values, label='Sand', bins=30, alpha=0.6)
ax1.hist(df_sh['Porosity'].values, label='Shale', bins=30, alpha=0.6)
ax1.set(xlabel='Porosity (fraction)', ylabel='Density', title='Porosity Histogram')
ax1.legend()

Difference between a loglog plot of scatter data and loglog plot of non scatter data

I trying to represent data in a loglog plot, but i cant figure out the difference between the two plotting methods, FIG4 the data is scattered. FIG5 the data is not scattered. What is the interpretation?
Here is the code:
fig4, ax4 = plt.subplots()
ax4.scatter(t, sigma, marker='o', label='strain', color='red', s=0.5)
ax4.set_xlabel('log(t)')
ax4.set_ylabel('log(Sigma)')
ax4.set_title('FIG4:Log(t),log(sigma)')
ax4.set_yscale('log')
ax4.set_xscale('log')
plt.grid()
plt.show()
fig5, ax5 = plt.subplots()
ax5.set_xlabel('log(t)')
ax5.set_ylabel('log(Sigma)')
ax5.set_title('FIG5: Log(t),log(sigma)')
plt.loglog(t,sigma)
plt.grid()
plt.show()
Here are the two plots:

Matplotlib format the scale label

I have searched around SO and haven't been able to find how to format this text (I've also checked around google and the matplotlib docs)
I'm currently creating a figure and then adding 4 subplots in a 2x2 matrix format so I'm trying to scale down all the text:
fig = plt.figure()
ax1 = fig.add_subplot(221)
ax1.tick_params(labelsize='xx-small')
ax1.set_title(v, fontdict={'fontsize':'small'})
ax1.hist(results[v], histtype='bar', label='data', bins=bins, alpha=0.5)
ax1.hist(results[v+'_sim'], histtype='bar', label='truth', bins=bins, alpha=0.8)
ax1.legend(loc='best', fontsize='x-small')

You can set the parameters before plot:
plt.rcParams['xtick.labelsize'] = "xx-small"
plt.rcParams['ytick.labelsize'] = "xx-small"

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

pyplot hist, plotting binned data - python

Related

problem: Visualization shape showing up on Visualization shape in python

Plotting histogram in Python with frequency percentage

Subplot problem: how to plot for each plot a histogram by categorical values?

Difference between a loglog plot of scatter data and loglog plot of non scatter data

Matplotlib format the scale label

Categories

Resources