Importing histogram from matplotlib to plotly

Importing histogram from matplotlib to plotly - python

I have generated these histograms with the python code below, and it looks fine in maptlotlib:
d_norm_1 = np.random.normal(loc=0.0, scale=3.0, size=5000)
## Build a Gaussian Mixture Model:
array1 = np.random.normal(loc=4.0, scale=2.0, size=2000)
array2 = np.random.normal(loc=-5.0, scale=4.0, size=2000)
d_norm_2 = np.concatenate((array1, array2))
fig3 = plt.figure(3, figsize=(8, 6))
ax3 = fig3.add_subplot(1, 1, 1)
plt.hist(d_norm_1, bins=40, normed=True, color='b', alpha=0.4, rwidth=1.0)
plt.hist(d_norm_2, bins=40, normed=True, color='g', alpha=0.4, rwidth=0.8)
plt.xlabel('$x$', size=20)
plt.ylabel('Probability Density', size=20)
plt.title('Histogram', size=20)
plt.setp(ax3.get_xticklabels(), rotation='horizontal', fontsize=16)
plt.setp(ax3.get_yticklabels(), rotation='horizontal', fontsize=16)
plt.show()
But when I import this into plotly, the histogram bars are replaced by lines. I think plotly is not compatible with this version of matplotlib.
Here is the plotly version of the same histogram shown above:
https://plot.ly/~vmirjalily/11/histogram/
I am using matplotlib 1.4.2

Your code histogram to plotly is working.
You are just missing one last step. What your plotly shows is a grouped bar chart. Eseentially what plotly has done is display 2 bars in a single column.
What you need to do, is go to
traces > mode and change to 'overlay' bar chart
here's my implementation
https://plot.ly/1/~quekxc

biobirdman's solution is perfectly fine if you want to use the web tools. Here's another way to do it strictly from Python:
import matplotlib.pyplot as plt
import numpy as np
import plotly.plotly as py
d_norm_1 = np.random.normal(loc=0.0, scale=3.0, size=5000)
## Build a Gaussian Mixture Model:
array1 = np.random.normal(loc=4.0, scale=2.0, size=2000)
array2 = np.random.normal(loc=-5.0, scale=4.0, size=2000)
d_norm_2 = np.concatenate((array1, array2))
fig3 = plt.figure(3, figsize=(8, 6))
ax3 = fig3.add_subplot(1, 1, 1)
plt.hist(d_norm_1, bins=40, normed=True, color='b', alpha=0.4, rwidth=1.0)
plt.hist(d_norm_2, bins=40, normed=True, color='g', alpha=0.4, rwidth=0.8)
plt.xlabel('$x$', size=20)
plt.ylabel('Probability Density', size=20)
plt.title('Histogram', size=20)
plt.setp(ax3.get_xticklabels(), rotation='horizontal', fontsize=16)
plt.setp(ax3.get_yticklabels(), rotation='horizontal', fontsize=16)
# note the `update` argument, it's formatted as a plotly Figure object
# this says: "convert the figure as best you can, then apply the update on the result"
py.iplot_mpl(fig3, update={'layout': {'barmode': 'overlay'}})
For more online info, checkout https://plot.ly/matplotlib/ or https://plot.ly/python/
For python help, checkout help(py.iplot_mpl) or help(Figure)
It can sometimes be useful to see exactly what got converted as well, you might try this:
import plotly.tools as tls
pfig = tls.mpl_to_plotly(fig3) # turns the mpl object into a plotly Figure object
print pfig.to_string() # prints out a `pretty` looking text representation

Related

Seaborn axvspan alterating x-axis

I'm trying to create some scatter plots, with seaborn with a specific area of each plot highlighted in red. However when I add the code for axvspan, it changes the x-axis. This is how the plots look prior to axvspan being applied.
When i apply the line for axvpsan:
fig, (ax0, ax1) = plt.subplots(2,1, figsize=(5,10))
ax0.axvspan("0.4", "0.8", color='red', alpha=0.3, label ='Problem Area')
sns.scatterplot(x='Values_1', y='Values_2', data=df3, color='green', ax=ax0)
sns.scatterplot(x='Values_3', y='Values_4', data=df3, color='green', ax=ax1)
plt.show()
It sends up looking like this:
Ultimately, the red section needs to only cover the data between 0.4 and 0.7, but by altering the x-axis it ends up covering all of it.
Any advice?

The unexpected behavior is resulting from passing the xmin and xmax arguments to matplotlib.pyplot.axvspan as str and not as float.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
# generate data
rng = np.random.default_rng(12)
df3 = pd.DataFrame({"Values_2": rng.random(100), "Values_1": np.linspace(0., 0.6, 100)})
fig, ax0 = plt.subplots(1,1, figsize=(6, 4))
ax0.axvspan(0.4, 0.8, color='red', alpha=0.3, label ='Problem Area')
sns.scatterplot(x='Values_1', y='Values_2', data=df3, color='green', ax=ax0)
plt.show()
This gives:

Combine 2 kde-functions in one plot in seaborn

I have the following code for plotting the histogram and the kde-functions (Kernel density estimation) of a training and validation dataset:
#Plot histograms
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
displot_dataTrain=sns.displot(data_train, bins='auto', kde=True)
displot_dataTrain._legend.remove()
plt.ylabel('Count')
plt.xlabel('Training Data')
plt.title("Histogram Training Data")
plt.show()
displot_dataValid =sns.displot(data_valid, bins='auto', kde=True)
displot_dataValid._legend.remove()
plt.ylabel('Count')
plt.xlabel('Validation Data')
plt.title("Histogram Validation Data")
plt.show()
# Try to plot the kde-functions together --> yields an AttributeError
X1 = np.linspace(data_train.min(), data_train.max(), 1000)
X2 = np.linspace(data_valid.min(), data_valid.max(), 1000)
fig, ax = plt.subplots(1,2, figsize=(12,6))
ax[0].plot(X1, displot_dataTest.kde.pdf(X1), label='train')
ax[1].plot(X2, displot_dataValid.kde.pdf(X1), label='valid')
The plotting of the histograms and kde-functions inside one plot works without problems. Now I would like to have the 2 kde-functions inside one plot but when using the posted code, I get the following error AttributeError: 'FacetGrid' object has no attribute 'kde'
Do you have any idea, how I can combined the 2 kde-functions inside one plot (without the histogram)?

sns.displot() returns a FacetGrid. That doesn't work as input for ax.plot(). Also, displot_dataTest.kde.pdf is never valid. However, you can write sns.kdeplot(data=data_train, ax=ax[0]) to create a kdeplot inside the first subplot. See the docs; note the optional parameters cut= and clip= that can be used to adjust the limits.
If you only want one subplot, you can use fig, ax = plt.subplots(1, 1, figsize=(12,6)) and use ax=ax instead of ax=ax[0] as in that case ax is just a single subplot, not an array of subplots.
The following code has been tested using the latest seaborn version:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
fig, ax = plt.subplots(figsize=(12, 6))
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='crimson', label='train', fill=True, ax=ax)
sns.kdeplot(data=np.random.normal(0.1, 1, 100).cumsum(),
color='limegreen', label='valid', fill=True, ax=ax)
ax.legend()
plt.tight_layout()
plt.show()

Plotting histogram in Python with frequency percentage

I have a list of ratings for which I am plotting a histogram. On the left (y-axis) it shows the count of the frequency, is there a way for it to show the % based on traffic.
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.hist(item['ratings'], bins = 5)
ax.legend()
ax.set_title("Ratings Frequency")
ax.set_xlabel("Ratings")
ax.set_ylabel("frequency")
ax.axhline(y=0, linestyle='--', color='k')

You can use countplot try using the seaborn library it will make it very easy to do data visualization
import seaborn as sns
sns.countplot()

figure/subplot confusion with regard to x/y limits

I am trying to set the x and y limits on a subplot but am having difficultly. I suspect that the difficultly stems from my fundamental lack of understanding of how figures and subplots work. I have read these two questions:
question 1
question 2
I tried to use that approach, but neither had any effect on the x and y limits. Here's my code:
fig = plt.figure(figsize=(9,6))
ax = plt.subplot(111)
ax.hist(sub_dict['b'], bins=30, color='r', alpha=0.3)
ax.set_ylim=([0,200])
ax.set_xlim=([0,100])
plt.xlabel('x')
plt.ylabel('y')
plt.title('title')
plt.show()
I am confused as whether to apply commands to fig or ax? For instance .xlabel and .title don't seem to be available for ax. Thanks

Why don't you do:
Ax = fig.add_subplot(111)
import matplotlib.pyplot as plt
import numpy as np
mu, sigma = 100, 15
x = mu + sigma*np.random.randn(100)
fig = plt.figure(figsize=(9,6))
ax = fig.add_subplot(111)
ax.hist(x, bins=30, color='r', alpha=0.3)
ax.set_ylim=(0, 200)
ax.set_xlim=(0, 100)
plt.xlabel('x')
plt.ylabel('y')
plt.title('title')
plt.show()
I've run your code on some sample code, and I'm attaching the screenshot. I'm not sure this is the desired result but this is what I got.

For a multiplot, where you have subplots in a single figure, you can have several xlabel and one title
fig.title("foobar")
ax.set_xlabel("x")
This is explained in great detail here on the Matplotlib website.
You in your case, use a subplot for just a single plot. This is possible, just doesn't make a lot of sense. Plots like the one below are supposed to be created with the subplot feature:
To answer your question: you can set the x- and y-limits on a per-subplot and per-axis basis by simply addressing the respective subplot directly (ax for subplot 1) and them calling the set_xlabel member function to set the label on the x-axis.
EDIT
For your updated question:
Use this code as inspiration, I had to generate some data on my own so no guarantees:
import matplotlib.pyplot as plt
plt.hist(sub_dict['b'], bins=30, color='r', alpha=0.3)
plt.ylim(0,200)
plt.xlim(0,100)
plt.xlabel('x')
plt.ylabel('y')
plt.title('title')
plt.show()

Bit more googling and I got the following that has worked:
sub_dict = subset(data_dict, 'b', 'a', greater_than, 10)
fig = plt.figure(figsize=(9,6))
ax = fig.add_subplot(111)
ax.hist(sub_dict['b'], bins=30, color='r', alpha=0.3)
plt.ylim(0,250)
plt.xlim(0,100)
plt.xlabel('x')
plt.ylabel('y')
plt.title('title')
plt.show()

Creating sparklines using matplotlib in python

I am working on matplotlib and created some graphs like bar chart, bubble chart and others.
Can some one please explain with an example what is difference between line graph and sparkline graph and how to draw spark line graphs in python using matplotlib ?
for example with the following code
import matplotlib.pyplot as plt
import numpy as np
x=[1,2,3,4,5]
y=[5,7,2,6,2]
plt.plot(x, y)
plt.show()
the line graph generated is the following:
But I couldn't get what is the difference between a line chart and a spark lien chart for the same data. Please help me understand

A sparkline is the same as a line plot but without axes or coordinates. They can be used to show the "shape" of the data in a compact way.
You can cram several line plots in the same figure just by using subplots and changing properties of the resulting Axes for each subplot:
data = np.cumsum(np.random.rand(1000)-0.5)
data = data - np.mean(data)
fig = plt.figure()
ax1 = fig.add_subplot(411) # nrows, ncols, plot_number, top sparkline
ax1.plot(data, 'b-')
ax1.axhline(c='grey', alpha=0.5)
ax2 = fig.add_subplot(412, sharex=ax1)
ax2.plot(data, 'g-')
ax2.axhline(c='grey', alpha=0.5)
ax3 = fig.add_subplot(413, sharex=ax1)
ax3.plot(data, 'y-')
ax3.axhline(c='grey', alpha=0.5)
ax4 = fig.add_subplot(414, sharex=ax1) # bottom sparkline
ax4.plot(data, 'r-')
ax4.axhline(c='grey', alpha=0.5)
for axes in [ax1, ax2, ax3, ax4]: # remove all borders
plt.setp(axes.get_xticklabels(), visible=False)
plt.setp(axes.get_yticklabels(), visible=False)
plt.setp(axes.get_xticklines(), visible=False)
plt.setp(axes.get_yticklines(), visible=False)
plt.setp(axes.spines.values(), visible=False)
# bottom sparkline
plt.setp(ax4.get_xticklabels(), visible=True)
plt.setp(ax4.get_xticklines(), visible=True)
ax4.xaxis.tick_bottom() # but onlyt the lower x ticks not x ticks at the top
plt.tight_layout()
plt.show()

A sparkline graph is just a regular plot with all the axis removed. quite simple to do with matplotlib:
import matplotlib.pyplot as plt
import numpy as np
# create some random data
x = np.cumsum(np.random.rand(1000)-0.5)
# plot it
fig, ax = plt.subplots(1,1,figsize=(10,3))
plt.plot(x, color='k')
plt.plot(len(x)-1, x[-1], color='r', marker='o')
# remove all the axes
for k,v in ax.spines.items():
v.set_visible(False)
ax.set_xticks([])
ax.set_yticks([])
#show it
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Importing histogram from matplotlib to plotly - python

Related

Seaborn axvspan alterating x-axis

Combine 2 kde-functions in one plot in seaborn

Plotting histogram in Python with frequency percentage

figure/subplot confusion with regard to x/y limits

Creating sparklines using matplotlib in python

Categories

Resources