I finished analyzing my data and want to show that they are statistically significant using the t-test_ind. However, I haven't found anything functional to show this other than what was referenced in (How does one insert statistical annotations (stars or p-values) into matplotlib / seaborn plots?):
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
from statannot import add_stat_annotation
ax = sns.barplot(x=x, y=y, order=order)
add_stat_annotation(ax, data=df, x=x, y=y,
boxPairList=[(order[0], order[1]), (order[0], order[2])],
test='t-test_ind',
textFormat='star',
loc='outside')
Using this approach however, whenever I try to save the plot using plt.savefig() the added significancies using the add_stat_annotation are discared (matplotlib does not seem to recognize the added annotations). Using the loc='inside' option messes up my plot so it isn't really an option.
I am therefore asking if there is some simpler way to add the sigificancies directly in matplotlib / seaborn or if you can plt.savefig() with enough border / padding to include everything.
It was mainly a xlabel cut off problem. So in future applications I would use the add_stat_annotation from webermarcolivier/statannot. To save your files use one of the following possibilities:
import matplotlib.pyplot as plt
plt.tight_layout() # Option 1
plt.autoscale() # Option 2
plt.savefig('filename.png', bbox_inches = "tight") # Option 3
Hope this will help someone for future use.
Related
I generate a lots of figures with a script which I do not display but store to harddrive. After a while I get the message
/usr/lib/pymodules/python2.7/matplotlib/pyplot.py:412: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_num_figures).
max_open_warning, RuntimeWarning)
Thus, I tried to close or clear the figures after storing. So far, I tried all of the followings but no one works. I still get the message from above.
plt.figure().clf()
plt.figure().clear()
plt.clf()
plt.close()
plt.close('all')
plt.close(plt.figure())
And furthermore I tried to restrict the number of open figures by
plt.rcParams.update({'figure.max_num_figures':1})
Here follows a piece of sample code that behaves like described above. I added the different options I tried as comments at the places I tried them.
from pandas import DataFrame
from numpy import random
df = DataFrame(random.randint(0,10,40))
import matplotlib.pyplot as plt
plt.ioff()
#plt.rcParams.update({'figure.max_num_figures':1})
for i in range(0,30):
fig, ax = plt.subplots()
ax.hist([df])
plt.savefig("/home/userXYZ/Development/pic_test.png")
#plt.figure().clf()
#plt.figure().clear()
#plt.clf()
#plt.close() # results in an error
#plt.close('all') # also error
#plt.close(plt.figure()) # also error
To be complete, that is the error I get when using plt.close:
can't invoke "event" command: application has been destroyed
while executing "event generate $w <>"
(procedure "ttk::ThemeChanged" line 6)
invoked from within "ttk::ThemeChanged"
The correct way to close your figures would be to use plt.close(fig), as can be seen in the below edit of the code you originally posted.
from pandas import DataFrame
from numpy import random
df = DataFrame(random.randint(0,10,40))
import matplotlib.pyplot as plt
plt.ioff()
for i in range(0,30):
fig, ax = plt.subplots()
ax.hist(df)
name = 'fig'+str(i)+'.png' # Note that the name should change dynamically
plt.savefig(name)
plt.close(fig) # <-- use this line
The error that you describe at the end of your question suggests to me that your problem is not with matplotlib, but rather with another part of your code (such as ttk).
plt.show() is a blocking function, so in the above code, plt.close() will not execute until the fig windows are closed.
You can use plt.ion() at the beginning of your code to make it non-blocking. Even though this has some other implications the fig will be closed.
I was still having the same issue on Python 3.9.7, matplotlib 3.5.1, and VS Code (the issue that no combination of plt.close() closes the figure). I have three loops which the most inner loop plots more than 20 figures. The solution that is working for me is using agg as backend and del someFig after plt.close(someFig). Subsequently, the order of code would be something like:
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
someFig = plt.figure()
.
.
.
someFig.savefig('OUTPUT_PATH')
plt.close(someFig) # --> (Note 1)
del someFig
.
.
.
NOTE 1: If this line is removed, the output figures may not be plotted correctly! Especially when the number of elements to be rendered in the figure is high.
NOTE 2: I don't know whether this solution could backfire or not, but at least it is working and not hugging RAM or preventing plotting figures!
import tensorflow as tf
from matplotlib import pyplot as plt
sample_image = tf.io.read_file(str(PATH / 'Path to your file'))
sample_image = tf.io.decode_jpeg(sample_image)
print(sample_image.shape)
plt.figure("1 - Sample Image ")
plt.title(label="Sample Image", fontsize=12, color="red")
plt.imshow(sample_image)
plt.show(block=False)
plt.pause(3)
plt.close()
plt.show(block=False)
plt.pause(interval) do the trick
This does not really solve my problem, but it is a work-around to handle the high memory consumption I faced and I do not get any of the error messages as before:
from pandas import DataFrame
from numpy import random
df = DataFrame(random.randint(0,10,40))
import matplotlib.pyplot as plt
plt.ioff()
for i in range(0,30):
plt.close('all')
fig, ax = plt.subplots()
ax.hist([df])
plt.savefig("/home/userXYZ/Development/pic_test.png")
I am trying to plot data to a figure and respective axis in matplotlib and as new work comes up, recall the figure with the additional plot on the axis:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
x=np.arange(0,20)
y=2*x
fig,ax=plt.subplots()
ax.scatter(x,x)
ax.scatter(x,y)
fig
Which works fine with matplotlib, if I however use seaborn's regplot:
fig2,ax2=plt.subplots()
sns.regplot(x,x,ax=ax2,fit_reg=False)
sns.regplot(x,y,ax=ax2,fit_reg=False)
fig2
fig2 generates the figure that I want but the regplot command generates an empty figure. Is there a way to suppress the regplot's empty output or have it display the updated ax2 without recalling fig2?
It seems you are using the jupyter notebook with the inline backend. In some circumstances regplot triggers the creation of a new figure even if the artists are being added to the previous one and this messes up the output. I don't know why this happens but I found a workaround that might help you, using plt.ioff to temporarily disable automatic display of figures.
plt.ioff()
fig, ax = plt.subplots()
sns.regplot(x, x, ax=ax)
fig
sns.regplot(x, 2 * x, ax=ax)
fig
You have to call plt.ioff before creating the figure for this to work. After that you have to explicitly display the figure. Then you can call plt.ion to restore the default behaviour.
regplot does not generate an empty figure. According to the documentation:
Understanding the difference between regplot() and lmplot() can be a
bit tricky. In fact, they are closely related, as lmplot() uses
regplot() internally and takes most of its parameters. However,
regplot() is an axes-level function, so it draws directly onto an axes
(either the currently active axes or the one provided by the ax
parameter), while lmplot() is a figure-level function and creates its
own figure, which is managed through a FacetGrid.
When I do the following:
fig2,ax2 = plt.subplots()
same_fig2 = sns.regplot(x,x,ax=ax2,fit_reg=False)
same_fig2.figure is fig2
>>> True
I am working on generating some scatter plot with matplotlib.pyplot.scatter() in jupyter notebook, and I found that if I import seaborn package, the scatter plot will lose its color. I am wondering if anyone has a similar issue?
Here is an example code
import matplotlib.pyplot as plt
import seaborn as sb
plt.scatter(range(4),range(4), c=range(4))
The output is
The scatter plot without seaborn is:
That seems to be the way it behaves. In seaborn 0.3 the default color scale was changed to greyscale. If you change your code to:
plt.scatter(range(4),range(4), c=sb.color_palette())
You will get an image with colors similar to your original.
See the Seaborn docs on choosing color palettes for more info.
Another way to fix this is to specify cmap option for plt.scatter() so that it would not be affected by seaborn:
ax = plt.scatter(range(4),range(4), c=range(4), cmap='gist_rainbow')
plt.colorbar(ax)
The result is:
There are many options for cmap here:
http://matplotlib.org/examples/color/colormaps_reference.html
seaborn is a beautiful Python package that acts, for the most part, as an additional layer on top of matplotlib. However, it changes, for instance, things that would be matplotlib methods on a plot object to direct seaborn functions.
seaborn's despine() remove any spines (the outer edges of the plot) from a plot. But I cannot do the opposite.
I cannot seem to recreate the spine in the standard way that I would / could if I had used matplotlib entirely from the start. Is there a way to do that? How would I?
Below is an example. Could I, for instance, add a spine on the bottom and the left of the plot?
from sklearn import datasets
import pandas as pd
tmp = datasets.load_iris()
iris = pd.DataFrame(tmp.data, columns=tmp.feature_names)
iris['species'] = tmp.target_names[tmp.target]
iris.species = iris.species.astype('category')
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('darkgrid')
sns.boxplot(x='species', y='sepal length (cm)', data=iris_new)
plt.show()
Thanks for all the great comments! I knew some of what you wrote, but not that both the 'axes.linewidth' and 'axes.edgecolor' needed to be set.
I'm writing an answer here, since it is a compilation of a few comments.
That is, the following code generates the plot below:
sns.set_style('darkgrid', {'axes.linewidth': 2, 'axes.edgecolor':'black'})
sns.boxplot(x='species', y='sepal length (cm)', data=iris_new)
plt.show()
I am using Seaborn to plot some data in Pandas.
I am making some very large plots (factorplots).
To see them, I am using some visualisation facilities at my university.
I am using a Compound screen made up of 4 by 4 monitors with small (but nonzero) bevel -- the gap between the screens.
This gap is black.
To minimise the disconnect between the screen i want the graph backgound to be black.
I have been digging around the documentation and playing around and I can't work it out..
Surely this is simple.
I can get grey background using set_style('darkgrid')
do i need to access the plot in matplotlib directly?
seaborn.set takes an rc argument that accepts a dictionary of valid matplotlib rcparams. So we need to set two things: the axes.facecolor, which is the color of the area where the data are drawn, and the figure.facecolor, which is the everything a part of the figure outside of the axes object.
(edited with advice from #mwaskom)
So if you do:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn
seaborn.set(rc={'axes.facecolor':'cornflowerblue', 'figure.facecolor':'cornflowerblue'})
fig, ax = plt.subplots()
You get:
And that'll work with your FacetGrid as well.
I am not familiar with seaborn but the following appears to let you change
the background by setting the axes background. It can set any of the ax.set_*
elements.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
m=pd.DataFrame({'x':['1','1','2','2','13','13'],
'y':np.random.randn(6)})
facet = sns.factorplot('x','y',data=m)
facet.set(axis_bgcolor='k')
plt.show()
Another way is to set the theme:
seaborn.set_theme(style='white')
In new versions of seaborn you can also use
axes_style() and set_style() to quickly set the plot style to one of the predefined styles: darkgrid, whitegrid, dark, white, ticks
st = axes_style("whitegrid")
set_style("ticks", {"xtick.major.size": 8, "ytick.major.size": 8})
More info in seaborn docs