seaborn plots the same size? - python

seaborn has a conveninent keyword named size=, that aims to make the plots a certain size. However, the plots significantly differ in size depending on the xy-ticks and the axis labels. What is the best way to generate plots with exactly the same dimensions regardless of ticks and axis labels?

Seaborn sizing options vary by the plot type, which can be a bit confusing, so this is a useful universal approach.
First run this: import matplotlib as plt
Then add the line plt.figure(figsize=(9, 9)) in the notebook cells for each of the plots. You can adjust the integer values as you see fit.

Related

Python Heatmaps (Basic and Complex)

What's the best way to do a heatmap in python (2.7)? I've found the heatmap.py module, and I was wondering if people have any advice on using it, or if there are other packages that do a good job.
I'm dealing with pretty basic data, like xy = np.random.rand(1000,2) superimposed on an image.
Although there's another thing I want to try, which is doing a heatmap that's scaled to a different heatmap. E.g., I have
attempts = np.random.rand(5000,2)
successes = np.random.rand(500,2)
And I want a heatmap of the successes relative to the density of the attempts. Is this possible?
Seaborn is a pretty widely-used library for making nice-looking plots, and has a heatmap function. Seaborn uses matplotlib under the hood.
import numpy as np
import seaborn as sns
xy = np.random.rand(1000,2)
sns.heatmap(xy, yticklabels=100)
Regarding your second question, I'm not sure what you mean. But my advice would be to create a numpy array or pandas dataframe of "successes [scaled] relative to the density of the attempts", however you mean that, and then pass that scaled array or dataframe to sns.heatmap
You can plot very complex heatmap using python package PyComplexHeatmap: https://github.com/DingWB/PyComplexHeatmap
https://github.com/DingWB/PyComplexHeatmap/blob/main/examples.ipynb
The most basic heatmap you can get is an image plot:
import matplotlib.pyplot as plt
import numpy as np
xy = np.random.rand(100,2)
plt.imshow(xy, aspect="auto")
plt.colorbar()
plt.show()
Note that using more points than you have pixels to show the heatmap might not make too much sense.
There are of course also different methods to draw a heatmaps and you may go through the matplotlib example gallery and see which plot appeals most to you.

Determine kind of Matplotlib Axes subplot

Given a matplotlib.axes_subplots.AexesSubplot object how do I tell what type of plot it contains? Is there a matplotlib feature that will determine this for me? for example...
I commonly plot data with pandas
import pandas as pd
df = pd.DataFrame({'y':range(10)})
line_ax = df.plot()
or
bar_ax = df.plot(kind='bar')
or
barh_ax = df.plot(kind='barh')
The matplotlib axes does not care about which plot it contains and it does not even know about it.
The question would also be how to distinguish "kinds" of plots. What kind of plot is in an axes which contains 2 bars, several markers, 2 lines and 3 arrows?
The kind argument to pandas plot function is simply a flag by which pandas decides which plotting function to call. This is independent of the axes and you may of course also have a plot produced by kind='bar' and kind='scatter' in the same axes.
So the answer is: No there is no general way to determine the kind of plot in an axes, mainly due to the fact that there is no such thing as a "kind of plot".
Of course, depending on what you'd need this type of information for, there are probably alternative ways to accomplish what you need.

Multiple histograms with logarithmic x scale

This is a combination of this thread on multiple histograms, and this thread on a logarithmic scales.
I am trying to have two (or more) histograms in a plot with a logarithmic x-scale, using this code: (with some external lists)
import numpy
import matplotlib.pyplot as plt
plt.hist([capacity_list, capacity_list2], np.logspace(-1,4,11))
plt.gca().set_xscale("log")
plt.show()
It works in principle; my only problem is that the logarithmic scale also seems to affect the bin width of the histograms and so one if them always has shorter bins, which doesn't look nice:
Does anybody know how to fix that?

matplot and seaborn figure parameters/customizations

I'm so confused between the two. Every time I make a chart on either pyplot or seaborn, I have to guess what syntax to use. For example, for seaborn doesn't have a title setter so I have to remember to use plt.title. Or, for seaborn charts, plt.xlabel doesn't work, so I have to use sns.axlable(x,y).
And also, randomly I run into the following problem. I'm simply trying to make my seaborn jointplot bigger but I have no success trying both the plt nor the seaborn methods (any tips as to a good documentation showing all the chart parameters??? I find them scattered on the web and it seems like each solution on stack overflow is unique...which adds to the overall confusion).
Here's my code:
a = plt.figure(figsize=(30,30))
a.set_size_inches(30,30)
sns.jointplot(x='COAST',y='NORTH',data = data_df, kind = 'kde')
Notice I used the plt method and the sns.set_size_inches methods. Both gave me a small chart.
So frustrated with the random overlaps of the two libraries. Any pro tips to lessen the confusion will be greatly appreciated!
edit: This is also true for seaborn's pairplot. I have no success in changing the pairplot's size.
sns.jointplot creates its own figure instance (as #tcaswell suspected). It doesn't appear that you can tell jointplot to use an existing figure. I think you have two options:
You can give sns.jointplot the size option. e.g.:
sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde', size=30)
You can alter the JointGrid figure size after creating it, using:
g=sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde')
g.fig.set_size_inches(30,30)
I presume option 1 is the better option, as it is a built-in seaborn option

Python, matplotlib: how to set tick label values to their logarithmic values

I have some data that I plot on a semi-log plot (log-lin style, with a logarithmic scale on the y-axis). Is there a way to change the y-axis tick labels from their actual values to their logarithmic values?
As an example, consider the following code:
import matplotlib.pyplot as plt
import numpy as np
x=np.array([1,2,3,4,5])
def f(x):
return 10**(x-1)
plt.plot(x,f(x))
plt.yscale(u'log')
plt.show()
Which produces the following plot:
(Sorry it is kind of big, I do not know how to make it smaller, feel free to edit to help out with that).
In this plot the tick labels are shown as 10^0, 10^1, 10^2, etc.; however I would like them to display as their logarithmic values: 0, 1, 2, etc.
I realize I could go back and change plt.plot(x,f(x)) to plt.plot(x,np.log10(f(x))) and then make the y-axis linear again instead of logarithmic but I want to know if there is anyway matplotlib can just change the y-axis tick values themselves without me having to put np.log10() in all my plt.plot()'s. My reason for this is two-fold: I have many plt.plot() lines in my code and would rather not go back and have to change it for all of them, and then I wouldn't have logarithmically spaced minor ticks (although I'm sure there's some way to change that even with a linear axis).
EDIT: I am aware of this question which has some similarities to mine but is not the same. The person in that question wants to change the tick labels from scientific form to "normal" decimal form. I want to change my tick labels from scientific form to the logarithmic (base 10) value of the number. I am sure the answer will be similar to the one I linked but it is not obvious to me how to do it. In fact, I looked at that question before posting mine but still decided to post mine because I did not know how to apply it to my problem. Perhaps to experienced programmers it is obvious how to apply the methods of the question I linked to my situation but it isn't obvious to me so please step me through it.
If you could show me a code sample (by copying my code sample and putting in the necessary lines) how this works I would much appreciate it.
You can use a custom formatter, for example:
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
import numpy as np
import math
x=np.array([1,2,3,4,5])
def f(x):
return 10**(x-1)
plt.plot(x,f(x))
plt.yscale(u'log')
#SET CUSTORM TICK FORMATTING
plt.gca().yaxis.set_major_formatter(FuncFormatter(lambda x,y: '{}'.format(math.log(x, 10))))
plt.show()

Categories

Resources