I have a heatmap using seaborn and am trying to adjust the height of the 4th plot below. You will see that it only has 2 rows of data vs the others that have more:
I have used the following code to create the plot:
f, ax = plt.subplots(nrows=4,figsize=(20,10))
cmap = plt.cm.GnBu_r
sns.heatmap(df,cbar=False,cmap=cmap,ax=ax[0])
sns.heatmap(df2,cbar=False,cmap=cmap,ax=ax[1])
sns.heatmap(df3,cbar=False,cmap=cmap,ax=ax[2])
sns.heatmap(df4,cbar=False,cmap=cmap,ax=ax[3])
Does anyone know the next step to essentially make the 4th plot smaller in height and thus stretching out the other 3? The 4th plot will generally always have 2-3 where as the others will have 6-7 most times. Thanks very much!
As normal, it is pretty funky/tedious with matplotlib. But here it is!
f = plt.figure(constrained_layout = True)
specs = f.add_gridspec(ncols = 1, nrows = 4, height_ratios = [1,1,1,.5])
for spec, df in zip(specs, (df, df2, df3, df4)):
ax = sns.heatmap(df,cbar=False,cmap=cmap, ax=f.add_subplot(spec))
You can change the heights relative to each other using the height_ratios. You could also implement a wdith_ratios parameter if you desired to change the relative widths. You could also implement a for loop to iterate over the graphing.
Related
I am trying to create a figure with three bar plots side by side. These bar plots have different yscales, but the data is fundamentally similar so I'd like all the bars to have the same width.
The only way I was able to get the bars to have the exact same width was by using sharex when creating the subplots, in order to keep the same x scale.
import matplotlib.pyplot as plt
BigData = [[100,300],[400,200]]
MediumData = [[40, 30],[50,20],[60,50],[30,30]]
SmallData = [[3,2],[11,3],[7,5]]
data = [BigData, MediumData, SmallData]
colors = ['#FC766A','#5B84B1']
fig, axs = plt.subplots(1, 3, figsize=(30,5), sharex=True)
subplot = 0
for scale in data:
for type in range(2):
bar_x = [x + type*0.2 for x in range(len(scale))]
bar_y = [d[type] for d in scale]
axs[subplot].bar(bar_x,bar_y, width = 0.2, color = colors[type])
subplot += 1
plt.show()
This creates this figure:
The problem with this is that the x-limits of the plot are also shared, leading to unwanted whitespace. I've tried setting the x-bounds after the fact, but it doesn't seem to override sharex. Is there a way to make the bars have the same width, without each subplot also being the same width?
Additionally, is there a way to create such a plot (one with different y scales to depending on the size of the data) without having to sort the data manually beforehand, like shown in my code?
Thanks!
Thanks to Jody Klymak for help finding this solution! I thought I should document it for future users.
We can make use of the 'width_ratios' GridSpec parameter. Unfortunately there's no way to specify these ratios after we've already drawn a graph, so the best way I found to implement this is to write a function that creates a dummy graph, and measures the x-limits from that graph:
def getXRatios(data, size):
phig, aks = plt.subplots(1, 3, figsize=size)
subplot = 0
for scale in data:
for type in range(2):
bar_x = [x + type*0.2 for x in range(len(scale))]
bar_y = [d[type] for d in scale]
aks[subplot].bar(bar_x,bar_y, width = 0.2)
subplot += 1
ratios = [aks[i].get_xlim()[1] for i in range(3)]
plt.close(phig)
return ratios
This is essentially identical to the code that creates the actual figure, with the cosmetic aspects removed, as all we want from this dummy figure is the x-limits of the graph (something we can't get from our actual figure as we need to define those limits before we start in order to solve the problem).
Now all you need to do is call this function when you're creating your subplots:
fig, axs = plt.subplots(1, 3, figsize=(40,5), gridspec_kw = {'width_ratios':getXRatios(data,(40,5))})
As long as your XRatio function creates your graph in the same way your actual graph does, everything should work! Here's my output using this solution.
To save space you could re-purpose the getXRatios function to also construct your final graph, by calling itself in the arguments and giving an option to return either the ratios or the final figure. I couldn't be bothered.
I have been trying to merge these two plots together but have not found a built-in in the documentation for MatPlotLib on how to do so. I want to show the two bar values next to each and for every new entry, add the new entry to the graph while shifting the other entries over to make space. The plots are below.
As stated prior, when I say merge, I do not simply mean just plop Plot A onto Plot B, but rather join the plots together so both bar values are shown in the same graph, like this:
The reasoning for this is that I will be able to log all the entries in a single plot without having to manually do so. By implementing something like this in my code, it would make entries go a lot quicker.
EDIT: I understand that I can graph these two together, but that is not what I am looking for. Once I get the necessary input, my program creates a graph of that data and saves it as a file. I am looking to append any new data to that original file by just shifting the original value over to the left in order to make space.
EDIT 2: How could I extract the data from each plot and after doing so, create a new graph? This would seem to be another acceptable workaround.
Is there anything preventing you from plotting each of them side by side but changing the index?
a, b, c = 2, 5, 3
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
count = 0
ax.bar(count, a)
# if prgoram produces a new output then...
count += 1
ax.bar(count, b) # index means new bar plot has shifted
# again
count += 1
ax.bar(count, c) # shifted again
This should automatically expand the x-axis anyway. You may have to alter this slightly if you've particularly concenred about the width of these bars.
If this isn't what you wanted you could consider replotting with the bar container or even just stripping the height to reuse.
fig, ax = plt.subplots(1, 1)
count = 0
bar_cont = ax.bar(count, a) # reference to the bar container of interest
print(bar_cont.get_height())
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
people = ['JOHN DOE', 'BOB SMITH']
values = [14,14]
ax.bar(people,values)
plt.show()
Should be the solution. You just have to pass a list instead of a single value to the plt.bar() function. More detailed explaination here.
I have a data file which consists of 131 columns and 4 rows. I am plotting it into python as follows
df = pd.read_csv('data.csv')
df.plot(figsize = (15,10))
Once it is plotted, all 131 legends are coming together like a huge tower over the line plots.
Please see the image here, which I have got :
Link to Image, I have clipped after v82 for better understanding
I have found some solutions on Stackoverflow (SO) to shift legend anywhere in the plot but I could not find any solution to break this legend tower into multiple small-small pieces and stack them one beside another.
Moreover, I want my plot something look like this
My desired plot :
Any help would be appreciable. Thank you.
You can specify the position of the legend in relative coordinates using loc and use ncol parameter to split the single legend column into multiple columns. To do so, you need an axis handle returned by the df.plot
df = pd.read_csv('data.csv')
ax = df.plot(figsize = (10,7))
ax.legend(loc=(1.01, 0.01), ncol=4)
plt.tight_layout()
I am working on getting some graphs generated for 4 columns, with the COLUMN_NM being the main index.
The issue I am facing is the column names are showing along the bottom. This is problematic for 2 reasons, first being there could be dozens of these columns so the graph would look messy and could stretch too far to the right. Second being they are getting cut off (though I am sure that can be fixed)
I would prefer to have the column names listed vertically in the box where 'MAX_COL_LENGTH' current resides, and have the bars different colors per column instead.
Any ideas how I would adjust this or suggestions to make this better?
for col in ['DISTINCT_COUNT', 'MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT']:
grid[['COLUMN_NM', col]].set_index('COLUMN_NM').plot.bar(title=col)
plt.show()
In this case you can plot points one by one and setup the label name for each point:
gs = gridspec.GridSpec(1,1)
fig = plt.figure(figsize=(5, 5))
ax = fig.add_subplot(gs[:, :])
data = [1,2,3,4,5]
label = ['l1','l2','l3','l4','l5']
for n,(p,l) in enumerate(zip(data,label)):
ax.bar(n,p,label=l)
ax.set_xticklabels([])
ax.legend()
This is the output for the code above:
I'm trying to make a grouped bar plot in matplotlib, following the example in the gallery. I use the following:
import matplotlib.pyplot as plt
plt.figure(figsize=(7,7), dpi=300)
xticks = [0.1, 1.1]
groups = [[1.04, 0.96],
[1.69, 4.02]]
group_labels = ["G1", "G2"]
num_items = len(group_labels)
ind = arange(num_items)
width = 0.1
s = plt.subplot(1,1,1)
for num, vals in enumerate(groups):
print "plotting: ", vals
group_len = len(vals)
gene_rects = plt.bar(ind, vals, width,
align="center")
ind = ind + width
num_groups = len(group_labels)
# Make label centered with respect to group of bars
# Is there a less complicated way?
offset = (num_groups / 2.) * width
xticks = arange(num_groups) + offset
s.set_xticks(xticks)
print "xticks: ", xticks
plt.xlim([0 - width, max(xticks) + (num_groups * width)])
s.set_xticklabels(group_labels)
My questions are:
How can I control the space between the groups of bars? Right now the spacing is huge and it looks silly. Note that I do not want to make the bars wider - I want them to have the same width, but be closer together.
How can I get the labels to be centered below the groups of bars? I tried to come up with some arithmetic calculations to position the xlabels in the right place (see code above) but it's still slightly off... it feels a bit like writing a plotting library rather than using one. How can this be fixed? (Is there a wrapper or built in utility for matplotlib where this is default behavior?)
EDIT: Reply to #mlgill: thank you for your answer. Your code is certainly much more elegant but still has the same issue, namely that the width of the bars and the spacing between the groups are not controlled separately. Your graph looks correct but the bars are far too wide -- it looks like an Excel graph -- and I wanted to make the bar thinner.
Width and margin are now linked, so if I try:
margin = 0.60
width = (1.-2.*margin)/num_items
It makes the bar skinnier, but brings the group far apart, so the plot again does not look right.
How can I make a grouped bar plot function that takes two parameters: the width of each bar, and the spacing between the bar groups, and plots it correctly like your code did, i.e. with the x-axis labels centered below the groups?
I think that since the user has to compute specific low-level layout quantities like margin and width, we are still basically writing a plotting library :)
Actually I think this problem is best solved by adjusting figsize and width; here is my output with figsize=(2,7) and width=0.3:
By the way, this type of thing becomes a lot simpler if you use pandas wrappers (i've also imported seaborn, not necessary for the solution, but makes the plot a lot prettier and more modern looking in my opinion):
import pandas as pd
import seaborn
seaborn.set()
df = pd.DataFrame(groups, index=group_labels)
df.plot(kind='bar', legend=False, width=0.8, figsize=(2,5))
plt.show()
The trick to both of your questions is understanding that bar graphs in Matplotlib expect each series (G1, G2) to have a total width of "1.0", counting margins on either side. Thus, it's probably easiest to set margins up and then calculate the width of each bar depending on how many of them there are per series. In your case, there are two bars per series.
Assuming you left align each bar, instead of center aligning them as you had done, this setup will result in series which span from 0.0 to 1.0, 1.0 to 2.0, and so forth on the x-axis. Thus, the exact center of each series, which is where you want your labels to appear, will be at 0.5, 1.5, etc.
I've cleaned up your code as there were a lot of extraneous variables. See comments within.
import matplotlib.pyplot as plt
import numpy as np
plt.figure(figsize=(7,7), dpi=300)
groups = [[1.04, 0.96],
[1.69, 4.02]]
group_labels = ["G1", "G2"]
num_items = len(group_labels)
# This needs to be a numpy range for xdata calculations
# to work.
ind = np.arange(num_items)
# Bar graphs expect a total width of "1.0" per group
# Thus, you should make the sum of the two margins
# plus the sum of the width for each entry equal 1.0.
# One way of doing that is shown below. You can make
# The margins smaller if they're still too big.
margin = 0.05
width = (1.-2.*margin)/num_items
s = plt.subplot(1,1,1)
for num, vals in enumerate(groups):
print "plotting: ", vals
# The position of the xdata must be calculated for each of the two data series
xdata = ind+margin+(num*width)
# Removing the "align=center" feature will left align graphs, which is what
# this method of calculating positions assumes
gene_rects = plt.bar(xdata, vals, width)
# You should no longer need to manually set the plot limit since everything
# is scaled to one.
# Also the ticks should be much simpler now that each group of bars extends from
# 0.0 to 1.0, 1.0 to 2.0, and so forth and, thus, are centered at 0.5, 1.5, etc.
s.set_xticks(ind+0.5)
s.set_xticklabels(group_labels)
I read an answer that Paul Ivanov posted on Nabble that might solve this problem with less complexity. Just set the index as below. This will increase the spacing between grouped columns.
ind = np.arange(0,12,2)