Hello there!
I am trying to create a figure consisting of a chloropleth map and a bar plot in Matplotlib. To achieve this, i am using the Geopandas library alongside Pandas and Matplotlib. I've run into an interesting problem that i couldn't find any answer for on the internet. Here's the problem:
This link leads to an image that replicates the problem.
As it can be seen on the image above, the map on the top (generated by Geopandas) does not span the same width as the bar chart on the bottom. There is too much whitespace to the left and the right of the figure. I want to get rid of this whitespace and make the map fit horizontally on the space that is allocated to it. I am also leaving a code sample below for those who wish to recreate it:
fig = plt.figure(figsize = (25.60,14.40)) #Here, i am setting the overall figure size
ax_1 = fig.add_subplot(2,1,1) #This will be the map
istanbul_districts.plot(ax = ax_1,
edgecolor = "black",
alpha = 1,
color = "Red") #Istanbul_districts is a GeoDataFrame object.
ax_2 = fig.add_subplot(2,1,2)
labels = list(health.loc[:,"district_eng"].value_counts().sort_values(ascending = False).index)
from numpy import arange
bar_positions = arange(len(labels)) + 1
bar_heights = h_inst_per_district_eng.loc[:,"health_count"].values.astype(int)
ax_2.bar(bar_positions,bar_heights,
width = 0.7,
align = "center",
color = "blue") #This is a generic barplot from Matplotlib
I am leaving a second image that shows the end result of the code snippet above:
This link also leads to an image that replicates the problem.
It can be clearly seen above that the axes of the two subplots do not start and end on the same location. Perhaps that could be the problem? What can be done to make them the same size?
Thanks to all those answer for their time in advance!
Adding an explanation, since you have found one solution.
If you specify matplotlib figure with two axes in a way you did, you get the figure split in half. Both axes are the same. Let's say that the original ratio of the figure is 1:1, your axes will be both 1:2.
This arbitrary ratio is fine for a bar chart, which can be scaled to essentially any ratio. It does not matter much if it is horizontal or vertical (from a plotting perspective, not data-viz).
However, if you want your map to show correct non-distorted shapes, you can't just specify the aspect ratio. That just follows the data. So if you have a map, which bounding box has 1:1 ratio, you can't expect that it will fill the whole 1:2 axis. GeoPandas changes the aspect ratio to follow the map's ratio.
The reason why the first example leaves gaps on side and the "solution" does not is this. Because the leftover space is on top and on the bottom the axis, it is not shown in the solution. Because it is on sides in the issue, it just stays there. If you had your plots next to each other instead of above, it would be vice versa.
Hope it is clearer.
Hello again!
swatchai's comment set me up on the right track and i found the culprit. Simply adjusting the figsize to a value like (19,19) fixed the problem. I'd still be happy if anyone can explain exactly why this happens.
Here's what it looks like when the figsize is a square (19,19):
Thanks for your efforts!
Related
I have problems when drawing horizontal bar chart
firstly if we draw it in plt.bar
plt.figure(figsize=(8,5))
plt.bar(range_df.range_start, range_df.cum_trade_vol, width=30)
plt.show()
but if we draw it in plt.barh
plt.figure(figsize=(8, 5))
plt.barh(range_df.range_start, range_df.cum_trade_vol)
plt.show()
its either
or like:
The problem, I think, is because the crowded data that left too few gaps.
What can we do to properly draw the graph? (since we cannot set width with barh? or can we?)
Maybe another plot package?
Please do not reset the y axis value as the current value is important
The data can be downloaded at
https://drive.google.com/file/d/1y8fHazEFhVR_u2KL6uUsBqv0qmXOd2xT/view?usp=share_link
the notebook is at:
https://colab.research.google.com/drive/1MbjJE4B-mspDRqCYXnDf8hyFRK_uLmRp?usp=sharing
Thank you
I am somewhat unsure of what you are talking about. When writing the horizontal plot, you drop width parameter. There is an equivalent version for plt.barh which is height. So try
plt.barh(range_df.range_start, range_df.cum_trade_vol, height=30)
sns.boxplot(data=df, width=0.5)
plt.title(f'Distribution of scores for initial and resubmission\
\nonly among students who resubmitted at all.\
\n(n = {df.shape[0]})')
I want to use a bigger font, and leave more space in the top white margin so that the title doesn't get crammed in. Surprisingly, I am totally unable to find the option despite some serious googling!
The basic problem you have is that the multi-line title is too tall, and is rendered "off the page".
A few options for you:
the least effort solution is probably to use tight_layout(). plt.tight_layout() manipulates the subplot locations and spacing so that labels, ticks and titles fit more nicely.
if this isn't enough, also look at plt.subplots_adjust() which gives you control over how much whitespace is used around one or more subfigures; you can modify just one aspect at at time, and all the other settings are left alone. In your case, you could use plt.subplots_adjust(top=0.8).
If you are generating a final figure for publication or similar, you might be aiming to tweak a lot to perfect it. In this case, you can precisely control the (sub)plot locations, using add_axes (see this example https://stackoverflow.com/a/17479417).
Here is an example, with a 6-line title for emphasis. The left panel shows the default - with half the title clipped off. The right panel has all measurements the same except the top; the middle has automatically removed whitespace on all sides.
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
data = 55 + 5* np.random.randn(1000,) # some data
vlongtitle = "\n".join(["long title"]*6) # a 6-line title
# using tight_layout, all the margins are reduced
plt.figure()
sns.boxplot(data, width=0.5)
plt.title(vlongtitle)
plt.tight_layout()
# 2nd option, just edit one aspect.
plt.figure()
sns.boxplot(data, width=0.5)
plt.title(vlongtitle)
plt.subplots_adjust(top=0.72)
Disclaimer: I am very inexperienced using matplotlib and python in general.
Here is the figure I'm trying to make:
Using GridSpec works well for laying out the plots, but when I try to include a colorbar on the right of each row, it changes the size of the corresponding subplot. This seems to be a well known and unavoidable problem with GridSpec. So at the advice of this question: Matplotlib 2 Subplots, 1 Colorbar
I've decided to remake the whole plot using ImageGrid. Unfortunately the documentation only lists the options cbar_mode=[None|single|each] whereas I want 1 colobar per row. Is there a way to do this inside a single ImageGrid? or will I have to make 2 grids and deal with the nightmare of alignment.
What about the 5th plot at the bottom? Is there a way to include that in the image grid somehow?
The only way I can see this working is to somehow nest two ImageGrids into a GridSpec in a 1x3 column. this seems overly complicated and difficult so I don't want to build that script until I know its the right way to go.
Thanks for any help/advice!
Ok I figured it out. It seems ImageGrid uses subplot somehow inside it. So I was able to generate the following plot using something like
TopGrid = ImageGrid( fig, 311,
nrows_ncols=(1,2),
axes_pad=0,
share_all=True,
cbar_location="right",
cbar_mode="single",
cbar_size="3%",
cbar_pad=0.0,
cbar_set_cax=True
)
<Plotting commands for the top row of plots and colorbar>
BotGrid = ImageGrid( fig, 312,
nrows_ncols=(1,2),
axes_pad=0,
share_all=True,
cbar_location="right",
cbar_mode="single",
cbar_size="3%",
cbar_pad=0.0,
)
<Plotting commands for bottom row . . .>
StemPlot = plt.subplot(313)
<plotting commands for bottom stem plot>
EDIT: the whitespace in the color plots is intentional, not some artifact from adding the colorbars
Is it possible for matplotlib only update the newest point to the figure instead of re-draw the whole figure?
For example: this may be the fastest way for dynamic plotting
initiate:
fig1 = Figure(figsize = (8.0,8.0),dpi = 100)
axes1 = fig1.add_subplot(111)
line1, = axes1.plot([],[],animated = True)
when new data is coming:
line1.set_data(new_xarray,new_yarray)
axes1.draw_artist(line1)
fig1.canvas.update()
fig1.canvas.flush_events()
But this will re-draw the whole figure! I'm think whether this is possible:
when new data is coming:
axes1.draw_only_last_point(new_x,new_y)
update_the_canvas()
It will only add this new point(new_x,new_y) to the axes instead of re-draw every point.
And if you know which graphic library for python can do that, please answer or comment, thank you so much!!!!!
Really appreciate your help!
Is only redrawing the entire figure the problem, i.e. it is ok to redraw the line itself as long as the figure is unchanged? Is the data known beforehand?
If the answer to those questions are NO, and YES, then it might be worth looking into the animate-class for matplotlib. One example where the data is known beforehand, but the points are plotted one by one is this example. In the example, the figure is redrawn if the newest point is outside of the current x-lim. If you know the range of your data you can avoid it by setting the limits beforehand.
You might also want to look into this answer, the animate example list or the animate documentation.
this is my (so far) little experience.
I started some month ago with Python(2.x) and openCV (2.4.13) as graphic library.I found in may first project that openCV for python works with numpy structure as much as matplotlib and (with slight difference) they can work together.
I had to update some pixel after some condition. I first did my elaboration from images with opencv obtaining a numpy 2D array, like a matrix.
The trick is: opencv mainly thinks about input as images, in terms of X as width first, then Y as height. The numpy structure wants rows and columns wich in fact is Y before X.
With this in mind I updated pixel by pixel the image-matrix A and plot it again with a colormap
import matplotlib as plt
import cv2
A = cv2.imread('your_image.png',0) # 0 means grayscale
# now you loaded an image in a numpy array A
for every new x,y pixel
A[y,x] = new pixel intensity value
plot = plt.imshow(A, 'CMRmap')
plt.show()
If you want images again, consider use this
import matplotlib.image as mpimg
#previous code
mpimg.imsave("newA.png", A)
If you want to work with colors remember that images in colour are X by Y by 3 numpy array but matplotlib has RGB as the right order of channels, openCv works with BGR order. So
C = cv2.imread('colour_reference.png',1) # 1 means BGR
A[y,x,0] = newRedvalue = C[y,x][2]
A[y,x,1] = newGreenvalue = C[y,x][1]
A[y,x,2] = newBluevalue = C[y,x][0]
I hope this will help you in some way
In the graphic below, I want to put in a legend for the calendar plot. The calendar plot was made using ax.plot(...,label='a') and drawing rectangles in a 52x7 grid (52 weeks, 7 days per week).
The legend is currently made using:
plt.gca().legend(loc="upper right")
How do I correct this legend to something more like a colorbar? Also, the colorbar should be placed at the bottom of the plot.
EDIT:
Uploaded code and data for reproducing this here:
https://www.dropbox.com/sh/8xgyxybev3441go/AACKDiNFBqpsP1ZttsZLqIC4a?dl=0
Aside - existing bugs
The code you put on the dropbox doesn't work "out of the box". In particular - you're trying to divide a datetime.timedelta by a numpy.timedelta64 in two places and that fails.
You do your own normalisation and colour mapping (calling into color_list based on an int() conversion of your normalised value). You subtract 1 from this and you don't need to - you already floor the value by using int(). The result of doing this is that you can get an index of -1 which means your very smallest values are incorrectly mapped to the colour for the maximum value. This is most obvious if you plot column 'BIOM'.
I've hacked this by adding a tiny value (0.00001) to the total range of the values that you divide by. It's a hack - I'm not sure that this method of mapping is at all the best use of matplotlib, but that's a different question entirely.
Solution adapting your code
With those bugs fixed, and adding a last suplot below all the existing ones (i.e. replacing 3 with 4 on all your calls to subplot2grid(), you can do the following:
Replace your
plt.gca().legend(loc="upper right")
with
# plot an overall colorbar type legend
# Grab the new axes object to plot the colorbar on
ax_colorbar = plt.subplot2grid((4,num_yrs), (3,0),rowspan=1,colspan=num_yrs)
mappableObject = matplotlib.cm.ScalarMappable(cmap = palettable.colorbrewer.sequential.BuPu_9.mpl_colormap)
mappableObject.set_array(numpy.array(df[col_name]))
col_bar = fig.colorbar(mappableObject, cax = ax_colorbar, orientation = 'horizontal', boundaries = numpy.arange(min_val,max_val,(max_val-min_val)/10))
# You can change the boundaries kwarg to either make the scale look less boxy (increase 10)
# or to get different values on the tick marks, or even omit it altogether to let
col_bar.set_label(col_name)
ax_colorbar.set_title(col_name + ' color mapping')
I tested this with two of your columns ('NMN' and 'BIOM') and on Python 2.7 (I assume you're using Python 2.x given the print statement syntax)
The finalised code that works directly with your data file is in a gist here
You get
How does it work?
It creates a ScalarMappable object that matplotlib can use to map values to colors. It set the array to base this map on to all the values in the column you are dealing with. It then used Figure.colorbar() to add the colorbar - passing in the mappable object so that the labels are correct. I've added boundaries so that the minimum value is shown explicitly - you can omit that if you want matplotlib to sort that out for itself.
P.S. I've set the colormap to palettable.colorbrewer.sequential.BuPu_9.mpl_colormap, matching your get_colors() function which gets these colours as a 9 member list. I strongly recommend importing the colormap you want to use as a nice name to make the use of mpl_colors and mpl_colormap more easy to understand e.g.
import palettable.colorbrewer.sequential.BuPu_9 as color_scale
Then access it as
color_scale.mpl_colormap
That way, you can keep your code DRY and change the colors with only one change.
Layout (in response to comments)
The colorbar may be a little big (certainly tall) for aesthetic ideal. There are a few possible options to do that. I'll point you to two:
The "right" way to do it is probably to use a Gridspec
You could use your existing approach, but increase the number of rows and have the colorbar still in one row, while the other elements span more rows than they do currently.
I've implemented that with 9 rows, an extra column (so that the month labels don't get lost) and the colorbar on the bottom row, spanning 2 less columns than the main figure. I've also used tight_layout with w_pad=0.0 to avoid label clashes. You can play with this to get your exact preferred size. New code here.
This gives:
:
There are functions to do this in matplotlib.colorbar. With some specific code from your example, I could give you a better answer, but you'll use something like:
myColorbar = matplotlib.colorbar.ColorbarBase(myAxes, cmap=myColorMap,
norm=myNorm,
orientation='vertical')