Plotting multiple subplots with different shapefiles in background - python

I am trying to plot side by side GeoPandas shapefiles using matplotlib but the titles, xlabel and ylabel are not plotting correctly.
fig, axes = plt.subplots(1,2, figsize=(10,3), sharex=True, sharey=True)
base = subs.boundary.plot(color='black', linewidth=0.1, ax=axes[0])
cluster.plot(ax=base, column='pixel', markersize=20, legend=True, zorder=2)
plt.title('THHZ')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
base = forest.boundary.plot(color='black', linewidth=0.2, ax=axes[1])
cluster.plot(ax=base, column='forest', markersize=20, legend=True, zorder=2)
plt.title('Forest')
This is what I get
This is what I want

You have a mixture of object-oriented and pyplot-style matplotlib interactions. The plt.* calls are following a logic of the current axis to act upon. More detail from the matplotlib docs here: Pyplot vs Object Oriented Interface. I don't know how that behaves with your plotting function calls (code not included in your post).
To be certain of what axes you are interacting with, use the object-oriented calls using the axes object you already have:
axes[0].set_title('THHZ')
axes[0].set_xlabel('Longitude')
axes[0].set_ylabel('Latitude')
axes[1].set_title('Forest')
You can also add fig.tight_layout() at the very end for a compacted figure layout.

Related

Matplotlib scatter plot dual y-axis

I try to figure out how to create scatter plot in matplotlib with two different y-axis values.
Now i have one and need to add second with index column values on y.
points1 = plt.scatter(r3_load["TimeUTC"], r3_load["r3_load_MW"],
c=r3_load["r3_load_MW"], s=50, cmap="rainbow", alpha=1) #set style options
plt.rcParams['figure.figsize'] = [20,10]
#plt.colorbar(points)
plt.title("timeUTC vs Load")
#plt.xlim(0, 400)
#plt.ylim(0, 300)
plt.xlabel('timeUTC')
plt.ylabel('Load_MW')
cbar = plt.colorbar(points1)
cbar.set_label('Load')
Result i expect is like this:
So second scatter set should be for TimeUTC vs index. Colors are not the subject;) also in excel y-axes are different sites, but doesnt matter.
Appriciate your help! Thanks, Paulina
Continuing after the suggestions in the comments.
There are two ways of using matplotlib.
Via the matplotlib.pyplot interface, like you were doing in your original code snippet with .plt
The object-oriented way. This is the suggested way to use matplotlib, especially when you need more customisation like in your case. In your code, ax1 is an Axes instance.
From an Axes instance, you can plot your data using the Axes.plot and Axes.scatter methods, very similar to what you did through the pyplot interface. This means, you can write a Axes.scatter call instead of .plot and use the same parameters as in your original code:
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.scatter(r3_load["TimeUTC"], r3_load["r3_load_MW"],
c=r3_load["r3_load_MW"], s=50, cmap="rainbow", alpha=1)
ax2.plot(r3_dda249["TimeUTC"], r3_dda249.index, c='b', linestyle='-')
ax1.set_xlabel('TimeUTC')
ax1.set_ylabel('r3_load_MW', color='g')
ax2.set_ylabel('index', color='b')
plt.show()

set custom tick labels on heatmap color bar

I have a list of dataframes named merged_dfs that I am looping through to get the correlation and plot subplots of heatmap correlation matrix using seaborn.
I want to customize the colorbar tick labels, but I am having trouble figuring out how to do it with my example.
Currently, my colorbar scale values from top to bottom are
[1,0.5,0,-0.5,-1]
I want to keep these values, but change the tick labels to be
[1,0.5,0,0.5,1]
for my diverging color bar.
Here is the code and my attempt:
fig, ax = plt.subplots(nrows=6, ncols=2, figsize=(20,20))
for i, (title,merging) in enumerate (zip(new_name_data,merged_dfs)):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
a = sns.heatmap(graph.abs(), cmap=colormap, vmin=-1,vmax=1,center=0,annot = graph, ax=ax.flat[i])
cbar = fig.colorbar(a)
cbar.set_ticklabels(["1","0.5","0","0.5","1"])
fig.delaxes(ax[5,1])
plt.show()
plt.close()
I keep getting this error:
AttributeError: 'AxesSubplot' object has no attribute 'get_array'
Several things are going wrong:
fig.colorbar(...) would create a new colorbar, by default appended to the last subplot that was created.
sns.heatmap returns an ax (indicates a subplot). This is very different to matplotlib functions, e.g. plt.imshow(), which would return the graphical element that was plotted.
You can suppress the heatmap's colorbar (cbar=False), and then create it newly with the parameters you want.
fig.colorbar(...) needs a parameter ax=... when the figure contains more than one subplot.
Instead of creating a new colorbar, you can add the colorbar parameters to sns.heatmap via cbar_kws=.... The colorbar itself can be found via ax.collections[0].colobar. (ax.collections[0] is where matplotlib stored the graphical object that contains the heatmap.)
Using an index is strongly discouraged when working with Python. It's usually more readable, easier to maintain and less error-prone to include everything into the zip command.
As now your vmin now is -1, taking the absolute value for the coloring seems to be a mistake.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
merged_dfs = [pd.DataFrame(data=np.random.rand(5, 7), columns=[*'ABCDEFG']) for _ in range(5)]
new_name_data = [f'Dataset {i + 1}' for i in range(len(merged_dfs))]
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(12, 7))
for title, merging, ax in zip(new_name_data, merged_dfs, axes.flat):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
sns.heatmap(graph, cmap=colormap, vmin=-1, vmax=1, center=0, annot=True, ax=ax, cbar_kws={'ticks': ticks})
ax.collections[0].colorbar.set_ticklabels([abs(t) for t in ticks])
fig.delaxes(axes.flat[-1])
fig.tight_layout()
plt.show()

Seaborn heatmap auto sizing the cells

I am trying to build a heat map which is working great if there are lot of categorical variables but not looking great when there are two to three data points as shown in the second image. I am looking for a way to auto adjust based on the data points.
here is the function
def bivariate(col1,col2,Title,cbar_size):
temp2=modifiedloan.groupby([col1,col2,'loan_status']).id.agg('count').to_frame('count').reset_index()
temp3=temp2.pivot_table(index=(col1,col2), columns='loan_status', values='count').fillna(0)
temp3['default%']=(temp3[0]/(temp3[0]+temp3[1]))
temp3=temp3.reset_index()
temp4=temp3.pivot_table(index=col1, columns=col2, values='default%').fillna(0)
temp5=temp3.pivot_table(index=col1, columns=col2, values=[0]).fillna(0)
f, (ax1) = plt.subplots(nrows=1, ncols=1, figsize=(12,6))
cmap = sns.cm.rocket_r
sns.heatmap(temp4,linewidths=1, ax=ax1,annot=False, fmt='g',cmap=cmap,cbar=True,cbar_kws={"shrink": cbar_size})
sns.heatmap(temp5, annot=True, annot_kws={'va':'top'}, fmt="", cbar=False,ax=ax1)
sns.heatmap(temp4, annot=True, fmt=".1%",annot_kws={'va':'bottom'}, cbar=False,cmap=cmap)
plt.ylim(b, t) # update the ylim(bottom, top) values
ax1.set_title(Title)
plt.tight_layout()
I realized the problem is with the version of seaborn installed. The same functions worked well on one of my colleagues machine.

x axis label disappearing in matplotlib and basic plotting in python

I am new to matplotlib, and I am finding it very confusing. I have spent quite a lot of time on the matplotlib tutorial website, but I still cannot really understand how to build a figure from scratch. To me, this means doing everything manually... not using the plt.plot() function, but always setting figure, axis handles.
Can anyone explain how to set up a figure from the ground up?
Right now, I have this code to generate a double y-axis plot. But my xlabels are disappearing and I dont' know why
fig, ax1 = plt.subplots()
ax1.plot(yearsTotal,timeseries_data1,'r-')
ax1.set_ylabel('Windspeed [m/s]')
ax1.tick_params('y',colors='r')
ax2 = ax1.twinx()
ax2.plot(yearsTotal,timeseries_data2,'b-')
ax2.set_xticks(np.arange(min(yearsTotal),max(yearsTotal)+1))
ax2.set_xticklabels(ax1.xaxis.get_majorticklabels(), rotation=90)
ax2.set_ylabel('Open water duration [days]')
ax2.tick_params('y',colors='b')
plt.title('My title')
fig.tight_layout()
plt.savefig('plots/my_figure.png',bbox_inches='tight')
plt.show()
Because you are using a twinx, it makes sense to operate only on the original axes (ax1).
Further, the ticklabels are not defined at the point where you call ax1.xaxis.get_majorticklabels().
If you want to set the ticks and ticklabels manually, you can use your own data to do so (although I wouldn't know why you'd prefer this over using the automatic labeling) by specifying a list or array
ticks = np.arange(min(yearsTotal),max(yearsTotal)+1)
ax1.set_xticks(ticks)
ax1.set_xticklabels(ticks)
Since the ticklabels are the same as the tickpositions here, you may also just do
ax1.set_xticks(np.arange(min(yearsTotal),max(yearsTotal)+1))
plt.setp(ax1.get_xticklabels(), rotation=70)
Complete example:
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
yearsTotal = np.arange(1977, 1999)
timeseries_data1 = np.cumsum(np.random.normal(size=len(yearsTotal)))+5
timeseries_data2 = np.cumsum(np.random.normal(size=len(yearsTotal)))+20
fig, ax1 = plt.subplots()
ax1.plot(yearsTotal,timeseries_data1,'r-')
ax1.set_ylabel('Windspeed [m/s]')
ax1.tick_params('y',colors='r')
ax1.set_xticks(np.arange(min(yearsTotal),max(yearsTotal)+1))
plt.setp(ax1.get_xticklabels(), rotation=70)
ax2 = ax1.twinx()
ax2.plot(yearsTotal,timeseries_data2,'b-')
ax2.set_ylabel('Open water duration [days]')
ax2.tick_params('y',colors='b')
plt.title('My title')
fig.tight_layout()
plt.show()
Based on your code, it is not disappear, it is set (overwrite) by these two functions:
ax2.set_xticks(np.arange(min(yearsTotal),max(yearsTotal)+1))
ax2.set_xticklabels(ax1.xaxis.get_majorticklabels(), rotation=90)
set_xticks() on the axes will set the locations and set_xticklabels() will set the xtick labels with list of strings labels.

Purpose of 'ax' keyword in pandas scatter_matrix function

I'm puzzled by the meaning of the 'ax' keyword in the pandas scatter_matrix function:
pd.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds={}, hist_kwds={}, **kwds)
The only clue given in the docstring for the ax keyword is too cryptic for me:
ax : Matplotlib axis object
I had a look in the pandas code for the scatter_matrix function, and the ax variable is incorporated in the following matplotlib subplots call:
fig, axes = plt.subplots(nrows=n, ncols=n, figsize=figsize, ax=ax,
squeeze=False)
But, for the life of me, I can't find any reference to an 'ax' keyword in matplotlib subplots!
Can anyone tell me what this ax keyword is for???
This is tricky here. When looking at the source of pandas scatter_matrix you will find this line right after the docstring:
fig, axes = _subplots(nrows=n, ncols=n, figsize=figsize, ax=ax, squeeze=False)
Hence, internally, a new figure, axes combination is created using the internal _subplots method. This is strongly related to the matplotlibs subplots command but slightly different. Here, the ax keyword is supplied as well. If you look at the corresponding source (pandas.tools.plotting._subplots) you will find these lines:
if ax is None:
fig = plt.figure(**fig_kw)
else:
fig = ax.get_figure()
fig.clear()
Hence, if you supply an axes object (e.g. created using matplotlibs subplots command), pandas scatter_matrix grabs the corresponding (matplolib) figure object and deletes its content. Afterwards a new subplots grid is created into this figure object.
All in all, the ax keyword allows to plot the scatter matrix into a given figure (even though IMHO in a slightly strange way).
In short, it targets a subplot within a grid.
If you have nrows=2 and ncols=2, for example, then ax allows you to plot on a specific axis by passing ax=axes[0,0] (top left) or ax=axes[1,1] (bottom right), etc.
When you create the subplots, you receive an axes variable. You can later plot (or subplot) with an element of that axes variable as above.
Take a look at the "Targeting different subplots" section of this page: http://pandas.pydata.org/pandas-docs/dev/visualization.html#targeting-different-subplots
I hope this helps.

Categories

Resources