I'm using geopandas (python 3.8.2) to plot variables contained in a geodataframe.
I would like to plot on a single figure, all datasets with their own colormap.
The problem is that the plot shows only the last dataset, which corresponds to 'var_5' with colormap 'Reds'. Even if I set: ax = geodataframe.plot() it does not work.
Any idea ? Many Thanks!
import geopandas as gpd
import matplotlib.pyplot as plt
filename = 'myfile.geojson'
geodataframe = gpd.read_file(filename)
cmaps = ['plasma', 'Greens', 'Blues', 'binary', 'Reds']
variables = ['var_1', 'var_2', 'var_3', 'var_4', 'var_5']
plt.rcParams['figure.figsize'] = (20, 10)
ax = plt.gca()
for i, var in enumerate(variables):
geodataframe.plot(ax=ax, column=var, cmap=cmaps[i])
plt.show()
Edit:
After taking into account the answers, I got this image:
Related
i just stumpled upon a problem I simply cannot solve. I have a dataset with raw data which I will upload here: https://file.io/oJqkZjAGyqV1
Its an excel file with the data inside.
I then created some code to open it, read it, generate a mean and sem of my data as below.
# Import required packages
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from pylab import cm
df = pd.read_excel("Chlorophyll_data_mod.xlsx")
#----Calculation of meanvalues and sem from raw_data---------
meandf2 = df.set_index(["Group"])
sets = []
for x in ["A","B","AB","xc"]:
meandf3 = meandf2.filter(like=f"Chl_{x}_").reset_index()
sets.append(meandf3)
#---------Grouping DataFrame----------#
means = []
ster = []
for x in range(len(sets)):
meandf = sets[x].groupby(["Group"]).mean()
meandf = meandf.reset_index()
means.append(meandf)
sems = sets[x].groupby("Group").sem()
sems = sems.reset_index()
ster.append(sems)
#----Selecting Dataframe from List-----#
plotdf = means[0]
ploter = ster[0]
plotgroup = plotdf.iloc[:,[0,]]
plotdata = plotdf.iloc[:,[1,]]
grouparray = plotgroup.to_numpy()
dataarray = plotdata.to_numpy()
#-----CreatePlot------#
fig, ax = plt.subplots(nrows=3, ncols=1, sharex="all", figsize=(10,8))
plotdf.plot(ax=ax[0,],x="Group",y="Chl_A_0D", kind="bar", legend=False, color="black")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_0D"],yerr=ploter["Chl_A_0D"])
plotdf.plot(ax=ax[1,],x="Group",y="Chl_A_10DaT", kind="bar", legend=False, color="blue")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_10DaT"],yerr=ploter["Chl_A_10DaT"])
plotdf.plot(ax=ax[2,],x="Group",y="Chl_A_7DaR", kind="bar", legend=False, color="magenta")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_7DaR"],yerr=ploter["Chl_A_7DaR"])
#----Legend of the Plot-----#
fig.legend(loc="lower center", bbox_to_anchor=(0.5,0), fancybox=True, ncol=6)
#----Layout------#
plt.tight_layout(rect=[0, 0.02, 1,1])
plt.show()
And I manage to create a subplot, which shows 3 of my interested data points. However, I struggle with the error bars.
My approach was to calculate the sem and store it into a new dataframe. And then just read it from there for the yerr. However, this doesn't work.
plotdf.plot(ax=ax[2,],x="Group",y="Chl_A_7DaR", kind="bar", legend=False, color="magenta", yerr=ploter["Chl_A_7DaR"])
Results in an array error because of the structure.
And my current approach, as in the main code above only draws the error bars in the last subplot, but not in each individual plot.
Maybe here is someone who could help me understanding this function?
Best regards
I'm trying to add markeredgecolor on my plot (marker='.' and I want markers to be surrounded by different colors depending on their characteristics).
I tried to do like this with geographic data : https://python-graph-gallery.com/131-custom-a-matplotlib-scatterplot/
fig, ax = plt.subplots(figsize = (8,6))
df.plot(ax=ax,color='green', marker=".", markersize=250)
df2.plot(ax=ax,color='green', marker=".", markerfacecolor="orange", markersize=250)
However I get this error :
AttributeError: 'PathCollection' object has no property 'markeredgecolor'
Do you know what's the problem and what to do ?
Edit - with a reproducible example :
#Packages needed
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
import shapely.wkt
#Creating GeoDataFrame df2
df2 = pd.DataFrame([shapely.wkt.loads('POINT (7.23173 43.68249)'),shapely.wkt.loads('POINT (7.23091 43.68147)')])
df2.columns=['geometry']
df2 = gpd.GeoDataFrame(df2)
df2.crs = {'init' :'epsg:4326'}
#Ploting df2
fig, ax = plt.subplots()
df2.plot(ax=ax,color='green', marker=".", markerfacecolor="orange")
Geopandas plot does not accept all arguments as matplotlib.plot (reference [here])(https://geopandas.readthedocs.io/en/latest/docs/reference/api/geopandas.GeoDataFrame.plot.html)
However, there is some **style_kwds that can do the work for you (not clearly explained in the docs though):
Taking back your code
df2.plot(
ax=ax,
color='green',
marker=".",
markersize=1000,
edgecolor='orange', # this modifies the color of the surrounding circle
linewidth=5 # this modifies the width of the surrounding circle
)
Having this configuration in requirements.txt
geopandas~=0.9.0
matplotlib~=3.3.4
pandas~=1.2.3
shapely~=1.7.1
Hello everyone.In the picture you can see a sample of my code(it repeats for i==6),and the outputs.Can somone tell me how to add coastlines/boundaries to maps?"ax.coastlines() failed. Thank you
I think the problem is that your axes is not a geoAxes. To make it a geoAxes you have to tell matplotlib what projection (e.g. PlateCarree) you would like to use.
What you could do is using the cartopy library and adding the projection key word to your subplot. See example below:
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
# Create sample data
lon = np.arange(129.4, 153.75+0.05, 0.25)
lat = np.arange(-43.75, -10.1+0.05, 0.25)
data = 10 * np.random.rand(len(lat), len(lon))
data_array = xr.Dataset({"DC": (["lat", "lon"], data),
'DMC': (["lat", "lon"], data),
'FFMC': (["lat", "lon"], data)},
coords={"lon": lon,"lat": lat})
# Just checking the datasets are not empty
print(data_array)
#< Plotting
fig, axes = plt.subplots(nrows=3, ncols=3, subplot_kw={'projection': ccrs.PlateCarree()}) # Use PlateCarree projection
data_array['DC'].plot(ax=axes[0,0], cmap='Spectral_r', add_colorbar=True, extend='both')
axes[0,0].coastlines() # Add coastlines
I have produced a Boxplot/Swarmplot graph using Matplotlib/Seaborn in Pandas. Some outliers can been seen in the graph (as dots outside the "whiskers"/"fence" area). I am looking for a way to trim the dataset directly after they have been identified in the graph and without removing them from the original dataset. I do not want to simply hide the outlier dots.
Some methods have been recommended and pandas quantile looks promising but I am not sure how to implement these with the code I have been using.
My graph with the outliers.
The code I used to produce this graph. The data has been organized into the tidy format.
# Import libraries and modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Set seaborn style
sns.set(style="whitegrid", palette="colorblind")
# load length tidy data
length_tidy = pd.read_csv('results/tidy/length_tidy.csv')
score_tidy = pd.read_csv('results/tidy/score_tidy.csv')
# Define and save boxplot and swarmplot for length data
fig, ax = plt.subplots(figsize=(10,6))
ax = sns.boxplot(x='Metric', y='Length', data=length_tidy, ax=ax)
ax = sns.swarmplot(x="Metric", y="Length", data=length_tidy, color=".25")
ax.set_xlabel('Condition')
ax.set_ylabel('Length in micrometers')
plt.savefig('statistics/boxplot/length_boxplot.png', dpi=300)
fig, ax = plt.subplots(figsize=(10,6))
ax = sns.boxplot(x='Metric', y='Score', data=score_tidy, ax=ax)
ax = sns.swarmplot(x="Metric", y="Score", data=score_tidy, color=".25")
ax.set_xlabel('Condition')
ax.set_ylabel('Score')
plt.savefig('statistics/boxplot/score_boxplot.png', dpi=300)
An example of some of the data I am working with in the CSV format.
Object,Metric,Length
M11,B2A10,1.807782
MT1,B2A10,3.2207116666666664
MT1,B2A1,3.57675
MT1,B2A2,2.9474600000000004
MT1,B2A3,2.247772857142857
MT1,B2A4,3.754455
MT1,B2A5,2.716282
MT1,B2A6,2.91325
MT1,B2A7,1.24806
MT1,B2A8,2.00371875
MT1,B2A9,1.5435599999999998
MT1,B2B1,2.2051515384615388
MT1,B2B2,1.5278873333333332
MT1,B2B3,1.7283750000000002
MT1,B2B4,1.4547385714285714
MT1,B2B5,3.237578333333333
MT1,B2B6,2.47016
MT1,B2B7,2.1185947777777776
MT1,B2B8,1.8502877777777773
MT10,B2A10,3.07143
MT10,B2A1,3.34361
MT10,B2A2,2.889958333333333
MT10,B2A3,2.22087
MT10,B2A4,2.87669
MT10,B2A5,1.6745005555555557
MT10,B2A7,2.09018
MT10,B2A8,2.4947450000000004
MT10,B2B1,1.849095882352941
MT10,B2B2,1.5291758000000002
MT10,B2B5,1.6423770999999998
MT10,B2B6,1.9680385714285715
MT10,B2B7,1.7207240000000001
MT10,B2B8,2.9618275
MT12,B2A10,1.7243058333333334
MT12,B2A1,3.3938900000000003
MT12,B2A2,2.00601
MT12,B2A3,2.1720200000000003
MT12,B2A4,2.452923333333333
MT12,B2A5,2.986948
MT12,B2A7,2.08466
MT12,B2A8,1.29047
MT12,B2B1,2.528839230769232
MT12,B2B2,1.4011425454545454
MT12,B2B5,1.626078333333333
MT12,B2B6,1.074394454545455
MT12,B2B7,2.0897078571428573
MT12,B2B8,1.4102533333333336
I need to change the colors of the boxplot drawn using pandas utility function. I can change most properties using the color argument but can't figure out how to change the facecolor of the box. Someone knows how to do it?
import pandas as pd
import numpy as np
data = np.random.randn(100, 4)
labels = list("ABCD")
df = pd.DataFrame(data, columns=labels)
props = dict(boxes="DarkGreen", whiskers="DarkOrange", medians="DarkBlue", caps="Gray")
df.plot.box(color=props)
While I still recommend seaborn and raw matplotlib over the plotting interface in pandas, it turns out that you can pass patch_artist=True as a kwarg to df.plot.box, which will pass it as a kwarg to df.plot, which will pass is as a kwarg to matplotlib.Axes.boxplot.
import pandas as pd
import numpy as np
data = np.random.randn(100, 4)
labels = list("ABCD")
df = pd.DataFrame(data, columns=labels)
props = dict(boxes="DarkGreen", whiskers="DarkOrange", medians="DarkBlue", caps="Gray")
df.plot.box(color=props, patch_artist=True)
As suggested, I ended up creating a function to plot this, using raw matplotlib.
def plot_boxplot(data, ax):
bp = ax.boxplot(data.values, patch_artist=True)
for box in bp['boxes']:
box.set(color='DarkGreen')
box.set(facecolor='DarkGreen')
for whisker in bp['whiskers']:
whisker.set(color="DarkOrange")
for cap in bp['caps']:
cap.set(color="Gray")
for median in bp['medians']:
median.set(color="white")
ax.axhline(0, color="DarkBlue", linestyle=":")
ax.set_xticklabels(data.columns)
I suggest using df.plot.box with patch_artist=True and return_type='both' (which returns the matplotlib axes the boxplot is drawn on and a dictionary whose values are the matplotlib Lines of the boxplot) in order to have the best customization possibilities.
For example, given this data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(
data=np.random.randn(100, 4),
columns=list("ABCD")
)
you can set a specific color for all the boxes:
fig,ax = plt.subplots(figsize=(9,6))
ax,props = df.plot.box(patch_artist=True, return_type='both', ax=ax)
for patch in props['boxes']:
patch.set_facecolor('lime')
plt.show()
you can set a specific color for each box:
colors = ['green','blue','yellow','red']
fig,ax = plt.subplots(figsize=(9,6))
ax,props = df.plot.box(patch_artist=True, return_type='both', ax=ax)
for patch,color in zip(props['boxes'],colors):
patch.set_facecolor(color)
plt.show()
you can easily integrate a colormap:
colors = np.random.randint(0,10, 4)
cm = plt.cm.get_cmap('rainbow')
colors_cm = [cm((c-colors.min())/(colors.max()-colors.min())) for c in colors]
fig,ax = plt.subplots(figsize=(9,6))
ax,props = df.plot.box(patch_artist=True, return_type='both', ax=ax)
for patch,color in zip(props['boxes'],colors_cm):
patch.set_facecolor(color)
# to add colorbar
fig.colorbar(plt.cm.ScalarMappable(
plt.cm.colors.Normalize(min(colors),max(colors)),
cmap='rainbow'
), ax=ax, cmap='rainbow')
plt.show()