Here is my code:
import pandas as pd
import matplotlib.pyplot as plt
wine = pd.read_csv('red wine quality.csv')
wine = wine.dropna()
plt.figure()
wine.plot.scatter(x = 'pH', y = 'alcohol', c = 'quality', alpha = 0.4,\
cmap = plt.get_cmap('jet'), colorbar = True)
plt.savefig('scatter plot.png')
plt.tight_layout()
plt.show()
Here is the plot that I get:
I get a scatter plot with the y-axis labeled as 'alcohol' ranging from 9-15 and the color bar labeled as 'quality' ranging from 3-8. I thought I had designated in my code that the x-axis would show up labeled as 'pH', but I get nothing. I have tried adjusting my figsize down to [8, 8], setting dpi to 100, and labeling the axes, but nothing will make the x-axis show up. What am I doing wrong?
Here is an MCVE to play with:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.randn(50, 4), columns=['A', 'B', 'C', 'D'])
plt.figure()
df.plot.scatter(x = 'A', y = 'B', c = 'C', alpha = 0.4,\
cmap = plt.get_cmap('jet'), colorbar = True)
plt.tight_layout()
plt.show()
X axis label and minor tick labels not showing on Pandas scatter plot is a known and open issue with Pandas (see BUG: Scatterplot x-axis label disappears with colorscale when using matplotlib backend ยท Issue #36064).
This bug occurs with Jupyter notebooks displaying Pandas scatterplots that have a colormap while using Matplotlib as the plotting backend. The simplest workaround is passing sharex=False to pandas.DataFrame.plot.scatter.
See Make pandas plot() show xlabel and xvalues
Related
I have a series of scatterplots (one example below), but I want to modify it so that the colors of the points in the plot become more red (or "hot") when they are clustered more closely with other points, while points that are spread out further are colored more blue (or "cold"). Is it possible to do this?
Currently, my code is pretty basic in its set up.
import plotly.express as px
fig = px.scatter(data, x='A', y='B', trendline='ols')
Using scipy.stats.gaussian_kde you can calculate the density and then use this to color the plot:
import pandas as pd
import plotly.express as px
from scipy import stats
df = pd.DataFrame({
'x':[0,0,1,1,2,2,2.25,2.5,2.5,3,3,4,2,4,8,2,2.75,3.5,2.5],
'y':[0,2,3,2,1,2,2.75,2.5,3,3,4,1,5,4,8,4,2.75,1.5,3.25]
})
kernel = stats.gaussian_kde([df.x, df.y])
df['z'] = kernel([df.x, df.y])
fig = px.scatter(df, x='x', y='y', color='z', trendline='ols', color_continuous_scale=px.colors.sequential.Bluered)
output:
How do you change the colors of the y-axis labels in a joyplot using joypy package?
Here is a sample code where i can change the color if the x-axis labels, but not the y-axis.
import joypy
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
## DATA
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
new_names = ['SepalLength','SepalWidth','PetalLength','PetalWidth','Name']
iris = pd.read_csv(url, names=new_names, skiprows=0, delimiter=',')
## PLOT
fig, axes = joypy.joyplot(iris)
## X AXIS
plt.tick_params(axis='x', colors='red')
## Y AXIS (NOT WORKING)
plt.tick_params(axis='y', colors='red')
I'm pretty sure the issue is because there are mutliple sub-y-axis's, one for each density plot, and they are actually hidden already.
Not sure how to access the y-axis that is actually shown (I want to change the color of "SepalLength")
Joyplot is using Matplotlib
r-beginners' comment worked for me. If you want to change the colors of all the y-axis labels, you can iterate through them like this:
for ax in axes:
label = ax.get_yticklabels()
ax.set_yticklabels(label, fontdict={'color': 'r'})
This results in a warning that you're not supposed to use set_xticklabels() before fixing the tick positions using set_xticks (see documentation here) but with joypy it didn't result in any errors for me.
Here's another solution that just changes the color of the label directly:
for ax in axes:
label = ax.get_yticklabels()
label[0].set_color('red')
I'm trying to visualize correlations using a heatmap in matplotlib (1.4.3), which works fine. I'd like to highlight specific cells/points in the heatmap, and my first guess was to overlay a second plot that creates the highlights. As imshow creates a new window, this does not work as intended, though. A condensed version of my code is below. Is there another way to render something matrix-like on top of an existing figure?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(4, 4), columns=list('ABCD'))
corrmatrix = df.corr()
fig, ax = plt.subplots()
im = ax.imshow(corrmatrix, cmap='afmhot', interpolation='none')
plt.colorbar(im)
ax.set_xticks(np.arange(len(df.columns)))
ax.set_xticklabels(df.columns)
ax.set_yticks(np.arange(len(df.columns)))
ax.set_yticklabels(df.columns)
relevant_cells = df > 0.9
rel_ax = ax.imshow(relevant_cells, cmap='YlOrBr', interpolation='none')
plt.show()
Emphasis can be achieved by overlaying the two heatmaps and adjusting them by transparency. The color map has been intentionally changed for clarity: if C,C and A,C is True
rel_ax = ax.imshow(relevant_cells, cmap='Blues', interpolation='none', alpha=0.7)
I am having an issue trying to superimpose plots with seaborn. I am able to generate the two plots separetly as
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
The output looks like this:
But when i try to put both plots superimposed, but assiging both to the same ax object.
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
I am not able to identify with the X axis in the Lineplot changes when superimposing both plots (both plots X axis go from 0 to 0.069).
My goal is for both plots to be superimposed, while keeping the same X axis range.
Seaborn's boxplot creates categorical x-axis, with all boxes nicely with the same distance. Internally the x-axis is numbered as 0, 1, 2, ... but externally it gets the labels from 0 to 0.069.
To combine a line plot with a boxplot, matplotlib's boxplot can be addressed directly, so that positions and widths can be set explicitly. When patch_artist=True, a rectangle is created (instead of just lines), for which a facecolor can be given. manage_ticks=False prevents that boxplot changes the x ticks and their limits. Optionally notch=True would accentuate the median a bit more, but depending on the data, the confidence interval might be too large and look weird.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
data1 = pd.DataFrame({'pct_gc': np.linspace(0, 0.069, 200), 'MSE': np.random.normal(0.02, 0.1, 200).cumsum()})
data1['pct_range'] = pd.cut(data1['pct_gc'], 10)
fig, ax1 = plt.subplots(ncols=1, figsize=(20, 7))
sns.lineplot(data=data1, y='MSE', x='pct_gc', ax=ax1)
for interval, color in zip(np.unique(data1['pct_range']), plt.cm.tab10.colors):
ax1.boxplot(data1[data1['pct_range'] == interval]['MSE'],
positions=[interval.mid], widths=0.4 * interval.length,
patch_artist=True, boxprops={'facecolor': color},
notch=False, medianprops={'color':'yellow', 'linewidth':2},
manage_ticks=False)
plt.show()
I have seaborn heatmap and I would like to plot a lineplot on top of it while using the same x and y axis that the heatmap is using.
I expected the line to behave like in this post and take up most of the space of the heatmap, but instead the output I got was the following plot where it only occupied a small section of the heatmap. How can I make the line take up most of the space in the heatmap?
Below is the minimal working example that produced the plot I linked above.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
num = 11
a = np.eye(num)
x = np.round(np.linspace(0, 1, num=num), 1)
y = np.round(np.linspace(0, 1, num=num), 1)
df = pd.DataFrame(a, columns=x, index=y)
f, ax = plt.subplots()
ax = sns.heatmap(df, cbar=False)
ax.axes.invert_yaxis()
sns.lineplot(x=x, y=y)
plt.show()
Perhaps just a simple fix here:
sns.lineplot(x=x*num, y=y*num)