When i try plot the following data in python, i do not see the green color portion in my graph. Please find it below. Meanwhile, please be noted that I use python 2.7.4.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline
range = pd.date_range('2015-01-01', '2015-12-31', freq='15min')
df = pd.DataFrame(index = range)
df
# Average speed in miles per hour
df['speed'] = np.random.randint(low=0, high=60, size=len(df.index))
# Distance in miles (speed * 0.5 hours)
df['distance'] = df['speed'] * 0.25
# Cumulative distance travelled
df['cumulative_distance'] = df.distance.cumsum()
df.head()
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(df.index, df['speed'], 'g-')
ax2.plot(df.index, df['distance'], 'b-')
ax1.set_xlabel('Date')
ax1.set_ylabel('Speed', color='g')
ax2.set_ylabel('Distance', color='b')
plt.show()
plt.rcParams['figure.figsize'] = 12,5
Speed and distance are two parameters which are directly proportional to each other. If you normalize speed/distance sets, you get exactly the same graph. As you draw your drafts with alpha=1 (opaque), then the only color you see is the last one drawn (blue). If you use alpha <> 1:
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(df.index, df['speed'], 'g-', alpha=0.5)
ax2.plot(df.index, df['distance'], 'b-', alpha=0.1)
ax1.set_xlabel('Date')
ax2.set_ylabel('Distance', color='b')
ax1.set_ylabel('Speed', color='g')
plt.show()
plt.rcParams['figure.figsize'] = 12,5
you see the green color (in fact a mixture of green and blue):
Related
I have a seaborn.heatmap plotted from a DataFrame:
import seaborn as sns
import matplotlib.pyplot as plt
fig = plt.figure(facecolor='w', edgecolor='k')
sns.heatmap(collected_data_frame, annot=True, vmax=1.0, cmap='Blues', cbar=False, fmt='.4g')
I would like to create some sort of highlight for a maximum value in each column - it could be a red box around that value, or a red dot plotted next to that value, or the cell could be colored red instead of using Blues. Ideally I'm expecting something like this:
I got the highlight working for DataFrame printing in Jupyter Notebook using tips from this answer:
How can I achieve a similar thing but on a heatmap?
We've customized the heatmap examples in the official reference. The customization examples were created from the responses from this site. It's a form of adding parts to an existing graph. I added a frame around the maximum value, but this is manual.
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import seaborn as sns
sns.set()
# Load the example flights dataset and convert to long-form
flights_long = sns.load_dataset("flights")
flights = flights_long.pivot("month", "year", "passengers")
# Draw a heatmap with the numeric values in each cell
f, ax = plt.subplots(figsize=(9, 6))
ax = sns.heatmap(flights, annot=True, fmt="d", linewidths=.5, ax=ax)
ax.add_patch(Rectangle((10,6),2,2, fill=False, edgecolor='blue', lw=3))
max value:
ymax = max(flights)
ymax
1960
flights.columns.get_loc(ymax)
11
xmax = flights[ymax].idxmax()
xmax
'July'
xpos = flights.index.get_loc(xmax)
xpos
6
ax.add_patch(Rectangle((ymax,xpos),1,1, fill=False, edgecolor='blue', lw=3))
Complete solution based on the answer of #r-beginners:
Generate DataFrame:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import seaborn
arr = np.array([[0.9336719 , 0.90119269, 0.90791181, 0.3112451 , 0.56715989,
0.83339874, 0.14571595, 0.36505745, 0.89847367, 0.95317909,
0.16396293, 0.63463356],
[0.93282304, 0.90605976, 0.91276066, 0.30288519, 0.56366228,
0.83032344, 0.14633036, 0.36081791, 0.9041638 , 0.95268572,
0.16803188, 0.63459491],
[0.15215358, 0.4311569 , 0.32324376, 0.51620611, 0.69872915,
0.08811177, 0.80087247, 0.234593 , 0.47973905, 0.21688613,
0.2738223 , 0.38322856],
[0.90406056, 0.89632902, 0.92220635, 0.3022458 , 0.58843012,
0.78159595, 0.17089609, 0.33443782, 0.89997103, 0.93128579,
0.15942313, 0.62644379],
[0.93868063, 0.45617598, 0.17708323, 0.81828266, 0.72986428,
0.82543775, 0.41530088, 0.2604382 , 0.33132295, 0.94686745,
0.05607774, 0.54141198]])
columns_text = [str(num) for num in range(0,12)]
index_text = ['C1', 'C2', 'C3', 'C4', 'C5']
arr_data_frame = pd.DataFrame(arr, columns=columns_text, index=index_text)
Highlighting maximum in a column:
fig,ax = plt.subplots(figsize=(15, 3), facecolor='w', edgecolor='k')
ax = seaborn.heatmap(arr_data_frame, annot=True, vmax=1.0, vmin=0, cmap='Blues', cbar=False, fmt='.4g', ax=ax)
column_max = arr_data_frame.idxmax(axis=0)
for col, variable in enumerate(columns_text):
position = arr_data_frame.index.get_loc(column_max[variable])
ax.add_patch(Rectangle((col, position),1,1, fill=False, edgecolor='red', lw=3))
plt.savefig('max_column_heatmap.png', dpi = 500, bbox_inches='tight')
Highlighting maximum in a row:
fig,ax = plt.subplots(figsize=(15, 3), facecolor='w', edgecolor='k')
ax = seaborn.heatmap(arr_data_frame, annot=True, vmax=1.0, vmin=0, cmap='Blues', cbar=False, fmt='.4g', ax=ax)
row_max = arr_data_frame.idxmax(axis=1)
for row, index in enumerate(index_text):
position = arr_data_frame.columns.get_loc(row_max[index])
ax.add_patch(Rectangle((position, row),1,1, fill=False, edgecolor='red', lw=3))
plt.savefig('max_row_heatmap.png', dpi = 500, bbox_inches='tight')
I have a list of values which I want to plot the distribution for. I'm using a box-plot but it would be nice to add some dotted lines going from the boxplot quartiles to the axis. Also I want just the quartile values displayed on the x ticks.
Here's a rough idea but with values at the end instead of names.
import numpy as np
import pandas as pd
import matplotlib.pylab as plt
vel_arr = np.random.rand(1000,1)
fig = plt.figure(1, figsize=(9, 6))
ax = fig.add_subplot(111)
# Create the boxplot
ax.boxplot(vel_arr,vert=False, manage_ticks=True)
ax.set_xlabel('value')
plt.yticks([1], ['category'])
plt.show()
np.quantile calculates the desired quantiles.
ax.vlines draws vertical lines, for example from the center of the boxplot to y=0. zorder=0 makes sure these lines go behind the boxplot.
ax.set_ylim(0.5, 1.5) resets the ylims. Default, the vlines force the ylims with some extra padding.
ax.set_xticks(quantiles) sets xticks at the position of every quantile.
import numpy as np
import matplotlib.pylab as plt
vel_arr = np.random.rand(50, 1)
fig = plt.figure(1, figsize=(9, 6))
ax = fig.add_subplot(111)
ax.boxplot(vel_arr, vert=False, manage_ticks=True)
ax.set_xlabel('value')
ax.set_yticks([1])
ax.set_yticklabels(['category'])
quantiles = np.quantile(vel_arr, np.array([0.00, 0.25, 0.50, 0.75, 1.00]))
ax.vlines(quantiles, [0] * quantiles.size, [1] * quantiles.size,
color='b', ls=':', lw=0.5, zorder=0)
ax.set_ylim(0.5, 1.5)
ax.set_xticks(quantiles)
plt.show()
I would like to display the following dataframe in barchart but with double y axis, I want to show areas columns on left side and prices columns on right side:
area1 area2 price1 price2
level
first 263.16 906.58 10443.32 35101.88
second 6879.83 14343.03 2077.79 4415.53
third 31942.75 60864.24 922.87 1774.47
I tried with code below, it works but only display left side.
import matplotlib.pyplot as plt
df.plot(kind='bar')
plt.xticks(rotation=45, fontproperties="SimHei")
plt.xlabel("")
plt.legend()
Thank you.
If I understood you correctly, one way could be this, but you have to "play" a bit with the values of width and position of the ticks:
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(12,5))
ax = fig.add_subplot(111)
ax2 = ax.twinx()
width = 0.1
df.area1.plot(kind='bar', color='red', ax=ax, width=width, position=0 )
df.area2.plot(kind='bar', color='orange', ax=ax, width=width, position=1)
df.price1.plot(kind='bar', color='blue', ax=ax2, width=width, position=2)
df.price2.plot(kind='bar', color='green', ax=ax2, width=width, position=3)
ax.set_ylabel('Area')
ax2.set_ylabel('Price')
ax.legend(["Area1", "Area2"], bbox_to_anchor=(0.8,1.0))
ax2.legend(["Price1", "Price2"], bbox_to_anchor=(0.9,1.0))
plt.show()
Another way is this:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(figsize=(10,5))
ax = fig.add_subplot(111)
ax2 = ax.twinx()
# ax.set_xticklabels(ax.get_xticklabels(),rotation=45) # Rotation 45 degrees
width = 0.1
ind = np.arange(len(df))
ax.set_ylabel('Area')
ax2.set_ylabel('Price')
ax.set_xlabel('Level')
ax.bar(ind, df.area1, width, color='red', label='area1')
ax.bar(ind + width, df.area2, width, color='orange', label='area2')
ax2.bar(ind + 2*width, df.price1, width, color='blue', label='price1')
ax2.bar(ind + 3*width, df.price2, width, color='green', label='price2')
ax.set(xticks=(ind + 1.5*width), xticklabels=df.index, xlim=[2*width - 1, len(df)])
ax.legend(["Area1", "Area2"], bbox_to_anchor=(1,1))
ax2.legend(["Price1", "Price2"], bbox_to_anchor=(1,0.87))
plt.show()
Here I plot a bar graph and a line graph in the same figure:
There are 2 y-axes, money and increase_rate, each on a different scale.
How can I set the ticks of the two y-axes to be at the same hight?
import numpy as np
import matplotlib.pyplot as plt
time = [2000,2001,2002,2003]
money = [1000,2000,4000,6000]
increase_rate =[2,1,6,12]
fig, ax1 = plt.subplots()
width = 0.75
ax1.set_xlabel("")
ax1.set_ylabel("")
ax1.bar(time, money ,width = width, color = "#9370DB", alpha=0.6)
ax1.tick_params(axis='y')
ax1.spines['right'].set_visible(False)
ax1.spines['left'].set_visible(False)
ax1.spines['top'].set_visible(False)
ax1.spines['bottom'].set_visible(False)
ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis
ax2.set_ylabel("")
ax2.plot(time, increase_rate, color = "#FFFF00", lw = 3)
ax2.tick_params(axis='y')
ax2.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.spines['top'].set_visible(False)
ax2.grid(color='black', linestyle='dotted', linewidth=0.8, alpha = 0.5)
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show()
Use the set_yticks to set the tick positions.
ax1.set_yticks(np.linspace(0, max(money), 5))
ax2.set_yticks(np.linspace(0, max(increase_rate), 5))
I have a dataframe that I'd like to use to build a scatterplot where different points have different colors:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
dat=pd.DataFrame(np.random.rand(20, 2), columns=['x','y'])
dat['c']=np.random.randint(0,100,20)
dat['c_norm']=(dat['c']-dat['c'].min())/(dat['c'].max()-dat['c'].min())
dat['group']=np.append(np.repeat('high',10), np.repeat('low',10))
As you can see, the column c_norm shows the c column has been normalized between 0 and 1. I would like to show a continuous legend whose color range reflect the normalized values, but labeled using the original c values as label. Say, the minimum (1), the maximum (86), and the median (49). I also want to have differing markers depending on group.
So far I was able to do this:
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
for row in dat.index:
if(dat.loc[row,'group']=='low'):
i_marker='.'
else:
i_marker='x'
ax.scatter(
x=dat.loc[row,'x'],
y=dat.loc[row,'y'],
s=50, alpha=0.5,
marker=i_marker
)
ax.legend(dat['c_norm'], loc='center right', bbox_to_anchor=(1.5, 0.5), ncol=1)
Questions:
- How to generate a continuous legend based on the values?
- How to adapt its ticks to show the original ticks in c, or at least a min, max, and mean or median?
Thanks in advance
Partial answer. Do you actually need to determine your marker colors based on the normed values? See the output of the snippet below.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dat = pd.DataFrame(np.random.rand(20, 2), columns=['x', 'y'])
dat['c'] = np.random.randint(0, 100, 20)
dat['c_norm'] = (dat['c'] - dat['c'].min()) / (dat['c'].max() - dat['c'].min())
dat['group'] = np.append(np.repeat('high', 10), np.repeat('low', 10))
fig, (ax, bx) = plt.subplots(nrows=1, ncols=2, num=0, figsize=(16, 8))
mask = dat['group'] == 'low'
scat = ax.scatter(dat['x'][mask], dat['y'][mask], s=50, c=dat['c'][mask],
marker='s', vmin=np.amin(dat['c']), vmax=np.amax(dat['c']),
cmap='plasma')
ax.scatter(dat['x'][~mask], dat['y'][~mask], s=50, c=dat['c'][~mask],
marker='X', vmin=np.amin(dat['c']), vmax=np.amax(dat['c']),
cmap='plasma')
cbar = fig.colorbar(scat, ax=ax)
scat = bx.scatter(dat['x'][mask], dat['y'][mask], s=50, c=dat['c_norm'][mask],
marker='s', vmin=np.amin(dat['c_norm']),
vmax=np.amax(dat['c_norm']), cmap='plasma')
bx.scatter(dat['x'][~mask], dat['y'][~mask], s=50, c=dat['c_norm'][~mask],
marker='X', vmin=np.amin(dat['c_norm']),
vmax=np.amax(dat['c_norm']), cmap='plasma')
cbar2 = fig.colorbar(scat, ax=bx)
plt.show()
You could definitely modify the second colorbar so that it matches the first one, but is that necessary?