How can I apply clipping to mark_text() in altair? - python

I have my plot clipped so it only shows certain ranges on the y axis. I added text to it using this code:
text2 = plot2.mark_text(align='left', dx=5, dy= -8, size = 15).encode(text = alt.Text('Accuracy', format = ',.2f'))
But this added annotation appears outside of the plot. So I need to get rid of it.
In the plot, I'm using sth like this:clip = True in mark_line().

You need to set clip=True for the text mark explicitly:
df = pd.DataFrame({'x': [1, 3], 'y': [1, 4], 'text': ['a', 'b']})
chart = alt.Chart(df).mark_line(clip=True).encode(
x=alt.X('x', scale=alt.Scale(domain=[0, 2])),
y='y'
)
chart + chart.mark_text().encode(text='text')
chart + chart.mark_text(clip=True).encode(text='text')

Related

How to plot recs and circles with two legends with Altair?

I would like to create two charts that are superimposed, but with two legends. One chart uses rects with one color palette, and the second chart displays circles with a second color palette. This should be very straightforward, but something is wrong. I only get a single legend. I also want the legends to be selectable. Here is a self-contained MWE, representative of a more complex use case. Below the code, I show an image of what the code produces: single legend, single color palette. Is this expected behavior or some kind of bug? Any insight is appreciated. Thanks!
streamimport pandas as pd
import altair as alt
import streamlit as st
# Demonstrate two categorical legends with selection_multi.
# There appears to be a bug when using shift-click on one menu, then the other.
def drawPlot():
x1 = [1, 2, 3]
y1 = [1, 2, 3]
x2 = [4, 5, 6]
y2 = [4, 5, 6]
df = pd.DataFrame({'x1':x1, 'y1':y1, 'x2':x2, 'y2':y2})
palette1 = alt.Color('x1:N',
scale=alt.Scale(
domain=[1, 2, 3],
range=['lightgreen', 'darkgreen', 'yellow'],
)
)
palette2 = alt.Color('x2:N',
scale=alt.Scale(
domain=[4, 5, 6],
range=['lightblue', 'darkblue', 'purple'],
)
)
select1 = alt.selection_multi(fields=['x1'], bind='legend')
select2 = alt.selection_multi(fields=['x2'], bind='legend')
nodes1 = alt.Chart(df).mark_rect(
width=20, height=20,
).encode(
x = 'x1:N',
y = 'y1:N',
color = palette1,
).add_selection(
select1
)
nodes2 = alt.Chart(df).mark_circle(
width=20, height=20, size=1200,
).encode(
x = 'x2:N',
y = 'y2:N',
color = palette2,
).add_selection(
select2
)
full_chart = (nodes1 + nodes2).properties(
height=500,
width=1000,
)
return full_chart
#----------------------------------------------------------------
if __name__ == "__main__":
chart = drawPlot()
st.altair_chart(chart, use_container_width=True)
Altair/Vega-Lite combine existing scales among charts into a single legend by default when possible for a more compact layout. When scales are independent of each other and should be represented in separate legends, you would need to resolve them manually, in your case it would look like this
chart.resolve_scale(color='independent')
You can read more on this page in the docs.

How to configure Chart Position in HConcatChart in Altair

I'm trying to horizontally concatenate two charts in altair, but I can't get them to look just like I want them to.
Here is what they look like:
And here is the code I'm using:
pick_ausbildung = alt.selection_single(fields = ["Ausbildungsstand"], on = "mouseover")
ausbildung_chart = alt.Chart(umfrage,
title = "Ausbildungsstand").mark_bar().encode(
y=alt.Y("Ausbildungsstand", axis = alt.Axis(title = None)),
x="count()",
color = alt.condition(pick_ausbildung,
alt.Color("Ausbildungsstand:N",
legend = None), alt.value("lightgrey")),
tooltip = ["Ausbildungsstand","count()"]).properties(height=200).add_selection(pick_ausbildung)
g_ausbildung_chart = alt.Chart(umfrage).mark_bar().encode(
x="Geschlecht",
y="count()",
color = "Geschlecht",
tooltip = ["Geschlecht","count()"]).properties(width=300).transform_filter(pick_ausbildung)
ausbildung_chart|g_ausbildung_chart
And basically, I would like to place the chart "Ausbildungsstand" in the middle of the chart area. I mean, I'd like to separate it from the top edge of the canvas.
I can sort of get the result I want by adjusting the height of the charts (if they have the same height, they're aligned), but I'd like to know how to move the chart inside the "canvas".
Thanks in advance for any help.
You can use the alt.hconcat() function and pass center=True. For example:
import altair as alt
import pandas as pd
df = pd.DataFrame({
'label': ['A', 'B', 'C', 'D', 'E'],
'value': [3, 5, 4, 6, 2],
})
chart1 = alt.Chart(df).mark_bar().encode(y='label', x='value')
chart2 = alt.Chart(df).mark_bar().encode(x='label', y='value')
alt.hconcat(chart1, chart2, center=True)

Filter altair heatmap with heat shading and text value

I'm trying to create a heatmap using the Altair lib, but I'm looking to filter my data with a slider for different views. The slider works fine with the standard color only heatmap, but when I try to add text to the boxes to describe the values in each cell I get the javascript error below. (Adding the text to the heatmap works fine without any filter slider.)
import altair as alt
import pandas as pd
source = pd.DataFrame({'year':[2017, 2017, 2018, 2018],
'age':[1, 2, 1, 2],
'y':['a', 'a', 'a', 'a'],
'n':[1, 2, 3, 4]})
slider = alt.binding_range(min=2017, max=2018, step=1)
select_year = alt.selection_single(name="my_year_slider", fields=['year'], bind=slider)
base = alt.Chart(source).add_selection(select_year).transform_filter(select_year)
heatmap = base.mark_rect().encode(
x='age:O',
y='y:O',
color='n:Q')
text = base.mark_text(baseline='middle').encode(
x='age:O',
y='y:O',
text='n:Q')
heatmap + text
This returns Javascript Error: Duplicate signal name: "my_year_slider_tuple"
Because you added the selection to the base chart and then layered two copies of it, the selection is defined twice. The solution is to only define the selection once; something like this:
import altair as alt
import pandas as pd
source = pd.DataFrame({'year':[2017, 2017, 2018, 2018],
'age':[1, 2, 1, 2],
'y':['a', 'a', 'a', 'a'],
'n':[1, 2, 3, 4]})
slider = alt.binding_range(min=2017, max=2018, step=1)
select_year = alt.selection_single(name="my_year_slider", fields=['year'], bind=slider)
base = alt.Chart(source).encode(
x='age:O',
y='y:O',
).transform_filter(select_year)
heatmap = base.mark_rect().encode(color='n:Q').add_selection(select_year)
text = base.mark_text(baseline='middle').encode(text='n:Q')
heatmap + text

Pyplot shows different colors in legend but plots in same color

I'm trying to plot nstrats different pareto fronts using Pyplot, but whatever I seem to try, each front comes out in the same color. I've tried doing this both with and without the colors array below, as I read that Python automatically cycles through colors. Maybe something is wrong with how I'm using the plot/scatter function in a loop? My code and a link to the output plot is below. Especially note that the legend shows the correct colors for my fronts, but all of the output is one color. I've validated that the output are from multiple strategies. Thank you for your help!
colors = ['r', 'g', 'b']
for i in range(nstrats):
plt.scatter(NPV[i], dev[i], color=colors[i], label = 'Strategy ' + str(i+1))
plt.xlabel("NPV of Harvest Strategy")
plt.ylabel("Standard Dev of Yearly Harvest")
plt.title("Pareto Front for Each Strategy")
plt.legend(loc='best')
plt.show()
What my program outputs:
This happens because everything is plotted by the same single plot command.
Here is an example:
import matplotlib.pyplot as plt
import numpy as np
data = [1, 100, 2, 200, 3, 300, 4, 400, 5, 500]
data = np.reshape(data, (5, 2))
column = 0
chunks = dict()
chunks[0] = data[data[:, column] < 300]
chunks[1] = data[data[:, column] >= 300]
for i in chunks:
plt.plot(chunks[i][:, 0], chunks[i][:, 1], "x", label="Chunk %i" % i)
plt.legend()
plt.show()
The data should be splitted into chunks based on the y value. But a mistake occurred, instead of 1 for column y 0 was entered for column x. Because of this mistake chunk 0 is a complete copy of data and chunk 1 is empty. Therefore everything is plotted by the first plot command in the same color. The second command only adds the label with a different color to the legend since there is no data to plot in chunk 1.
Try changing the argument color to c.
colors = ['r', 'g', 'b']
for i in range(nstrats):
plt.scatter(NPV[i], dev[i], c=colors[i], label = 'Strategy ' + str(i+1))
plt.xlabel("NPV of Harvest Strategy")
plt.ylabel("Standard Dev of Yearly Harvest")
plt.title("Pareto Front for Each Strategy")
plt.legend(loc='best')
plt.show()
NOTE: Keep the value of (nstart <= 3). Otherwise, it will not work.

seaborn factorplot: set series order of display in legend

Seaborn, for some special cases, order the legend sometimes differently than the plotting order:
data = {'group': [-2, -1, 0] * 5,
'x': range(5)*3,
'y' : range(15)}
df = pd.DataFrame(data)
sns.factorplot(kind='point', x='x', y='y', hue='group', data=df)
While the plotting sequence is [-2, -1, 0], the legend is listed in order of [-1, -2, 0].
My current workaround is to disable the legend in factorplot and then add the legend afterwards using matplotlib. Is there a better way?
I think what you're looking for is hue_order = [-2, -1, 0]
df = pd.DataFrame({'group': ['-2','-1','0'] * 5, 'x' : range(5) * 3, 'y' : range(15)})
sns.factorplot(kind = 'point', x = 'x', y= 'y', hue_order = ['-2', '-1', '0'], hue = 'group', data = df)
I just stumbled across this oldish post. The only answer doesn't seem to work for me but I found a more satisfying solution to change legend order.
Although in your examples the legends are set correctly for me, it is possible to change the ordre via the add_legend() method:
df = pd.DataFrame({'group': [-2,-1,0] * 5, 'x' : range(5) * 3, 'y' : range(15)})
ax = sns.factorplot(kind = 'point', x = 'x', y= 'y', hue = 'group', data = df, legend = False)
ax.add_legend(label_order = ['0','-1','-2'])
And for automated numerical sorting:
ax.add_legend(label_order = sorted(ax._legend_data.keys(), key = int))

Categories

Resources