I've got a three-level bokeh.models.FactorRange which I use to draw tick labels on a vbar-plot. The problem is that there are dozens of factors in total and the lowest-level labels get very cramped.
I can use plot.xaxis.formatter = bokeh.models.PrintfTickFormatter(format='') to suppress drawing of the lowest-level labels, but this seems like an ugly hack. Also, I need to have the second-level tick labels to be rotated, yet plot.xaxis.major_label_orientation = ... only ever affects the lowest-level ticks (just like plot.xaxis.formatter does).
How to control each level of bokeh.models.FactorRange individually?
As of Bokeh 0.12.13, there is no way to control the individual orientations or formatting of different levels.
The basic initial work to revamp categorical support (for multi-level axes, etc) was a large update. Rather than add more even complexity and risk up front for features we were not sure anyone would want or need, we started with basic capability, expecting to hear from users in time what additional features were justified. This seems like it has come up a few times, so it would be reasonable to consider adding, but it would represent new work, so a GitHub feature request issue is the appropriate next step.
For completeness, I will mention that Bokeh is extensible, so it's always technically possible to create a Custom Extension. Axes are some of the most complicated code in Bokeh, and a full custom Axis would be non-trivial to write. However it's possible that would be sufficient to make subclass of CategoricalAxis and just override this one method:
https://github.com/bokeh/bokeh/blob/master/bokehjs/src/coffee/models/axes/categorical_axis.ts#L83-L110
That's where the currently hard-coded 'parallel' orientation are, and also where formatting could be overridden.
In latest Bokeh (2.2.0), the feature #bigreddot was talking about seems to have been implemented: you can call
p.xaxis.group_label_orientation = [angle in radians]
to set orientation of the outer labels, while as in the question
p.xaxis.major_label_orientation = [angle in radians]
sets the orientation of the inner labels.
Related
So I am making a program to plot a bar graph for a probability data set. The data set is not stored, at least I don't want it to. I need to plot a bar for every possibility,and I want the bars to be dynamic. Dynamic in the sense that I don't want them to be plotted by counting the occurrence of each item from the stored data set as I said the data set is not stored. I want the bars to generate with the data simultaneously. \n
I was trying to use python lists. So the bars would look something like, 36[****************]. But I can't think of using them dynamically. I am left with two possibilities, one that I generate like 60-120 bars (which is stupid). Or I store the data (which increases my work and execution time and load). And I also can't think of other things. So suggest me something please!
How do I dynamically visualize large and complex data sets (stored in Elasticsearch) with dozens of sub-graphs and drill-downs? I would like some kind of dynamic control over this, but would prefer to not roll my own (in Python). Is it possible to use Kibana for this kind of thing? Or is there some better tool for the task?
Best would be if I could have rudimentary control over layout, then be able to show a number of bar charts if the user wants to see time series, but if she wants to slice it perpendicular I could show for instance pie charts. User should be able to interactively klick her way down, generating AND/OR lucene expressions, etc.
The more I rubber duck, the more I feel like I need to build this myself in Bokeh or something of the kind. If I need to create all such business logic manually, what would be my best HTML/graphing library? Or are there plugins for Kibana that does this perhaps? If I have to create things manually, it does not necessarily need to be in Python, but it needs to be back-end (for Elasticsearch security).
I wrote my own, took around 500 LoC and a week or two of devopsing.
I'm working with a dataset regarding the survivors on the Titanic, where I'm trying to show the relationship between Age of passengers and the fare they paid.
This is what the data is currently formatted as:
from here, it was fairly easy to make a simple scatterplot, like so:
However, I am curious as to if there is a way to set the color of some of the points to be different based on the sex from the dataset. Most examples I have seen across the internet focus on how to change the color for two separate data sets. I initially tried to use an if statement to change the color depending on sex, but that didn't work for me the way I hoped it would.
Perhaps much easier with seaborn:
import seaborn as sns
data = sns.load_dataset('titanic')
sns.scatterplot('age', 'fare', data=data, hue='sex')
One potential solution I came up to after pondering a bit could potentially look like this as well:
The problem with this solution is you have to add more variables, which isn't ideal, and the results stack over each other a bit making it harder to see the data trends.
I'm currently pumping out some histograms with matplotlib. The issue is that because of one or two outliers my whole graph is incredibly small and almost impossible to read due to having two separate histograms being plotted. The solution I am having problems with is dropping the outliers at around a 99/99.5 percentile. I have tried using:
plt.xlim([np.percentile(df,0), np.percentile(df,99.5)])
plt.xlim([df.min(),np.percentile(df,99.5)])
Seems like it should be a simple fix, but I'm missing some key information to make it happen. Any input would be much appreciated, thanks in advance.
To restrict focus to just the middle 99% of the values, you could do something like this:
trimmed_data = df[(df.Column > df.Column.quantile(0.005)) & (df.Column < df.Column.quantile(0.995))]
Then you could do your histogram on trimmed_data. Exactly how to exclude outliers is more of a stats question than a Python question, but basically the idea I was suggesting in a comment is to clean up the data set using whatever methods you can defend, and then do everything (plots, stats, etc.) on only the cleaned dataset, rather than trying to tweak each individual plot to make it look right while still having the outlier data in there.
I'm trying to allow wrapping of the text in a Gtk.CellRendererText but I have small problem:
Those rows are very large.
The only code I changed was this:
cell = Gtk.CellRendererText(markup=0)
cell.set_property("wrap_mode", Pango.WrapMode.WORD)
cell.set_property("wrap_width", 20)
And that makes it wrap, but it also seemed to make this visual issue appear
I seem to remember reading something about this on a blog at planet gnome quite a while ago. From what I remember there is something to do with the height-for-width drawing model that means when wrapping is enabled GtkLabel etc request enough height to reflow the text for wrap-width even if there is more horizontal space available which leaves loads of empty space when the width is wider. There was a fix but I'm afraid I can't remember it at the moment, I'll try and find the original post later.
I've tried but I can't find the post, however having read some more I'm pretty sure this is the problem. There is some discussion related to GtkTable doing similar things at https://bugs.launchpad.net/ubuntu/+source/gtk+3.0/+bug/825173 I've a nasty feeling the fix I can't remember properly might have been to turn off wrapping. I guess it would be possible to get a notification when the column width changes and make wrap-width the correct width for that value but that's a bit of a pain.
If you can live with the column being a fixed width, set the expand property of the column to False and the fixed-width property to True then set the wrap-width, width-chars and max-width-chars properties of the renderer all to the same value then the text wraps without any extra space.