Matplotlib Ticker - python

Can someone give me an example of how to use the following tickFormatters. The docs are uninformative to me.
ticker.StrMethodFormatter()
ticker.IndexFormatter()
for example I might think that
x = np.array([ 316566.962, 294789.545, 490032.382, 681004.044, 753757.024,
385283.153, 651498.538, 937628.225, 199561.358, 601465.455])
y = np.array([ 208.075, 262.099, 550.066, 633.525, 612.804, 884.785,
862.219, 349.805, 279.964, 500.612])
money_formatter = tkr.StrMethodFormatter('${:,}')
plt.scatter(x,y)
ax = plt.gca()
fmtr = ticker.StrMethodFormatter('${:,}')
ax.xaxis.set_major_formatter(fmtr)
would set my tick labels to be dollar signed and comma sep for thousands places ala
['$300,000', '$400,000', '$500,000', '$600,000', '$700,000', '$800,000', '$900,000']
but instead I get an index error.
IndexError: tuple index out of range
For IndexFormatter docs say:
Set the strings from a list of labels
don't really know what this means and when I try to use it my tics disappear.

The StrMethodFormatter works indeed by supplying a string that can be formatted using the format method. So the approach of using '${:,}' goes in the right direction.
However from the documentation we learn
The field used for the value must be labeled x and the field used for the position must be labeled pos.
This means that you need to give an actual label x to the field. Additionally you may want to specify the number format as g not to have the decimal point.
fmtr = matplotlib.ticker.StrMethodFormatter('${x:,g}')
The IndexFormatter is of little use here. As you found out, you would need to provide a list of labels. Those labels are used for the index, starting at 0. So using this formatter would require to have the x axis start at zero and ranging over some whole numbers.
Example:
plt.scatter(range(len(y)),y)
fmtr = matplotlib.ticker.IndexFormatter(list("ABCDEFGHIJ"))
ax.xaxis.set_major_formatter(fmtr)
Here, the ticks are placed at (0,2,4,6,....) and the respective letters from the list (A, C, E, G, I, ...) are used as labels.

Related

Differentiate some points in a python plot

I have a plot in Python of the following
a = (1,2,3,4,5)
b = (1,2,3,4,5)
plot(a,b)
I want to differentiate some of the x axis points in the plot with a dots of unique colors
for examples this points
c = (2,4)
I tried the following:
a = (1,2,3,4,5)
b = (1,2,3,4,5)
plot(a,b)
matplotlib.pyplot.scatter(a, c)
But I got the error "x and y must be the same size"
If you google "matplotlib scatter" or something like that, one of the first links will be this one, which says that the function gives "a scatter plot of y vs. x". So, the error message you shared makes sense, since the length of a is greater than the length of c. I hope it makes sense to you why what your original code gives that error.
Your problem is specific enough that there isn't an out-of-the-box solution in matplotlib that I'm aware of, so this will require the use of your own custom functions. I'll give one approach that might be helpful to you. I'm structuring my answer here so you can see how to solve the issue to your own specifications, instead of relying too heavily on copying & pasting other people's code, since the latter makes it harder for you to do exactly what you want to do.
To restate the problem in more concise terms: How can a matplotlib user plot a line, and then put markers on a subset of the line's points, specified by their x-values?
To begin, here is what your program might look like currently:
import matplotlib.pyplot as plt
x_coords = [ ] # fill in x_coords here
y_coords = [ ] # fill in y_coords here
marker_x_coords = [ ] # fill in with x coordinates of points you want to have markers
plt.plot(x_coords, y_coords) # plots the line
#### TODO: plot the markers ####
Now, you have the x-values of the points you want to put markers on. How might you get their corresponding y-values? Well, you can make a function that searches for the index of the x-value in x_coords, and then gives the corresponding value at the same index of y_coords:
def getYVals(x_coords_lst, y_coords_lst, marker_x_coords_lst):
marker_y_coords = [] # start with an empty list
for x_point in marker_x_coords_lst:
point_index = x_coords_lst.index(x_point) # get the index of a point in the x list
marker_y_coords.append(y_coords_lst[point_index]) # add the value of the y list at that index to the list that will be returned
return marker_y_coords
This isn't the fastest method, but it is the clearest on what is happening. Here's an alternative that would give the same results but would likely perform faster computationally (it uses something called "list comprehension"):
def getYVals(x_coords_lst, y_coords_lst, marker_x_coords_lst):
return [y_coords_lst[x_coords_lst.index(x_val)] for x_val in marker_x_coords_lst]
The output of either of the getYVals function above will work as the y values of the markers. This can be put in the y argument of plt.scatter, and you already have the x values of it, so from there you should be good to go!

What is the best way to display numeric and symbolic expressions in python?

I need to produce calculation reports that detail step by step calculations, showing the formulas that are used and then showing how the results are achieved.
I have looked at using sympy to display symbolic equations. The problem is that a sympy symbol is stored as a variable, and therefore I cannot also store the numerical value of that symbol.
For example, for the formula σ=My/I , I need to show the value of each symbol, then the symbolic formula, then the formula with values substituted in, and finally the resolution of the formula.
M=100
y= 25
I=5
σ=My/I
σ=100*25/5
σ=5000
I’m new to programming and this is something I’m struggling with. I’ve thought of perhaps building my own class but not sure how to make the distinction the different forms. In the example above, σ is at one point a numerical value, one half of an symbolic expression, and also one half of a numerical expression.
Hopefully the following helps. This produces more or less what you want. You cannot get your fifth line of workings easily as you'll see in the code.
from sympy import *
# define all variables needed
# trying to keep things clear that symbols are different from their numeric values
M_label, y_label, l_label = ("M", "y", "l")
M_symbol, y_symbol, l_symbol = symbols(f"{M_label} {y_label} {l_label}", real=True)
M_value, y_value, l_value = (100, 25, 5)
# define the dictionary whose keys are string names
# and whose values are a tuple of symbols and numerical values
symbols_values = {M_label: (M_symbol, M_value),
y_label: (y_symbol, y_value),
l_label: (l_symbol, l_value)}
for name, symbol_value in symbols_values.items():
print(f"{name} = {symbol_value[1]}") # an f-string or formatted string
sigma = M_symbol * y_symbol / l_symbol
print(f"sigma = {sigma}")
# option 1
# changes `/5` to 5**(-1) since this is exactly how sympy views division
# credit for UnevaluatedExpr
# https://stackoverflow.com/questions/49842196/substitute-in-sympy-wihout-evaluating-or-simplifying-the-expression
sigma_substituted = sigma\
.subs(M_symbol, UnevaluatedExpr(M_value))\
.subs(y_symbol, UnevaluatedExpr(y_value))\
.subs(l_symbol, UnevaluatedExpr(l_value))
print(f"sigma = {sigma_substituted}")
# option 2
# using string substitution
# note this could replace words like `log`, `cos` or `exp` to something completely different
# this is why it is unadvised. The code above is far better for that purpose
sigma_substituted = str(sigma)\
.replace(M_label, str(M_value))\
.replace(y_label, str(y_value))\
.replace(l_label, str(l_value))
print(f"sigma = {sigma_substituted}")
sigma_simplified = sigma\
.subs(M_symbol, M_value)\
.subs(y_symbol, y_value)\
.subs(l_symbol, l_value)
print(f"sigma = {sigma_simplified}")
Also note that if you wanted to change the symbols_values dictionary to keys being the symbols and values being the numerical values, you will have a hard time or seemingly buggy experience using the keys. That is because if you have x1 = Symbol("x") and x2 = Symbol("x"), SymPy sometimes treats the above as 2 completely different variables even though they are defined the same way. It is far easier to use strings as keys.
If you begin to use more variables and choose to work this way, I suggest using lists and for loops instead of writing the same code over and over.

matplotlib convert length of string to get coordinate units of fig.text

Basic questions:
Is there a way to convert the length of a string to coordinate units as used in the arguments in fig.text? If so, how?
More info:
In attempting to use fig.text like this:
fig.text(0.54,0.909,'Title Starts Here', ha="center", va="bottom", fontsize=12,color="black")
fig.text(0.7,0.909,'and Ends Here', ha="center", va="bottom", fontsize=12,color="red")
I think it is possible to make getting the correct text coordinates easier in the case of trying to create a centered title with multiple instances of fig.text, like above.
I would like to do the following to achieve this using the coordinate system and the length of the strings, but that's where I'm stuck.
I believe the length of a character does not equal a unit of the length of a coordinate. But if I could somehow do the following, I think it would work:
figwidth = 1.0 (width available for entire title).
Get the length of the entire string to be plotted on one line of the title (i.e. strlen = len('Title Starts Here and Ends Here').
convert strlen to units of the coordinate system (strlenconv).
start = (figwidth/2) - (strlenconv/2)
-this tells me where to begin the first part of the text.
start + len('Title Starts Here ')(converted to coordinate units)
-this tells me where to start the second part of the text, note the extra space.
So, I think this can work if only I can get help with step 3.

Sort arrays by two criteria

My figure has a very large legend, and to make it easier to find each corresponding line, I want to sort the legend by the y value of the line at the last datapoint.
plots[] contains a list of Line2D objects,
labels[] is the corresponding labels to each Line2D object, generated through labels = [plot._label for plot in plots]
I want to sort each/both arrays by plots._y[-1], the value of y at the last point
Bonus points if I can also sort first by _linestyle (a string) and then by the y value.
I am unsure of how to do this well, I wouldn't think it would require a loop, but it might because I am sorting by 2 criteria, one of which will be tricky to deal with (':' and '-' are the values of linestyle). Is there a function that can help me out here?
edit: it just occurred to me that I can generate labels after I sort, so that uncomplicates things a bit. However, I still have to sort plots by each object's linestyle and y[-1] value.
I believe this may work:
sorted(plots, key = lambda plot :(plot._linestyle, plot._y[-1]))

Matplotlib (pylab): how to specify a substituted html colour code for plotting?

I'm running into a problem which I suspect is related to order of substitution in the parser.
From the matplotlib docs, it has:
"For a greater range of colors, you have two options. You can specify the color using an html hex string, as in:
color = '#eeefff'"
I like this method because it gives me colour values that are (for me) easy to understand - and thus easy to set programmatically to colour choices that I want to achieve. So, I have a snippet of code, that, for instance, cycles between increasingly bright values of green (or at least that's what I would like it to achieve):
for t in range(runtime):
...
green = 0x008000
if t < 32:
x_vals = [x for line in film.frames[t] for x in range(len(line)) if line[x] > 0]
y_vals = [y for y in range(len(film.frames[t])) for x in film.frames[t][y] if x > 0]
if len(y_vals) > 0:
pylab.scatter(x_vals, y_vals, color=('#%x' % green), s=8, marker='s')
green += 0x400
if green > 0x00FF00: green = 0x00FF00
...except that this doesn't work. What I get is:
ValueError: to_rgba: Invalid rgba arg "#"
to_rgb: Invalid rgb arg "#"
invalid hex color string "#"
As I mentioned this looks like one of those arcane order-of-substitution issues because if I type a literal
pylab.scatter(x_vals, y_vals, color='#008000, s=8, marker='s')
things seem to work.
So how do I get a substituted html colour value into the plot function? It seems as though it should be a simple matter of exactly the right syntactic form - but I've not been able to guess what that is yet.
One thing I ABSOLUTELY DO NOT!!! want to end up having to do is using the 'alternative' method of floating-point tuples, from the documentation:
or you can pass an R , G , B tuple, where each of R , G , B are in the
range [0,1].
because:
1) Floating-point values are always something of an approximation, thus mapped into an integral RGB colour space there is a chance if your computations round in poor manner to get an imprecise colour - not quite the one you selected (or wished to select)
2) The hex triplet method of colour specification is at least to me quite intuitive and obvious. While it's easy enough to think about how they would map into the 0-1 colour space, I'd rather not have to resort to that method because it means a "translation" in my head - as well as in the code - you'd have to do, e.g. green = hex_green/0x100 (or indeed is that right?) Does the [0,1] space actually map as [0,1) - so that hex 255 is one less that what would map to notional float 1, or do they map 0,1 in a space mapping from 0-255 as truly 256 values over the closed range [0,1] so that 1=255 etc. etc.? So the mapping is not intuitive and in fact is somewhat ambiguous.
So unless there is really NO option but to use a triplet form if you want to use a computed substitution, if it really is the case that the low-level design of the command parser permits no option other than a text literal, if you want to use the html colour specification, I'd rather not be offered 'solutions' using such a workaround. But by all means if this is the only way it can be done at all, please let me know.
Thanks.
The problems seems to be missing zeros at the start. Python removes the zeros that is not important for the numerical value.
'{0:06x}'.format(green) Use this to make it always pad with zeros to a length of 6 digits.
In [221]: '{0:x}'.format(green)
Out[221]: '8000'
In [222]: '{0:06x}'.format(green)
Out[222]: '008000'
'%06.x' %green also works if you want to use this syntax.

Categories

Resources