How to prevent x-axis values ranging from least to greatest? - python

I am unable to prevent the x values from going least to greatest and I need them in a specific order, is there such a way to do this in Python?
This is what the order of the x values needs to be instead.

You can plot them with "default" x values and change the tick labels.
plt.plot(Y)
plt.xticks(ticks=range(len(Y)), labels=X) # where X is your list with the order you want

Related

Python - How to get integrated values from np.array?

I am trying to get integrated values from np.array, list of values. Not the surface under the function, but values. I have values of acceleration and want to get values of velocity.
So let's say I have an arry like:
a_x = np.array([111.2, 323.2, 123.3, 99.38, 65.23, -0.19, -34.67])
And I try to get integrated values from this array to get the values of velocity.
If I use lets say simps, quad, trapz, I get the one number (surface).
So how do you integrate np.array values and get integrated values that you can store in a list?
You can't do it by the way you want it, because you didn't understand the process behind it. If you are given acceleration, then using the following equation:
You are able only to find INDEFINITE integral, you know the acceleration, but you don't know starting conditions, thus your solution can't be empty.
As the solution to each of those questions is: "Find velocity given an acceleration", then the solution would be v(t)=integral of a(t)dt+c, where your acceleration is constant, so it doesn't rely on t and it can be written as v(t)=at+c, but still - we don't know anything about how long acceleration lasted and what is the starting condition.
But answering the question about getting values which can be stored in a list - you do it by indexing your values of np.array:
import numpy as np
a_x = np.array([111.2,323.2,123.3])
#Gets first value
print(a_x[0])
If I use lets say simps, quad, trapz, I get the one number (surface).
Because quad,simps,or trapz are methods used for given points, which return value of integral with those given points with corresponding method, for example:
numpy.trapz(y, x=None, dx=1.0, axis=- 1)
if x isn't specified (as in your case), it assumes that you want to use trapeze to estimate the field under the value y of given points with x equal distribution of x. It has to give one value.

Matplotlib Ticker

Can someone give me an example of how to use the following tickFormatters. The docs are uninformative to me.
ticker.StrMethodFormatter()
ticker.IndexFormatter()
for example I might think that
x = np.array([ 316566.962, 294789.545, 490032.382, 681004.044, 753757.024,
385283.153, 651498.538, 937628.225, 199561.358, 601465.455])
y = np.array([ 208.075, 262.099, 550.066, 633.525, 612.804, 884.785,
862.219, 349.805, 279.964, 500.612])
money_formatter = tkr.StrMethodFormatter('${:,}')
plt.scatter(x,y)
ax = plt.gca()
fmtr = ticker.StrMethodFormatter('${:,}')
ax.xaxis.set_major_formatter(fmtr)
would set my tick labels to be dollar signed and comma sep for thousands places ala
['$300,000', '$400,000', '$500,000', '$600,000', '$700,000', '$800,000', '$900,000']
but instead I get an index error.
IndexError: tuple index out of range
For IndexFormatter docs say:
Set the strings from a list of labels
don't really know what this means and when I try to use it my tics disappear.
The StrMethodFormatter works indeed by supplying a string that can be formatted using the format method. So the approach of using '${:,}' goes in the right direction.
However from the documentation we learn
The field used for the value must be labeled x and the field used for the position must be labeled pos.
This means that you need to give an actual label x to the field. Additionally you may want to specify the number format as g not to have the decimal point.
fmtr = matplotlib.ticker.StrMethodFormatter('${x:,g}')
The IndexFormatter is of little use here. As you found out, you would need to provide a list of labels. Those labels are used for the index, starting at 0. So using this formatter would require to have the x axis start at zero and ranging over some whole numbers.
Example:
plt.scatter(range(len(y)),y)
fmtr = matplotlib.ticker.IndexFormatter(list("ABCDEFGHIJ"))
ax.xaxis.set_major_formatter(fmtr)
Here, the ticks are placed at (0,2,4,6,....) and the respective letters from the list (A, C, E, G, I, ...) are used as labels.

Getting data of a box plot - Matplotlib

I have to plot a boxplot of some data, which I could easily do with Matplotlib. However, I was requested to provide a table with the data presented there, like the whiskers, the medians, standard deviation, and so on.
I know that I could calculate these "by hand", but I also know, from the reference, that the boxplot method:
Returns a dictionary mapping each component of the boxplot to a list of the matplotlib.lines.Line2D instances created. That dictionary has the following keys (assuming vertical boxplots):
boxes: the main body of the boxplot showing the quartiles and the median’s confidence intervals if enabled.
medians: horizonal lines at the median of each box.
whiskers: the vertical lines extending to the most extreme, n-outlier data points.
caps: the horizontal lines at the ends of the whiskers.
fliers: points representing data that extend beyone the whiskers (outliers).
So I'm wondering how could I get these values, since they are matplotlib.lines.Line2D.
Thank you.
As you've figured out, you need to access the members of the return value of boxplot.
Namely, e.g. if your return value is stored in bp
bp['medians'][0].get_ydata()
>> array([ 2.5, 2.5])
As the boxplot is vertical, and the median line is therefore a horizontal line, you only need to focus on one of the y-values; i.e. the median is 2.5 for my sample data.
For each "key" in the dictionary, the value will be a list to handle for multiple boxes. If you have just one boxplot, the list will only have one element, hence my use of bp['medians'][0] above.
If you have multiple boxes in your boxplot, you will need to iterate over them using e.g.
for medline in bp['medians']:
linedata = medline.get_ydata()
median = linedata[0]
CT Zhu's answer doesn't work unfortunately, as the different elements behave differently. Also e.g. there's only one median, but two whiskers...therefore it's safest to manually treat each quantity as outlined above.
NB the closest you can come is the following;
res = {}
for key, value in bp.items():
res[key] = [v.get_data() for v in value]
or equivalently
res = {key : [v.get_data() for v in value] for key, value in bp.items()}

Sort arrays by two criteria

My figure has a very large legend, and to make it easier to find each corresponding line, I want to sort the legend by the y value of the line at the last datapoint.
plots[] contains a list of Line2D objects,
labels[] is the corresponding labels to each Line2D object, generated through labels = [plot._label for plot in plots]
I want to sort each/both arrays by plots._y[-1], the value of y at the last point
Bonus points if I can also sort first by _linestyle (a string) and then by the y value.
I am unsure of how to do this well, I wouldn't think it would require a loop, but it might because I am sorting by 2 criteria, one of which will be tricky to deal with (':' and '-' are the values of linestyle). Is there a function that can help me out here?
edit: it just occurred to me that I can generate labels after I sort, so that uncomplicates things a bit. However, I still have to sort plots by each object's linestyle and y[-1] value.
I believe this may work:
sorted(plots, key = lambda plot :(plot._linestyle, plot._y[-1]))

Most Efficient Way to Automate Grouping of List Entries

Background:I have a very large list of 3D cartesian coordinates, I need to process this list to group the coordinates by their Z coordinate (ie all coordinates in that plane). Currently, I manually create groups from the list using a loop for each Z coordinate, but if there are now dozens of possible Z (was previously handling only 2-3 planes)coordinates this becomes impractical. I know how to group lists based on like elements of course, but I am looking for a method to automate this process for n possible values of Z.Question:What's the most efficient way to automate the process of grouping list elements of the same Z coordinate and then create a unique list for each plane?
Code Snippet:
I'm just using a simple list comprehension to group individual planes:
newlist=[x for x in coordinates_xyz if insert_possible_Z in x]
I'm looking for it to automatically make a new unique list for every Z plane in the data set.
Data Format:
((x1,y1,0), (x2, y2, 0), ... (xn, yn, 0), (xn+1,yn+1, 50),(xn+2,yn+2, 50), ... (x2n+1,y2n+1, 100), (x2n+2,y2n+2, 100)...)etc. I want to automatically get all coordinates where Z=0, Z=50, Z=100 etc. Note that the value of Z (increments of 50) is an example only, the actual data can have any value.Notes:My data is imported either from a file or generated by a separate module in lists. This is necessary for interface with another program (that I have not written).
The most efficient way to group elements by Z and make a list of them so grouped is to not make a list.
itertools.groupby does the grouping you want without the overhead of creating new lists.
Python generators take a little getting used to when you aren't familiar with the general mechanism. The official generator documentation is a good starting point for learning why they are useful.
If I am interpreting this correctly, you have a set of coordinates C = (X,Y,Z) with a discrete number of Z values. If this is the case, why not use a dictionary to associate a list of the coordinates with the associated Z value as a key?
You're data structure would look something like:
z_ordered = {}
z_ordered[3] = [(x1,y1,z1),(x2,y2,z2),(x3,y3,z3)]
Where each list associated with a key has the same Z-value.
Of course, if your Z-values are continuous, you may need to modify this, say by making the key only the whole number associated with a Z-value, so you are binning in increments of 1.
So this is the simple solution I came up with:
groups=[]
groups[:]=[]
No_Planes=#Number of planes
dz=#Z spacing variable here
for i in range(No_Planes):
newlist=[x for x in coordinates_xyz if i*dz in x]
groups.append(newlist)
This lets me manipulate any plane within my data set simply with groups[i]. I can also manipulate my spacing. This is also an extension of my existing code, as I realised after reading #msw's response about itertools, looping through my current method was staring me in the face, and far more simple than I imagined!

Categories

Resources