I have defined a function to plot a histogram. Inside this function I am doing some analysis of the data which I obtain from 2 clicks on the figure.
My code is below:
def hist_maker():
heights,edges = np.histogram(data, 1000)
edges = edges[:-1]+(edges[1]-edges[0])
fig, ax = plt.subplots()
ax.plot(edges,heights) # plot histogram
plt.yscale('log', nonposy='clip')
ax.set(title=titl, xlabel='ADC Value(DN/40)', ylabel='Frequency')
point1, point2 = fig.ginput(2) # get input from 2 clicks on figure
ax.axvspan(point1[0], point2[0], color='red', alpha=0.5) # paint selected area in red
mask = (edges>point1[0]) & (edges<point2[0])
# calculate which values are selected and display mean
fig.text(0.2,0.84,'Mean: ' + str((sum(edges[mask]*heights[mask])/sum(heights[mask]))))
mean = sum(edges[mask]*heights[mask])/sum(heights[mask])
mean_noise = edges[heights.argmax() # Find the x value corresponding to the max y value
fig.text(0.2,0.8,'Std: ' + str(g))
What's actually going on inside the function all works fine. But, for example from the code if I wanted to use the caculated mean_noise at some point later on in the code, I get an error which says that mean noise is not defined (which is right because it isn't defined outside of the function)
So my question is how to extract the value of mean_noise that I calculate automatically when I have defined hist_maker so that I can use it later on?
One way around this is to get rid of the function hist_maker and just repeat the code inside for each histogram I am plotting which I'm sure would work. But as I am plotting multiple histograms I thought it would be easier to define a function and then just keep calling that for each histogram.
Simplest solution - the first line of your function should be:
global mean_noise
If you then run (outside the function):
hist_maker()
print(mean_noise)
The print should work. If you reversed the order of those two lines, you'll get a NameError.
Note, though, this is generally not considered good programming. The generally considered better solution would be to return mean_noise at the end of your function.
Related
I am writing some simple scripts to plot a graph given a trigonometric function (in this example, a sine).
My issue is that I'd like to plot JUST two periods of the given trig function. To clarify, in trigonometry a Period is the length (on a graph) that ONE wave takes up. For sin and cos, one period is 2pi.
I'd like to take my existing code, and (preferably) using matplotlib, plot two periods of my given trig function, and line up a couple of points on the graph with a couple of points on my function.
If it's possible, I would like to be able to plot my function so that the start of the first period lines up with my first label, the highest point of the first period lines up with the second label, the point where my function crosses the x-axis with the third label, the lowest point with the fourth label, and the end of my first period/beginning of my second period with the fifth label. This pattern would then repeat for the second period. From here on, I'm going to refer to the x labels as the "Period Markings".
I've come up with three possible solutions for this:
I could set the borders of my graph (in this case x = -4 and x = 4) to be labeled as the first and ninth Period Markings respectively, then constrain my function to just be within the graph somehow.
I could somehow set a parameter in matplotlib to only plot 4pi (the length of two periods) units worth of line, although in that case, however, I don't think that the Period Markings would match up with their desired points.
If matplotlib supports it, I could find the low points, x-intercepts, and high points of the graph, then assign my Period Markers to each one from left to right. This would have the advantage of removing the necessity to plot ONLY two periods, as the Period Markers would dictate the beginning and end of the two periods.
Below I've inserted a couple of things:
A copy of the plotting part of my code, containing a sample equation and some sample Period Markings
A screenshot of the graph of the given sample equation
A visual representation of where each Period Marking would line up with, ideally, as well as a line demarcating an estimation of two full periods.
The standard form of a sin function is y = aSIN(bx-c)+d. The equation here is just sin(x), but you can see how variables c and d play a role in determining the graph. Usually, the xlabels array would be filled in with variables that are determined earlier in the script, as would all the variables at the top (func, a, b, c, d).
import math
import matplotlib.pyplot as plt
import numpy as np
func = sin
a = 1
b = 1
c = 0
d = 0
xlabels = np.array(['-2pi', '-3pi/2', '-pi', '-pi/2',
'0','pi/2', 'pi', '3pi/2','2pi'])
xlabelcount = -4, -3, -2, -1, 0, 1, 2, 3, 4
x = np.arange(-4, 4, 0.01)
if func == 'sin':
ypoints = a*np.sin(2*x-c)+d
if func == 'cos':
ypoints = a*np.cos(2*x+c)+d
if b < 0:
plt.gca().invert_yaxis()
plt.title('Wave Function')
plt.xlabel('Period (Not to Scale)')
plt.ylabel('Amplitude')
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
plt.plot(x, ypoints)
plt.xticks(ticks=xlabelcount,labels=xlabels)
plt.show()
Plot of sin(x)
Preferred Period Marking placements
I hope this can provide a comprehensive understanding of the issue I face, and any help would be greatly appreciated. I feel that I've done a fair amount of Googling around, but nothing has yielded a good answer. I apologize in advance if I'm missing something really obvious.
Thanks,
dreadlearner
If I understand this correctly, you would like to add points on the curve at certain predefined locations on x-axis (period markings). If this is correct, the best way is to evaluate the value of the function at those particular "period markings" and plot this as a single point. Something like:
fn = "sin"
if fn == "sin":
fn = np.sin
elif fn == "cos":
fn = np.cos
# if required, the next three statements can be
# customized for each function by shifting them
# inside the if ... else blocks
x = np.linspace(-2*np.pi, 2*np.pi, 1000)
points = [i * np.pi/2 for i in range(-4, 5)]
labels = ["-2π", "-3π/2", "-π", "-π/2", "0", "π/2", "π", "3π/2", "2π"]
fig, ax = plt.subplots()
ax.plot(x, fn(x))
ax.set_xticks(points)
ax.set_xticklabels(labels)
# the next line is what you probably want
for pt in points:
ax.plot(pt, fn(pt), "ok")
ax.hlines(0, x[0], x[-1], "r")
plt.show()
Looks like this:
There is a for-loop in my part of code, and every step it can generate new tpr(as X), fpr(as Y) like that
0.05263157894736842 0.1896551724137931
0.06578947368421052 0.19540229885057472
0.07894736842105263 0.22988505747126436
0.07894736842105263 0.25862068965517243
0.07894736842105263 0.28735632183908044
I want collect all these points and get a full plot, but it didn't work. And my code are attached below
for i in range (-30,20):
predicted = (np.sign(t+i*1e-4)+1)/2.
vals, cm = re.get_CM_vals(y_test, predicted)
tpr = re.TPR_CM(cm)
fpr = re.FPR_CM(cm)
#print(tpr, fpr)
plt.plot(fpr, tpr,'b.-',linewidth=1)
plt.show()
Beside, I want to the the right angle line between points like that.is there a func in matplotlib?
Using your current code, I suggest adding the x values to an array and the y values to another array. You could also use something like: ArrayName = [[],[]], then append the x and y values to ArrayName[0] and ArrayName[1], respectively. Not only would this actually work, but it would be slightly faster, since the plt.plot and plt.scatter functions work faster plotting all the points at once instead of through a for loop.
If you don't want to plot the points connected with lines, I still suggest using an array since that would be faster. (It wouldn't be that much faster in this case, but it's a good habit to have.
The Problem:
Using NumPy, I have created an array of random points within a range.
import numpy as np
min_square = 5
positions = (np.random.random(size=(100, 2)) - 0.5) * 2 * container_radius
Where container_radius is an integer and min_square is an integer.
Following that, using matplotlib, I plot the points on a graph.
import matplotlib.pyplot as plt
plt.plot(positions[:, 0], positions[:, 1], 'r.')
plt.show()
This graph shows me the distribution of the points in relation to each other.
What I am looking for is a method to implement something similar to or exactly a k-d tree to draw a rectangle over the densest area of the scatter plot with a defined minimum for the size.
This would be done using plt.gca().add_patch(plt.Rectangle((x, y), width=square_size, height=square_side, fill=None where square_side is the defined by the density function and is at least a minimum sizeo of min_square.
Attempts to Solve the Problem:
So far, I have created my own sort of density function that is within my understanding of Python and easy enough to code without lagging my computer too hard.
The solve comes in the form of creating an additional predefined variable intervals which is an integer.
Using what I had so far, I define a function to calculate the densities by checking if the points are within a range of floats.
# clb stands for calculate_lower_bound
def clb(x):
return -1 * container_radius + (x * 2 * container_radius - min_square) / (intervals - 1)
# crd stands for calculate_regional_density
def crd(x, y):
return np.where(np.logical_and(\
np.logical_and(positions[:, 0] >= clb(x), positions[:, 0] < clb(x) + min_square),\
np.logical_and(positions[:, 1] >= clb(y), positions[:, 1] < clb(y) + min_square)))[0].shape[0]
Then, I create a NumPy array of size size=(intervals, intervals) and pass the indices of the array (I have another question about this as I am currently using a quite inefficient method) as inputs into crd(x,y) and store the values in another array called densities. Then using some method, I calculate the maximum value in my densities array and draw the rectangle using some pretty straightforward code that I do not think is necessary to include here as it is not the problem.
What I Looking For:
I am looking for some function, f(x), that computes the dimensions and coordinates of a square encompassing the densest region on a scatterplot graph. The function would have access to all the variables it needs such as positions, min_square, etc. If you could use informative variable names or explain what each variable means, that would be a great help as well.
Other (Potentially) Important Notes:
I am looking for something that gets the job done in a reasonable time. In most scenarios, I am going to be working with around 10000 points and I need to calculate the densest region around 100 times so the function needs to be efficient enough so that the task completes within around 10-20 seconds.
As such, approximations using formulas like the example I have shown are completely valid as long as they implement well and are able to grow the dimensions of the square larger if necessary.
Thanks!
Create a method plot(window, expression, color = "black") to plot the expression in the
window.
This is what i've done:
from math import *
from graphics import *
win = GraphWin()
def plot(window, expression, color = "black"):
#Evaluates given expression and plots it in "window". Returns the list of all the plotted points.
points = []
#Evalute expression over 1000 different values and for each (x,y) pair plot the point.
for i in range(0, 1001):
try:
x = i/100.0
y = eval(expression)
plot(x,y)
except Exception:
print("For ", x, " the expression is invalid")
return points
So i guess i have done something wrong. Can someone help me? :)
Looking at your code you have a function called plot that calls plot - this is because of the classic error of from somewhere import *.
I suspect that you are trying to call graphics.plot within plot so get rid of the from graphics import * and put graphics. before the items that you are using from there.
You are also not filling in, or using your points list.
There are a couple of obvious problems:
You create a list of points, never put anything into it, then return it (still empty) at the end; and
For each individual point x, y you call plot again recursively (see Steve Barnes' answer), passing x as window and y as expression.
I suggest you separate this into two parts: one to create a list of points based on the function, and one to plot this list of points.
don't use the name plot as your own function name, this will shadow the matplotlib.pyplot.plot method name if you have import *.
the matplotlib plot method is used to create lines, which needs a series of Xs and Ys each time you call it. E.g., plot(1,2) or plot([1], [2]) will plot nothing in the figure, while plot([1,2], [3,4]) draws a line between point (1,3) and (2,4). You need to call scatter(1, 2) if you insist on plotting one point each time.
Let's say I have two histograms and I set the opacity using the parameter of hist: 'alpha=0.5'
I have plotted two histograms yet I get three colors! I understand this makes sense from an opacity point of view.
But! It makes is very confusing to show someone a graph of two things with three colors. Can I just somehow set the smallest bar for each bin to be in front with no opacity?
Example graph
The usual way this issue is handled is to have the plots with some small separation. This is done by default when plt.hist is given multiple sets of data:
import pylab as plt
x = 200 + 25*plt.randn(1000)
y = 150 + 25*plt.randn(1000)
n, bins, patches = plt.hist([x, y])
You instead which to stack them (this could be done above using the argument histtype='barstacked') but notice that the ordering is incorrect.
This can be fixed by individually checking each pair of points to see which is larger and then using zorder to set which one comes first. For simplicity I am using the output of the code above (e.g n is two stacked arrays of the number of points in each bin for x and y):
n_x = n[0]
n_y = n[1]
for i in range(len(n[0])):
if n_x[i] > n_y[i]:
zorder=1
else:
zorder=0
plt.bar(bins[:-1][i], n_x[i], width=10)
plt.bar(bins[:-1][i], n_y[i], width=10, color="g", zorder=zorder)
Here is the resulting image:
By changing the ordering like this the image looks very weird indeed, this is probably why it is not implemented and needs a hack to do it. I would stick with the small separation method, anyone used to these plots assumes they take the same x-value.