Controlling brightness in pcolormesh

Controlling brightness in pcolormesh - python

I am attempting to plot the electron's probability (in an hydrogen atom), using python and matplotlib's pcolormesh.
All is well except that since the distribution drops so rapidly - some details are not visible, e.g., the surroundings of the zeroes of the radial function (in the higher energy states) are too fade, making it hard to notice that the wave function actually vanishes at some radii.
I know I can handle this with some rescaling and "adjustments" to the wave function, but I would rather tweak my plotting skills and realize how to do this with matplotlib.
I want to adjust the heat map so that more of the map would be bright.
Is there a way to control its sensitivity?
Thanks in advance.

You can use gamma correction do to that. I've used it in quite similar situations with very good results.
One way to do that:
normalized = original/original.max() # rescale to between 0 and 1
corrected = numpy.power(normalized, gamma) # try values between 0.5 and 2 as a start point
plt.imshow(corrected)
This works because elevating the interval between 0 and 1 to a given exponent yields monotonically increasing results that cross 0,0 and 1,1. This is similar to moving the middle slider of Photoshop/GIMP "levels" dialog.
EDIT: better yet, it seems that Matplotlib already has a class for that.

Related

How to draw smooth contour/level curves of multivariable functions

G'day programmers and math enthusiasts.
Recently I have been exploring how CAS graphing calculators function; in particular, how they are able to draw level curves and hence contours for multivariable functions.
Just a couple of notes before I ask my question:
I am using Python's Pygame library purely for the window and graphics. Of course there are better options out there but I really wanted to keep my code as primitive as I am comfortable with, in an effort to learn the most.
Yes, yes. I know about matplotlib! God have I seen 100 different suggestions for using other supporting libraries. And while they are definitely stunning and robust tools, I am really trying to build up my knowledge from the foundations here so that one day I may even be able to grow and support libraries such as them.
My ultimate goal is to get plots looking as smooth as this:
Mathematica Contour Plot Circle E.g.
What I currently do is:
Evaluate the function over a grid of 500x500 points equal to 0, with some error tolerance (mine is 0.01). This gives me a rough approximation of the level curve at f(x,y)=0.
Then I use a dodgy distance function to find each point's closest neighbour, and draw an anti-aliased line between the two.
The results of both of these steps can be seen here:
First Evaluating Valid Grid Points
Then Drawing Lines to Closest Points
For obvious reasons I've got gaps in the graph where the next closest point is always keeping the graph discontinuous. Alas! I thought of another janky work around. How about on top of finding the closest point, it actually looks for the next closest point that hasn't already been visited? This idea came close, but still doesn't really seem to be even close to efficient. Here are my results after implementing that:
Slightly Smarter Point Connecting
My question is, how is this sort of thing typically implemented in graphing calculators? Have I been going about this all wrong? Any ideas or suggestions would be greatly appreciated :)
(I haven't included any code, mainly because it's not super clear, and also not particularly relevant to the problem).
Also if anyone has some hardcore math answers to suggest, don't be afraid to suggest them, I've got a healthy background in coding and mathematics (especially numerical and computational methods) so here's me hoping I should be able to cope with them.

so you are evaluating the equation for every x and y point on your plane. then you check if the result is < 0.01 and if so, you are drawing the point.
a better way to check if the point should be drawn is to check if one of the following is true:
(a) if the point is zero
(b) if the point is positive and has at least one negative neighbor
(c) if the point is negative and has at least one positive neighbor
there are 3 problems with this:
it doesn't support any kind of antialisasing so the result will not look as smooth as you would want
you can't make thicker lines (more then 1 pixel)
if the 0-point line is only touching (it's positive on both sides and not positive on one, negative on the other)
this second solution may fix those problems but it was made by me and not tested so it may or may not work:
you assign the value to a corner and then calculate the distance to the zero line for each point from it's corners. this is the algorithm for finding the distance:
def distance(tl, tr, bl, br): # the 4 corners
avg = abs((tl + tr + bl + br) / 4) # getting the absolute average
m = min(map(abs, (tl + tr + bl + br))) # absolute minimum of points
if min == 0: # special case
return float('inf')
return avg / m # distance to 0 point assuming the trend will continue
this returns the estimated distance to the 0 line you can now draw the pixel e.g. if you want a 5-pixel line, then if the result is <4 you draw the pixel full color, elif the pixel is <5 you draw the pixel with an opacity of distance - 4 (*255 if you are using pygames alpha option)
this solution assumes that the function is somewhat linear.
just try it, in the worst case it doesn't work...

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.1319&rep=rep1&type=pdf
This 21 Page doc has everything I need to draw the implicit curves accurately and smoothly. It even covers optimisation methods and supports bifurcation points for implicit functions. Would highly recommend to anyone with questions similar to my own above.
Thanks to everyone who had recommendations and clarifying questions, they all helped lead me to this resource.

Fixing dihedral angles beyond 180 degrees limit to create smooth curves

I have an output from a commercial program that contains the dihedral angles of a molecule in time. The problem comes from apparently a known quadrant issue when taking cosines, that your interval is -180 to 180, and I am not familiar with. If the dihedral would be bigger than 180, this commercial program (SHARC, for molecular dynamics simulations) understands that it is bigger than -180, creating jumps on the plots (you can see an example in the figure bellow).
Is there a correct mathematical way to convert these plots to smooth curves, even if it means to go to dihedrals higher than 180?
What I am trying is to create an python program to deal with each special case, when going from 180 to -180 or vice versa, how to deal with cases near 90 or 0 degrees, by using sines and cosines... But it is becoming extremely complex, with more than 12 nested if commands inside a for loop running through the X axis.
If it was only one figure, I could do it by hand, but I will have dozens of similar plots.
I attach an ascii file with the that for plotting this figure.
What I would like it to look like is this:
Thank you very much,
Cayo Gonçalves

Ok, I've found a pretty easy solution.
Numpy has the unwrap function. I just need to feed the function with a vector with the angles in radians.
Thank you Yves for giving me the name of the problem. This helped me find the solution.

This is called phase unwrapping.
As your curves are smooth and slowly varying, every time you see a large negative (positive) jump, add (subtract) 360. This will restore the original curve. (For the jump threshold, 170 should be good, I guess).

Creating a packed bubble / scatter plot in python (jitter based on size to avoid overlapping)

I have come across a number of plots (end of page) that are very similar to scatter / swarm plots which jitter the y-axis in order avoid overlapping dots / bubbles.
How can I get the y values (ideally in an array) based on a given set of x and z values (dot sizes)?
I found the python circlify library but it's not quite what I am looking for.
Example of what I am trying to create
EDIT: For this project I need to be able to output the x, y and z values so that they can be plotted in the user's tool of choice. Therefore I am more interested in solutions that generate the y-coords rather than the actual plot.

Answer:
What you describe in your text is known as a swarm plot (or beeswarm plot) and there are python implementations of these (esp see seaborn), but also, eg, in R. That is, these plots allow adjustment of the y-position of each data point so they don't overlap, but otherwise are closely packed.
Seaborn swarm plot:
Discussion:
But the plots that you show aren't standard swarm plots (which almost always have the weird looking "arms"), but instead seem to be driven by some type of physics engine which allows for motion along x as well as y, which produces the well packed structures you see in the plots (eg, like a water drop on a spiders web).
That is, in the plot above, by imagining moving points only along the vertical axis so that it packs better, you can see that, for the most part, you can't really do it. (Honestly, maybe the data shown could be packed a bit better, but not dramatically so -- eg, the first arm from the left couldn't be improved, and if any of them could, it's only by moving one or two points inward). Instead, to get the plot like you show, you'll need some motion in x, like would be given by some type of physics engine, which hopefully is holding x close to its original value, but also allows for some variation. But that's a trade-off that needs to be decided on a data level, not a programming level.
For example, here's a plotting library, RAWGraphs, which produces a compact beeswarm plot like the Politico graphs in the question:
But critically, they give the warning:
"It’s important to keep in mind that a Beeswarm plot uses forces to avoid collision between the single elements of the visual model. While this helps to see all the circles in the visualization, it also creates some cases where circles are not placed in the exact position they should be on the linear scale of the X Axis."
Or, similarly, in notes from this this D3 package: "Other implementations use force layout, but the force layout simulation naturally tries to reach its equilibrium by pushing data points along both axes, which can be disruptive to the ordering of the data." And here's a nice demo based on D3 force layout where sliders adjust the relative forces pulling the points to their correct values.
Therefore, this plot is a compromise between a swarm plot and a violin plot (which shows a smoothed average for the distribution envelope), but both of those plots give an honest representation of the data, and in these plots, these closely packed plots representation comes at a cost of a misrepresentation of the x-position of the individual data points. Their advantage seems to be that you can color and click on the individual points (where, if you wanted you could give the actual x-data, although that's not done in the linked plots).
Seaborn violin plot:
Personally, I'm really hesitant to misrepresent the data in some unknown way (that's the outcome of a physics engine calculation but not obvious to the reader). Maybe a better compromise would be a violin filled with non-circular patches, or something like a Raincloud plot.

I created an Observable notebook to calculate the y values of a beeswarm plot with variable-sized circles. The image below gives an example of the results.
If you need to use the JavaScript code in a script, it should be straightforward to copy and paste the code for the AccurateBeeswarm class.
The algorithm simply places the points one by one, as close as possible to the x=0 line while avoiding overlaps. There are also options to add a little randomness to improve the appearance. x values are never altered; this is the one big advantage of this approach over force-directed algorithms such as the one used by RAWGraphs.

How can I work around overflow error in matplotlib?

I'm solving a set of coupled differential equations with odeint package from scipy.integrate.
For the integration time I have:
t=numpy.linspace(0,8e+9,5e+06)
where 5e+06 is the timestep.
I then plot the equations I have as such:
plt.xscale('symlog') #x axis logarithmic scale
plt.yscale('log',basey=2) #Y axis logarithmic scale
plt.gca().set_ylim(8, 100000) #Changing y axis ticks
ax = plt.gca()
ax.yaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
ax.xaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
plt.title("Example graph")
plt.xlabel("time (yr)")
plt.ylabel("quantity a")
plt.plot(t,a,"r-", label = 'Example graph')
plt.legend(loc='best')
where a is time dependent variable. (This is just one graph from many.)
However, the graphs look a bit jagged, rather than oscillatory and I obtain this error:
OverflowError: Exceeded cell block limit (set 'agg.path.chunksize' rcparam)
I'm not overly sure what this error means, I've looked at other answers but don't know how to implement the 'agg.path.chunksize'.
Also, the integration + plotting takes around 7 hours and that is with some CPU processing hacks, so I really do not want to implement anything that would increase the time.
How can I overcome this error?
I have attempted to reduce the timestep, however I obtain this error instead:
Excess work done on this call (perhaps wrong Dfun type).
Run with full_output = 1 to get quantitative information.

As the error message suggests, you may set the chunksize to a larger value.
plt.rcParams['agg.path.chunksize'] = 1000
However you may also critically reflect why this error occurs in the first place. It would only occur if you are trying to plot an unreasonably large amount of data on the graph. Meaning, if you try to plot 200000000 points, the renderer might have problems to keep them all in memory. But one should probably ask oneself, why is it necessary to plot so many points? A screen may display some 2000 points in lateral direction, a printed paper maybe 6000. Using more points does not make sense, generally speaking.
Now if the solution of your differential equations requires a large point density, it does not automatically mean that you need to plot them all.
E.g. one could just plot every 100th point,
plt.plot(x[::100], y[::100])
most probably without even affecting the visual plot appearance.

Higher sampling for image's projection

My software should judge spectrum bands, and given the location of the bands, find the peak point and width of the bands.
I learned to take the projection of the image and to find width of each peak.
But I need a better way to find the projection.
The method I used reduces a 1600-pixel wide image (eg 1600X40) to a 1600-long sequence. Ideally I would want to reduce the image to a 10000-long sequence using the same image.
I want a longer sequence as 1600 points provide too low resolution. A single point causes a large difference (there is a 4% difference if a band is judged from 18 to 19) to the measure.
How do I get a longer projection from the same image?
Code I used: https://stackoverflow.com/a/9771560/604511
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("band2.png"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Set the min value to zero for a nice fit
projection /= projection.mean()
projection -= projection.min()

What you want to do is called interpolation. Scipy has an interpolate module, with a whole bunch of different functions for differing situations, take a look here, or specifically for images here.
Here is a recently asked question that has some example code, and a graph that shows what happens.
But it is really important to realise that interpolating will not make your data more accurate, so it will not help you in this situation.
If you want more accurate results, you need more accurate data. There is no other way. You need to start with a higher resolution image. (If you resample, or interpolate, you results will acually be less accurate!)
Update - as the question has changed
#Hooked has made a nice point. Another way to think about it is that instead of immediately averaging (which does throw away the variance in the data), you can produce 40 graphs (like your lower one in your posted image) from each horizontal row in your spectrum image, all these graphs are going to be pretty similar but with some variations in peak position, height and width. You should calculate the position, height, and width of each of these peaks in each of these 40 images, then combine this data (matching peaks across the 40 graphs), and use the appropriate variance as an error estimate (for peak position, height, and width), by using the central limit theorem. That way you can get the most out of your data. However, I believe this is assuming some independence between each of the rows in the spectrogram, which may or may not be the case?

I'd like to offer some more detail to #fraxel's answer (to long for a comment). He's right that you can't get any more information than what you put in, but I think it needs some elaboration...
You are projecting your data from 1600x40 -> 1600 which seems like you are throwing some data away. While technically correct, the whole point of a projection is to bring higher dimensional data to a lower dimension. This only makes sense if...
Your data can be adequately represented in the lower dimension. Correct me if I'm wrong, but it looks like your data is indeed one-dimensional, the vertical axis is a measure of the variability of that particular point on the x-axis (wavelength?).
Given that the projection makes sense, how can we best summarize the data for each particular wavelength point? In my previous answer, you can see I took the average for each point. In the absence of other information about the particular properties of the system, this is a reasonable first-order approximation.
You can keep more of the information if you like. Below I've plotted the variance along the y-axis. This tells me that your measurements have more variability when the signal is higher, and low variability when the signal is lower (which seems useful!):
What you need to do then, is decide what you are going to do with those extra 40 pixels of data before the projection. They mean something physically, and your job as a researcher is to interpret and project that data in a meaningful way!
The code to produce the image is below, the spec. data was taken from the screencap of your original post:
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("spec2.png"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Compute the variance
variance = pic_avg.var(axis=0)
from pylab import *
scale = 1/40.
X_val = range(projection.shape[0])
errorbar(X_val,projection*scale,yerr=variance*scale)
imshow(pic,origin='lower',alpha=.8)
axis('tight')
show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.