I have the x and y obtained from a histogram, and I want to rebuild that histogram. How can I do that? I tried this:
plt.hist(x,bins=len(x),weights=y)
But it seems like the points are not exactly on the center of the bin and they get significantly shifted after a while (the points on the x-axis are not equally spaced).
Related
I want to create a 2d histogram, where altitude is represented on the y-axis and max wind speed on the x-axis and each bin is scaled by the total number of data points in the specific row (altitude level).
The desired output looks similar to the attached figure, however does the color represent a scaled density for each altitude level.
The 2d histogram without scaling looks like this:
The goal I want to achieve with scaling is to get a more accurate Pearson correlation, since we're mainly interested in the altitude effect of wind speed maxima.
I would like to create spatial bins like (for example) the ones in this plot here:
Here there are 412 bins but this can vary depending on how many I want (https://arxiv.org/pdf/1909.04701.pdf). I have already computed all the boundary lines that defines these bins. If I have millions of points defined by their x, y coordinates, how would I efficiently put them in one of these 412 bins.
[update]
The latitude bins are always equivalent in size while the longitudinal bins are not. I can use np.digitize to find the latitudinal bin pretty easily, once that's found I know the bins of the longitude as well. However I'm not sure I can vectorize np.digitize to have a different bin array for each point I provide. longitudinal_bins_arr would be an array of all the longitudinal bins for each latitude for the length of all the points that I'm trying to bin.
lat_bin_for_points = np.digitize( latlon[:,0], latitudinal_bins )
lon_bin_for_points = np.digitize( latlon[:,1], longitudinal_bins_arr[lat_bin_for_points] )
I have a list of sizes for every 10 seconds. But instead of just single points, I want to plot a range for each point. Like the example graph I attached.
I tried using hist2d() but this makes each point the same shape. My data increases it's spacing as it goes up for example:
d=[0,0.062,0.8,1.1,1.5,2.8,4,7,11,16)
This is the type of graph I want:
My apologies for my ignorance in advance; I've only been learning Python for about two months. Every example question that I've seen on Stack Overflow seems to discuss a single distribution over a series of data, but not one distribution per data point with band broadening.
I have some (essentially) infinitely-thin bars at value x with height y that I need to run a line over so that it looks like the following photo:
The bars are the obtained from the the table of data on the far right. The curve is what I'm trying to make.
I am doing some TD-DFT work to calculate a theoretical UV/visible spectrum. It will output absorbance strengths (y-values, i.e., heights) for specific wavelengths of light (x-values). Theoretically, these are typically plotted as infinitely-thin bars, though we experimentally obtain a curve instead. The theoretical data can be made to appear like an experimental spectrum by running a curve over it that hugs y=0 and has a Gaussian lineshape around every absorbance bar.
I'm not sure if there's a feature that will do this for me, or if I need to do something like make a loop summing Gaussian curves for every individual absorbance, and then plot the resulting formula.
Thanks for reading!
It looks like my answer was using Seaborn to do a kernel density estimation. Because a KDE isn't weighted and only considers the density of x-values, I had to create a small loop to create a new list consisting of the x-entries each multiplied out by their respective intensities:
for j in range(len(list1)): #list1 contains x-values
list5.append([list1[j]]*int(list3[j])) #list5 was empty; see below for list3
#now to drop the brackets from within the list:
for k in range(len(list5)): #list5 was just made, containing intensity-proportional x-values
for l in list5[k]:
list4.append(l) #now just a list, rather than a list of lists
(had to make another list earlier of the intensities multiplied by 1000000 to make them all integers):
list3 = [i * 1000000 for i in list2] #list3 now contains integer intensities
this is the graph in question and the dots should appear in the bottom plane, not "above" the plane like i manged to.
bx.scatter(xs,ys,zs, zdir=zs,c=plt.cm.jet(np.linspace(0,1,N))) # scatter points
for i in range(N-1):
bx.plot(xs[i:i+2], ys[i:i+2], zs[i:i+2], color=plt.cm.jet(i/N), alpha=0.5)
#plots the lines between points
bx.scatter(xs,ys,zs=732371.0,zdir="z",c=plt.cm.jet(np.linspace(0,1,N)),depthshade=True)
bx.set_zlim3d(732371.0,) #limit is there so that we can project the points onto the xy-plane
as youll notice the points are drawn above the xy-grid and I had to set a lower limit for the z-axis so that the first projected point will not interfere with the first scatter point
I would prefer the points be in 2d and less hacky since I got 50 other graphs to do like this and fine tune each one would be cumbersome.
Got some simpler method you want to share?
There are many options, and ultimately, it depends on the range of your data in the other plots.
1) Offset the projection point by a fixed amount
You could calculate the minimum Z value, and plot your projection a fixed offset from that minimum value.
zs=min(zs)-offset
2) offset the projection by a relative amount that depends on the range of your data.
You could take into account the range of your data (i.e. the distance from min to max Z) and calculate an offset proportional to that (e.g. 10-15%).
zs=min(zs)-0.15*(max(zs)-min(zs))