How do you make a 2d histogram graph of binned data?

How do you make a 2d histogram graph of binned data? - python

I have a list of sizes for every 10 seconds. But instead of just single points, I want to plot a range for each point. Like the example graph I attached.
I tried using hist2d() but this makes each point the same shape. My data increases it's spacing as it goes up for example:
d=[0,0.062,0.8,1.1,1.5,2.8,4,7,11,16)
This is the type of graph I want:

Related

How can i make an interpolated function back into discrete data points at different x points than the original data?

I used the interpolate function on two data sets. Now I want to transform the resultant functions into discrete x-y datasets, with 500 points each for each dataset. It feels like something I should be able to do. but I haven't been able to figure out how. Here is my code. Ch_resting_spores is a species of diatom within my dataframe.
intersmall=interp1d(counting.age_med_Ka, counting.small)
elems={}
for i in diatoms.columns:
elems[i]=interp1d(diatoms.Age_ka,diatoms[i],kind='cubic')
fig,ax=plt.subplots()
ax.plot(diatoms["Age_ka"], elems['Ch_resting_spores'](diatoms["Age_ka"]),'r')
ax2=ax.twinx()
ax2.plot(counting['age_med_Ka'],intersmall(counting['age_med_Ka']))
ax.set_xlim(1030,1102)

Build a histogram in python by giving bins parameters

I have the x and y obtained from a histogram, and I want to rebuild that histogram. How can I do that? I tried this:
plt.hist(x,bins=len(x),weights=y)
But it seems like the points are not exactly on the center of the bin and they get significantly shifted after a while (the points on the x-axis are not equally spaced).

What do the numbers on the axis mean when visualizing clusters in 2-dimensions?

I followed the codes in this link
What do the numbers on the x-axis and y-axis mean in this plot? Why they are discrete numbers?
When I used my own data, it gives me this kind of plot, I can't understand what the plot is trying to say.

As they are working with more than two dimensions (features), they are using PCA to project the data into two dimensions (that do not need to correspond to any of the dimensions of the original data) so it can be plotted.
So each of the data points are projected into the dimensions PCA1 and PCA2, which are real-valued (not discrete)

Is there a way to plot Matplotlib's Imshow against a specific array rather than the indices?

I'm trying to use Imshow to plot a 2-d Fourier transform of my data. However, Imshow plots the data against its index in the array. I would like to plot the data against a set of arrays I have containing the corresponding frequency values (one array for each dim), but can't figure out how.
I have a 2D array of data (gaussian pulse signal) that I Fourier transform with np.fft.fft2. This all works fine. I then get the corresponding frequency bins for each dimension with np.fft.fftfreq(len(data))*sampling_rate. I can't figure out how to use imshow to plot the data against these frequencies though. The 1D equivalent of what I'm trying to do us using plt.plot(x,y) rather than just using plt.plot(y).
My first attempt was to use imshows "extent" flag, but as fas as I can tell that just changes the axis limits, not the actual bins.
My next solution was to use np.fft.fftshift to arrange the data in numerical order and then simply re-scale the axis using this answer: Change the axis scale of imshow. However, the index to frequency bin is not a pure scaling factor, there's typically a constant offset as well.
My attempt was to use 2d hist instead of imshow, but that doesn't work since 2dhist plots the number of times an order pair occurs, while I want to plot a scalar value corresponding to specific order pairs (i.e the power of the signal at specific frequency combinations).
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
f = 200
st = 2500
x = np.linspace(-1,1,2*st)
y = signal.gausspulse(x, fc=f, bw=0.05)
data = np.outer(np.ones(len(y)),y) # A simple example with constant y
Fdata = np.abs(np.fft.fft2(data))**2
freqx = np.fft.fftfreq(len(x))*st # What I want to plot my data against
freqy = np.fft.fftfreq(len(y))*st
plt.imshow(Fdata)
I should see a peak at (200,0) corresponding to the frequency of my signal (with some fall off around it corresponding to bandwidth), but instead my maximum occurs at some random position corresponding to the frequencie's index in my data array. If anyone has any idea, fixes, or other functions to use I would greatly appreciate it!

I cannot run your code, but I think you are looking for the extent= argument to imshow(). See the the page on origin and extent for more information.
Something like this may work?
plt.imshow(Fdata, extent=(freqx[0],freqx[-1],freqy[0],freqy[-1]))

How can I account for identical data points in a scatter plot?

I'm working with some data that has several identical data points. I would like to visualize the data in a scatter plot, but scatter plotting doesn't do a good job of showing the duplicates.
If I change the alpha value, then the identical data points become darker, which is nice, but not ideal.
Is there some way to map the color of a dot to how many times it occurs in the data set? What about size? How can I assign the size of the dot to how many times it occurs in the data set?

As it was pointed out, whether this makes sense depends a bit on your dataset. If you have reasonably discrete points and exact matches make sense, you can do something like this:
import numpy as np
import matplotlib.pyplot as plt
test_x=[2,3,4,1,2,4,2]
test_y=[1,2,1,3,1,1,1] # I am just generating some test x and y values. Use your data here
#Generate a list of unique points
points=list(set(zip(test_x,test_y)))
#Generate a list of point counts
count=[len([x for x,y in zip(test_x,test_y) if x==p[0] and y==p[1]]) for p in points]
#Now for the plotting:
plot_x=[i[0] for i in points]
plot_y=[i[1] for i in points]
count=np.array(count)
plt.scatter(plot_x,plot_y,c=count,s=100*count**0.5,cmap='Spectral_r')
plt.colorbar()
plt.show()
Notice: You will need to adjust the radius (the value 100 in th s argument) according to your point density. I also used the square root of the count to scale it so that the point area is proportional to the counts.
Also note: If you have very dense points, it might be more appropriate to use a different kind of plot. Histograms for example (I personally like hexbin for 2d data) are a decent alternative in these cases.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do you make a 2d histogram graph of binned data? - python

Related

How can i make an interpolated function back into discrete data points at different x points than the original data?

Build a histogram in python by giving bins parameters

What do the numbers on the axis mean when visualizing clusters in 2-dimensions?

Is there a way to plot Matplotlib's Imshow against a specific array rather than the indices?

How can I account for identical data points in a scatter plot?

Categories

Resources