Making a histogram via matplotlib - python

I have a problem. I'm doing a task for my lessons and I'm doing my best, but the teacher does not seem to care and I need to look for the problem myself facing his demands.
I had to make a program, it does not important of what, I don't bother explaining it. I just need to make a histogram to show results, the problem is I can't use .hist() because we need to make OUR histogram via .bar() using matplotlib library.
Here is the code:
import random
import math
import matplotlib.pyplot as plt
import numpy as np
list = []
N = 1000
for i in range(N):
x_1 = random.random()
x_2 = random.random()
xx = ((-2 * math.log(x_1)) ** (1 / 2)) * math.sin(2 * math.pi * x_2)
xy = ((-2 * math.log(x_1)) ** (1 / 2)) * math.cos(2 * math.pi * x_2)
list.append(xx)
list.append(xy)
plt.hist(list, alpha=0.5)
plt.show()
I need to change the plt.hist() to plt.bar(), doing so I end up with this:
plt.bar(list, y_pos, align='center', alpha=0.5)
And the bars overlay, the histogram is unclear. The teacher's assistand told me to sum up the bars like this: when the value if between let's say 1-1.99 you add those values to bar 1, when 2-2.99 to bar 2 etc.
Don't know how to do this, please help.

First off, using numpy allows vectorization and makes everything much faster. Also, please don't use list as a variable name because it overwrites the standard function with that name.
To calculate a histogram "by hand" isn't that difficult.
First you need to decide about some bins, usually they are regularly spaced over the complete domain of the data. Default, plt.hist uses 10 bins, equally divided over the range from the minimum to the maximum of the data
Then, you create an array to count the number of x that fall inside each of the 10 bins. To know in which of the bins (numbered 0 to 9) a particular x falls, subtract the minimum x and divide by the range. This gives a number between 0 and 1. Multiplying by the number of bins gives a float number between 0 and 10. Converting the number to an integer gives the index of the bin which should be incremented by one. A multiplication by a number slightly lower than 1 prevents that the maximum x would be put into bin index 10 which doesn't exist.
The code below first creates a bar plot in yellow and then draws a standard histogram in transparent red on top of it. As both coincide perfectly, together they show an orange histogram.
import matplotlib.pyplot as plt
import numpy as np
N = 1000
x_1 = np.random.random(N)
x_2 = np.random.random(N)
xx = ((-2 * np.log(x_1)) ** (1 / 2)) * np.sin(2 * np.pi * x_2)
xy = ((-2 * np.log(x_1)) ** (1 / 2)) * np.cos(2 * np.pi * x_2)
lst = np.concatenate([xx, xy])
minx = min(lst)
maxx = max(lst)
bins = 10
bin_counts = np.zeros(bins)
bin_factor = bins * 0.999999 / (maxx - minx)
for x in lst:
bin_counts[int((x - minx) * bin_factor)] += 1
plt.bar(np.linspace(minx, maxx, bins, endpoint=False), bin_counts, width=(maxx - minx) / bins,
alpha=.5, ec='w', align='edge', color='yellow')
plt.hist(lst, alpha=0.5, bins=bins, ec='w', color='red')
plt.show()
To draw a standard gaussian normal over the histogram, it needs to be scaled by the number of samples and adjusted for the width of the bars:
from scipy.stats import norm
x = np.linspace(minx, maxx, 200)
plt.plot(x, norm.pdf(x, 0, 1)*len(lst)/bin_factor, color='green', lw=2)
PS: You might want to read more about the Box-Muller transform.
To have bars that are only 0.1 wide, you could change the code as follows. You'd also need a larger N to avoid that the bars would have very irregular heights.
minx = -2
maxx = 2
bins = 40
bin_counts = np.zeros(bins)
factor = bins * 0.999999 / (maxx - minx)
for x in lst:
if minx <= x <= maxx: # this test is needed when minx is larger than the real minimum (similar for maxx)
bin_counts[int((x - minx) * factor)] += 1
# optionally show ticks every .1 steps
plt.xticks(np.arange(minx, maxx+0.001, 0.1), rotation=90)
The following plot uses N = 10000

Whoa, excellent. This looks awesome. I hope there won't be any problems with x axis with labels. Would there be a way to divide it by 10? So I can get for example -0.3, -0.2,-0.1,0,0.1,0.2,0.3 ?
I've also done that method with other task.
import math
from scipy import constants
import matplotlib.pyplot as plt
import numpy as np
kb = constants.Boltzmann
m = 5e-26
var = (kb*300/m)**(1/2)
v_list = []
v = 0
N = 1000
for i in range(N):
x_1 = random.random()
x_2 = random.random()
n1 = ((-2 * math.log(x_1, math.e)) ** (1 / 2)) * math.cos(2 * math.pi * x_2) * var
n2 = ((-2 * math.log(x_1, math.e)) ** (1 / 2)) * math.sin(2 * math.pi * x_2) * var
y_1 = random.random()
y_2 = random.random()
n3 = ((-2 * math.log10(y_1)) ** (1 / 2)) * math.cos(2 * math.pi * y_2) * var
v += math.sqrt(n1*n1+n2*n2+n3*n3)
if math.sqrt(n1*n1+n2*n2+n3*n3) <= 900:
v_list.append(math.sqrt(n1*n1+n2*n2+n3*n3))
else:
continue
minx = min(v_list)
maxx = max(v_list)
bins = 10
bin_counts = np.zeros(bins)
factor = bins * 0.999999 / (maxx - minx)
for x in v_list:
bin_counts[int((x - minx) * factor)] += 1
plt.bar(np.linspace(minx, maxx, bins, endpoint=False), bin_counts, width=(maxx - minx) / bins,
alpha=0.5, ec='w', align='edge', color='green')
plt.show()
I've got some problems with plt.savefig() but it worked later. Also could not transfer the math library to numpy, so I left it like this.
EDIT: I may be having a cool task with Mandelbrot set later, I was looking for answer all across the place but could not manage. It has some strange approach which I could not learn how to solve. All this to addidion of teachers not wanting to simply tell me the answer. I had high hopes for that Python lessos but what I'm left with is just simply disgust.

Related

How to correctly extract the phase of the spectrum in python

I describe a pulse in the time domain and do a Fourier Transform to convert it to the frequency domain.
I add an e-index polynomial phase e^{i*phase}to it in the frequency domain,phase is a polynomial.
At this time, I use the angle function under numpy to extract the phase, and what I get is such dense peaks as shown in the figure. I don't know if this is correct and I don't know how should I extract the polynomial again.
import numpy as np
import matplotlib.pyplot as plt
fs = 1e-15
THz = 1e12
nm = 1e-9
c = 3e8
N = 2 ** 13
time_window = 3000 * fs
wavelength = 800 * nm
t = np.linspace(-time_window / 2,time_window / 2, N)
df = np.append(np.linspace(0, N / 2, int(N / 2)),(np.linspace(-N / 2, -1, int(N / 2))))/ time_window
f = c/wavelength + df
dw = 2 * np.pi * df
FWHM = 50 * fs
m = 4 * np.log(2)
A_t = np.exp(-m * t ** 2 * (1 / 2) / FWHM ** 2)
A_w = np.fft.fft(A_t)
GDD = 500 * fs*fs
TOD = 0 * fs*fs*fs
FOD = 0
A_w = np.exp(1j * (GDD / 2.0) * dw**2 +
1j * (TOD / 6.0) * dw ** 3+
1j * (FOD / 24.0) * dw ** 4) * A_w
fig_1 = plt.figure(1, facecolor='w', edgecolor='k')
ax_1 = fig_1.add_subplot(1, 1, 1)
ax_2 = ax_1.twinx()
ax_1.plot(np.fft.fftshift(f/THz),np.fft.fftshift(np.abs(A_w) ** 2 / max(np.abs(A_w) ** 2)),'b')
ax_2.plot(np.fft.fftshift(f/THz),np.fft.fftshift(np.angle(A_w)),'r')
ax_1.set_ylabel('Intensity / a.u.')
ax_2.set_ylabel('Phase / rad')
ax_1.tick_params(axis='y', colors='b')
ax_2.tick_params(axis='y', colors='r')
plt.xlim(300,450)
plt.show()
Two things. To center your data relative to phase, you need to either fftshift your data before the FFT, or flip the sign of the imaginary component in every other result element.
Then look at the magnitude result. When the magnitudes go sufficiently near zero, the phase becomes that of random numerical noise, rather than informative. So the phase of near zero magnitudes can be zeroed to make the plot look cleaner.

Python separate sin waves Fourier Transform

I'm currently trying to make a sin wave separator, at replit.com. However, I am running it, and the bottom graph is off. There should be on spike, but there are multiple in the area. It is using the function e ^ (-2pi * i * frequency * the height of the sin wave). Can anyone help me? The math I am basing it off of is this video. Thank you!
for i in range(180):
height.append(cos(2 * i * const))
center = (0 + 0j)
centers = []
centers2 = []
centers3 = []
for a in range(180):
f = a * const
center = (0 + 0j)
for a, i in enumerate(height):
center += -i * e**(-2 * pi * 1j * f * a)
#maybe fix equation? It looks off...
center *= 1/(a)
centers3.append(sqrt(center.real ** 2 + center.imag **2))
centers.append(center.real)
centers2.append(center.imag)
The thing in the exponential has a factor of 2 pi too much in it. So you could say that f is a factor of 2 pi too high, or that the factor of 2 pi does belong in f but then it shouldn't be repeated in the exponential, either way it's in there twice while it should be in there once.
Changed code, plus minor other edits:
from math import cos, pi, e, sqrt, atan2
import matplotlib.pyplot as plt
import numpy as np
const = 2 * pi/180
height = []
for i in range(180):
height.append(cos(2 * i * const))
amplitudes = []
phases = []
print(90 / 180, 90 * const)
for a in range(180):
f = a / 180
center = (0 + 0j)
for index, sample in enumerate(height):
center += sample * e**(-2 * pi * 1j * f * index)
# fixed equation
amplitudes.append(sqrt(center.real ** 2 + center.imag **2))
phases.append(atan2(center.imag, center.real))
x = np.linspace(0, np.pi, 180)
y = height
fig, ax = plt.subplots(2,2)
ax[0,0].plot(x, y)
ax[1,0].plot(x, amplitudes)
ax[1,1].plot(x, phases, 'orange')
#ax[0,1].plot(x, centers3, 'orange')
plt.show()
Output:

Adding the output of one graph to another graph

I would like to find a way to translate and add the bottom graph (from y = -20 to 0) onto the above graph (from y = 0-20) so that the final domain is between y = 0 to 20:
However, I am finding problems doing it, as the graph I used to draw the bottom graph (R12) has already a negative input, thus it will not show up on the positive y-axis. Here is my code:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
N = 1001
lower = 0
upper = 20
u = np.linspace(lower, upper, N)
t1a, t3a = np.meshgrid(u,-u)
t1, t3 = np.meshgrid(u,u)
omega = 10
delta = 5
tau = 2
mu = 1
t2 = 0.1
def g(t):
return delta * (omega ** 2) * (tau ** 2) * ((np.e ** (-t/tau))+(t/tau)-1)
R12 = 1 * (mu ** 4) * (np.e **(-1j * omega * (t3a-(t1a)))) * (np.e ** (-g(t1a)+g(t2)-g(t3a)-g(t1a+t2)-g(t2+t3a)+g(t1a+t2+t3a)))
R45 = 1 * (mu ** 4) * (np.e ** (-1j * omega * (t3+t1))) * (np.e ** (-g(t1)-g(t2)-g(t3)+g(t1+t2)+g(t2+t3)-g(t1+t2+t3)))
R12_fft = np.fft.fftshift((np.fft.fft2((R12)))) / np.sqrt(len(R12))
R45_fft = np.fft.fftshift((np.fft.fft2((R45)))) / np.sqrt(len(R45))
R_pure = (R12_fft + R45_fft)
plt.contourf(t1a,t3a,R12_fft, cmap = 'seismic')
plt.contourf(t1,t3,R45_fft, cmap = 'seismic')
plt.xlabel('${\omega}_{3}$', fontsize = 24)
plt.ylabel('${\omega}_{1}$', fontsize = 24)
plt.xlim(-20, 20)
plt.ylim(-20, 20)
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()
As an example, if I try to do the simple adding:
plt.contourf(t1,t3,R_pure, cmap = 'seismic')
it basically gives me back the same shape of the graph. What I would like instead is superimposing the bottom graph onto the top and adding the output together. Is there any way I can achieve this? Thank you!
I feel like it's a bit too dumb, so it's probably wrong.
R_pure = (R12_fft + abs(R45_fft))

How to correctly sample a density?

I do not understand why the following code works with the normal function and do not for another custom function :
That is the example where I tried to sample the normal distribution :
n = 100000
xx = np.random.uniform(-5, 5, n)
rho = mpl.pylab.normpdf(xx, 0, 1)
rnd = np.random.rand(n)
ix = np.where(rho > rnd)
xx = xx[ix]
h = plt.hist(xx, bins=20, normed=True)
# plot density
x = np.linspace(-5, 5, 100)
plt.plot(x, mpl.pylab.normpdf(x, 0, 1))
It works and I got :
Now if I changed the density, I do not correctly sample it. I checked if the density is well normed and it is. Thus I do not understand where I am wrong
n = 100000
xx = np.random.uniform(0, 1, n)
rho = 2 * np.sin(2 * xx * np.pi)**2
rnd = np.random.rand(n)
ix = np.where(rho > rnd)
xx = xx[ix]
h = plt.hist(xx, bins=20, normed=True)
# plot density
x = np.linspace(0, 1, 100)
print(np.trapz(2 * np.sin(2 * x * np.pi)**2, x))
plt.plot(x, 2 * np.sin(2 * x * np.pi)**2)
You are doing rejection sampling
In the first case the max value of the pdf is < 1, and you are drawing rnd from [0,1], so all the values are below the max. You are throwing away more values than needed though, since the max is strictly less than 1. In the second case the max of the pdf is 2 but you are still drawing rnd from [0,1] in the line
rnd = np.random.rand(n)
You should change that line so it samples uniformly from [0,2]. Note that the somewhat flat tops of your histograms correspond to the parts of [0,1] where the pdf is > 1. Your code has no way of treating some of those values differently than others.
You're rejecting too much in first example, and not enough in the second.
Optimal case when you're sampling Y from 0 to PDFmax.
In first case, you shall call
rnd = np.random.rand(n) / np.sqrt(2.0 * np.pi)
In second case
rnd = 2.0 * np.random.rand(n)

How to generate random points in a circular distribution

I am wondering how i could generate random numbers that appear in a circular distribution.
I am able to generate random points in a rectangular distribution such that the points are generated within the square of (0 <= x < 1000, 0 <= y < 1000):
How would i go upon to generate the points within a circle such that:
(x−500)^2 + (y−500)^2 < 250000 ?
import random
import math
# radius of the circle
circle_r = 10
# center of the circle (x, y)
circle_x = 5
circle_y = 7
# random angle
alpha = 2 * math.pi * random.random()
# random radius
r = circle_r * math.sqrt(random.random())
# calculating coordinates
x = r * math.cos(alpha) + circle_x
y = r * math.sin(alpha) + circle_y
print("Random point", (x, y))
In your example circle_x is 500 as circle_y is. circle_r is 500.
Another version of calculating radius to get uniformly distributed points, based on this answer
u = random.random() + random.random()
r = circle_r * (2 - u if u > 1 else u)
FIRST ANSWER:
An easy solution would be to do a check to see if the result satisfies your equation before proceeding.
Generate x, y (there are ways to randomize into a select range)
Check if ((x−500)^2 + (y−500)^2 < 250000) is true
if not, regenerate.
The only downside would be inefficiency.
SECOND ANSWER:
OR, you could do something similar to riemann sums like for approximating integrals. Approximate your circle by dividing it up into many rectangles. (the more rectangles, the more accurate), and use your rectangle algorithm for each rectangle within your circle.
What you need is to sample from (polar form):
r, theta = [math.sqrt(random.randint(0,500))*math.sqrt(500), 2*math.pi*random.random()]
You can then transform r and theta back to cartesian coordinates x and y via
x = 500 + r * math.cos(theta)
y = 500 + r * math.sin(theta)
Related (although not Python), but gives the idea.
You can use below the code and if want to learn more
https://programming.guide/random-point-within-circle.html
import random
import math
circle_x = 500
circle_y = 500
a = random.randint(0,500) * 2 * math.pi
r = 1 * math.sqrt(random.randint(0,500))
x = r * math.cos(a) + circle_x
y = r * math.sin(a) + circle_y
here's an example hope could help someone :).
randProba = lambda a: a/sum(a)
npoints = 5000 # points to chose from
r = 1 # radius of the circle
plt.figure(figsize=(5,5))
t = np.linspace(0, 2*np.pi, npoints, endpoint=False)
x = r * np.cos(t)
y = r * np.sin(t)
plt.scatter(x, y, c='0.8')
n = 2 # number of points to chose
t = np.linspace(0, 2*np.pi, npoints, endpoint=False)[np.random.choice(range(npoints), n, replace=False, p=randProba(np.random.random(npoints)))]
x = r * np.cos(t)
y = r * np.sin(t)
plt.scatter(x, y)
You can use rejection sampling, generate a random point within the (2r)×(2r) square that covers the circle, repeat until get one point within the circle.
I would use polar coordinates:
r_squared, theta = [random.randint(0,250000), 2*math.pi*random.random()]
Then r is always less than or equal to the radius, and theta always between 0 and 2*pi radians.
Since r is not at the origin, you will always convert it to a vector centered at 500, 500, if I understand correctly
x = 500 + math.sqrt(r_squared)*math.cos(theta)
y = 500 + math.sqrt(r_squared)*math.sin(theta)
Choose r_squared randomly because of this

Categories

Resources