matplotlib doesn't display the correct data - python

I am new to Python. For some reason when I look at the plot it displays all the data as if Y = 0 but the last one, which is weird since when I ask it to print Y it displays the right values. What am I doing wrong?
import math
import numpy as np
import matplotlib.pyplot as plt
y0=2 # [m]
g=9.81 # [m/s^2]
v=20 # initial speed [m/s]
y_target=1 # [m]
x=35 # [m]
n_iter=50
theta=np.linspace(0,0.5*math.pi,n_iter) # theta input [rad]
Y=np.zeros(n_iter) # y output [m]
for i in range(n_iter):
Y[i]=math.tan(theta[i])*x-g/(2*(v*math.cos(theta[i]))**2)*x**2+y0
plt.plot(theta,Y)
plt.ylabel('y [m]')
plt.xlabel('theta [rad]')
plt.ylim(top=max(Y),bottom=min(Y))
plt.show()

The problem is that the function blows up a bit as theta approaches π/2. Notice the little 1e33 at the top of the y-axis in the plot: the scale of that axis is huge, because the last value of y is essentially minus infinity (because of dividing by almost zero). If you change the limits of the y-axis, e.g. to (-1000, +1000), the plot looks correct.
But I can't resist helping you with something you didn't ask for help on... You are not using NumPy correctly. NumPy gives you two things: n-dimensional arrays as a data structure, and fast, optimized code for 'vectorized' computing with those arrays. In essence, you never need a loop in NumPy — you just compute with everything at once. Try doing 10 * np.array([1, 2, 3]) and you will get the idea.
So I would write your code like this:
import numpy as np
import matplotlib.pyplot as plt
# Problem parameters.
y0 = 2 # [m]
g = 9.81 # [m/s^2]
v = 20 # initial speed [m/s]
x = 35 # [m]
# Make theta [rad].
steps = 50
theta = np.linspace(0, 0.5*np.pi, steps)
# Compute y.
y = np.tan(theta) * x - g / (2 * (v * np.cos(theta))**2) * x**2 + y0
# Plot.
plt.plot(theta, y)
plt.ylabel('y [m]')
plt.xlabel('theta [rad]')
plt.ylim(-1000, 1000)
plt.show()
Notice that there's no loop — you just use the vector theta as if it were a scalar. And the math library (which can't handle NumPy's arrays, only scalars) is not needed at all when you're using NumPy.

Related

How can I find output frequencies of an FFT in Python, where amplitude is greater than 100?

I have a plot of 3 Dirac Delta functions after computing an fft using scipy. I want to find the 3 frequencies at which the delta dirac functions occur, therefore, where the y component(amplitude) is greater than 0, but how do I then find their corresponding x component (frequency). Is there a simpler way to print the dominating output frequencies of an fft?
I tried using the np.interp function but it accepts x values and returns y values. I tried inputting the reverse but it only returned the maximum frequency. I don't have an equation simply relating x and y as I've used an fft on my x and y values.
I found all y values above a certain level, y>100 in this case.
But how do I find their corresponding x values?
Fourier Plot
import matplotlib.pyplot as plt
import numpy as np
import math
import pandas as pd
from scipy.fft import fft, fftfreq
import scipy
sample_rate = 44100
duration = 5
N = sample_rate*duration
time = x = np.linspace(0, duration, N, endpoint=False)
amplitude = np.sin(7000*time*2*np.pi)*10 + np.cos(time*5000*(2*np.pi)) + np.sin(time*200*2*np.pi)*5
plt.plot(amplitude[:1000])
from scipy.fft import rfft, rfftfreq
yf = scipy.fft.rfft(amplitude)
xf = scipy.fft.rfftfreq(N, 1/sample_rate)
plt.plot(xf, np.abs(yf))
plt.xlim(0, 10000)
def check_amp(number):
if np.abs(number) > 100:
return True
return False
fft_outputs_iterator = filter(check_amp, yf)
fft_outputs = list(fft_outputs_iterator)
print(np.abs(fft_outputs))

How to validate the downsampling is as intended

how to validate whether the down sampled output is correct. For example, I had make some example, however, I am not sure whether the output is correct or not?
Any idea on the validation
Code
import numpy as np
import matplotlib.pyplot as plt # For ploting
from scipy import signal
import mne
fs = 100 # sample rate
rsample=50 # downsample frequency
fTwo=400 # frequency of the signal
x = np.arange(fs)
y = [ np.sin(2*np.pi*fTwo * (i/fs)) for i in x]
f_res = signal.resample(y, rsample)
xnew = np.linspace(0, 100, f_res.size, endpoint=False)
#
# ##############################
#
plt.figure(1)
plt.subplot(211)
plt.stem(x, y)
plt.subplot(212)
plt.stem(xnew, f_res, 'r')
plt.show()
Plotting the data is a good first take at a verification. Here I made regular plot with the points connected by lines. The lines are useful since they give a guide for where you expect the down-sampled data to lie, and also emphasize what the down-sampled data is missing. (It would also work to only show lines for the original data, but lines, as in a stem plot, are too confusing, imho.)
import numpy as np
import matplotlib.pyplot as plt # For ploting
from scipy import signal
fs = 100 # sample rate
rsample=43 # downsample frequency
fTwo=13 # frequency of the signal
x = np.arange(fs, dtype=float)
y = np.sin(2*np.pi*fTwo * (x/fs))
print y
f_res = signal.resample(y, rsample)
xnew = np.linspace(0, 100, f_res.size, endpoint=False)
#
# ##############################
#
plt.figure()
plt.plot(x, y, 'o')
plt.plot(xnew, f_res, 'or')
plt.show()
A few notes:
If you're trying to make a general algorithm, use non-rounded numbers, otherwise you could easily introduce bugs that don't show up when things are even multiples. Similarly, if you need to zoom in to verify, go to a few random places, not, for example, only the start.
Note that I changed fTwo to be significantly less than the number of samples. Somehow, you need at least more than one data point per oscillation if you want to make sense of it.
I also remove the loop for calculating y: in general, you should try to vectorize calculations when using numpy.
The spectrum of the resampled signal should have a tone at the same frequency as the input signal just in a smaller nyquist bandwidth.
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
import scipy.fftpack as fft
fs = 100 # sample rate
rsample=50 # downsample frequency
fTwo=10 # frequency of the signal
n = np.arange(1024)
y = np.sin(2*np.pi*fTwo/fs*n)
y_res = signal.resample(y, len(n)/2)
Y = fft.fftshift(fft.fft(y))
f = -fs*np.arange(-512, 512)/1024
Y_res = fft.fftshift(fft.fft(y_res, 1024))
f_res = -fs/2*np.arange(-512, 512)/1024
plt.figure(1)
plt.subplot(211)
plt.stem(f, abs(Y))
plt.subplot(212)
plt.stem(f_res, abs(Y_res))
plt.show()
The tone is still at 10.
IF you down sample a signal both signals will still have the exact same value and a given time , so just loop through "time" and check that the values are the same. In your case you go from a sample rate of 100 to 50. Assuming you have 1 seconds worth of data from building your x from fs, then just loop through t = 0 to t=1 in 1/50'th increments and make sure that Yd(t) = Ys(t) where Yd d is the down sampled f and Ys is the original sampled frequency. Or to say it simply Yd(n) = Ys(2n) for n = 1,2,3,...n=total_samples-1.

Simultaneously fit linearly every line of a 2d numpy array

I am working in Python on image analysis. I have an image (2d numpy array) with some intensity drift in it. I want to level it.
To remove the increasing/decreasing intensity over the width of the image, I want to fit every row of the 2d numpy array with a line. I however do not want to loop through every row index.
MWE:
import numpy as np
import matplotlib.pyplot as plt
width=1500
height=2500
np.random.random((width,height))
fill_fun = lambda x,a,b : a*x+b
play_image = fill_fun(np.tile(np.arange(width),(height,1)),0.15,2)+np.random.random( (height,width) )
#For representation purposes:
#plt.imshow(play_image,cmap='Greys_r')
#plt.show()
#1) Fit every row and kill the intensity decrease/increase tendency
fit_func = lambda p,x: p[0]*x+b
errfunc = lambda p, x, y: abs(fitfunc(p, x) - y) # Distance to the target function
x_axis=np.linspace(0,width,width)
for i in range(height):
row_val=play_image[i,:]
p0=[(row_val[-1]-row_val[0])/float(width),row_val[0]] #guess
p1, success = optimize.leastsq(errfunc, p0[:], args=(x_axis,row_val))
play_image[i,:]-= fit_func(p1,x_axis)-p1[1]
By doing this I effectively level my image intensity horizontally. Is there anyway I can replace the loop by a matrix operation ? To somehow fit all the lines at the same time with a (height,2) parameter vector ?
Thanks for the help
Fitting a line is a simple formula to use directly, which can be done about three short lines in numpy (most of the code below is just making and plotting the data and fits):
import numpy as np
import matplotlib.pyplot as plt
# make the data as sequential sections of a circle
theta = np.linspace(np.pi, 0, 120)
y = np.reshape(np.sin(theta), (10,12))
x = np.repeat(np.arange(12)[None,:], 10, axis=0)
# fit the line
m = lambda x: np.mean(x, axis=1)
beta = ( m(y*x) - m(x)*m(y) )/(m(x*x) - m(x)**2)
alpha = m(y) - beta*m(x)
# plot the data and fits
plt.plot([y[:,i] for i in range(12)], ".") # plot the data
plt.gca().set_color_cycle(None) # reset the color cycle
fits = alpha[:,None] + beta[:,None]*x # make lines from the fits for the plots
plt.plot(fits.T)
plt.show()
You can implement the normal equations and their solution pretty easily. The main challenge is keeping track of the appropriate dimensions so all the vectorized operations work correctly. Here's one method:
import numpy as np
# image size
m = 100
n = 125
# A random image to work with.
np.random.seed(123)
img = np.random.randint(0, 100, size=(m, n))
# X is the design matrix. It is the same for each row. It has shape (n, 2).
X = np.column_stack((np.ones(n), np.arange(n)))
# A is X.T.dot(X), but in this case we can use an explicit formula for each term.
s1 = 0.5*n*(n - 1) # Sum of integers
s2 = n*(n - 0.5)*(n - 1)/3.0 # Sum of squared integers
A = np.array([[n, s1], [s1, s2]])
# Y has shape (2, m). Each column is a vector on the right-hand-side of the
# normal equations.
Y = X.T.dot(img.T)
# Solve the normal equations. beta has shape (2, m). Each column gives the
# coefficients of the linear fit for each row of img.
beta = np.linalg.solve(A, Y)
# Create an array that holds the linear drift for each row.
# X has shape (n, 2) and beta has shape (2, m), so row_drift has shape (m, n),
# the same as img.
row_drift = X.dot(beta).T
# Remove the drift from img.
img2 = img - row_drift

Healpy: From Data to Healpix map

I have a data grid where the rows represent theta (0, pi) and the columns represent phi (0, 2*pi) and where f(theta,phi) is the density of dark matter at that location. I wanted to calculate the power spectrum for this and have decided to use healpy.
What I can not understand is how to format my data for healpy to use. If someone could provide code (in python for obvious reasons) or point me to a tutorial, that would be great! I have tried my hand at doing it with the following code:
#grid dimensions are Nrows*Ncols (subject to change)
theta = np.linspace(0, np.pi, num=grid.shape[0])[:, None]
phi = np.linspace(0, 2*np.pi, num=grid.shape[1])
nside = 512
print "Pixel area: %.2f square degrees" % hp.nside2pixarea(nside, degrees=True)
pix = hp.ang2pix(nside, theta, phi)
healpix_map = np.zeros(hp.nside2npix(nside), dtype=np.double)
healpix_map[pix] = grid
But, when I try to execute the code to do the power spectrum. Specifically, :
cl = hp.anafast(healpix_map[pix], lmax=1024)
I get this error:
TypeError: bad number of pixels
If anyone could point me to a good tutorial or help edit my code that would be great.
More specifications:
my data is in a 2d np array and I can change the numRows/numCols if I need to.
Edit:
I have solved this problem by first changing the args of anafast to healpix_map.
I also improved the spacing by making my Nrows*Ncols=12*nside*nside.
But, my power spectrum is still giving errors. If anyone has links to good documentation/tutorial on how to calculate the power spectrum (condition of theta/phi args), that would be incredibly helpful.
There you go, hope it's what you're looking for. Feel free to comment with questions :)
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
# Set the number of sources and the coordinates for the input
nsources = int(1.e4)
nside = 16
npix = hp.nside2npix(nside)
# Coordinates and the density field f
thetas = np.random.random(nsources) * np.pi
phis = np.random.random(nsources) * np.pi * 2.
fs = np.random.randn(nsources)
# Go from HEALPix coordinates to indices
indices = hp.ang2pix(nside, thetas, phis)
# Initate the map and fill it with the values
hpxmap = np.zeros(npix, dtype=np.float)
for i in range(nsources):
hpxmap[indices[i]] += fs[i]
# Inspect the map
hp.mollview(hpxmap)
Since the map above contains nothing but noise, the power spectrum should just contain shot noise, i.e. be flat.
# Get the power spectrum
Cl = hp.anafast(hpxmap)
plt.figure()
plt.plot(Cl)
There is a faster way to do the map initialization using numpy.add.at, following this answer.
This is several times faster on my machine as compared to the first section of Daniel's excellent answer:
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
# Set the number of sources and the coordinates for the input
nsources = int(1e7)
nside = 64
npix = hp.nside2npix(nside)
# Coordinates and the density field f
thetas = np.random.uniform(0, np.pi, nsources)
phis = np.random.uniform(0, 2*np.pi, nsources)
fs = np.random.randn(nsources)
# Go from HEALPix coordinates to indices
indices = hp.ang2pix(nside, thetas, phis)
# Baseline, from Daniel Lenz's answer:
# time: ~5 s
hpxmap1 = np.zeros(npix, dtype=np.float)
for i in range(nsources):
hpxmap1[indices[i]] += fs[i]
# Using numpy.add.at
# time: ~0.6 ms
hpxmap2 = np.zeros(npix, dtype=np.float)
np.add.at(hpxmap2, indices, fs)

turn scatter data into binned data with errors bars equal to standard deviation

I have a bunch of data scattered x, y. If I want to bin these according to x and put error bars equal to the standard deviation on them, how would I go about doing that?
The only I know of in python is to loop over the data in x and group them according to bins (max(X)-min(X)/nbins) then loop over those blocks to find the std. I'm sure there are faster ways of doing this with numpy.
I want it to look similar to "vert symmetric" in: http://matplotlib.org/examples/pylab_examples/errorbar_demo.html
You can bin your data with np.histogram. I'm reusing code from this other answer to calculate the mean and standard deviation of the binned y:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.rand(100)
y = np.sin(2*np.pi*x) + 2 * x * (np.random.rand(100)-0.5)
nbins = 10
n, _ = np.histogram(x, bins=nbins)
sy, _ = np.histogram(x, bins=nbins, weights=y)
sy2, _ = np.histogram(x, bins=nbins, weights=y*y)
mean = sy / n
std = np.sqrt(sy2/n - mean*mean)
plt.plot(x, y, 'bo')
plt.errorbar((_[1:] + _[:-1])/2, mean, yerr=std, fmt='r-')
plt.show()
No loop ! Python allows you to avoid looping as much as possible.
I am not sure to get everything, you have the same x vector for all data and many y vectors corresponding to different measurement no ? And you want to plot your data as the "vert symmetric" with the mean value of y for each x and a standard deviation for each x as an errorbar ?
Then it is easy. I assume you have a M-long x vector and a N*M array of your N sets of y data already loaded in variable names x and y.
import numpy as np
import pyplot as pl
error = np.std(y,axis=1)
ymean = np.mean(y,axis=1)
pl.errorbar(x,ymean,error)
pl.show()
I hope it helps. Let me know if you have any question or if it is not clear.

Categories

Resources