I have a Matlab script to compute the DFT of a signal and plot it:
(data can be found here)
clc; clear; close all;
fid = fopen('s.txt');
txt = textscan(fid,'%f');
s = cell2mat(txt);
nFFT = 100;
fs = 24000;
deltaF = fs/nFFT;
FFFT = [0:nFFT/2-1]*deltaF;
win = hann(length(s));
sw = s.*win;
FFT = fft(sw, nFFT)/length(s);
FFT = [FFT(1); 2*FFT(2:nFFT/2)];
absFFT = 20*log10(abs(FFT));
plot(FFFT, absFFT)
grid on
I am trying to translate it to Python and can't get the same result.
import numpy as np
from matplotlib import pyplot as plt
x = np.genfromtxt("s.txt", delimiter=' ')
nfft = 100
fs = 24000
deltaF = fs/nfft;
ffft = [n * deltaF for n in range(nfft/2-1)]
ffft = np.array(ffft)
window = np.hanning(len(x))
xw = np.multiply(x, window)
fft = np.fft.fft(xw, nfft)/len(x)
fft = fft[0]+ [2*fft[1:nfft/2]]
fftabs = 20*np.log10(np.absolute(fft))
plt.figure()
plt.plot(ffft, np.transpose(fftabs))
plt.grid()
The plots I get (Matlab on the left, Python on the right):
What am I doing wrong?
Both codes are different in one case you concatenate two lists
FFT = [FFT(1); 2*FFT(2:nFFT/2)];
in the matlab code
in the other you add the first value of fft with the rest of the vector
fft = fft[0]+ [2*fft[1:nfft/2]]
'+' do not concatenate here because you have numpy array
In python, it should be:
fft = fft[0:nfft/2]
fft[1:nfft/2] = 2*fft[1:nfft/2]
I am not a Mathlab user so I am not sure but there are few things I'd ask to see if I can help you.
You called np.array after array has been made (ffft). That probably will not change the nature of array as well as you hoped, perhaps it would be better to try to define it inside np.array(n * deltaF for n in range(nfft/2-1)) I am not sure of formatting but you get the idea. The other thing is that the range doesn't seem right to me. You want it to have a value of 49?
Another one is the fft = fft[0]+ [2*fft[1:nfft/2]] compared to FFT = [FFT(1); 2*FFT(2:nFFT/2)]; I am not sure if the comparsion is accurate or not. It just seemed to be a different type of definition to me?
Also, when I do these type of calculations, I 'print' out the intermediate steps so I can compare the numbers to see where it breaks.
Hope this helps.
I found out that using np.fft.rfft instead of np.fft.fft and modifying the code as following does the job :
import numpy as np
from matplotlib import pyplot as pl
x = np.genfromtxt("../Matlab/s.txt", delimiter=' ')
nfft = 100
fs = 24000
deltaF = fs/nfft;
ffft = np.array([n * deltaF for n in range(nfft/2+1)])
window = np.hanning(len(x))
xw = np.multiply(x, window)
fft = np.fft.rfft(xw, nfft)/len(x)
fftabs = 20*np.log10(np.absolute(fft))
pl.figure()
pl.plot(np.transpose(ffft), fftabs)
pl.grid()
The resulting plot :
right result with Python
I can see that the first and the last points, as well as the amplitudes are not the same. It isn't a problem for me (I am more interested in the general shape), but if someone can explain, I'd be happy.
Related
Create a function that finds the largest derivative in the derivative
list. Feel free to compare with the numpy function max. Let the
program print out what volume this corresponds to. This is the volume
of strong base added at the equivalence point. Also find the pH at the
equivalence point using your program.
I was able to find the first part of the question by making a function to find max and got the correct answer from that, but im stuck on how to use that information to find the pH at the equivalence point.
My code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
fil = pd.read_csv('https://raw.githubusercontent.com/andreasdh/programmering-i-kjemi/master/docs/datafiler/titreringsdata.txt', delimiter = ",")
volum = fil['volum']
pH = fil['pH']
print(pH, volume)
plt.plot(volum, pH, color = "#B00B69", label = "Tilpasset modell")
plt.scatter(volum, pH, color = "hotpink", label = "Datapunkter")
plt.xlabel("volum")
plt.ylabel("pH")
plt.grid()
plt.show()
d = []
for i in range(len(volum)-1):
dery = pH[i+1] - pH[i]
dert = volum[i+1] - volum[i]
dydt = dery/dert
d.append(dydt)
print(d)
def fmax(list):
max = list[0]
for x in list:
if x > max:
max = x
return max
print('the biggest element in the derivative is', fmax(d))
I believe that at somepoint I will ahve to use matplotlib.pyplot to make a graph and scatter the data around but still can't understand what I'm supposed to do.
I have a program in MATLAB which I want to port to Python. The problem is that in it I use the built-in spectrogram function and, although the matplotlib specgram function seems identical, I'm getting different results when I run both.
These is the code I've been running.
MATLAB:
data = 1:999; %Dummy data. Just for testing.
Fs = 8000; % All the songs we'll be working on will be sampled at an 8KHz rate
tWindow = 64e-3; % The window must be long enough to get 64ms of the signal
NWindow = Fs*tWindow; % Number of elements the window must have
window = hamming(NWindow); % Window used in the spectrogram
NFFT = 512;
NOverlap = NWindow/2; % We want a 50% overlap
[S, F, T] = spectrogram(data, window, NOverlap, NFFT, Fs);
Python:
import numpy as np
from matplotlib import mlab
data = range(1,1000) #Dummy data. Just for testing
Fs = 8000
tWindow = 64e-3
NWindow = Fs*tWindow
window = np.hamming(NWindow)
NFFT = 512
NOverlap = NWindow/2
[s, f, t] = mlab.specgram(data, NFFT = NFFT, Fs = Fs, window = window, noverlap = NOverlap)
And this is the result I get in both executions:
http://i.imgur.com/QSPvYsC.png
(The F and T variables are exactly the same in both programs)
It's obvious that they're different; in fact, the Python execution even doesn't return complex numbers. What could be the problem? Is there any way to fix it or I should use another spectrogram function?
Thank you so much in advance for your help.
In matplotlib, specgram by default returns the power spectral density (mode='PSD'). In MATLAB, spectrogram by default returns the short-time fourier transform, unless nargout==4, in which case it also computes the PSD. To get the matplotlib behaviour to match the MATLAB behaviour, set mode='complex'
I have a signal in frequency domain.Then I took numpy.fft.ifft of signal.I got time domain signal.Again I took fft of same time signal properly I'm not getting negative and positive frequencies(Plot 3 in Figure).
time = np.arange(0, 10, .01)
N = len(time)
signal_td = np.cos(2.0*np.pi*2.0*time)
signal_fd = np.fft.fft(signal_td)
signal_fd2 = signal_fd[0:N/2]
inv_td2 = np.fft.ifft(signal_fd2)
fd2 = np.fft.fft(inv_td2)
General comment: I avoid using time as a variable name because IPython loads it as a "magic" command.
Something I find at times confusing about matplotlib is that when you plot a complex array, it actually plots the real part. In the code snippet:
tt = np.arange(0, 10, .01)
N = len(tt)
signal_td = np.cos(2.0*np.pi*2.0*tt)
signal_fd = np.fft.fft(signal_td)
signal_fd2 = signal_fd[0:N/2]
inv_td2 = np.fft.ifft(signal_fd2)
fd2 = np.fft.fft(inv_td2)
The following arrays have dtype of float64: tt and signal_td. The others are complex128. The reason you only see one peak in fd2 is because it is a transform of exp(4j*np.pi*tt) rather than cos(4*np.pi*tt).
I'm new really to python programming, and I was just wondering if you can create a regular grid of 0.5 by o.5 m of resolution using LiDAR points.
My data are in LAS format (reading with from liblas import file as lasfile) and they have the following format: X,Y,Z. Where X and Y are coordinates.
The points are randomly positioned and some pixel are empty (NAN value) and in some pixel there are more of one points. Where there are more of one point, I wish to obtain a mean value. In the end i need to save the data in a TIF format or Ascii format.
I am studying osgeo module and GDAL but I honest to say that i don't know if osgeo module is the best solution.
I am really glad for help with some code that i can study and implement,
Thanks in Advance for the help, I really need.
I don't know the best way to get a grid with these parameters.
It's a bit late but maybe this answer will be useful for others, if not for you...
I have done this with Numpy and Pandas, and it's pretty fast. I was using TLS data and could do this with several million data points without any trouble on a decent 2009-vintage laptop. The key is 'binning' by rounding the data, and then using Pandas' GroupBy methods to do the aggregating and calculate the means.
If you need to round to a power of 10 you can use np.round, otherwise you can round to an arbitrary value by making a function to do so, which I have done by modifying this SO answer.
import numpy as np
import pandas as pd
# make rounding function:
def round_to_val(a, round_val):
return np.round( np.array(a, dtype=float) / round_val) * round_val
# load data
data = np.load( 'shape of ndata, 3')
n_d = data.shape[0]
# round the data
d_round = np.empty( [n_d, 5] )
d_round[:,0] = data[:,0]
d_round[:,1] = data[:,1]
d_round[:,2] = data[:,2]
del data # free up some RAM
d_round[:,3] = round_to_val( d_round[:,0], 0.5)
d_round[:,4] = round_to_val( d_round[:,1], 0.5)
# sorting data
ind = np.lexsort( (d_round[:,4], d_round[:,3]) )
d_sort = d_round[ind]
# making dataframes and grouping stuff
df_cols = ['x', 'y', 'z', 'x_round', 'y_round']
df = pd.DataFrame( d_sort)
df.columns = df_cols
df_round = df[['x_round', 'y_round', 'z']]
group_xy = df_round.groupby(['x_round', 'y_round'])
# calculating the mean, write to csv, which saves the file with:
# [x_round, y_round, z_mean] columns. You can exit Python and then start up
# later to clear memory if that's an issue.
group_mean = group_xy.mean()
group_mean.to_csv('your_binned_data.csv')
# Restarting...
import numpy as np
from scipy.interpolate import griddata
binned_data = np.loadtxt('your_binned_data.csv', skiprows=1, delimiter=',')
x_bins = binned_data[:,0]
y_bins = binned_data[:,1]
z_vals = binned_data[:,2]
pts = np.array( [x_bins, y_bins])
pts = pts.T
# make grid (with borders rounded to 0.5...)
xmax, xmin = 640000.5, 637000
ymax, ymin = 6070000.5, 6067000
grid_x, grid_y = np.mgrid[640000.5:637000:0.5, 6067000.5:6070000:0.5]
# interpolate onto grid
data_grid = griddata(pts, z_vals, (grid_x, grid_y), method='cubic')
# save to ascii
np.savetxt('data_grid.txt', data_grid)
When I've done this, I have saved the output as a .npy and converted to a tiff with the Image library, and then georeferenced in ArcMap. There is probably a way to do that with osgeo but I haven't used it.
Hope this helps someone at least...
You can use the histogram function in Numpy to do binning, for instance:
import numpy as np
points = np.random.random(1000)
#create 10 bins from 0 to 1
bins = np.linspace(0, 1, 10)
means = (numpy.histogram(points, bins, weights=data)[0] /
numpy.histogram(points, bins)[0])
Try LAStools, particularly lasgrid or las2dem.
In my application, the data data is sampled on a distorted grid, and I would like to resample it to a nondistorted grid. In order to test this, I wrote this program with examplary distortions and a simple function as data:
from __future__ import division
import numpy as np
import scipy.interpolate as intp
import pylab as plt
# Defining some variables:
quadratic = -3/128
linear = 1/16
pn = np.poly1d([quadratic, linear,0])
pixels_x = 50
pixels_y = 30
frame = np.zeros((pixels_x,pixels_y))
x_width= np.concatenate((np.linspace(8,7.8,57) , np.linspace(7.8,8,pixels_y-57)))
def data(x,y):
z = y*(np.exp(-(x-5)**2/3) + np.exp(-(x)**2/5) + np.exp(-(x+5)**2))
return(z)
# Generating grid coordinates
yt = np.arange(380,380+pixels_y*4,4)
xt = np.linspace(-7.8,7.8,pixels_x)
X, Y = np.meshgrid(xt,yt)
Y=Y.T
X=X.T
Y_m = np.zeros((pixels_x,pixels_y))
X_m = np.zeros((pixels_x,pixels_y))
# generating distorted grid coordinates:
for i in range(pixels_y):
Y_m[:,i] = Y[:,i] - pn(xt)
X_m[:,i] = np.linspace(-x_width[i],x_width[i],pixels_x)
# Sample data:
for i in range(pixels_y):
for j in range(pixels_x):
frame[j,i] = data(X_m[j,i],Y_m[j,i])
Y_m = Y_m.flatten()
X_m = X_m.flatten()
frame = frame.flatten()
##
Y = Y.flatten()
X = X.flatten()
ipf = intp.interp2d(X_m,Y_m,frame)
interpolated_frame = ipf(xt,yt)
At this point, I have to questions:
The code works, but I get the the following warning:
Warning: No more knots can be added because the number of B-spline coefficients
already exceeds the number of data points m. Probably causes: either
s or m too small. (fp>s)
kx,ky=1,1 nx,ny=54,31 m=1500 fp=0.000006 s=0.000000
Also, some interpolation artifacts appear, and I assume that they are related to the warning - Do you guys know what I am doing wrong?
For my actual applications, the frames need to be around 500*100, but when doing this, I get a MemoryError - Is there something I can do to help that, apart from splitting the frame into several parts?
Thanks!
This problem is most likely related to the usage of bisplrep and bisplev within interp2d. The docs mention that they use a smooting factor of s=0.0 and that bisplrep and bisplev should be used directly if more control over s is needed. The related docs mention that s should be found between (m-sqrt(2*m),m+sqrt(2*m)) where m is the number of points used to construct the splines. I had a similar problem and found it solved when using bisplrep and bisplev directly, where s is only optional.
For 2d interpolation,
griddata
is solid, local, fast.
Take a look at problem-with-2d-interpolation-in-scipy-non-rectangular-grid on SO.
You might want to look at the following interp method in basemap:
mpl_toolkits.basemap.interp
http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html
unless you really need spline-based interpolation.