I have EEG data that comes in the form of a 3D numpy array (epoch * channel * timepoint). timepoint is a 256 element array containing each sampled timepoint (1s total, at 256Hz). epoch is an experimental trial.
I'm trying to import the numpy array into a form Python-MNE (http://martinos.org/mne/stable/mne-python.html) understands, but I'm having some trouble
First, I'm not sure if I should be importing this raw data as a RawArray or an EpochsArray. I tried the latter with this:
ch_names = list containing my 64 eeg channel names
allData = 3d numpy array as described above
info = mne.create_info(ch_names, 256, ch_types='eeg')
event_id = 1
#I got this from a tutorial but really unsure what it does and I think this may be the problem
events = np.array([200, event_id]) #I got this from a tutorial but really unsure what it does and I think this may be the problem
raw = mne.EpochsArray(allData, info, events=events)
picks = mne.pick_types(info, meg=False, eeg=True, misc=False)
raw.plot(picks=picks, show=True, block=True)
When I run this I get an index error: "too many indices for array"
Ultimately I want to do some STFT and CSP analysis on the data, but right now I'm in need of some help with the initial restructuring and importing into MNE.
Whats the correct way to import this numpy data that would make it easiest to complete my intended analyses?
Is there any way you can convert the data you acquired from your EEG setup into the .fif format? The 'raw' data format the MNE page talks about in their tutorial is a .fif format file. If you can get your eeg data into .fif format, you can pretty much just follow the tutorial step by step...
Functions to convert from various other EEG file formats to .fif: http://martinos.org/mne/stable/manual/convert.html
If that's not an option, here are some thoughts:
EpochsArray() looks to be the correct function as it expects a data array with (n_epochs, n_channels, n_times) for the shape. Just to be sure, check that the shape of your allData array matches up with np.shape(allData).
On a related note the help page for EpochsArray() mentioned mne.read_events() the big question though is where your events data might be stored for you to be able to read it...
Based on the tutorial you linked it seems like the way to get 'events' if you're starting from a .fif file is:
events = mne.find_events(raw, stim_channel='STI 014'). This makes me wonder if you have more than 64 channels in your numpy array and one of your channels is in fact a stimulation channel... if that's the case you could try feeding that stim channel to the mne.read_events() function. Alternatively, perhaps your stim or events channel might be a separate array or perhaps unprocessed?
Hope this is at least somewhat helpful and good luck!
In case someone else is wondering, they added a tutorial to their doc: Creating MNE-Python data structures from scratch. You should be able to find the 2 needed steps:
info structure creation
epochs from array creation
Related
I'm new to Python, and to programming in general, so please don't take it too hard on me
I am currently trying to figure out how to write a new wav file using a string (which was derived from another wave file's data)
I performed a fourier transform on that file's data, so now I'm trying to get the values from the Fourier transform written into a new wav file.
I can only use numpy and the included Python library, not scipy
According to the documentation, I have to use wave_write(), but I have no idea what the code is supposed to look like for this function.
I think I'm supposed to do something pertaining to
wave_write.writeframesraw(data)
Then again, not totally sure of what to do.
Any help is greatly appreciated!
Two functions in NumPy can help you with this: astype and tostring.
If you have an array of sound samples, say X then you can convert it to the right format using astype. This will depend on what data type is used in the wav file, and the library you are using to save it. But let us for this example say you want to store it as 16 bit integer. You'll need to scale X according to the data type selected - so in this case the range will be -32768 to 32767 for a signed 16 bit int. If you sample goes from -1.0 to 1.0 then you can simply multiply with 32767.
The next part is simply to convert it to a string using tostring, it could look something the following:
scaled = X * 32767
scaled.astype('<i2').tostring()
You can find the documentation for the functions here:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.astype.html
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tostring.html
I have data in the format of 10000x500 matrix contained in a .txt file. In each row, data points are separated from each other by one whitespace and at the end of each row there a new line starts.
Normally I was able to read this kind of multidimensional array data into Python by using the following snippet of code:
with open("position.txt") as f:
data = [line.split() for line in f]
# Get the data and convert to floats
ytemp = np.array(data)
y = ytemp.astype(np.float)
This code worked until now. When I try to use the exact some code with another set of data formatted in the same way, I get the following error:
setting an array element with a sequence.
When I try to get the 'shape' of ytemp, it gives me the following:
(10001,)
So it converts the rows to array, but not the columns.
I thought of any other information to include, but nothing came to my mind. Basically I'm trying to convert my data from a .txt file to a multidimensional array in Python. The code worked before, but now for some reason that is unclear to me it doesn't work. I tried to look compare the data, of course it's huge, but everything seems quite similar between the data that is working and the data that is not working.
I would be more than happy to provide any other information you may need. Thanks in advance.
Use numpy's builtin function:
data = numpy.loadtxt('position.txt')
Check out the documentation to explore other available options.
I'm working with various climate models, but right now I'm working on regridding the latitudes and longitudes of these files from 2.5x2.5 to 0.5x0.5, and I am completely lost. I've been running on the Anaconda package for all of my netCDF4 needs, and I've made good progress, it's just regridding that baffles me completely. I have three main arrays that I'm using:
The first is the data_array, a numpy array that contains the information for precipitation.
The second is the lan_array, a numpy array containing the latitude information.
The third is the lot_array, a numpy array containing the longitude information.
All this data came from the netCDF4 file.
Again, my data is currently in 2.5x2.5. Meaning, the lonxlat is currently 144x72. I use np.meshgrid(lon_array,lat_array) to bring lonxlat to go to 72. My data_array also contains 72 elements, thus matching up perfectly.
This is where I get stuck and I have no idea how to proceed.
My thoughts: I want my 144x72 to convert to 720x360 in order for it to be 0.5x0.5.
I know one way of creating the lonxlat that I want is by np.arange(-89.75,90.25,0.5) and np.arange(-179.75,181.25,0.5). But I don't know how to match up the data_array to match with that.
Can anyone please offer any assistance? Any help is much appreciated!
Note: I also have ESMF modules available to me.
An easy option would be nctoolkit (https://nctoolkit.readthedocs.io/en/latest/installing.html). This has a built in method called to_latlon that easily achieves what you want. Just do the following for bilinear interpolation (and see the user guide for other methods):
import nctoolkit as nc
data = nc.open("infile.nc")
data.to_latlon(lon = [-179.75, 179.75], lat = [-89.75, 89.75], res = [0.5, 0.5])
I am coding in python, and trying to use netCDF4 to read in some floating point netCDF data. Mt original code looked like
from netCDF4 import Dataset
import numpy as np
infile='blahblahblah'
ds = Dataset(infile)
start_pt = 5 # or whatever
x = ds.variables['thedata'][start_pt:start_pt+2,:,:,:]
Now, because of various and sundry other things, I now have to read 'thedata' one slice at a time:
x = np.zeros([2,I,J,K]) # I,J,K match size of input array
for n in range(2):
x[n,:,:,:] = ds.variables['thedata'][start_pt+n,:,:,:]
The thing is that the two methods of reading give slightly different results. Nothing big, like one part in 10 to the fifth, but still ....
So can anyone tell me why this is happening and how I can guarantee the same results from the two methods? My thought was that the first method perhaps automatically establishes x as being the same type as the input data, while the second method establishes x as the default type for a numpy array. However, the input data is 64 bit and I thought the default for a numpy array was also 64 bit. So that doesn't explain it. Any ideas? Thanks.
The first example pulls the data into a NetCDF4 Variable object, while the second example pulls the data into a numpy array. Is it possible that the Variable object is just displaying the data with a different amount of precision?
Context :
I am discovering the vast field of DSP. Yes I'm a beginner.
My goal :
Apply fft on an audio array given by audiolab to get the different frequencies of the signal.
Question :
One question : I just cannot get what to do with a numpy array which contains audio datas, thanks to audiolab. :
import numpy as np
from scikits.audiolab import Sndfile
f = Sndfile('first.ogg', 'r')
# Sndfile instances can be queried for the audio file meta-data
fs = f.samplerate
nc = f.channels
enc = f.encoding
print(fs,nc,enc)
# Reading is straightfoward
data = f.read_frames(10)
print(data)
print(np.fft.fft(data))
Now I have got my datas.
Readings
I read those two nice articles here :
Analyze audio using Fast Fourier Transform (the accepted answser is wonderful)
and
http://www.onlamp.com/pub/a/python/2001/01/31/numerically.html?page=2
Now there are two technics : apparently one suggests square (first link) whereas the other a log, especially : 10ln10(abs(1.10**-20 + value))
Which one is the best ?
SUM UP :
I would like to get the fourier analysis of my array but any of those two answers seems to only emphasis the signal and not isolating the components.
I may be wrong, I am a still a noob.
What should I really do then ?
Thanks,
UPDATE:
I ask this question :
DSP - get the amplitude of all the frequencies which is related to this one.
Your question seems pretty confused, but you've obviously tried something, which is great. Let me take a step back and suggest an overall route for you:
Start by breaking your audio into chunks of some size, say N.
Perform the FFT on each chunk of N samples.
THEN worry about displaying the data as RMS (the square approach) or dB (the ln-based approach).
Really, you can think of those values as scaling factors for display.
If you need help with the FFT itself, my blog post on pitch detection with the FFT may help: http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
Adding to the answer given by #Bjorn Roche.
Here is a simple code for plotting frequency spectrum, using dB scale.
It uses matplotlib for plotting.
import numpy as np
import pylab
# for a real signal
def plotfftspectrum(signal, dt): # where dt is the sample rate
n = signal.size
spectrum = np.abs(np.fft.fft(signal))
spectrum = 20*np.log(spectrum/spectrum.max()) # dB scale
frequencies = np.fft.fftfreq(n, dt)
pylab.plot(frequencies[:n//2], spectrum[:n//2])
# plot n//2 due real function symmetry
pylab.show()
You can use it, after reading at least some samples of your data, e.g like 1024.
data = f.read_frames(1024)
plotfftspectrum(data, 1./f.samplerate)
Where I believe your sample rate is in frequency.