Syntax Error when including a module that imports another module - python

I am trying to create smaller files than can themselves be run standalone in order to test them. One of my files reads data from a csv and the test is the plot the data. I am now creating a module that makes a moving average of the data, the test will be to read in the data (using the other module) and then run itself on the data and plot it. Each file is stored in its own directory (the read file module is stored in _ReadBand)
The issue seems to be with the %matplotlib inline command that i am using (i am working in ipython) that is only found in the test area
Here is my Read CSV file
##
# Read sensor band data
# Creates Timestamps from RTC ticks
##
import pandas as pd
from ticksConversion import ticks_to_time #converts difference in rtc to float of seconds
def read_sensor(folder_path, file_path):
##This function takes csv of raw data from sensor
##and returns the invidual signals as a well as a timestamp made from the real time clock
#Create full path
data_path = folder_path + file_path
#Read CSV
sensor_data = pd.read_csv(data_path, header=0, names=['time', 'band_rtc', 'ps1', 'ps2', 'ps3'])
#Extract sensor signals and rtc into lists
sensorData = [list(sensor_data.ps1), list(sensor_data.ps2), list(sensor_data.ps3)]
#Create Timestamps based on RTC
sensorTimestamp = [0] #first timestamp value
secondsElapsed = 0 #running total of seconds for timestamps
for indx, val in enumerate(sensor_data.band_rtc):
if( indx == 0):
continue #If first rtc value simply continue, this data already has timestamp zero
secondsElapsed += ticks_to_time(sensor_data.band_rtc[indx-1], sensor_data.band_rtc[indx]) #convert rtc elapsed to seconds and add to total
sensorTimestamp.append(secondsElapsed) #add timestamp for this data point
return sensorTimestamp, sensorData
#Test code
if __name__ == "__main__":
#matplotlib - 2D plotting library
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
#Test Data Path
folder_path = './'
file_path = 'testRead.csv'
#Test Fuunction
sensorTimestamp, sensorData = read_sensor(folder_path, file_path)
#Plot Data
for indx, data in enumerate(sensorData):
plt.figure(figsize=(20,3))
plt.plot(sensorTimestamp, sensorData[indx], label='data from LED %i' %indx)
plt.title('Raw Reading from Band')
plt.legend(loc='best')
plt.show()
This is my moving average file:
##
# Moving average Filter
##
def loop_rolling_mean(numRolls, dataset, window):
## Moving average with specificed window sized moved over data specified number of times
## Input:
# dataset - data to average
# numRolls - number of times to do a moving average
# window - window size for average, must be odd for unshifted data
##Output
# rolledMean - averaged data
rolledMean = Series(dataset) # Copy data over and cast to pandas series
for x in range(0,numRolls): #iterate how many times that you want to do run the window
rolledMean = pd.rolling_mean(rolledMean, window, min_periods=1, center=True)
return rolledMean ## the dataset that had the rolling mean going forward x number of times
def loop_rolling_mean_with_reverse(numRolls, dataset, window):
##roll over data set forward and backward to get rid of offset
## Input:
# dataset - data to average
# numRolls - number of rolls (forward and backward), must be even for unshifted data
# window - window size for average
##Output
# rolledMean - averaged data
#Error Checking
if(numRolls%2 != 0):
return "Number of rolls must be even for un-shifted data"
## Now going to do the alternating rolling
rolledMean = Series(dataset) # Copy data over and cast to pandas series
for x in range(0, int(numRolls/2)):
forwardRoll = pd.rolling_mean(rolledMean, window, min_periods=1) #roll data in forward direction
reversdData = forwardRoll.reindex(index=forwardRoll.index[::-1]) #reverse data
reverseRoll = pd.rolling_mean(reversdData, window, min_periods=1) #roll over reversed data
rolledMean = reverseRoll.reindex(index=reverseRoll.index[::-1]) #reverse data again so its in correct order
return rolledMean
#Test code
if __name__ == "__main__":
# import readBand
import sys
sys.path.insert(0, '../_ReadBand')
import readBand
matplotlib - 2D plotting library
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
#Set number of rolls and Window
numRolls = 2
windowSize = 3
rolledSensorData = []
for indx, val in enumerate(sensorData):
rolledSensorData.append(loop_rolling_mean(numRolls, sensorData[indx], windowSize))
#Plot Data
for indx, val in enumerate(rolledSensorData):
plt.figure(figsize=(20,3))
plt.plot(sens_data.timestamp, sensorData[indx], label='raw data')
plt.plot(sensor_data.timestamp, rolledSensorData[indx], label='Moving Avg forward')
plt.title('LED %i' %indx)
#plt.axis([8,22, 7000, 8600])
plt.legend(loc='best')
plt.show()
for indx, val in enumerate(rolledSensorData):
print len(rolledSensorData[indx])
And this is the error i receive, as you can see it refers to the test area in readBand.py
File "../_ReadBand\readBand.py", line 43
%matplotlib inline
^
SyntaxError: invalid syntax
I dont even think this part of the code should be running since its being imported and not the main

Related

How do I find, plot, and output the peaks of a live plotted Fast Fourier Transform (FFT) in Python?

I am working with the pyaudio and matplotlib packages for the first time and I am attempting to plot live audio data from microphone input, transform it to frequency domain information, and then output peaks with an input distance. This project is a modification of the three-part guide to build a spectrum analyzer found here.
Currently the code is formatted in a class as I have alternative methods that I am applying to the audio but I am only posting the class with the relevant methods as they don't make reference to each and are self-contained. Another quirk of the program is that it calls upon a local file though it only uses input from the user microphone; this is a leftover from the original functionality of plotting a sound file's intensity while it played and is no longer integral to the code.
import pyaudio
import wave
import struct
import pandas as pd
from scipy.fftpack import fft
from scipy.signal import find_peaks
import matplotlib.pyplot as plt
import numpy as np
class Wave:
def __init__(self, file) -> None:
self.CHUNK = 1024 * 4
self.obj = wave.open(file, "r")
self.callback_output = []
self.data = self.obj.readframes(self.CHUNK)
self.rate = 44100
# Initiate an instance of PyAudio
self.p = pyaudio.PyAudio()
# Open a stream with the file specifications
self.stream = self.p.open(format = pyaudio.paInt16,
channels = self.obj.getnchannels(),
rate = self.rate,
output = True,
input = True,
frames_per_buffer = self.CHUNK)
def fft_plot(self, distance: float):
x_fft = np.linspace(0, self.rate, self.CHUNK)
fig, ax = plt.subplots()
line_fft, = ax.semilogx(x_fft, np.random.rand(self.CHUNK), "-", lw = 2)
# Bind plot window sizes
ax.set_xlim(20, self.rate / 2)
plot_data = self.stream.read(self.CHUNK)
self.data_int = pd.DataFrame(struct.unpack(\
str(self.CHUNK * 2) + 'h', plot_data)).astype(dtype = "b")[::2]
y_fft = fft(self.data_int)
line_fft.set_ydata(np.abs(y_fft[0:self.CHUNK]) / (256 * self.CHUNK))
plt.show(block = False)
while True:
# Read incoming audio data
data = self.stream.read(self.CHUNK)
# Convert data to bits then to array
self.data_int = struct.unpack(str(4 * self.CHUNK) + 'B', data)
# Recompute FFT and update line
yf = fft(self.data_int)
line_data = np.abs(yf[0:self.CHUNK]) / (128 * self.CHUNK)
line_fft.set_ydata(line_data)
# Find all values above threshold
peaks, _ = find_peaks(line_data, distance = distance)
# Update the plot
plt.plot(peaks, line_data[peaks], "x")
fig.canvas.draw()
fig.canvas.flush_events()
# Exit program when plot window is closed
fig.canvas.mpl_connect('close_event', exit)
test_file = "C:/Users/Tam/Documents/VScode/Final Project/PrismGuitars.wav"
audio_test = Wave(test_file)
audio_test.fft_plot(2000)
The code does not throw any errors and runs fine with an okay framerate and only terminates when the plot window is closed, all of which is good. The issue I'm encountering is with the determination and plotting of the peaks of line_data as when I run this code the output over time looks like this matplotlib graph instance.
It seems that the peaks (or peak) are being found but at a lower frequency than the x of line_data and as such are shifted comparatively. The other, more minor, issue is that since this is a live plot I would like to clear the previous instance of the peak marker so that it only shows the current instance and not all of the ones plotted prior.
I have attempted in prior fixes to use the line_fft in the peak detection but as it is cast to a Line2D format the peak detection algorithm isn't able to deal with the data type. I have also tried implementing a list comprehension as seen in this post but the time to cast to list is prohibitively slow and did not return any peak markers when I ran it.
EDIT: Following Jody's input the program now returns the proper values as I was only printing an index for the x-coordinate of the peak marker. Nevertheless I would still appreciate some insight as to whether it is possible to update per marker rather than having all the previous ones constantly displayed.
As for the marker updating I have attempted to clear the plot in the while loop both before and after drawing the markers (in different tests of course) but I only ever end up with a completely blank graph.
Please let me know if there is anything I should clarify and thank you for your time.
As Jody pointed out the peaks variable contains indexes for the detected peaks that then need to be retrieved from x_fft and line_data in order to match up with the displayed data.
First we create a scatter plot:
scat = ax.scatter([], [], c = "purple", marker = "x")
This data can then be stacked using a container variable in the while loop as such:
array_peaks = np.c_[x_fft[peaks], line_data[peaks]]
and update the data in the while loop with:
scat.set_offsets(array_peaks)

Making a dot animation in python

The problem is straightforward but I am beginner in python and stuck on the optimal way to implement this:
I have a .txt file that contains time (s) and frequency.
I want a visualization of this data as a dot moving up and down a vertical axis.
I need the dot to move at the corresponding time stamp of the frequency, since I plan to output an mp4 file and sync up the animation to the original sound file.
The max and min of the y axis would be those of the frequency in the file.
By "moving" I think making the dot appear and disappear before the next one should work, so it's not continuous.
Below is what I have so far:
import matplotlib.pyplot as plt
import matplotlib.animation
from matplotlib.colors import LinearSegmentedColormap
# Import Data
time = np.loadtxt("sentence 1.txt", usecols=0, skiprows=1, dtype=float)
print(time)
print(time.shape)
f0 = np.loadtxt("sentence 1.txt", usecols=1, skiprows=1, dtype=float)
print(f0)
print(f0.shape)
# These are the indices of onset times of syllables
# time[0], time[10], time[22], time[34], time[85], time[100]
onset = [0,10,22,34,85,100]
# Do f0 averages for each syllable and create an array
def utterances(array):
f0_mean = []
end = 0
for i in range(len(array)):
if i == len(array)-1:
end = len(f0) - 1
else:
end = array[i+1]-1
avg = np.mean(f0[array[i]:end])
f0_mean.append(avg)
return f0_mean
utterance = utterances(onset)
print(utterance)
print(min(f0), max(f0))
Is there a way to handle specific times in seconds (for example [0.15 0.160714 0.171429 0.182143 0.192857 0.203571 0.214286 0.225
0.235714 0.246429]) in the matplotlib.animation.FuncAnimation()?

graphing a line of data from a text file

This is my first time creating a graph on python. I have a text file holding data of "weekly gas averages". There are 52 of them (its a years worth of data). I understand how to read the data and make it into a list, I think, and I can do the basics of making a graph if I make the points myself. But I don't know how to connect the two, as in turn the data in the file into my X axis and then make my own Y axis (1-52). My code is a bunch of thoughts I've slowly put together. Any help or direction would be amazing.
import matplotlib.pyplot as plt
def main():
print("Welcome to my program. This program will read data
off a file"\
+" called 1994_Weekly_Gas_Averages.txt. It will plot the"\
+" data on a line graph.")
print()
gasFile = open("1994_Weekly_Gas_Averages.txt", 'r')
gasList= []
gasAveragesPerWeek = gasFile.readline()
while gasAveragesPerWeek != "":
gasAveragePerWeek = float(gasAveragesPerWeek)
gasList.append(gasAveragesPerWeek)
gasAveragesPerWeek = gasFile.readline()
index = 0
while index<len(gasList):
gasList[index] = gasList[index].rstrip('\n')
index += 1
print(gasList)
#create x and y coordinates with data
x_coords = [gasList]
y_coords = [1,53]
#build line graph
plt.plot(x_coords, y_coords)
#add title
plt.title('1994 Weekly Gas Averages')
#add labels
plt.xlabel('Gas Averages')
plt.ylabel('Week')
#display graph
plt.show()
main()
Two errors I can spot while reading your code:
The object gasList is already a list, so when you write x_coords = [gasList] you're creating a list of list, which will not work
the line y_coords=[1,53] creates a list with only 2 values: 1 and 53. When you plot, you need to have as many y-values as there are x-values, so you should have 52 values in that list. You don't have to write them all by hand, you can use the function range(start, stop) to do that for you
That being said, you will probably gain a lot by using the functions that have already been written for you. For instance, if you use the module numpy (import numpy as np), then you can use np.loadtxt() to read the content of the file and create an array in one line. It's going to be much faster, and less error prone that trying to parse files by your self.
The final code:
import matplotlib.pyplot as plt
import numpy as np
def main():
print(
"Welcome to my program. This program will read data off a file called 1994_Weekly_Gas_Averages.txt. It will "
"plot the data on a line graph.")
print()
gasFile = "1994_Weekly_Gas_Averages.txt"
gasList = np.loadtxt(gasFile)
y_coords = range(1, len(gasList) + 1) # better not hardcode the length of y_coords,
# in case there fewer values that expected
# build line graph
plt.plot(gasList, y_coords)
# add title
plt.title('1994 Weekly Gas Averages')
# add labels
plt.xlabel('Gas Averages')
plt.ylabel('Week')
# display graph
plt.show()
if __name__ == "__main__":
main()

Continuous analog read from National Instruments DAQ with nidaqmx python package

Inspired by the answer to this question, I have tried the following code:
import nidaqmx
from nidaqmx import stream_readers
from nidaqmx import constants
import time
sfreq = 1000
bufsize = 100
data = np.zeros((1, 1), dtype = np.float32) # initializes total data file
with nidaqmx.Task() as task:
task.ai_channels.add_ai_voltage_chan("cDAQ2Mod1/ai1")
task.timing.cfg_samp_clk_timing(rate = sfreq, sample_mode = constants.AcquisitionType.CONTINUOUS,
samps_per_chan = bufsize) # unclear samps_per_chan is needed or why it would be different than bufsize
stream = stream_readers.AnalogMultiChannelReader(task.in_stream)
def reading_task_callback(task_id, event_type, num_samples, callback_data=None): # num_samples is set to bufsize
buffer = np.zeros((1, num_samples), dtype = np.float32) # probably better to define it here inside the callback
stream.read_many_sample(buffer, num_samples, timeout = constants.WAIT_INFINITELY)
data = np.append(data, buffer, axis = 1) # hopping to retrieve this data after the read is stopped
task.register_every_n_samples_acquired_into_buffer_event(bufsize, reading_task_callback)
Expected behavior: it reads continuously from a channel. I am not even trying to get it to do something specific yet (such as plotting in real time), but I would expect the python console to run until one stops it, since the goal is to read continuously.
Observed behavior: running this code proceeds quickly and the console prompt is returned.
Problem: it seems to me this is not reading continuously at all. Furthermore, the data variable does not get appended like I would like it to (I know that retrieving a certain number of data samples does not require such convoluted code with nidaqmx; this is just one way I thought I could try and see if this is doing what I wanted, i.e. read continuously and continuously append the buffered sample values to data, so that I can then look at the total data acquired).
Any help would be appreciated. I am essentially certain the way to achieve this is by making use of these callbacks which are part of nidaqmx, but somehow I do not seem to manage them well. Note I have been able to read a predefined and finite amount of data samples from analog input channels by making use of read_many_sample.
Details: NI cDAQ 9178 with NI 9205 module inserted, on Lenovo laptop running Windows Home 10, python 3.7 and nidaqmx package for python.
EDIT: for anyone interested, I now have this working in the following way, with a live visual feedback using matplotlib, and - not 100% percent sure yet - it seems there no buffer problems even if one aims at long acquisitions (>10 minutes). Here is the code (not cleaned, sorry):
"""
Analog data acquisition for QuSpin's OPMs via National Instruments' cDAQ unit
The following assumes:
"""
# Imports
import matplotlib.pyplot as plt
import numpy as np
import nidaqmx
from nidaqmx.stream_readers import AnalogMultiChannelReader
from nidaqmx import constants
# from nidaqmx import stream_readers # not needed in this script
# from nidaqmx import stream_writers # not needed in this script
import threading
import pickle
from datetime import datetime
import scipy.io
# Parameters
sampling_freq_in = 1000 # in Hz
buffer_in_size = 100
bufsize_callback = buffer_in_size
buffer_in_size_cfg = round(buffer_in_size * 1) # clock configuration
chans_in = 3 # set to number of active OPMs (x2 if By and Bz are used, but that is not recommended)
refresh_rate_plot = 10 # in Hz
crop = 10 # number of seconds to drop at acquisition start before saving
my_filename = 'test_3_opms' # with full path if target folder different from current folder (do not leave trailing /)
# Initialize data placeholders
buffer_in = np.zeros((chans_in, buffer_in_size))
data = np.zeros((chans_in, 1)) # will contain a first column with zeros but that's fine
# Definitions of basic functions
def ask_user():
global running
input("Press ENTER/RETURN to stop acquisition and coil drivers.")
running = False
def cfg_read_task(acquisition): # uses above parameters
acquisition.ai_channels.add_ai_voltage_chan("cDAQ2Mod1/ai1:3") # has to match with chans_in
acquisition.timing.cfg_samp_clk_timing(rate=sampling_freq_in, sample_mode=constants.AcquisitionType.CONTINUOUS,
samps_per_chan=buffer_in_size_cfg)
def reading_task_callback(task_idx, event_type, num_samples, callback_data): # bufsize_callback is passed to num_samples
global data
global buffer_in
if running:
# It may be wiser to read slightly more than num_samples here, to make sure one does not miss any sample,
# see: https://documentation.help/NI-DAQmx-Key-Concepts/contCAcqGen.html
buffer_in = np.zeros((chans_in, num_samples)) # double definition ???
stream_in.read_many_sample(buffer_in, num_samples, timeout=constants.WAIT_INFINITELY)
data = np.append(data, buffer_in, axis=1) # appends buffered data to total variable data
return 0 # Absolutely needed for this callback to be well defined (see nidaqmx doc).
# Configure and setup the tasks
task_in = nidaqmx.Task()
cfg_read_task(task_in)
stream_in = AnalogMultiChannelReader(task_in.in_stream)
task_in.register_every_n_samples_acquired_into_buffer_event(bufsize_callback, reading_task_callback)
# Start threading to prompt user to stop
thread_user = threading.Thread(target=ask_user)
thread_user.start()
# Main loop
running = True
time_start = datetime.now()
task_in.start()
# Plot a visual feedback for the user's mental health
f, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex='all', sharey='none')
while running: # make this adapt to number of channels automatically
ax1.clear()
ax2.clear()
ax3.clear()
ax1.plot(data[0, -sampling_freq_in * 5:].T) # 5 seconds rolling window
ax2.plot(data[1, -sampling_freq_in * 5:].T)
ax3.plot(data[2, -sampling_freq_in * 5:].T)
# Label and axis formatting
ax3.set_xlabel('time [s]')
ax1.set_ylabel('voltage [V]')
ax2.set_ylabel('voltage [V]')
ax3.set_ylabel('voltage [V]')
xticks = np.arange(0, data[0, -sampling_freq_in * 5:].size, sampling_freq_in)
xticklabels = np.arange(0, xticks.size, 1)
ax3.set_xticks(xticks)
ax3.set_xticklabels(xticklabels)
plt.pause(1/refresh_rate_plot) # required for dynamic plot to work (if too low, nulling performance bad)
# Close task to clear connection once done
task_in.close()
duration = datetime.now() - time_start
# Final save data and metadata ... first in python reloadable format:
filename = my_filename
with open(filename, 'wb') as f:
pickle.dump(data, f)
'''
Load this variable back with:
with open(name, 'rb') as f:
data_reloaded = pickle.load(f)
'''
# Human-readable text file:
extension = '.txt'
np.set_printoptions(threshold=np.inf, linewidth=np.inf) # turn off summarization, line-wrapping
with open(filename + extension, 'w') as f:
f.write(np.array2string(data.T, separator=', ')) # improve precision here!
# Now in matlab:
extension = '.mat'
scipy.io.savemat(filename + extension, {'data':data})
# Some messages at the end
num_samples_acquired = data[0,:].size
print("\n")
print("OPM acquisition ended.\n")
print("Acquisition duration: {}.".format(duration))
print("Acquired samples: {}.".format(num_samples_acquired - 1))
# Final plot of whole time course the acquisition
plt.close('all')
f_tot, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex='all', sharey='none')
ax1.plot(data[0, 10:].T) # note the exclusion of the first 10 iterations (automatically zoomed in plot)
ax2.plot(data[1, 10:].T)
ax3.plot(data[2, 10:].T)
# Label formatting ...
ax3.set_xlabel('time [s]')
ax1.set_ylabel('voltage [V]')
ax2.set_ylabel('voltage [V]')
ax3.set_ylabel('voltage [V]')
xticks = np.arange(0, data[0, :].size, sampling_freq_in)
xticklabels = np.arange(0, xticks.size, 1)
ax3.set_xticks(xticks)
ax3.set_xticklabels(xticklabels)
plt.show()
Of course comments are appreciated. This is probably still suboptimal.

How does one plot a running average without importing external modules (other than matplotlib)?

Here is a link to the file with the information in 'sunspots.txt'. With the exception of external modules matploblib.pyplot and seaborn, how could one compute the running average without importing external modules like numpy and future? (If it helps, I can linspace and loadtxt without numpy.)
If it helps, my code thus far is posted below:
## open/read file
f2 = open("/Users/location/sublocation/sunspots.txt", 'r')
## extract data
lines = f2.readlines()
## close file
f2.close()
t = [] ## time
n = [] ## number
## col 1 == col[0] -- number identifying which month
## col 2 == col[1] -- number of sunspots observed
for col in lines: ## 'col' can be replaced by 'line' iff change below is made
new_data = col.split() ## 'col' can be replaced by 'line' iff change above is made
t.append(float(new_data[0]))
n.append(float(new_data[1]))
## extract data ++ close file
## check ##
# print(t)
# print(n)
## check ##
## import
import matplotlib.pyplot as plt
import seaborn as sns
## plot
sns.set_style('ticks')
plt.figure(figsize=(12,6))
plt.plot(t,n, label='Number of sunspots oberved monthly' )
plt.xlabel('Time')
plt.ylabel('Number of Sunspots Observed')
plt.legend(loc='best')
plt.tight_layout()
plt.savefig("/Users/location/sublocation/filename.png", dpi=600)
The question is from the weblink from this university (p.11 of the PDF, p.98 of the book, Exercise 3-1).
Before marking this as a duplicate:
A similar question was posted here. The difference is that all posted answers require importing external modules like numpy and future whereas I am trying to do without external imports (with the exceptions above).
Noisy data that needs to be smoothed
y = [1.0016, 0.95646, 1.03544, 1.04559, 1.0232,
1.06406, 1.05127, 0.93961, 1.02775, 0.96807,
1.00221, 1.07808, 1.03371, 1.05547, 1.04498,
1.03607, 1.01333, 0.943, 0.97663, 1.02639]
Try a running average with a window size of n
n = 3
Each window can by represented by a slice
window = y[i:i+n]
Need something to store the averages in
averages = []
Iterate over n-length slices of the data; get the average of each slice; save the average in another list.
from __future__ import division # For Python 2
for i in range(len(y) - n):
window = y[i:i+n]
avg = sum(window) / n
print(window, avg)
averages.append(avg)
When you plot the averages you'll notice there are fewer averages than there are samples in the data.
Maybe you could import an internal/built-in module and make use of this SO answer -https://stackoverflow.com/a/14884062/2823755
Lots of hits searching with running average algorithm python

Categories

Resources