How to do audio preprocessing in flutter? - python

So me and my team are making an audio classification app for android. We used a python backend for classification, then connected it to the flutter app, but now we want to get rid of that and do it all in flutter with tflite. Problem is, we relied on librosa for our data preprocessing (getting a mel spectrogram) and we can't find any libraries to get mel spectrograms in flutter. Does anyone here know of one? Or can recommend another way to preprocess audio input for our tflite model?
Basically, is there a flutter library that can do this:
mels = np.mean(librosa.feature.melspectrogram(y=X, sr=sample_rate).T,axis=0)

Related

how to train a model in machine learning

iam working on my grad project that is a security detection program
now i need to ask a few questions
1- how to train a model for object detection with yolo, i already have the images with the annotations but i dont know how to train them
2-iam using haarcascade with opencv to do face recognition is there is a better model or new way because haarcascade is not accurate and drop frames and its only on front face or if not how can i train the haarcascade on more image to be better
3- also all of the program is obv written with python so after i finish the program how can i make it a website or desktop app because for example i run yolo with parser arg so i dont know how to make a button do that
1-i tried google collab to train data but it was on tensorflow so it didnt help me
2-and i tried cascade trainer but i failed to train every time

Real-time audio classification using Python

I want to make a real-time audio classification using python. I have a trained deep learning model but I only only give it wav file as an input. I want to make it real-time where it will use microphone as an input. Is that possible? If so, how will I do it?

Problem in finding code for Simple image classification between two classes in android studio

i trained keras model on 2 classes and converted it into the .tflite model. now i want to use this model in android studio for simple classification between two classes when i put an image from the gallery. i can't find any help on internet regarding this simple way. On internet there is ways for camera but i don't need that in my simple project.
You will need to use TensorFlow Lite's Java API to run inference on-device. Start with the Android quickstart, if you haven't installed the dependencies etc.
If your model is similar to how standard image classification models (for eg MobileNet) work, you can start with the TFLite Image Classification app source for inspiration. Specifically, you might be interested in the Classifier base-class (and its floating point child class). Classifier demonstrates how you can use TFLite's Java API to instantiate a new Interpreter & run inference for image inputs.

How do I implement keras mnist model with video camera

I have successfully build an model of handwritten digits. How would I load the model and use it with live data coming from a video camera? I would like it to draw a box around the number and label the number.
Your question is very broad however there might be one video to answer all your qestions.
This is how use an ml model on your Android.

How to train CNN on common voice dataset

I am trying to train a cnn with the common voice dataset. I am new to speech recognition and am not able to find any links on how to use the dataset with keras. I followed this article to build a simple word classification network. But I want to scale it up with the common voice dataset. any help is appreciated.
Thank you
What you can do is looking at MFCCs. In short, these are features extracted from the audio waveform by using signal processing techniques to transcribe the way humans perceive sound. In python, you can use python-speech-features to compute MFCCs.
Once you have prepared your data, you can build a CNN; for example something like this one:
You can also use RNNs (LSTM or GRU for example), but this is a bit more advanced.
EDIT: A very good dataset to start, if you want:
Speech Commands Dataset

Categories

Resources