So here we have a Python script:
""" Record a few seconds of audio and save to a WAVE file. """
import pyaudio
import wave
import sys
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
frames_per_buffer = chunk)
print "* recording"
all = []
for i in range(0, RATE / chunk * RECORD_SECONDS):
data = stream.read(chunk)
all.append(data)
print "* done recording"
stream.close()
p.terminate()
# write data to WAVE file
data = ''.join(all)
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(data)
wf.close()
And this script does what the first comentary line says, if you run it in terminal it will output a ".wav" file in the path you're set in the moment of the execution... What I want to do is to get that file and "manipule" it, instead of outputting it to the Computer, I want to store it in a variable or something like that, and then I would like to POST it to an URL parsing some parametters along with it... I saw some interesting examples of posting multipart-encoded files using requests, as you can see here:
http://docs.python-requests.org/en/latest/user/quickstart/
But I made several attempts of achieving what I'm descripting in this question and I was unlucky... Maybe a little guidance will help with this one :)
Being Brief, what I need is to record a WAV file from microphone and then POST it to an URL (Parsing Data like the Headers with it) and then get the output in a print statement or something like that in the terminal...
Thank You!!
wave.open lets you pass either a file name or a file-like object to save into. If you pass in a StringIO object rather than WAVE_OUTPUT_FILENAME, you'll can get a string object that you can presumably use to construct a POST request.
Note that this will load the file into memory -- if it might be really long, you might prefer to do it into a temporary file and then use that to make your request. Of course, you're already loading it into memory, so maybe that's not an issue.
Related
I have the following code, that reads a video and saves it in another path, the problem is that when the file is saved this is not reproducible?
import subprocess
import shlex
from io import BytesIO
file = open("a.mkv", "rb")
with open('a.mkv', 'rb') as fh:
buf = BytesIO(fh.read())
args = shlex.split('ffmpeg -i pipe: -codec copy -f rawvideo pipe:')
proc = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = proc.communicate(input=buf.getbuffer())
proc.wait()
f = open("a.mp4", "wb")
f.write(out)
f.close()
I need keep the buffers so, the video has the correct size, how can I solve this?
You could use ffmpeg-python
pip install ffmpeg-python
And then:
import subprocess
import shlex
import ffmpeg
from io import BytesIO
file = open("a.mkv", "rb")
with open('a.mkv', 'rb') as fh:
buf = BytesIO(fh.read())
process = (
ffmpeg
.input('pipe:') \
.output('a.mp4') \
.overwrite_output() \
.run_async(pipe_stdin=True) \
)
process.communicate(input=buf.getbuffer())
and isn't there any way of making a pipe of stream data like a node.js, for example, start a pipe that download bytes from s3, process with ffmpeg and upload to s3, that in node.js I guess can do byte by byte instead of fill the ram, so then in python the idea is create a temp file in the backend server and write then the files
Yes, there is, but there is no prescribed mechanism in Python like in Node.js. You need to run your own threads (or asyncio coroutines), one to send data to and other to receive data from FFmpeg process. Here is a sketch of what I would do
from threading import Thread
import subprocess as sp
# let's say getting mp4 and output mkv, copying all the streams
# NOTE: you cannot pipe out mp4
args = ['-f','mp4','-','-c','copy','-f','matroska','-']
proc = sp.Popen(args,stdin=sp.PIPE,stdout=sp.PIPE)
def writer():
# get downloaded bytes data
data = ...
while True:
# get next data block
data = self._queue.get()
self._queue.task_done()
if data is None:
break
try:
nbytes = proc.stdin.write(data)
except:
# stdout stream closed/FFmpeg terminated, end the thread as well
break
if not nbytes and proc.stdin.closed: # just in case
break
def reader():
# output block size
blocksize = ... # set to something reasonable
# I use the frame byte size for rawdata in but would be
# different for receiving encoded data
while True:
try:
data = proc.stdout.read(blocksize)
except:
# stdout stream closed/FFmpeg terminated, end the thread as well
break
if not data: # done no more data
break
# upload the data
...
writer = Thread(target=writer)
reader = Thread(target=reader)
writer.start()
reader.start()
writer.join() # first wait until all the data are written
proc.stdin.close() # triggers ffmpeg to stop waiting for input and wrap up its encoding
proc.wait() # waits for ffmpeg
reader.join() # wait till all the ffmpeg outputs are processed
I tried out multiple different attempts using different approaches for my ffmpegio.streams.SimpleFilterBase class and settled on this approach.
In Short
Is there a way to convert raw audio data (obtained by PyAudio module) into the form of virtual file (can be obtained by using python open() function), without saving it to the disk and read it from the disk? Details are provided as belows.
What Am I Doing
I'm using PyAudio to record audio, then it will be fed into a tensorflow model to get prediction. Currently, it works when I firstly save the recorded sound as .wav file on the disk, and then read it again to feed it into the model. Here is the code of recording and saving:
import pyaudio
import wave
CHUNK_LENGTH = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
RECORD_SECONDS = 1
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK_LENGTH)
print("* recording")
frames = [stream.read(RATE * RECORD_SECONDS)] # here is the recorded data, in the form of list of bytes
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
After I get the raw audio data (variable frames), it can be saved by using python wave module as belows. We can see that when saving, some meta message must be saved by calling functions like wf.setxxx.
import os
output_dir = "data/"
output_path = output_dir + "{:%Y%m%d_%H%M%S}.wav".format(datetime.now())
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# save the recorded data as wav file using python `wave` module
wf = wave.open(output_path, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
And here is the code of using the saved file to run inference on tensorflow model. It just simply read it as binary then the model will handle the rest.
import classifier # my tensorflow model
with open(output_path, 'rb') as f:
w = f.read()
classifier.run_graph(w, labels, 5)
THE PROBLEM
For real-time needs, I need to keep streaming the audio and feeding it into the model once a while. But it seems unreasonable to keep saving the file on the disk and then read it again and again, which will spend losts of time on I/O.
I want to keep the data in memeory and use it directly, rather than saving and reading it repeatedly. However, python wave module does not support reading and writing simultaneously (refers here).
If I directly feed the data without some meta data (e.g. channels, frame rate) (which can be added by wave module during saving) like this:
w = b''.join(frames)
classifier.run_graph(w, labels, 5)
I will get error as belows:
2021-04-07 11:05:08.228544: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at decode_wav_op.cc:55 : Invalid argument: Header mismatch: Expected RIFF but found
Traceback (most recent call last):
File "C:\Users\anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "C:\Users\anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
target_list, run_metadata)
File "C:\Users\anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Header mismatch: Expected RIFF but found
The tensorflow model I'm using is provided here: ML-KWS-for-MCU, hope this helps.
Here is the code that produces the error: (classifier.run_graph())
def run_graph(wav_data, labels, num_top_predictions):
"""Runs the audio data through the graph and prints predictions."""
with tf.Session() as sess:
# Feed the audio data as input to the graph.
# predictions will contain a two-dimensional array, where one
# dimension represents the input image count, and the other has
# predictions per class
softmax_tensor = sess.graph.get_tensor_by_name("labels_softmax:0")
predictions, = sess.run(softmax_tensor, {"wav_data:0": wav_data})
# Sort to show labels in order of confidence
top_k = predictions.argsort()[-num_top_predictions:][::-1]
for node_id in top_k:
human_string = labels[node_id]
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
return 0
You should be able to use io.BytesIO instead of a physical file, they share the same interface but BytesIO is only kept in memory:
import io
container = io.BytesIO()
wf = wave.open(container, 'wb')
wf.setnchannels(4)
wf.setsampwidth(4)
wf.setframerate(4)
wf.writeframes(b'abcdef')
# Read the data up to this point
container.seek(0)
data_package = container.read()
# add some more data...
wf.writeframes(b'ghijk')
# read the data added since last
container.seek(len(data_package))
data_package = container.read()
This should allow you to continuously stream the data into the file while reading the excess using your TensorFlow code.
I need to transcribe the speech that is being written to a wav file. I've implemented the following iterator to try to incrementally read the audio from the file:
import wave
def read_audio(path, chunk_size=1024):
wave_file = wave.open(open(path, 'rb'))
while True:
data = wave_file.readframes(chunk_size)
if data != "":
yield data
In order to test the generator, I've implemented a function that keeps writing to a wav file the audio captured by the computer's microphone:
import pyaudio
def record_to_file(out_path):
fmt = pyaudio.paInt16
channels = 1
rate = 16000
chunk = 1024
audio = pyaudio.PyAudio()
stream = audio.open(format=fmt, channels=channels,
rate=rate, input=True,
frames_per_buffer=chunk)
wave_file = wave.open(out_path, 'wb')
wave_file.setnchannels(channels)
wave_file.setsampwidth(audio.get_sample_size(fmt))
wave_file.setframerate(rate)
while True:
data = stream.read(chunk)
waveFile.writeframes(data)
Below is the test script:
import threading
import time
WAV_PATH='out.wav'
def record_worker():
record_to_file(WAV_PATH)
if __name__=='__main__':
t = threading.Thread(target=record_worker)
t.setDaemon(True)
t.start()
time.sleep(5)
reader = read_audio(WAV_PATH)
for chunk in reader:
print(len(chunk))
It doesn't work as I'd expect - the reader stops yielding after a while. Since the test is successful if I adapt record_file to set the wav file's nframes to a very large number beforehand and do the writing with writeframesraw, my guess is that wave.open eagerly reads nframes, not trying to read anything after that number of frames has been read.
Is it possible to obtain that incremental read in Python 2.7 without resorting to this setnframes hack? It's worth noting that, contrary to the test script, I have no control of the wav file's generation in the scenario in which I plan to utilize such feature. The writing gets done by a SWIG-adapted C library named pjsip (http://www.pjsip.org/python/pjsua.htm), so I don't expect it to be possible to do any modifications on that end.
import urllib.request,io
url = 'http://www.image.com/image.jpg'
path = io.BytesIO(urllib.request.urlopen(url).read())
I'd like to check the file size of the URL image in the filestream path before saving, how can i do this?
Also, I don't want to rely on Content-Length headers, I'd like to fetch it into a filestream, check the size and then save
You can get the size of the io.BytesIO() object the same way you can get it for any file object: by seeking to the end and asking for the file position:
path = io.BytesIO(urllib.request.urlopen(url).read())
path.seek(0, 2) # 0 bytes from the end
size = path.tell()
However, you could just as easily have just taken the len() of the bytestring you just read, before inserting it into an in-memory file object:
data = urllib.request.urlopen(url).read()
size = len(data)
path = io.BytesIO(data)
Note that this means your image has already been loaded into memory. You cannot use this to prevent loading too large an image object. For that using the Content-Length header is the only option.
If the server uses a chunked transfer encoding to facilitate streaming (so no content length has been set up front), you can use a loop limit how much data is read.
Try importing urllib.request
import urllib.request, io
url = 'http://www.elsecarrailway.co.uk/images/Events/TeddyBear-3.jpg'
path = urllib.request.urlopen(url)
meta = path.info()
>>>meta.get(name="Content-Length")
'269898' # ie 269kb
You could ask the server for the content-length information. Using urllib2 (which I hope is available in your python):
req = urllib2.urlopen(url)
meta = req,info()
length_text = meta.getparam("Content-Length")
try:
length = int(length_text)
except:
# length unknown, you may need to read
length = -1
I have been trying to download a video file with python and at the same time playing it with VLC.
I have tried few ways. One of them is to download in a single thread with continuous fetch and append data. This style is slow but video plays. The code is something like below
self.fp = open(dest, "w")
while not self.stop_down and _continue: with urllib2 request
try:
size = 1024 * 8
data = page.read(size)
bytld+= size
self.fp.write(data)
This function takes longer to download but I am able to play the video while its loading.
However I have been trying to download in multiple parts at the same time..
With proper threading logics
req= urllib2.Request(self.url)
req.headers['Range'] = 'bytes=%s-%s' % (self.startPos, self.end)
response = urllib2.urlopen(req)
content = response.read()
if os.path.exists(self.dest) :
out_fd = open(self.dest, "r+b")
else :
out_fd = open(self.dest, "w+b")
out_fd.seek(self.startPos, 0)
out_fd.write(content)
out_fd.close()
With my threading I am making sure that each part of the file is being saved on sequentially.
But for some reason I can't play this file at all while downloading.
Is there anything I am not doing right? Is the "Range" should be modified different way?
Turns out for each block of data in thread mode Range has to be +1 BYTE. So if the first block is 1024 next one is from 1023 to whatever.