I'd like accomplish the following in Python. I want to call a subprocess (ffmpeg in this case, using the ffmpy3 wrapper) and directly pipe the process' output on to a file-like object that can be consumed by another function's open() call. Since audio and video data can become quite big, I explicitly don't ever want to load the process' output into memory as a whole, but only "stream" it in a buffered fashion. Here is some example code.
async def convert_and_process(file: FileIO):
ff = ffmpy3.FFmpeg(
inputs={str(file.name): None},
outputs={'pipe:1': '-y -ac 1 -ar 16000 -acodec pcm_s16le -f wav'}
)
stdout: StreamReader = (await ff.run_async(stdout=subprocess.PIPE)).stdout
with wave.open(help_needed, 'rb') as wf:
# do stuff with wave file
pass
Here is the code of run_async, it's just a simple wrapper around asyncio.create_subprocess_exec().
My problem is basically just to turn the StreamReader returned by run_async() into a file-like object that can be consumed by wave.open(). Moreover, does this approach actually not load all output into memory, as Popen.wait() or Popen.communicate() would do?
I was thinking that os.pipe() might be useful, but I'm not sure how.
If your example is the true representation of your ultimate goal (read audio samples in blocks) then you can accomplish it much easier just with FFmpeg and its subprocess.Popen.stdout. If there are more to it than using wave library to read a memory-mapped .wav file, then please ignore this answer or clarify.
First a shameless plug, if you are willing to try another library, my ffmpegio can do what you want to do. Here is an example:
import ffmpegio
#audio stream reader
with ffmpegio.open(file,'ra', blocksize=1024, ac=1, ar=16000,
sample_fmt='s16le') as f:
for block in f: # block: [1024xchannels] ndarray
do_your_thing(block)
blocksize argument sets the number of samples to retrieve at a time (so 1024 audio samples in this example).
This library is still pretty young, and if you have any issues please report on its GitHub Issues board.
Second, if you prefer to implement it yourself, it's actually fairly straight forward if you know the FFmpeg output stream formats AND you need only one stream (multiple streams could also be done easily under non-Windows, I think). For your example above, try the following:
ff = ffmpy3.FFmpeg(
inputs={str(file.name): None},
outputs={'pipe:1': '-y -ac 1 -ar 16000 -acodec pcm_s16le -f s16le'}
)
stdout = (await ff.run_async(stdout=subprocess.PIPE)).stdout
nsamples = 1024 # read 1024 samples
itemsize = 2 # bytes, int16x1channel
while True:
try:
b = stdout.read(nsamples*itemsize)
# you may need to check for len(b)=0 as well, not sure atm
except BrokenPipeError:
break
x = np.frombuffer(b, nsamples, np.int16)
# do stuff with audio samples in x
Note that I changed -f wav to -f s16le so only the raw samples are sent to stdout. Then stdout.read(n) is essentially identical to wave.readframes(n) except for what their n's mean.
Related
I want to convert ogg byte array/bytes with Opus codec to wav byte array/bytes without saving to disk. I have downloaded audio from telegram api and it is in byte array format with .ogg extension. I do not want to save it to filesystem to eliminate filesystem io latencey.
Currently what I am doing is after saving the audio file in .ogg format using code the below code using telegram api for reference https://docs.python-telegram-bot.org/en/stable/telegram.file.html#telegram.File.download_to_drive
# listen for audio messages
async def audio(update, context):
newFile = await context.bot.get_file(update.message.voice.file_id)
await newFile.download_to_drive(output_path)
I am using the code
subprocess.call(["ffmpeg", "-i", output_path, output_path.replace(".ogg", ".wav"), '-y'], stderr=subprocess.DEVNULL, stdout=subprocess.DEVNULL)
to convert ogg file to wav file. But this is not what I want.
I want the code
async def audio(update, context):
newFile = await context.bot.get_file(update.message.voice.file_id)
byte_array = await newFile.download_as_bytearray()
to get byte_array and now I want this byte_array to be converted to wav without saving to disk and without using ffmpeg. Let me know in comments if something is unclear. Thanks!
Note: I have setted up a telegram bot at the backend which listens for audios sent to private chat which I do manually for testing purposes.
We may write the OGG data to FFmpeg stdin pipe, and read the encoded WAV data from FFmpeg stdout pipe.
My following answer describes how to do it with video, and we may apply the same solution to audio.
The example assumes that the OGG data is already downloaded and stored in bytes array (in the RAM).
Piping architecture:
-------------------- Encoded --------- Encoded ------------
| Input OGG encoded | OGG data | FFmpeg | WAV data | Store to |
| stream | ----------> | process | ----------> | BytesIO |
-------------------- stdin PIPE --------- stdout PIPE -------------
The implementation is equivalent to the following shell command:
Linux: cat input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
Windows: type input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
The example uses ffmpeg-python module, but it's just a binding to FFmpeg sub-process (FFmpeg CLI must be installed, and must be in the execution path).
Execute FFmpeg sub-process with stdin pipe as input and stdout pipe as output:
ffmpeg_process = (
ffmpeg
.input('pipe:', format='ogg')
.output('pipe:', format='wav')
.run_async(pipe_stdin=True, pipe_stdout=True)
)
The input format is set to ogg, the output format is set to wav (use default encoding parameters).
Assuming the audio file is relatively large, we can't write the entire OGG data at once, because doing so (without "draining" stdout pipe) causes the program execution to halt.
We may have to write the OGG data (in chunks) in a separate thread, and read the encoded data in the main thread.
Here is a sample for the "writer" thread:
def writer(ffmpeg_proc, ogg_bytes_arr):
chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important).
n_chunks = len(ogg_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end).
remainder_size = len(ogg_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size).
for i in range(n_chunks):
ffmpeg_proc.stdin.write(ogg_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process.
if (remainder_size > 0):
ffmpeg_proc.stdin.write(ogg_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process.
ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process.
The "writer thread" writes the OGG data in small chucks.
The last chunk is smaller (assume the length is not a multiple of chuck size).
At the end, stdin pipe is closed.
Closing stdin finish encoding the data, and closes FFmpeg sub-process.
In the main thread, we are starting the thread, and read encoded "WAV" data from stdout pipe (in chunks):
thread = threading.Thread(target=writer, args=(ffmpeg_process, ogg_bytes_array))
thread.start()
while thread.is_alive():
wav_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe
out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
For reading the remaining data, we may use ffmpeg_process.communicate():
# Read the last encoded chunk.
wav_chunk = ffmpeg_process.communicate()[0]
out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
Complete code sample:
import ffmpeg
import base64
from io import BytesIO
import threading
async def download_audio(update, context):
# The method is not not used - we are reading the audio from as file instead (just for testing).
newFile = await context.bot.get_file(update.message.voice.file_id)
bytes_array = await newFile.download_as_bytearray()
return bytes_array
# Equivalent Linux shell command:
# cat input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
# Equivalent Windows shell command:
# type input.ogg | ffmpeg -y -f ogg -i pipe: -f wav pipe: > test.wav
# Writer thread - write the OGG data to FFmpeg stdin pipe in small chunks of 1KBytes.
def writer(ffmpeg_proc, ogg_bytes_arr):
chunk_size = 1024 # Define chunk size to 1024 bytes (the exacts size is not important).
n_chunks = len(ogg_bytes_arr) // chunk_size # Number of chunks (without the remainder smaller chunk at the end).
remainder_size = len(ogg_bytes_arr) % chunk_size # Remainder bytes (assume total size is not a multiple of chunk_size).
for i in range(n_chunks):
ffmpeg_proc.stdin.write(ogg_bytes_arr[i*chunk_size:(i+1)*chunk_size]) # Write chunk of data bytes to stdin pipe of FFmpeg sub-process.
if (remainder_size > 0):
ffmpeg_proc.stdin.write(ogg_bytes_arr[chunk_size*n_chunks:]) # Write remainder bytes of data bytes to stdin pipe of FFmpeg sub-process.
ffmpeg_proc.stdin.close() # Close stdin pipe - closing stdin finish encoding the data, and closes FFmpeg sub-process.
if False:
# We may assume that ogg_bytes_array is the output of download_audio method
ogg_bytes_array = download_audio(update, context)
else:
# The example reads the decode_string from a file (for testing").
with open('input.ogg', 'rb') as f:
ogg_bytes_array = f.read()
# Execute FFmpeg sub-process with stdin pipe as input and stdout pipe as output.
ffmpeg_process = (
ffmpeg
.input('pipe:', format='ogg')
.output('pipe:', format='wav')
.run_async(pipe_stdin=True, pipe_stdout=True)
)
# Open in-memory file for storing the encoded WAV file
out_stream = BytesIO()
# Starting a thread that writes the OGG data in small chunks.
# We need the thread because writing too much data to stdin pipe at once, causes a deadlock.
thread = threading.Thread(target=writer, args=(ffmpeg_process, ogg_bytes_array))
thread.start()
# Read encoded WAV data from stdout pipe of FFmpeg, and write it to out_stream
while thread.is_alive():
wav_chunk = ffmpeg_process.stdout.read(1024) # Read chunk with arbitrary size from stdout pipe
out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
# Read the last encoded chunk.
wav_chunk = ffmpeg_process.communicate()[0]
out_stream.write(wav_chunk) # Write the encoded chunk to the "in-memory file".
out_stream.seek(0) # Seek to the beginning of out_stream
ffmpeg_process.wait() # Wait for FFmpeg sub-process to end
# Write out_stream to file - just for testing:
with open('test.wav', "wb") as f:
f.write(out_stream.getbuffer())
I am using pythons bz2 module to generate (and compress) a large jsonl file (bzip2 compressed 17GB).
However, when I later try to decompress it using pbzip2 it only seems to use one CPU-core for decompression, which is quite slow.
When i compress it with pbzip2 it can leverage multiple cores on decompression. Is there a way to compress within python in the pbzip2-compatible format?
import bz2,sys
from Queue import Empty
#...
compressor = bz2.BZ2Compressor(9)
f = open(path, 'a')
try:
while 1:
m = queue.get(True, 1*60)
f.write(compressor.compress(m+"\n"))
except Empty, e:
pass
except Exception as e:
traceback.print_exc()
finally:
sys.stderr.write("flushing")
f.write(compressor.flush())
f.close()
A pbzip2 stream is nothing more than the concatenation of multiple bzip2 streams.
An example using the shell:
bzip2 < /usr/share/dict/words > words_x_1.bz2
cat words_x_1.bz2{,,,,,,,,,} > words_x_10.bz2
time bzip2 -d < words_x_10.bz2 > /dev/null
time pbzip2 -d < words_x_10.bz2 > /dev/null
I've never used python's bz2 module, but it should be easy to close/reopen a stream in 'a'ppend mode, every so-many bytes, to get the same result. Note that if BZ2File is constructed from an existing file-like object, closing the BZ2File will not close the underlying stream (which is what you want here).
I haven't measured how many bytes is optimal for chunking, but I would guess every 1-20 megabytes - it definitely needs to be larger than the bzip2 block size (900k) though.
Note also that if you record the compressed and uncompressed offsets of each chunk, you can do fairly efficient random access. This is how the dictzip program works, though that is based on gzip.
If you absolutely must use pbzip2 on decompression this won't help you, but the alternative lbzip2 can perform multicore decompression of "normal" .bz2 files, such as those generated by Python's BZ2File or a traditional bzip2 command. This avoids the limitation of pbzip2 you're describing, where it can only achieve parallel decompression if the file is also compressed using pbzip2. See https://lbzip2.org/.
As a bonus, benchmarks suggest lbzip2 is substantially faster than pbzip2, both on decompression (by 30%) and compression (by 40%) while achieving slightly superior compression ratios. Further, its peak RAM usage is less than 50% of the RAM used by pbzip2. See https://vbtechsupport.com/1614/.
my python script is supposed to write to /dev/xconsole. It works as expected, when I am reading from /dev/xconsole, such as with tail -F /dev/xconsole. But if I don't have tail running, my script hangs and waits.
I am opening the file as follows:
xconsole = open('/dev/xconsole', 'w')
and writing to it:
for line in sys.stdin:
xconsole.write(line)
Why does my script hang, when nobody is reading the output from /dev/xconsole ?
/dev/xconsole is a named pipe and it is on demand FIFO pipe.
So when you use it, it stores data in memory as Linux provides an object of fixed size. If the application doesn't read the data timely, then the buffer becomes full and the application hangs.
In order to avoid this, you'll need to Write > Read > Write and so on. Just ensure that it doesn't fill up. For a Linux system it's around 64KB usually.
#Vishnudev summarized this nicely already and should be accepted as the correct answer. I'll just add to his answer with the following code to resize your FIFO memory buffer:
import fcntl
F_SETPIPE_SZ = 1031
F_GETPIPE_SZ = 1032
fifo_fd = open("/path/to/fifo", "rb")
print(f"fifo buffer size before: {fcntl.fcntl(fifo_fd, F_GETPIPE_SZ))}"
fcntl.fcntl(fifo_fd, F_SETPIPE_SZ, 1000000)
print(f"fifo buffer size after: {fcntl.fcntl(fifo_fd, F_GETPIPE_SZ))}"
I have a program that outputs a huge array of float32 preceded by a 40 bytes header. The program writes to stdout.
I can dump the output into a file, open it in python, skip the 40 bytes of the header, and load it into numpy using numpy.fromfile(). However that takes a lot of time.
So what I would like to do is to load the array directly into numpy by reading the stdout of the program that generates it. However I'm having a hard time figuring this out.
Thanks!
You can memory-map the file instead of reading it all. This will take almost no time:
np.memmap(filename, np.float32, offset=40)
Of course actually reading data from the result will take some time, but probably this will be hidden by interleaving the I/O with computation.
If you really don't want the data to ever be written to disk, you could use subprocess.Popen() to run your program with stdout=subprocess.PIPE and pass the resulting stdout file-like object directly to numpy.fromfile().
Many thanks to all that responded/commented.
I followed downshift's advice and looked into the link he provides...
And came up with the folowing:
nLayers=<Number of layers in program output>
nRows=<Number of rows in layer>
nCols=<Number of columns in layer>
nBytes=<Number of bytes for each value>
noDataValue=<Value used to code no data in program output>
DataType=<appropriate numpy data type for values>
perLayer=nRows*nCols*nBytes
proc = sp.Popen(cmd+args, stdout = sp.PIPE, shell=True)
data=bytearray()
for i in range(0,nLayers):
dump40=proc.stdout.read(40)
data=data+bytearray(proc.stdout.read(perLayer))
ndata = np.frombuffer(data, dtype=DataType)
ndata[ndata == noDataValue]=np.nan
ndata.shape = (nLayers,nRows,nCols)
the key here is using the numpy.frombuffer which uses the same read buffer to create the darray, and thus avoids having to duplicate the data in memory.
I need to parse the output produced by an external program (third party, I have no control over it) which produces large amounts of data. Since the size of the output greatly exceeds the available memory, I would like to parse the output while the process is running
and remove from the memory the data that have already been processed.
So far I do something like this:
import subprocess
p_pre = subprocess.Popen("preprocessor",stdout = subprocess.PIPE)
# preprocessor is an external bash script that produces the input for the third-party software
p_3party = subprocess.Popen("thirdparty",stdin = p_pre.stdout, stdout = subprocess.PIPE)
(data_to_parse,can_be_thrown) = p_3party.communicate()
parsed_data = myparser(data_to_parse)
When "thirdparty" output is small enough, this approach works. But as stated in the Python documentation:
The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
I think a better approach (that could actually make me save some time),
would be to start processing data_to_parse while it is being produces,
and when the parsing has been done correctly "clear" data_to_parse removing
the data that have already been parsed.
I have also tried to use a for cycle like:
parsed_data=[]
for i in p_3party.stdout:
parsed_data.append(myparser(i))
but it gets stuck and can't understand why.
So I would like to know what it is the best approach to accomplish this? What are the issues to be aware of?
You can use the subprocess.Popen() to create a steam from which you read lines.
import subprocess
stream = subprocess.Popen(stdout=subprocess.PIPE).stdout
for line in stream:
#parse lines as you recieve them.
print line
You could pass the lines to your myparser() method, or append them to a list until you are ready to use them.. whatever.
In your case, using two sub-processes, it would work something like this:
import subprocess
def method(stream, retries=3):
while retries > 0:
line = stream.readline()
if line:
yield line
else:
retries -= 1
pre_stream = subprocess.Popen(cmd, stdout=subprocess.PIPE).stdout
stream = subprocess.Popen(cmd, stdin=pre_stream, stdout=subprocess.PIPE).stdout
for parsed in method(stream):
# do what you want with the parsed data.
parsed_data.append(parsed)
Iterating over a file as in for i in p_3party.stdout: uses a read-ahead buffer. The readline() method may be more reliable with a pipe -- AFAIK it reads character by character.
while True:
line = p_3party.stdout.readline()
if not line:
break
parsed_data.append(myparser(line))