I am trying to use pyAudioAnalysis to analyse an audio stream in real-time from a HTTP stream. My goal is to use the Zero Crossing Rate (ZCR) and other methods in this library to identify events in the stream.
pyAudioAnalysis only supports input from a file but converting a http stream to a .wav will create a large overhead and temporary file management I would like to avoid.
My method is as follows:
Using ffmpeg I was able to get the raw audio bytes into a subprocess pipe.
try:
song = subprocess.Popen(["ffmpeg", "-i", "https://media-url/example", "-acodec", "pcm_s16le", "-ac", "1", "-f", "wav", "pipe:1"],
stdout=subprocess.PIPE)
I then buffered this data using pyAudio with the hope of being able to use the bytes in pyAudioAnalysis
CHUNK = 65536
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
output=True)
data = song.stdout.read(CHUNK)
while len(data) > 0:
stream.write(data)
data = song.stdout.read(CHUNK)
However, inputting this data output into AudioBasicIO.read_audio_generic() produces an empty numpy array.
Is there a valid solution to this problem without temporary file creation?
You can try my ffmpegio package:
pip install ffmpegio
import ffmpegio
# read entire stream
fs, x = ffmpegio.audio.read("https://media-url/example", ac=1, sample_fmt='s16')
# fs - sampling rate
# x - [nx1] numpy array
# or read a block at a time:
with ffmpegio.open(["https://media-url/example", "ra", blocksize=1024, ac=1, sample_fmt='s16') as f:
fs = f.rate
for x in f:
# x: [1024x1] numpy array (or shorter for the last block)
process_data(x)
Note that if you need normalized samples, you can set sample_fmt to 'flt' 'dbl'.
If you prefer to keep dependency low, the key in calling ffmpeg subprocess is to use raw output format:
import subprocess as sp
import numpy as np
song = sp.Popen(["ffmpeg", "-i", "https://media-url/example", "-f", "s16le","-c:a", "pcm_s16le", "-ac", "1", "pipe:1"], stdout=sp.PIPE)
CHUNK = 65536
n = CHUNK/2 # 2 bytes/sample
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
while len(data) > 0:
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
I cannot speak of pyAudioAnalysis but I suspect it expects samples and not bytes.
Related
Hello Stack community,
I'm reading frames from an IP-camera stream and storing them in a list to later on create a video file.
I'm using the python OpenCV library and it works well but ..
The frames that are sent from the IP camera should have a h264 compression but when i check the size of the frames they are 25 MB for a 4K stream. I run out of memory quickly.
This is not the code, but similar to it:
import cv2
cap = cv2.VideoCapture(0)
list = []
while(cap.isOpened()):
ret, frame = cap.read()
if ret==True:
frame = cv2.flip(frame,0)
list.append(frame)
cap.release()
out = cv2.VideoWriter('output.avi', -1, 20.0, (640,480))
for frm in list:
out.write(frm)
out.release()
cv2.destroyAllWindows()
It seems like ret, frame = cap.read() unpacks the frame ?
This generates extra processing each loop and is uneccessary for my intentions with the script, is there a way to retrieve frames without unpacking them ?
Sorry in advance for my probable ignorance.
I built a test sample for reading h264 stream into memory using ffmpeg-python.
The sample reads the data from a file (I don't have a camera for testing it).
I also tested the code reading from RTSP stream.
Here is the code (please read the comments):
import ffmpeg
import threading
import io
in_filename = 'test_vid.264' # Input file for testing (".264" or ".h264" is a convention for elementary h264 video stream file)
## Build synthetic video, for testing:
################################################
# ffmpeg -y -r 10 -f lavfi -i testsrc=size=192x108:rate=1 -c:v libx264 -crf 23 -t 50 test_vid.264
width, height = 192, 108
(
ffmpeg
.input('testsrc=size={}x{}:rate=1'.format(width, height), f='lavfi')
.output(in_filename, vcodec='libx264', crf=23, t=50)
.overwrite_output()
.run()
)
################################################
# Use ffprobe to get video frames resolution
###############################################
# p = ffmpeg.probe(in_filename, select_streams='v');
# width = p['streams'][0]['width']
# height = p['streams'][0]['height']
###############################################
# Stream the video as array of bytes (simulate the stream from the camera for testing)
###############################################
## https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md
#sreaming_process = (
# ffmpeg
# .input(in_filename)
# .video # Video only (no audio).
# .output('pipe:', format='h264')
# .run_async(pipe_stdout=True) # Run asynchronous, and stream to stdout
#)
###############################################
# Read from stdout in chunks of 16K bytes
def reader():
chunk_len_in_byte = 16384 # I don't know what is the optimal chunk size
in_bytes = chunk_len_in_byte
# Read until number of bytes read are less than chunk_len_in_byte
# Also stop after 10000 chucks (just for testing)
chunks_counter = 0
while (chunks_counter < 10000):
in_bytes = process.stdout.read(chunk_len_in_byte) # Read 16KBytes from PIPE.
stream.write(in_bytes) # Write data to In-memory bytes streams
chunks_counter += 1
if len(in_bytes) < chunk_len_in_byte:
break
# Use public RTSP Streaming for testing
# in_stream = "rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mov"
# Execute ffmpeg as asynchronous sub-process.
# The input is in_filename, and the output is a PIPE.
# Note: you should replace the input from file to camera (I might forgot an argument that tells ffmpeg to expect h264 input stream).
process = (
ffmpeg
.input(in_filename) #.input(in_stream)
.video
.output('pipe:', format='h264')
.run_async(pipe_stdin=True, pipe_stdout=True)
)
# Open In-memory bytes streams
stream = io.BytesIO()
thread = threading.Thread(target=reader)
thread.start()
# Join thread, and wait for processes to end.
thread.join()
try:
process.wait(timeout=5)
except sp.TimeoutExpired:
process.kill() # Kill subprocess in case of a timeout (there might be a timeout because input stream still lives).
#sreaming_process.wait() # sreaming_process is used
stream.seek(0) #Seek to beginning of stream.
# Write result to "in_vid.264" file for testing (the file is playable).
with open("in_vid.264", "wb") as f:
f.write(stream.getvalue())
In case you find it useful, I may add some more background descriptions before the code.
Please let me know if the code is working with a camera, and what you had to modify.
I'm trying to stream audio playing internally. Currently, my script has a more hardware based solution where the audio outputs through the auxiliary out which connects to a USB auxiliary line-in adapter. For simplicity, it would be much better to record audio internally rather than having to use hardware to loop the audio signal back into itself.
My relevant code:
def encode(**kwarg):
global audio
print('Encoding: '+str(kwarg))
#encoding algorithm goes here.
writeSaveFlag = False
#self.queueEncoding()
print('process encode')
# create pyaudio stream
stream = False
while not stream:
try:
stream = audio.open(format = kwarg['resolution'],rate = kwarg['sampleRate'],channels = kwarg['channels'],input_device_index = kwarg['deviceIndex'],input = True,frames_per_buffer=kwarg['chunk'])
except:
audio.terminate()
audio = pyaudio.PyAudio()
self.rewindSong()
t = Timer(songDuration-encodingDelayTolarance,checkStatus,kwargs={'currentSong':kwarg['currentSong'],'tolerance':kwarg['tolerance']})
t.start()
startTime = time.time()
playFlag = False
print("recording")
frames = []
# loop through stream and append audio chunks to frame array
for ii in range(0,int((kwarg['sampleRate']/kwarg['chunk'])*kwarg['encodeDuration'])):
#if time.time() - startTime > 2000 and playFlag == False:
# self.play()
data = stream.read(kwarg['chunk'])
frames.append(data)
print("finished recording")
stream.stop_stream()
stream.close()
# save the audio frames as .wav file
wavefile = wave.open(saveFilePath+kwarg['outputFileName']+'.wav','wb')
wavefile.setnchannels(kwarg['channels'])
wavefile.setsampwidth(audio.get_sample_size(kwarg['resolution']))
wavefile.setframerate(kwarg['sampleRate'])
wavefile.writeframes(b''.join(frames))
wavefile.close()
processEncode(trackID=kwarg['trackID'])
#clear memory
gc.collect()
#create a new instance for next recording
self.queueEncoding()
I found this related question but the only answer posted suggests looping the audio as I already have. Would it be better to use an alternative library for this internal recording functionality? Does alsa recognize the internal audio as an audio device? Does pyaudio recognize non-physical audio devices such as an internal audio stream?
I'm trying to read a twitch stream via streamlink (https://streamlink.github.io/api_guide.html)
into OpenCV for further processing in Python.
What works, reading the stream into a stream.ts file via popen and then into opencv:
import subprocess
import os
import time
def create_new_streaming_file(stream_filename="stream0", stream_link="https://www.twitch.tv/tsm_viss"):
try:
os.remove('./Engine/streaming_util/'+stream_filename+'.ts')
except OSError:
pass
cmd = "streamlink --force --output ./Engine/streaming_util/"+stream_filename+".ts "+stream_link+" best"
subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
create_new_streaming_file()
video_capture = cv2.VideoCapture('./Engine/streaming_util/stream0.ts')
Which is very slow and the stream stops after about 30 seconds.
I would like to read the bytestream directly into openCV out of streamlink's python api.
What works is printing out the latest n bytes of a stream into the console:
import streamlink
streams = streamlink.streams("https://www.twitch.tv/grimmmz")
stream = streams["best"]
fd = stream.open()
while True:
data = fd.read(1024)
print(data)
I'm looking for something like this (does not work but you'll get the concept):
streams = streamlink.streams("https://www.twitch.tv/grimmmz")
stream = streams["best"]
fd = stream.open()
bytes=''
while True:
# to read mjpeg frame -
bytes+= fd.read(1024)
a = bytes.find('\xff\xd8')
b = bytes.find('\xff\xd9')
if a!=-1 and b!=-1:
jpg = bytes[a:b+2]
bytes= bytes[b+2:]
img = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8),cv2.CV_LOAD_IMAGE_COLOR)
cv2.imwrite('messigray.png', img)
cv2.imshow('cam2', img)
else:
continue
Thanks a lot in advance!
It was quite tricky to accomplish with reasonable performance, though.
Check out the project: https://github.com/DanielTea/rage-analytics/blob/master/README.md
The main file is realtime_Videostreamer.py in the engine folder. If you initialize this object it creates an ffmpeg subprocess and fills a queue with video frames in an extra thread. This architecture prevents the mainthread from blocking so depending on your networkspeed and cpu power a couple streams can be analyzed in parallel.
This solution works with twitch streams very well. Didn’t try out other streaming sites.
More info about this project.
I'm working on an experiment concerned with spatial sound perception. In this experiment, different sounds should be simultaneously presented from up to eight speakers. For this purpose, I would like to create a Python code for OS X (10.10.5) can read from a multichannel sound file and send each channel of this sound file to a designated speaker (via an appropriate hardware device).
I came across a rather convenient solution for the "present from multiple speakers"-part of the problem: Following this post, it can be easily done for mono/stereo files by adding more entries to the channel_map in PyAudio. The (slightly modified) code looks like this:
import pyaudio
import wave
import sys
chunk = 4096
PyAudio = pyaudio.PyAudio
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = PyAudio()
channel_map = (0, 1, -1, -1, -1, -1, -1, -1)
try:
stream_info = pyaudio.PaMacCoreStreamInfo(
flags=pyaudio.PaMacCoreStreamInfo.paMacCorePlayNice, # default
channel_map=channel_map)
except AttributeError:
print("Sorry, couldn't find PaMacCoreStreamInfo. Make sure that "
"you're running on Mac OS X.")
sys.exit(-1)
print("Stream Info Flags:", stream_info.get_flags())
print("Stream Info Channel Map:", stream_info.get_channel_map())
print("channels",wf.getnchannels())
print('sample width',wf.getsampwidth())
stream = p.open(
format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
output_host_api_specific_stream_info=stream_info)
data = wf.readframes(chunk)
while data != '':
stream.write(data)
data = wf.readframes(chunk)
stream.stop_stream()
stream.close()
p.terminate()
However, I wonder whether this also works with multichannel sound files in PyAudio? Is it possible to read from a multichannel (i.e 8-channel) sound file and to send specific channels to different output devices (e.g. speakers) with PyAudio?
If yes, can someone provide an example of how it can be done? I don't mind digging some more into libraries/modules but providing an example for the code in question would help me (as a novice) a lot.
If PyAudio is not the right choice, I would really appreciate any further recommendations/ideas/comments on how it can be done.
Thanks a lot!
Malte
I have been pulling my hair out trying to get a proxy working. I need to decrypt the packets from a server and client ((this may be out of order..)), then decompress everything but the packet header.
The first 2 packets ((10101 and 20104)) are not compressed, and decrypt, destruct, and decompile properly.
Alas, but to no avail; FAIL!; zlib.error: Error -5 while decompressing data: incomplete or truncated stream
Same error while I am attempting to decompress the encrypted version of the packet.
When I include the packet header, I get a randomly chosen -3 error.
I have also tried changing -zlib.MAX_WBITS to zlib.MAX_WBITS, as well as a few others, but still get the same error.
Here's the code;
import socket, sys, os, struct, zlib
from Crypto.Cipher import ARC4 as rc4
cwd = os.getcwd()
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ss = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('192.168.2.12',9339))
s.listen(1)
client, addr = s.accept()
key = "fhsd6f86f67rt8fw78fw789we78r9789wer6renonce"
cts = rc4.new(key)
stc = rc4.new(key)
skip = 'a'*len(key)
cts.encrypt(skip)
stc.encrypt(skip)
ss.connect(('game.boombeachgame.com',9339))
ss.settimeout(0.25)
s.settimeout(0.25)
def io():
while True:
try:
pack = client.recv(65536)
decpack = cts.decrypt(pack[7:])
msgid, paylen = dechead(pack)
if msgid != 10101:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
print "ID:",msgid
print "Payload Length",paylen
print "Payload:\n",decpack
ss.send(pack)
dump(msgid, decpack)
except socket.timeout:
pass
try:
pack = ss.recv(65536)
msgid, paylen = dechead(pack)
decpack = stc.decrypt(pack[7:])
if msgid != 20104:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
print "ID:",msgid
print "Payload Length",paylen
print "Payload:\n",decpack
client.send(pack)
dump(msgid, decpack)
except socket.timeout:
pass
def dump(msgid, decpack):
global cwd
pdf = open(cwd+"/"+str(msgid)+".bin",'wb')
pdf.write(decpack)
pdf.close()
def dechead(pack):
msgid = struct.unpack('>H', pack[0:2])[0]
print int(struct.unpack('>H', pack[5:7])[0])
payload_bytes = struct.unpack('BBB', pack[2:5])
payload_len = ((payload_bytes[0] & 255) << 16) | ((payload_bytes[1] & 255) << 8) | (payload_bytes[2] & 255)
return msgid, payload_len
io()
I realize it's messy, disorganized and very bad, but it all works as intended minus the decompression.
Yes, I am sure the packets are zlib compressed.
What is going wrong here and why?
Full Traceback:
Traceback (most recent call last):
File "bbproxy.py", line 68, in <module>
io()
File "bbproxy.py", line 33, in io
decopack = zlib.decompress(decpack, zlib.MAX_WBITS)
zlib.error: Error -5 while decompressing data: incomplete or truncated stream
I ran into the same problem while trying to decompress a file using zlib with Python 2.7. The issue had to do with the size of the stream (or file input) exceeding the size that could be stored in memory. (My PC has 16 GB of memory, so it was not exceeding the physical memory size, but the buffer default size is 16384.)
The easiest fix was to change the code from:
import zlib
f_in = open('my_data.zz', 'rb')
comp_data = f_in.read()
data = zlib.decompress(comp_data)
To:
import zlib
f_in = open('my_data.zz', 'rb')
comp_data = f_in.read()
zobj = zlib.decompressobj() # obj for decompressing data streams that won’t fit into memory at once.
data = zobj.decompress(comp_data)
It handles the stream by buffering it and feeding in into the decompressor in manageable chunks.
I hope this helps to save you time trying to figure out the problem. I had help from my friend Jordan! I was trying all kinds of different window sizes (wbits).
Edit: Even with the below working on partial gz files for some files when I decompressed I got empty byte array and everything I tried would always return empty though the function was successful. Eventually I resorted to running gunzip process which always works:
def gunzip_string(the_string):
proc = subprocess.Popen('gunzip',stdout=subprocess.PIPE,
stdin=subprocess.PIPE, stderr=subprocess.DEVNULL)
proc.stdin.write(the_body)
proc.stdin.close()
body = proc.stdout.read()
proc.wait()
return body
Note that the above can return a non-zero error code indicating that the input string is incomplete but it still performs the decompression and hence the stderr being swallowed. You may wish to check errors to allow for this case.
/edit
I think the zlib decompression library is throwing an exception because you are not passing in a complete file just a 65536 chunk ss.recv(65536). If you change from this:
decopack = zlib.decompress(decpack, -zlib.MAX_WBITS)
to
decompressor = zlib.decompressobj(-zlib.MAX_WBITS)
decopack = decompressor(decpack)
it should work as that way can handle streaming.
A the docs say
zlib.decompressobj - Returns a decompression object, to be used for decompressing data streams that won’t fit into memory at once.
or even if it does fit into memory you might just want to do the beginning of the file
Try this:
decopack = zlib.decompressobj().decompress(decpack, zlib.MAX_WBITS)