Capture first image from h.264 video streaming using websocket - Python - python

I'm trying to capture a single image from H.264 video streaming in my Raspberry Pi. The streaming is using raspivid with websocket. But, cannot show a correct image in imshow(). I also tried to set the .reshape(), but got ValueError: cannot reshape array of size 3607 into shape (480,640,3)
In client side, I successfully connect to the video streaming and get incoming bytes. The server is using raspivid-broadcaster for video streaming. I guess the first byte can be decoded to image? So, I do the following code.
async def get_image_from_h264_streaming():
uri = "ws://127.0.0.1:8080"
async with websockets.connect(uri) as websocket:
frame = json.loads(await websocket.recv())
print(frame)
width, height = frame["width"], frame["height"]
response = await websocket.recv()
print(response)
# transform the byte read into a numpy array
in_frame = (
numpy
.frombuffer(response, numpy.uint8)
# .reshape([height, width, 3])
)
# #Display the frame
cv2.imshow('in_frame', in_frame)
cv2.waitKey(0)
asyncio.get_event_loop().run_until_complete(get_image_from_h264_streaming())
print(frame) shows
{'action': 'init', 'width': 640, 'height': 480}
print(response) shows
b"\x00\x00\x00\x01'B\x80(\x95\xa0(\x0fh\x0..............xfc\x9f\xff\xf9?\xff\xf2\x7f\xff\xe4\x80"
Any suggestions?
---------------------------------- EDIT ----------------------------------
Thanks for this suggestion. Here is my updated code.
def decode(raw_bytes: bytes):
code_ctx = av.CodecContext.create("h264", "r")
packets = code_ctx.parse(raw_bytes)
for i, packet in enumerate(packets):
frames = code_ctx.decode(packet)
if frames:
return frames[0].to_ndarray()
async def save_img():
async with websockets.connect("ws://127.0.0.1:8080") as websocket:
image_init = await websocket.recv()
count = 0
combined = b''
while count < 3:
response = await websocket.recv()
combined += response
count += 1
frame = decode(combined)
print(frame)
cv2.imwrite('test.jpg', frame)
asyncio.get_event_loop().run_until_complete(save_img())
print(frame) shows
[[109 109 109 ... 115 97 236]
[109 109 109 ... 115 97 236]
[108 108 108 ... 115 97 236]
...
[111 111 111 ... 101 103 107]
[110 110 110 ... 101 103 107]
[112 112 112 ... 104 106 110]]
Below is the saved image I get. It has the wrong size of 740(height)x640(width). The correct one is 480(height) x 640(width). And, not sure why the image is grayscale instead of color one.
---------------------------------- EDIT 2 ----------------------------------
Below is the main method to send data in raspivid.
raspivid - index.js
const {port, ...raspividOptions} = {...options, profile: 'baseline', timeout: 0};
videoStream = raspivid(raspividOptions)
.pipe(new Splitter(NALSeparator))
.pipe(new stream.Transform({
transform: function (chunk, _encoding, callback){
...
callback();
}
}));
videoStream.on('data', (data) => {
wsServer.clients.forEach((socket) => {
socket.send(data, {binary: true});
});
});
stream-split - index.js (A line of code shows the max. size is 1Mb)
class Splitter extends Transform {
constructor(separator, options) {
...
this.bufferSize = options.bufferSize || 1024 * 1024 * 1 ; //1Mb
...
}
_transform(chunk, encoding, next) {
if (this.offset + chunk.length > this.bufferSize - this.bufferFlush) {
var minimalLength = this.bufferSize - this.bodyOffset + chunk.length;
if(this.bufferSize < minimalLength) {
//console.warn("Increasing buffer size to ", minimalLength);
this.bufferSize = minimalLength;
}
var tmp = new Buffer(this.bufferSize);
this.buffer.copy(tmp, 0, this.bodyOffset);
this.buffer = tmp;
this.offset = this.offset - this.bodyOffset;
this.bodyOffset = 0;
}
...
}
};
----------Completed Answer (Thanks Ann and Christoph for the direction)----------
Please see in answer section.

One question, how is the frame/stream transmitted trough websocket? The Byte sequence looks like a nal unit, it can be PPS or SPS etc. how do you know its an IFrame for example, i dont know If cv2.imshow Support RAW H264. Look into pyav there u can open h264 raw bytes then you can try to exract one frame out of it :) let me know if you need help on pyav, Look at this post
there is an example how you can doit.
Update
Based on your comment, you need a way to parse and decode a raw h264 stream,
below is a function that give u and idea about that, you need to pass your recived bytes from websocket to this function, be aware that needs to be enough data to extract one frame.
pip install av
PyAV docs
import av
# Feed in your raw bytes from socket
def decode(raw_bytes: bytes):
code_ctx = av.CodecContext.create("h264", "r")
packets = code_ctx.parse(raw_bytes)
for i, packet in enumerate(packets):
frames = code_ctx.decode(packet)
if frames:
return frame[0].to_ndarray()
You could also try to read directly with pyav the Stream with av.open("tcp://127.0.0.1:")
Update 2
Could u please test this, the issues that you have on your edit are weird, you dont need a websocket layer I thing you can read directly from raspivid
raspivid -a 12 -t 0 -w 1280 -h 720 -vf -ih -fps 30 -l -o tcp://0.0.0.0:5000
def get_first_frame(path):
stream = av.open(path, 'r')
for packet in stream.demux():
frames = packet.decode()
if frames:
return frames[0].to_ndarray(format='bgr24')
ff = get_first_frame("tcp://0.0.0.0:5000")
cv2.imshow("Video", ff)
cv2.waitKey(0)

Packages of PyAV and Pillow are required. No need to use cv2 anymore. So, add the packages
pip3 install av
pip3 install Pillow
Codes
import asyncio
import websockets
import av
import PIL
def decode_image(raw_bytes: bytes):
code_ctx = av.CodecContext.create("h264", "r")
packets = code_ctx.parse(raw_bytes)
for i, packet in enumerate(packets):
frames = code_ctx.decode(packet)
if frames:
return frames[0].to_image()
async def save_img_from_streaming():
uri = "ws://127.0.0.1:8080"
async with websockets.connect(uri) as websocket:
image_init = await websocket.recv()
count = 0
combined = b''
while count < 2:
response = await websocket.recv()
combined += response
count += 1
img = decode_image(combined)
img.save("img1.png","PNG")
asyncio.get_event_loop().run_until_complete(save_img_from_streaming())
By Christoph's answer, to_ndarray is suggested, but I found it somehow it results a grayscale image, which is casued by the return of incorrect numpy array form like [[...], [...], [...], ...]. The colored image should be an array like [[[...], [...], [...], ...], ...]. Then, I look at the PyAV docs, there is another method called to_image, which can return an RGB PIL.Image of the frame. So, just using that function can get what I need.
Notes that response from await websocket.recv() may be different. It depends on how the server sends.

This is a problem I once had when attempting to send numpy images (converted to bytes) through sockets. The problem was that the bytes string was too long.
So instead of sending the entire image at once, I sliced the image so that I had to send, say, 10 slices of the image. Once the other end receives the 10 slices, simply stack them together.
Keep in mind that depending on the size of your images, you may need to slice them more or less to achieve the optimal results (efficiency, no errors).

Related

Asyncio, Arduino BLE, and not reading characteristic updates

I have an Arduino 33 BLE that is updating a few bluetooth characteristics with a string representation of BNO055 sensor calibration and quaternion data. On the Arduino side, I see the calibration and quaternion data getting updated in a nice orderly sequence as expected.
I have a Python (3.9) program running on Windows 10 that uses asyncio to subscribe to the characteristics on the Arduino to read the updates. Everything works fine when I have an update rate on the Arduino of 1/second. By "works fine" I mean I see the orderly sequence of updates: quaternion, calibration, quaternion, calibration,.... The problem I have is that I changed the update rate to the 10/second (100ms delay in Arduino) and now I am getting, for example, 100 updates for quaternion data but only 50 updates for calibration data when the number of updates should be equal. Somehow I'm not handling the updates properly on the python side.
The python code is listed below:
import asyncio
import pandas as pd
from bleak import BleakClient
from bleak import BleakScanner
ardAddress = ''
found = ''
exit_flag = False
temperaturedata = []
timedata = []
calibrationdata=[]
quaterniondata=[]
# loop: asyncio.AbstractEventLoop
tempServiceUUID = '0000290c-0000-1000-8000-00805f9b34fb' # Temperature Service UUID on Arduino 33 BLE
stringUUID = '00002a56-0000-1000-8000-00805f9b34fb' # Characteristic of type String [Write to Arduino]
inttempUUID = '00002a1c-0000-1000-8000-00805f9b34fb' # Characteristic of type Int [Temperature]
longdateUUID = '00002a08-0000-1000-8000-00805f9b34fb' # Characteristic of type Long [datetime millis]
strCalibrationUUID = '00002a57-0000-1000-8000-00805f9b34fb' # Characteristic of type String [BNO055 Calibration]
strQuaternionUUID = '9e6c967a-5a87-49a1-a13f-5a0f96188552' # Characteristic of type Long [BNO055 Quaternion]
async def scanfordevices():
devices = await BleakScanner.discover()
for d in devices:
print(d)
if (d.name == 'TemperatureMonitor'):
global found, ardAddress
found = True
print(f'{d.name=}')
print(f'{d.address=}')
ardAddress = d.address
print(f'{d.rssi=}')
return d.address
async def readtemperaturecharacteristic(client, uuid: str):
val = await client.read_gatt_char(uuid)
intval = int.from_bytes(val, byteorder='little')
print(f'readtemperaturecharacteristic: Value read from: {uuid} is: {val} | as int={intval}')
async def readdatetimecharacteristic(client, uuid: str):
val = await client.read_gatt_char(uuid)
intval = int.from_bytes(val, byteorder='little')
print(f'readdatetimecharacteristic: Value read from: {uuid} is: {val} | as int={intval}')
async def readcalibrationcharacteristic(client, uuid: str):
# Calibration characteristic is a string
val = await client.read_gatt_char(uuid)
strval = val.decode('UTF-8')
print(f'readcalibrationcharacteristic: Value read from: {uuid} is: {val} | as string={strval}')
async def getservices(client):
svcs = await client.get_services()
print("Services:")
for service in svcs:
print(service)
ch = service.characteristics
for c in ch:
print(f'\tCharacteristic Desc:{c.description} | UUID:{c.uuid}')
def notification_temperature_handler(sender, data):
"""Simple notification handler which prints the data received."""
intval = int.from_bytes(data, byteorder='little')
# TODO: review speed of append vs extend. Extend using iterable but is faster
temperaturedata.append(intval)
#print(f'Temperature: Sender: {sender}, and byte data= {data} as an Int={intval}')
def notification_datetime_handler(sender, data):
"""Simple notification handler which prints the data received."""
intval = int.from_bytes(data, byteorder='little')
timedata.append(intval)
#print(f'Datetime: Sender: {sender}, and byte data= {data} as an Int={intval}')
def notification_calibration_handler(sender, data):
"""Simple notification handler which prints the data received."""
strval = data.decode('UTF-8')
numlist=extractvaluesaslist(strval,':')
#Save to list for processing later
calibrationdata.append(numlist)
print(f'Calibration Data: {sender}, and byte data= {data} as a List={numlist}')
def notification_quaternion_handler(sender, data):
"""Simple notification handler which prints the data received."""
strval = data.decode('UTF-8')
numlist=extractvaluesaslist(strval,':')
#Save to list for processing later
quaterniondata.append(numlist)
print(f'Quaternion Data: {sender}, and byte data= {data} as a List={numlist}')
def extractvaluesaslist(raw, separator=':'):
# Get everything after separator
s1 = raw.split(sep=separator)[1]
s2 = s1.split(sep=',')
return list(map(float, s2))
async def runmain():
# Based on code from: https://github.com/hbldh/bleak/issues/254
global exit_flag
print('runmain: Starting Main Device Scan')
await scanfordevices()
print('runmain: Scan is done, checking if found Arduino')
if found:
async with BleakClient(ardAddress) as client:
print('runmain: Getting Service Info')
await getservices(client)
# print('runmain: Reading from Characteristics Arduino')
# await readdatetimecharacteristic(client, uuid=inttempUUID)
# await readcalibrationcharacteristic(client, uuid=strCalibrationUUID)
print('runmain: Assign notification callbacks')
await client.start_notify(inttempUUID, notification_temperature_handler)
await client.start_notify(longdateUUID, notification_datetime_handler)
await client.start_notify(strCalibrationUUID, notification_calibration_handler)
await client.start_notify(strQuaternionUUID, notification_quaternion_handler)
while not exit_flag:
await asyncio.sleep(1)
# TODO: This does nothing. Understand why?
print('runmain: Stopping notifications.')
await client.stop_notify(inttempUUID)
print('runmain: Write to characteristic to let it know we plan to quit.')
await client.write_gatt_char(stringUUID, 'Stopping'.encode('ascii'))
else:
print('runmain: Arduino not found. Check that its on')
print('runmain: Done.')
def main():
# get main event loop
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(runmain())
except KeyboardInterrupt:
global exit_flag
print('\tmain: Caught keyboard interrupt in main')
exit_flag = True
finally:
pass
print('main: Getting all pending tasks')
# From book Pg 26.
pending = asyncio.all_tasks(loop=loop)
print(f'\tmain: number of tasks={len(pending)}')
for task in pending:
task.cancel()
group = asyncio.gather(*pending, return_exceptions=True)
print('main: Waiting for tasks to complete')
loop.run_until_complete(group)
loop.close()
# Display data recorded in Dataframe
if len(temperaturedata)==len(timedata):
print(f'Temperature data len={len(temperaturedata)}, and len of timedata={len(timedata)}')
df = pd.DataFrame({'datetime': timedata,
'temperature': temperaturedata})
#print(f'dataframe shape={df.shape}')
#print(df)
df.to_csv('temperaturedata.csv')
else:
print(f'No data or lengths different: temp={len(temperaturedata)}, time={len(timedata)}')
if len(quaterniondata)==len(calibrationdata):
print('Processing Quaternion and Calibration Data')
#Load quaternion data
dfq=pd.DataFrame(quaterniondata,columns=['time','qw','qx','qy','qz'])
print(f'Quaternion dataframe shape={dfq.shape}')
#Add datetime millis data
#dfq.insert(0,'Time',timedata)
#Load calibration data
dfcal=pd.DataFrame(calibrationdata,columns=['time','syscal','gyrocal','accelcal','magcal'])
print(f'Calibration dataframe shape={dfcal.shape}')
#Merge two dataframes together
dffinal=pd.concat([dfq,dfcal],axis=1)
dffinal.to_csv('quaternion_and_cal_data.csv')
else:
print(f'No data or lengths different. Quat={len(quaterniondata)}, Cal={len(calibrationdata)}')
if len(quaterniondata)>0:
dfq = pd.DataFrame(quaterniondata, columns=['time', 'qw', 'qx', 'qy', 'qz'])
dfq.to_csv('quaterniononly.csv')
if len(calibrationdata)>0:
dfcal = pd.DataFrame(calibrationdata, columns=['time','syscal', 'gyrocal', 'accelcal', 'magcal'])
dfcal.to_csv('calibrationonly.csv')
print("main: Done.")
if __name__ == "__main__":
'''Starting Point of Program'''
main()
So, my first question is can anyone help me understand why I do not seem to be getting all the updates in my Python program? I should be seeing notification_quaternion_handler() and notification_calibration_handler() called the same number of times but I am not. I assume I am not using asyncio properly but I am at a loss to debug it at this point?
My second question is, are there best practices for trying to receive relatively high frequency updates from bluetooth, for example every 10-20 ms? I am trying to read IMU sensor data and it needs to be done at a fairly high rate.
This is my first attempt at bluetooth and asyncio so clearly I have a lot to learn.
Thank You for the help
Fantastic answer by #ukBaz.
In summary for other who may have a similar issue.
On the Arduino side I ended up with something like this (important parts only shown):
typedef struct __attribute__ ((packed)) {
unsigned long timeread;
int qw; //float Quaternion values will be scaled to int by multiplying by constant
int qx;
int qy;
int qz;
uint8_t cal_system;
uint8_t cal_gyro;
uint8_t cal_accel;
uint8_t cal_mag;
}sensordata ;
//Declare struct and populate
sensordata datareading;
datareading.timeread=tnow;
datareading.qw=(int) (quat.w()*10000);
datareading.qx=(int) (quat.x()*10000);
datareading.qy=(int) (quat.y()*10000);
datareading.qz=(int) (quat.z()*10000);
datareading.cal_system=system;
datareading.cal_gyro=gyro;
datareading.cal_accel=accel;
datareading.cal_mag=mag;
//Write values to Characteristics.
structDataChar.writeValue((uint8_t *)&datareading, sizeof(datareading));
Then on the Python (Windows Desktop) side I have this to unpack the data being sent:
def notification_structdata_handler(sender, data):
"""Simple notification handler which prints the data received."""
# NOTE: IT IS CRITICAL THAT THE UNPACK BYTE STRUCTURE MATCHES THE STRUCT
# CONFIGURATION SHOWN IN THE ARDUINO C PROGRAM.
# <hh meaning: <=little endian, h=short (2 bytes), b=1 byte, i=int 4 bytes, unsigned long = 4 bytes
#Scale factor used in Arduino to convert floats to ints.
scale=10000
# Main Sensor struct
t,qw,qx,qy,qz,cs,cg,ca,cm= struct.unpack('<5i4b', data)
sensorstructdata.append([t,qw/scale,qx/scale,qy/scale,qz/scale,cs,cg,ca,cm])
print(f'--->Struct Decoded. time={t}, qw={qw/scale}, qx={qx/scale}, qy={qy/scale}, qz={qz/scale},'
f'cal_s={cs}, cal_g={cg}, cal_a={ca}, cal_m={cm}')
Thanks for all the help and as promised the performance is MUCH better than what I started with!
You have multiple characteristics that are being updated at the same frequency. It is more efficient in Bluetooth Low Energy (BLE) to transmit those values in the same characteristic. The other thing I noticed is that you appear to be sending the value as a string. It looks like the string format might "key:value" by the way you are extracting information from the string. This is also inefficient way to send data via BLE.
The data that is transmitted over BLE is always a list of bytes so if a float is required, it needs to be changed into an integer to be sent as bytes. As an example, if we wanted to send a value with two decimal places, multiplying it by 100 would always remove the decimal places. To go the other way it would be divide by 100. e.g:
>>> value = 12.34
>>> send = int(value * 100)
>>> print(send)
1234
>>> send / 100
12.34
The struct library allows integers to be easily packed that into a series of byes to send. As an example:
>>> import struct
>>> value1 = 12.34
>>> value2 = 67.89
>>> send_bytes = struct.pack('<hh', int(value1 * 100), int(value2 * 100))
>>> print(send_bytes)
b'\xd2\x04\x85\x1a'
To then unpack that:
>>> r_val1, r_val2 = struct.unpack('<hh', send_bytes)
>>> print(f'Value1={r_val1/100} : Value2={r_val2/100}')
Value1=12.34 : Value2=67.89
Using a single characteristic with the minimum number of bytes being transmitted should allow for the faster notifications.
To look at how other characteristics do this then look at the following document from the Bluetooth SIG:
https://www.bluetooth.com/specifications/specs/gatt-specification-supplement-5/
A good example might be the Blood Pressure Measurement characteristic.

Get data as dataframe from Look through API sdk package in python

I have the following code to get a look from Looker through an API. I stored the api in a .ini. All of that works, but now I want to get the data from this look as Dataframe in python, so that I can use the data for further analysis. How can i do that? I used this code, but that only saves it to png. I can't find a way to create a dataframe from this, as I want the data itself and not just the outcome image.
import sys
import textwrap
import time
import looker_sdk
from looker_sdk import models
sdk = looker_sdk.init40("/Name.ini")
def get_look(title: str) -> models.Look:
title = title.lower()
look = next(iter(sdk.search_looks(title=title)), None)
if not look:
raise Exception(f"look '{title}' was not found")
return look
def download_look(look: models.Look, result_format: str, width: int, height: int):
"""Download specified look as png/jpg"""
id = int(look.id)
task = sdk.create_look_render_task(id, result_format, width, height,)
if not (task and task.id):
raise sdk.RenderTaskError(
f"Could not create a render task for '{look.title}'"
)
# poll the render task until it completes
elapsed = 0.0
delay = 0.5 # wait .5 seconds
while True:
poll = sdk.render_task(task.id)
if poll.status == "failure":
print(poll)
raise Exception(f"Render failed for '{look.title}'")
elif poll.status == "success":
break
time.sleep(delay)
elapsed += delay
print(f"Render task completed in {elapsed} seconds")
result = sdk.render_task_results(task.id)
filename = f"{look.title}.{result_format}"
with open(filename, "wb") as f:
f.write(result)
print(f"Look saved to '{filename}'")
look_title = sys.argv[1] if len(sys.argv) > 1 else "Name"
image_width = int(sys.argv[2]) if len(sys.argv) > 2 else 545
image_height = int(sys.argv[3]) if len(sys.argv) > 3 else 842
image_format = sys.argv[4] if len(sys.argv) > 4 else "png"
if not look_title:
raise Exception(
textwrap.dedent(
"""
Please provide: <lookTitle> [<img_width>] [<img_height>] [<img_format>]
img_width defaults to 545
img_height defaults to 842
img_format defaults to 'png'"""
)
)
look = get_look(look_title)
#Dataframe storage
download_look(look, image_format, image_width, image_height)
The SDK function you are using (create_look_render_task), which is described here only allows you to download in either pdf, png, or jpg.
If you want to get the data from a Look into a dataframe then you may want to look into using the run_look function instead described here. When you use run_look you can change the result_format to CSV and then write your own code to convert to a dataframe.

Images are seemingly randomly truncated when sent over sockets in python using PIL and BytesIO

I want to send PIL images over a socket in python. I used a method suggested on stack overflow:
def exchange_image(self, img_send):
fd = io.BytesIO()
img_send.save(fd, "png")
img_recv = self.exchange(fd.getvalue(), MAX_IMG_SIZE) # max size is 10000000
return Image.open(io.BytesIO(img_recv))
def exchange(self, to_send, size):
if self.role == "client":
self.conn.sendall(to_send)
return self.conn.recv(size)
elif self.role == "server":
received = self.conn.recv(size)
self.conn.sendall(to_send)
return received
The problem with this method is that sometimes it throws a strange error which should not happen in my opinion.
struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0)
which causes
OSError("image file is truncated")
The error happens when I use the returned image and make a PhotoImage out of it.
self.image_gui = ImageTk.PhotoImage(self.image)
I am unable to wrap my head around this as the buffer apparently disappears at random (one can call the function 5 times with perfect results and then suddenly it does not work anymore).
This is the first time I do "networking" and it most probably has something to do with the way I am misunderstanding the inner workings of BytesIO, send and recv. Help would be greatly appreciated.
Now it works.
def exchange_image(self, img_send):
fd = io.BytesIO()
img_send.save(fd, "png")
img_recv = self.exchange(fd.getvalue())
fd.close()
return Image.open(io.BytesIO(img_recv))
def exchange(self, to_send):
size_send = str(len(to_send))
size_send = "0" * (MAX_INIT_SIZE - len(size_send)) + size_send
if self.role == "client":
self.conn.sendall(size_send.encode())
size_recv = int(self.recvall(MAX_INIT_SIZE).decode())
self.conn.sendall(to_send)
return self.recvall(size_recv)
elif self.role == "server":
size_recv = int(self.recvall(MAX_INIT_SIZE).decode())
self.conn.sendall(size_send.encode())
received = self.recvall(size_recv)
self.conn.sendall(to_send)
return received
def recvall(self, size):
msg = bytes()
while len(msg) < size:
msg += self.conn.recv(size)
return msg

Sending and rendering multiple images from Flask backend to Flutter frontend

I've been trying to send multiple images from my flask server to flutter. I've tried everything, I either get a byte cannot be json serialised or flutter gives error in parsing the image. I've been using Image.memory() for the response.
The weird part is, if I send over one image in bytes format, it works as intended.
Any help is greatly appreciated
#app.route('/image', methods = ['POST'])
def hola():
with open("1.jpg", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
return encoded_string
This server side code works as intended. Following is the code I used for Flutter
Future<String> uploadImage(filename, url) async {
// List<String> images;
var request = http.MultipartRequest('POST', Uri.parse(url));
request.files.add(
await http.MultipartFile.fromPath('picture', filename),
);
request.headers.putIfAbsent('Connection', () => "Keep-Alive, keep-alive");
request.headers.putIfAbsent('max-age', () => '100000');
print(request.headers.entries);
http.Response response =
await http.Response.fromStream(await request.send());
print("Result: ${response.statusCode}");
// print(y);
return response.body;
// return res;
}
Then I call this function with help of an on button click event. like this:
var res = await uploadImage(file.path, url);
setState(() {
images = res;
});
Container(
child: state == ""
? Text('No Image Selected')
: Image.memory(base64.decode(images)),
),
The above is the working example it renders the Image I send. The following is where I face problem:
Server Side:
#app.route('/send', methods= ['GET'])
def send():
with open("1.jpg", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
with open("2.jpg", "rb") as image_file:
encoded_string2 = base64.b64encode(image_file.read())
x = [str(encoded_string2), str(encoded_string)]
return jsonify({'images':x})
To handle the above here is my flutter code:
var request = http.MultipartRequest('POST', Uri.parse(url));
request.files.add(
await http.MultipartFile.fromPath('picture', filename),
);
request.headers.putIfAbsent('Connection', () => "Keep-Alive, keep-alive");
request.headers.putIfAbsent('max-age', () => '100000');
print(request.headers.entries);
http.Response response =
await http.Response.fromStream(await request.send());
print("Result: ${response.statusCode}");
var x = jsonDecode(response.body);
var y = x['images'];
var z = y[0];
images = z;
To render the image the container code remains the same. I get this error:
The following _Exception was thrown resolving an image codec:
Exception: Invalid image data
or I get:
Unexpected character at _
I tried parsing in a different manner, for ex:
var x = jsonDecode(response.body);
var y = x['images'];
var z = utf8.encode(y[0]);
images = base64Encode(x[0]);
or this:
var x = jsonDecode(response.body);
var y = x['images'];
var z = base64Decode(y[0]);
images = z;
but nothing works
if you are trying to return several image binaries in a response I assume looks something like
{ "image1":"content of image one bytes", "image2":"content of image two bytes" }
and as you have found a problem arises in that binary content cannot be encoded naively into json
what you would typically do is convert it to base64
{"image1":base64.b64encode(open("my.png","rb").read()),"image2":open(...)}
most things can render base64 (not entirely sure specifically for flutter Images (but certainly for tags in html)
<img src="data:image/png;base64,<BASE64 encoded data>" />
If not you can always get the bytes back with base64decode (which flutter almost certainly has in one of its libraries)

Problem with streaming audio in Python from a mic via MQTT to Google Streaming using generators

I've read the Google documentation and looked at their examples however have not managed to get this working correctly in my particular use case. The problem is that the packets of the audio stream are broken up into smaller chunks (frame size) base64 encoded and sent over MQTT - meaning that the generator approach is likely to stop part way through despite not being fully completed by the sender. My MicrophoneSender component will send the final part of the message with a segment_key = -1, so this is the flag that the complete message has been sent and that a full/final process of the stream can be completed. Prior to that point the buffer may not have all of the complete stream so it's difficult to get either a) the generator to stop yielding b) the google as to return a partial transcription. A partial transcription is required once every 10 or so frames.
To illustrate this better here is my code.
inside receiver:
STREAMFRAMETHRESHOLD = 10
def mqttMsgCallback(self, client, userData, msg):
if msg.topic.startswith("MicSender/stream"):
msgDict = json.loads(msg.payload)
streamBytes = b64decode(msgDict['audio_data'].encode('utf-8'))
frameNum = int(msgDict['segment_num'])
if frameNum == 0:
self.asr_time_start = time.time()
self.asr.endOfStream = False
if frameNum >= 0:
self.asr.store_stream_bytes(streamBytes)
self.asr.endOfStream = False
if frameNum % STREAMFRAMETHRESHOLD == 0:
self.asr.get_intermediate_and_print()
else:
#FINAL, recieved -1
trans = self.asr.finish_stream()
self.send_message(trans)
self.frameCount=0
inside Google Speech Class implementation:
class GoogleASR(ASR):
def __init__(self, name):
super().__init__(name)
# STREAMING
self.stream_buf = queue.Queue()
self.stream_gen = self.getGenerator(self.stream_buf)
self.endOfStream = True
self.requests = (types.StreamingRecognizeRequest(audio_content=chunk) for chunk in self.stream_gen)
self.streaming_config = types.StreamingRecognitionConfig(config=self.config)
self.current_transcript = ''
self.numCharsPrinted = 0
def getGenerator(self, buff):
while not self.endOfStream:
# Use a blocking get() to ensure there's at least one chunk of
# data, and stop iteration if the chunk is None, indicating the
# end of the audio stream.
chunk = buff.get()
if chunk is None:
return
data = [chunk]
# Now consume whatever other data's still buffered.
while True:
try:
chunk = buff.get(block=False)
data.append(chunk)
except queue.Empty:
self.endOfStream = True
yield b''.join(data)
break
yield b''.join(data)
def store_stream_bytes(self, bytes):
self.stream_buf.put(bytes)
def get_intermediate_and_print(self):
self.get_intermediate()
def get_intermediate(self):
if self.stream_buf.qsize() > 1:
print("stream buf size: {}".format(self.stream_buf.qsize()))
responses = self.client.streaming_recognize(self.streaming_config, self.requests)
# print(responses)
try:
# Now, put the transcription responses to use.
if not self.numCharsPrinted:
self.numCharsPrinted = 0
for response in responses:
if not response.results:
continue
# The `results` list is consecutive. For streaming, we only care about
# the first result being considered, since once it's `is_final`, it
# moves on to considering the next utterance.
result = response.results[0]
if not result.alternatives:
continue
# Display the transcription of the top alternative.
self.current_transcript = result.alternatives[0].transcript
# Display interim results, but with a carriage return at the end of the
# line, so subsequent lines will overwrite them.
#
# If the previous result was longer than this one, we need to print
# some extra spaces to overwrite the previous result
overwrite_chars = ' ' * (self.numCharsPrinted - len(self.current_transcript))
sys.stdout.write(self.current_transcript + overwrite_chars + '\r')
sys.stdout.flush()
self.numCharsPrinted = len(self.current_transcript)
def finish_stream(self):
self.endOfStream = False
self.get_intermediate()
self.endOfStream = True
final_result = self.current_transcript
self.stream_buf= queue.Queue()
self.allBytes = bytearray()
self.current_transcript = ''
self.requests = (types.StreamingRecognizeRequest(audio_content=chunk) for chunk in self.stream_gen)
self.streaming_config = types.StreamingRecognitionConfig(config=self.config)
return final_result
Currently what this does is output nothing from the transcriptions side.
stream buf size: 21
stream buf size: 41
stream buf size: 61
stream buf size: 81
stream buf size: 101
stream buf size: 121
stream buf size: 141
stream buf size: 159
But the response/transcript is empty. If I put a breakpoint on the for response in responses inside the get_intermediate function then it never runs which means that for some reason it's empty (not retuned from Google). However, if I put a breakpoint on the generator and take too long (> 5 seconds) to continue to yield the data, it (Google) tells me that the data is probably being sent to the server too slow. google.api_core.exceptions.OutOfRange: 400 Audio data is being streamed too slow. Please stream audio data approximately at real time.
Maybe someone can spot the obvious here...
The way you have organized your code, the generator you give to the Google API is initialized exactly once - on line 10, using a generator expression: self.requests = (...). As constructed, this generator will also run exactly once and become 'exhausted'. Same applies to the generator function that the (for ...) generator itself calls (self.getGeneerator()). It will run once only and stop when it retrieved 10 chunks of data (which are very small, from what I can see). Then, the outer generator (what you assigned to self.requests) will also stop forever - giving the ASR only a short bit of data (10 times 20 bytes, looking at the printed debug output). There's nothing recognizable in that, most likely.
BTW, note you have a redundant yield b''.join(data) in your function, the data will be sent twice.
You will need to redo the (outer) generator so it does not return until all data is received. If you want to use another generator as you do to gather each bigger chunk for the 'outer' generator from which the Google API is reading, you will need to re-make it every time you begin a new loop with it.

Categories

Resources