How to send encoded and decoded image over HTTP - python

I need to send an encoded and decoded image along with some metadata via HTTP.
I would like to send the images as binary data instead of encoding them as base64 as encoding & decoding adds unnecessary latency.
So for example, the encoded image may look like this:
img = open(img_file, 'rb').read()
and the decoded image may look like this:
img = cv2.imread(img_file)
Assume I also need to send some additional information in POST request, such as the image name for example.
What is the most efficient way to send these? What would the code look like in Python? What content-type or other headers would I need to use?
I've found some examples like this online, but they only send a single image and therefore set the content-type as image/jpeg, but I'm wondering what happens when you have additional fields to send.

If you want to send additional fields you have a few options:
Base64 encode the image data and embed it in a json string with all your extra data
Add custom HTTP headers with your fields in
Add your fields to the image metadata itself
I know you said you didn't want to do 1, but how do you know it is adding unnecessary latency if you've never tried it? I expect it's far less than the latency of an HTTP request. Option 2 is risky as the headers can get stripped or changed by network infrastructure and your users might not expect to find data in the headers. Option 3 depends a bit what the data is and whether it makes sense for it to be inside the image (and again whether your users know to look for it there)

Related

Get Image size from URL

I have a list of URIs of images from essentially a Wordpress site.
I want to be able to have a script to get their file sizes (mb, kb, GB) from just using the URIs.
I don't have access to this server-wise and need to add the sizes to a Google sheet. This seems like the fastest way to do it as there are over 5k images and attachments.
However when I do this in Python
>>> import requests
>>> response = requests.get("https://xxx.xxxxx.com/wp-content/uploads/2017/05/resources.png")
>>> len(response.content)
3232
I get 3232 bytes but when I check in Chrome Dev Tools, it's 3.4KB
What is being added? Or is the image actually 3.4KB and my script is only checking content-length?
Also, I don't want to check using the Content-Length header as some of the images may be large and chunked so I want to be sure I'm getting the actual file size of the image.
What is a good way to go about this? I feel like there should be some minimal code or script I could run.
The value you are seeing (3.4KB) includes the network overhead such as response headers.
As a side note, I am not sure what is the version of Chrome you are using but the transfer size (including response headers) and the resource size (i.e. the file size) are displayed separately for me:

decode bytes that are in string form with unkown encoding

So I just started my own little project to create a bot for a game,
but only did little coding before, so I am definitely no expert, if I get something mixed up or forget to mention some information, I apologize in advance!
so basically my python bot will connect to the server (WebSocket connection 13, the header says "Accept-Encoding: gzip, deflate, br"), I use the WebSocket module and that works well. the game sends messages in JSON format. however, they are filled up with backslashes, I think internally a javascript clears those out / splits each message into multiple ones and removes the outermost layer. so far my solution is to just clear out the backslashes and from there on it's pretty straightforward.
problem is: map data is apparently encoded. so basically the message would look like this:
{"type":"pkg","data":"[\"{\\\"type\\\":\\\"pl\\\",\\\"data\\\":[\\\"{\\\\\\\"type\\\\\\\":\\\\\\\"p\\\\\\\",\\\\\\\"id\\\\\\\":227727,\\\\\\\"tpl\\\\\\\":227727,\\\\\\\"s\\\\\\\":458
.... and then at the end of the message (its a lot longer, i just didnt to post 30 lines of compressed data):
{\\\"type\\\":\\\"zip\\\",\\\"data\\\":\\\"{\\\\\\\"type\\\\\\\":\\\\\\\"map\\\\\\\",\\\\\\\"xî182îyî478îtilesî\\\\\\\"1:î¢î¤î£î¦526_21î¢254
which is obviously the encoded / compressed map data. firefox dev tools however shows it decompressed too, it then looks more like this:
\\\\\\\"map\\\\\\\",\\\\\\\"xî\u0080\u0086182î\u0080\u008dyî\u0080\u0086478î\u0080\u008dtilesî\u0080\u0086\\\\\\\"1:î\u0080¢î\u0080¤î\u0080£î\u0080¦526_21î\u0080¢254:36î\u0080²î\u0080´î\u0080³î\u0080µ:î\u0080¬î\u0080¸î\u0080ºî\u0080·î\u0080·î\u0080ºî\u0080¼î\u0080¶
I tried around with different commands and modules like zlib, but honestly, I m really lost. is that data already decoded and now in byte form or is that still compressed zip data? if so, how can I decode it, as I right now handle it as a raw string? or should I put it into a data file from the get-go? what does the xi, in the beginning, stand for, the encoding scheme?
any help is greatly appreciated, I would really like to know what the heck is going on here :D

How can I send both text and image?

In my chat application, only the text is can be sent right now. I'm trying to add a feature in which the images can also be sent. However, there is one point I'm stuck in. When receiving the data, how can I discriminate between photo and text? I'm asking this because these two are completely different procedures. In one of them, we encode it with UTF-8 and send, while in the other we send bytes. On the server side, how can I discriminate them?
I was able to add a send-photo feature on the client side as shown below. When I try it, it succesfully sends image bytes. The only thing I need to is to discriminate the text from bytes on the server side.
As my code is too long, I prefer not to add all of it here. You can access it through my github https://github.com/suleymanyaman/randomchatserver
Client
def sendphoto():
dlg = QFileDialog()
dlg.setFileMode(QFileDialog.AnyFile)
img_dir = QStringListModel()
if dlg.exec_():
img_dir = dlg.selectedFiles()[0]
data = open(r'{}'.format(img_dir),'rb').read()
s.send(data)
Server
while 1:
msg = client.recv(100000000).decode("utf-8")
Once it's on the network, everything is bytes. To add support for images, you just need to send some message that says "An image is coming next." Your protocol hopefully already has some "control messages" that you can use for this.
If you want to keep the protocol "readable" (i.e. you prefer all bytes to be sensible UTF-8), you could use base 64 encoding or similar to turn your images into "text" before sending them. But that probably isn't necessary.

receiving PNG file in django

I have a falcon server that I am trying to port to django. One of the falcon endpoints processes a request that contains a PNG file sent with content_type = 'application/octet-stream'. It writes the data to a file maintaining the correct PNG structure.
The falcon code does this:
form = cgi.FieldStorage(fp=req.stream, environ=req.env)
and then writes the png like this:
fd.write(form[key].file.read())
I cannot figure out how to do the same thing in django. When my view is called the data in request.POST[key] has already been decoded to unicode text and it's no longer valid png data.
How can I do this with django? Should/can I use cgi.FieldStorage? The request I get (of type django.core.handlers.wsgi.WSGIRequest) does not have a stream method. I'm sure there's some way to do this, but I have not come up with anything googling.
I solved this by changing the client to set the file and filename fields each part of the multipart and then I was able it iterate through request.FILES and successfully write the files as PNG.

How to handle unicode of an unknown encoding in Django?

I want to save some text to the database using the Django ORM wrappers. The problem is, this text is generated by scraping external websites and many times it seems they are listed with the wrong encoding. I would like to store the raw bytes so I can improve my encoding detection as time goes on without redoing the scrapes. But Django seems to want everything to be stored as unicode. Can I get around that somehow?
You can store data, encoded into base64, for example. Or try to analize HTTP headers from browser, may be it is simplier to get proper encoding from there.
Create a File with the data. Use a Django models.FileField to hold a reference to the file.
No it does not involve a ton of I/O. If your file is small it adds 2 or 3 I/O's (the directory read, the iNode read and the data read.)

Categories

Resources