I've been trying to respond to serve images when someone goes to a url like localhost:5000/image.png. I'm serving the headers via python sockets
Steps I tried:
When the file is requested, I first loaded the raw binaries of the file as shown below.
data=open(filename, "rb").read()
After loading the binaries, I formatted it into the headers as shown below
HTTP/1.1 200 OK
Content-Type: image/png
Content-Length: {len(data)}
{data}
When I return the response, the browser displays a small white box.
After trying several solutions I found online(mentioned below btw), I concluded that the white box only appears if the data I'm returning in the response is invalid
Things I've tried:
Encoding the data variable using base64 encode, same result
Attempting to decode the data using utf-8, raises "Invalid Start Byte"
Attempting to decode the data using ISO-8859-1, it did decode but it displays the same white box
I've tried the same thing with non image files with their respective content-types, they work perfectly
Source code of the part
class ClassicFileRequest(object):
#staticmethod
def create(response, rtp):
basestring=f"HTTP/1.1 200 OK\nContent-Type: {rtp}\nContent-Length: {len(response)}\n\n{response}" # Base string, this is where all the file data is formatted
return basestring
def responsefromfile(filename):
print(filename)
ft=mimetypes.guess_type(filename)[0]
if filename.endswith((".webp",".jpeg",".png",".gif",".ico")):
data=open(filename, "rb").read() # Read binaries for image files
return ClassicFileRequest.create(data, ft)
else:
data=open(filename, "r").read()
return ClassicFileRequest.create(data,ft)
UPDATE:*
The error has been fixed, apparently formatting a bytes object into a string wasn't a great thing to do
Related
I'm trying to decode a base64 pdf file and send it to another endpoint.
I used a python policy for the decoding part and here's the code
import base64
pdfB64 = flow.getVariable("request.content")
pdfFile = base64.b64decode(pdfB64)
flow.setVariable("pdfFileDecoded",pdfFile)
Now, when I send my http post request which is below
headers :
Accept : */*
boundary : --Boundaryy
--Boundaryy
Content-Disposition: form-data; name="testdu12janvier"; filename="testdu12janvier.pdf"
Content-Type: application/pdf
<< Heres is sensitive data which is basically a base64 encoded pdf file >>
--Boundaryy--
When I send this POST request and trace it in Apigee Edge, I notice that something else is encoded before the pdf file I think its either the boundary or one of the headers. This makes a corrupt pdf file which can't be read.
How do I isolate the pdf file from the request body without removing boundaries? as I'll need to send multiple in near future.
I'm trying to send raw response (which is a jpeg image from my laptop camera) in python using flask, this is the particular snippet:
#app.route("/")
def stream():
frm = imencode('.jpg',cap.read()[1])[1].tobytes()
resp = Response()
resp.set_data(value="HTTP/1.1 200 Ok\nContent-Type:image/jpeg\nContent-Length:"+str(len(frm))+"\n\n"+str(frm))
return resp
My browser seem to display it as text nonetheless. If initialize
Response(frm.tobytes(), headers={"Content-Type":"image/jpeg"})
then then browser decodes the image ok. I'm not very good with web stuff, but from what I've found so far response consists of first line specifying the http version, response status code and respective message. Then come the header fields each on new line. then a clean line separates metadata from the body. I've read that the bare minimum for a simple response are Content-Type and Content-Length headers. Some sources also mention using \r in combination with \n to separate the lines but i didn't find the example so far and this source didn't specify where exactly \r should be added.
I have tested some requests inside the Postman app. First, I want to get the body information of an HTTP request inside Python (package requests used). The response appears positive with 200 OK.
response = session.request("POST", url, headers=headers, data=payload, verify ='custom-proxy-ca.crt')
Now I would like to get the body with
body = response.content
Print(body) delivers
b'\x83\x84\x01\x00\xc4\xff\xd4\xe9\xb4\xf6\xde,\x13\xa9\xc0(\xc7_\x8dL\x90\xf0\xb4K\xc4<\xe7\xb1M\x02)\xe0\x80z\xd0\xdf>\xcf\xd7\xd2\xec\x8d\x1e\xe4un\x0c\x83\xa1\x88g\xe7fah\x89\xbe\xca\xa8\x04_\xa2W\xbd\xfe]W\xd1\x06\x1f\xef~ZN\xa6\x0bq\xfa\x18\xc4\x1f\xb3\xf8\xc2\x9dF\xc5\xf0\xe6\x8d\xb6\xc1\xa0\xab\x7f\xfbyM\xe0\x88I\xb4\xd4\x82\xa1%\xd9R7Nt\xa4~<\x8c\x8e\xdb\xe7<xx-.\xab\xa7|16\xcb"\xba\x89\xbc\xe7\xcaF\xd1\xacV-u\xbf\xaa\x04\xf7\xa2\x88\xa1\x1bUI\xdfkI$`\x18:j\x7fU\x02\x0e\xcb\x97\x8em\xc6\x81\xe6\x85\xbe\xa5\xb9vbjQ$}M&n\xe0$A\xe0\xd9\xd2\xc6\x9aA\xf4\x12\x81/1\x0c\xf0(\x0cy\xf5\xaf\xca\x1bQ\x1082\xa1\xb4n4VRR\xbb7\xa5XO\x08\x0c\x13\xf2:\xc0-\x06\xa9\xda\xaeGX\x97B\x81!\x17\x87\xfa\xd1\x1b\xc0\xd0\x89|\xe8E\x0f\rp\xfd\x00\x96\xeaI\xbe\xda\xbb\xe3\x87\xc7\xdb\x9b\xfd\xab\xe8\xc7\xdd\x0cEL-x\xe0\x9bVhY\x0cT\x08\x95S\xa3\xfd\xdc\xe3\x81/1\x9d\x9e\'T\xf6\xe0pl\xd33#0,T}X%\x04\x0e\xd7r\xfd\x10\x0cs\xe90\x05\xe8\xe8\xf8\xea\xfc\xe5\xf8\xe1\xfd\xb9\xea\xe7\xe0\xc0\x9a!\xa1\\M\xa8\x9d\x9f\xe4\xa2\x07_\xae\xd7\x0c\xdd\xb8\xaa\xbf\xe9\xfc\x1a|\x89^\xf59\x81\xe3J\x91\xa4v(\xff7J1\x1ao\x9c\x89\xa1#0\xf4\xaa\xa0\xc7\xbc\xea\x9f\xae\xa6\xe8\xa9-T\xc9#\xd1\x81\x7f\xee\x9a\xbb\xfd\x87\xc3\xe3+|K\xe2\xfdPe\xa0\xaa\x9d\x18\xf0\xcc\xc0\xf10\x80\xca\xb0XuW\x9d\xcc\xc0\xa5\xc8;bP\xdd\x9d\x1aeC\xfd\xf84\xa6\x14yG\xeb\xb5\x01\x03'
Now I try to search a token in the body, but it seems to be encrypted.
If I want to get the result of the JSON parser with
json.loads(body)
it returns
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 0: invalid start byte`.
Okay, it seems that the encoding is done in a different way than expected. But how did the Postman app do the decoding of the body? For example, I can read it there parsed as JSON (see the figure below). What am I doing wrong in Python?
Request
Okay, the problem is solved, but I want to share with you how to deal with this kind of problem.
The initial problem is to call the HTTP POST request with the header parameter Accept-Encoding like
'Accept-Encoding': 'gzip, deflate, br'
This line of code means: Locally can receive data in compressed format.
The server compresses the large file and sends it back to the client during processing. After receiving the IE, the IE performs a local pressure on the file.
The reason for the error is: the program did not extract the file
Solution: delete this line of code and it works
I am new with python was trying to build an API that would mimic how AWS presigned urls would work. I have created a custom url that hits our endpoint and after resolving the url it should return the data associated to that url. The data is a zip file present on S3. I tried using aiboto3 to get the data
async with aioboto3.client("s3", config=Config(signature_version='s3v4'), aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_KEY,) as s3_client:
try:
s3_ob = await s3_client.get_object(Bucket=DEFAULT_BUCKET_NAME, Key=path)
however this step fails as it gives a
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 12: invalid continuation byte
which is understandable as zip might not be utf-8 encoded.
Hence, I moved on to use boto3 and simply get the object data like this
s3_file = s3_client.get_object(Bucket=DEFAULT_BUCKET_NAME, Key=path)
response = s3_file["Body"].read()
response is a byte array with data like
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x10\x08...
How do I return this data from the API if I want that at the user's end it should automatically download this zip file if this url is hit in the browser? Right now it does get downloaded since I have set the following, but I'm unable to open it as it says unsupported format.
(Set this so that it triggers download automatically)
request.response.content_type = 'binary/octet-stream'
request.response.headers['Access-Control-Allow-Methods'] = 'POST, GET'
request.response.headers['Access-Control-Expose-Headers'] = 'X-filename'
request.response.headers['X-filename'] = path
request.response.content_disposition = 'attachment;filename=hello.zip'
This API is being written in Frontdoor(API gateway). The function returning this data is an async function. This code works fine for a csv file and I simply return response.decode('utf-8'), however I can't do the same in this case since decoding it gives an error. I tried using latin1 as the decoding for the zip file data however it messes with the data and the downloaded file says format not supported.
I am trying to get the following URL with requests.get() in Python 3.x: http://www.finanzen.net/suchergebnis.asp?strSuchString=DE0005933931 (this URL consists of a base URL with the search string DE0005933931).
The request gets redirected (via HTTP status code 301) to http://www.finanzen.net/etf/ishares_core_dax%AE_ucits_etf_de in a browser (containing the character 0xAE character ® in the URL). Using requests.get() with the redirected URL works as well.
When trying to get the search string URL with Python 2.7 everything works and I get the redirected response, using Python 3.x I get the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 21: invalid start byte
The code snippet to test this:
import requests
url_1 = 'http://www.finanzen.net/suchergebnis.asp?strSuchString=LU0274208692'
# redirected to http://www.finanzen.net/etf/db_x-trackers_msci_world_index_ucits_etf_1c
url_2 = 'http://www.finanzen.net/suchergebnis.asp?strSuchString=DE0005933931'
# redirected to http://www.finanzen.net/etf/ishares_core_dax%AE_ucits_etf_de
print(requests.get(url_1).status_code) # working
print(requests.get(url_2).status_code) # error with Python 3.x
Some more information:
I am working on Windows 7 using Python 3.6.3 with requests.__version__ = '2.18.4' but I
get the same error with other Python versions as well (3.4, 3.5).
Using other search strings, everything works with Python 3.x as well,
e.g.
http://www.finanzen.net/suchergebnis.asp?strSuchString=LU0274208692
Interestingly I even get an Internal Server Error with https://www.hurl.it trying to GET the above mentioned URL. Maybe it is no Python problem.
Any idea, why this is working in Python 2.7 but not in Python 3.x and what I can do about this?
The server responds with a URL encoded as Latin-1 which is not URL encoded; non-ASCII bytes are shown as 0x?? hex escapes:
Location: /etf/ishares_core_dax0xAE_ucits_etf_de
The 0xAE byte there is not a valid URL character; the server is violating standards here. What they should be sending is
Location: /etf/ishares_core_dax%AE_ucits_etf_de
or
Location: /etf/ishares_core_dax%C2%AE_ucits_etf_de
Using escaped data for the Latin-1 or UTF-8 encoding of the URL.
We can patch requests to be more robust in the face of this error, by returning the Location header unchanged:
from requests.sessions import SessionRedirectMixin
def get_redirect_target(
self, resp, _orig=SessionRedirectMixin.get_redirect_target):
try:
return _orig(self, resp)
except UnicodeDecodeError:
return resp.headers['location']
SessionRedirectMixin.get_redirect_target = get_redirect_target
With this patch applied the redirects work as expected.
I created a pull request to improve Location handling.