Django Rest - FileUploadView - Unexpected data being added into an Uploaded File - python

Here is my FileUploadView class to handle POST request of an uploaded file. The file I am expecting are XML Files in which I use ElementTree to parse through it in fileHandler(). However, when using Postman to send a file through using ('form-data'), I realized that it is attaching some type of header to my uploaded file, which in turn causes the tree parse() to have a syntax error since its reading something that is not of an XML format.
I tried using HTTPie to send the file through, which worked with no issue, the XML Parser parsed it correctly and entered the data into the expected Object.
I then tried to do some TestCases with Django, and tried to test the fileupload. Which caused the parser to have a syntax error again due to having a header attached to the file once more.
class UploadTest(APITestCase):
def test_file_upload(self):
c = Client()
with open("/Users/Ren/Desktop/Capstone/Backend/projectB/VMA/testing/Test.xml") as fp:
c.post('/upload/TestXML.xml', {'filename' : 'Test.xml', 'attachment': fp})
My question is: What is causing that header to pop up/be added onto the uploaded file. I'm guessing it has something to do with how I am sending the post request through Postman and the Django TestCase which is different to HTTPie
view.py
class FileUploadView(APIView):
parser_classes = (FileUploadParser,)
def post(self, request, filename, format=None):
print(request.FILES)
file_obj = request.FILES['file']
fileHandler(file_obj)
return Response(status=204)
FileReader.py
def fileHandler(file):
filepath = file.temporary_file_path()
print(file.read())
tree = ET.parse(filepath)
root = tree.getroot()
XML File and output when calling file.read()
XML I need to read in (Expected Output):
<site host="192.168.212.4" name="http://192.168.212.4" port="80" ssl="false"><alerts><alertitem>\n <pluginid>10021</pluginid>\n <alert>X-Content-Type-Options header missing</alert>\n <riskcode>1</riskcode>\n <reliability>2</reliability>\n <riskdesc>Low (Warning)</riskdesc>\n <desc>The Anti-MIME-Sniffing header X-Content-Type-Options was not set to \'nosniff\'.\n\tThis allows older versions of Internet Explorer and Chrome to perform MIME-sniffing on the response body, potentially causing the response body to be interpreted and displayed as a content type other than the declared content type.\n\tCurrent (early 2014) and legacy versions of Firefox will use the declared content type (if one is set), rather than performing MIME-sniffing.\n\t</desc>\n <uri>http://192.168.212.4/</uri>\n <param/>\n <attack/>\n <otherinfo/>\n <solution>Ensure that the application/web server sets the Content-Type header appropriately, and that it sets the X-Content-Type-Options header to \'nosniff\' for all web pages.\n\tIf possible, ensure that the end user uses a standards-compliant and modern web browser that does not perform MIME-sniffing at all, or that can be directed by the web application/web server to not perform MIME-sniffing.\n\t</solution>\n <reference>\n\t</reference>\n</alertitem>
The Output when running request.FILES['file'].read() --- Current Output
b'----------------------------507481440966899800347275\r\nContent-Disposition: form-data; name=""; filename="sampleXML.xml"\r\nContent-Type: application/xml\r\n\r\n<site host="192.168.212.4" name="http://192.168.212.4" port="80" ssl="false"><alerts><alertitem>\n <pluginid>10021</pluginid>\n <alert>X-Content-Type-Options header missing</alert>\n <riskcode>1</riskcode>\n <reliability>2</reliability>\n <riskdesc>Low (Warning)</riskdesc>\n <desc>The Anti-MIME-Sniffing header X-Content-Type-Options was not set to \'nosniff\'.\n\tThis allows older versions of Internet Explorer and Chrome to perform MIME-sniffing on the response body, potentially causing the response body to be interpreted and displayed as a content type other than the declared content type.\n\tCurrent (early 2014) and legacy versions of Firefox will use the declared content type (if one is set), rather than performing MIME-sniffing.\n\t</desc>\n <uri>http://192.168.212.4/</uri>\n <param/>\n <attack/>\n <otherinfo/>\n <solution>Ensure that the application/web server sets the Content-Type header appropriately, and that it sets the X-Content-Type-Options header to \'nosniff\' for all web pages.\n\tIf possible, ensure that the end user uses a standards-compliant and modern web browser that does not perform MIME-sniffing at all, or that can be directed by the web application/web server to not perform MIME-sniffing.\n\t</solution>\n <reference>\n\t</reference>\n</alertitem>\n\n \r\n----------------------------507481440966899800347275--\r\n'
Contains the unnecessary: b'----------------------------507481440966899800347275\r\nContent-Disposition: form-data; name=""; filename="sampleXML.xml"\r\nContent-Type: application/xml\r\n\r\n

I played around with code for a bit and made a tiny change into the testCase:
class UploadTest(APITestCase):
def test_file_upload(self):
c = Client()
with open("/Users/Ren/Desktop/Capstone/Backend/projectB/VMA/testing/Test.xml") as fp:
c.post('/upload/TestXML.xml', {'filename' : 'Test.xml', 'attachment': fp})
I changed the
{'filename' : 'Test.xml', 'attachment': fp}
to
{'filename' : b'Test.xml', 'attachment': fp}
I remember reading it somewhere, unfortunately I do not remember where but... turning the file into "bytes" fixed it...

Related

python flask sending raw http response with jpeg encoded image

I'm trying to send raw response (which is a jpeg image from my laptop camera) in python using flask, this is the particular snippet:
#app.route("/")
def stream():
frm = imencode('.jpg',cap.read()[1])[1].tobytes()
resp = Response()
resp.set_data(value="HTTP/1.1 200 Ok\nContent-Type:image/jpeg\nContent-Length:"+str(len(frm))+"\n\n"+str(frm))
return resp
My browser seem to display it as text nonetheless. If initialize
Response(frm.tobytes(), headers={"Content-Type":"image/jpeg"})
then then browser decodes the image ok. I'm not very good with web stuff, but from what I've found so far response consists of first line specifying the http version, response status code and respective message. Then come the header fields each on new line. then a clean line separates metadata from the body. I've read that the bare minimum for a simple response are Content-Type and Content-Length headers. Some sources also mention using \r in combination with \n to separate the lines but i didn't find the example so far and this source didn't specify where exactly \r should be added.

Header info being written into file when PUT-ing to a Webdav server

I have the following code:
r = requests.put(
config.get('webdav', 'url') + file_name,
auth=(
config.get('webdav', 'username'),
config.get('webdav', 'password')
),
files={
"files": open(os.path.expanduser(charges_file_path), 'rb')
}
)
Which is fairly straightforward. It simply calls a PUT request to a webdav server, and pushes the data that is in files (plain text) to the server.
It works, except for a strange (or maybe not so strange if I am just missing something small) issue. When I do a GET on the file, or the file is viewed on the server directly, the file itself contains header information:
--55e72d74a10b423590cd4faa68212192
Content-Disposition: form-data; name="files"; filename="test_file6.txt"
(file_data)
--55e72d74a10b423590cd4faa68212192--
I haven't been able to find a reason or way around this. When I cURL the file from command line, it works fine.
Any ideas?
I am not really familiar with how Python requests works, but after reading through some docs and finding a similar issue someone had with sending files to Zendesk (this post), you might want to try using the data (or json) parameter instead of files in your request. Also, maybe attaching a params with filename if that's applicable here as well similar to the post I linked.
Another thing to do would be to put a Content-Type header on this request.
i.e.
requests.put(
...,
headers={'Content-Type': 'application/binary'},
data=open(os.path.expanduser(charges_file_path), 'rb').read()
)

Cannot POST data to URL using Pythons requests library from external file

I'm trying to POST data to a URL using Pythons requests library.
If I try and do this by setting a multiline string variable which contains the post data in my script, everything works fine.
If I try to read in an external file with the same data in, the request fails on the application server I'm posting to, because it thinks there is invalid XML.
For example:
This works
starturl="http://myserver.example.com/location/where/I/post"
username=user
password=mypassword
# Set the XML data
xmldata="""<?xml version="1.0" encoding="utf-8"?>
(Lots more xml)
"""
# POST the job data
session = requests.Session()
request = session.post(starturl, auth=(username,password), data=xmldata, headers=post_headers)
Server side application processes the request just fine. However, if the only change I make is to read the xml data from an external file, this no longer works.
This does not work
xmlfile="/path/to/my/xmldata.xml"
xmldata = open(xmlfile,'r')
session = requests.Session()
request = session.post(start_url, auth=(username,password), data=xmldata.read(), headers=post_headers)
The server side application, then errors with:
"Data at the root level is invalid. Line 1, position 1"
When inspecting with wireshark I can see there is a difference in the request body of my POST. Three little dots are appearing from somewhere
When it works:
Content-Type: application/xml
Authorization: Basic c3BvdGFkbTpQQHNzdzByZA==
<?xml version="1.0" encoding="utf-8"?>
When it fails:
Content-Type: application/xml
Authorization: Basic c3BvdGFkbTpQQHNzdzByZA==
...<?xml version="1.0" encoding="utf-8"?>
I'm not sure what's causing the 3 leading dots to appear in the request body. I've inspected the source XML file, tried stripping newlines from it. Nothing seems to do the trick?
It's impossible to tell for sure without having your xml file, but you might have a BOM at the beginning of your file. Microsoft is notably (in)famous for insisting on putting useless BOM on all utf-8 files.
You can check the first three characters of your file for the codecs.BOM_UTF8 sequence ('\xef\xbb\xbf') and strip it out if it's there.

Python - Read headers inside controller?

I'm building controller between source control system and Odoo in a way that specific integrated code source control system (like bitbucket, github) would be able to payload data using json. Reading of actual payloaded data is working, but what I'm struggling, is reading headers data inside controller.
I need headers data so I could identify from which system this payload is received (for example data structure might be different in bitbucket and github). Now if I would read that header, I would know which system payloads the data and how to parse it properly.
So my controller looks like this:
from odoo import http
from odoo.http import request
class GitData(http.Controller):
"""Controller responsible for receiving git data."""
#http.route(['/web/git.data'], type='json', auth="public")
def get_git_data(self, **kwargs):
"""Get git data."""
# How to read headers inside here??
data = request.jsonrequest
# do something with data
return '{"response": "OK"}'
Now for example I can call this route with:
import requests
import json
url = 'http://some_url/web/git.data'
headers = {
'Accept': 'text/plain',
'Content-Type': 'application/json',
'type': 'bitbucket'}
data = {'some': 'thing'}
r = requests.post(url, data=json.dumps(data), headers=headers)
Now it looks that controller reads headers automatically, because it understands that it is json type. But what if I need to manually check specific header data like headers['type'] (in my example it was bitbucket)?
I tried looking into dir(self) and dir(request), but did not see anything related with headers. Also **kwargs is empty, so no headers there.
Note.: request object is actually:
# Thread local global request object
_request_stack = werkzeug.local.LocalStack()
request = _request_stack()
"""
A global proxy that always redirect to the current request object.
"""
# (This is taken from Odoo 10 source)
So basically it is part of werkzeug.
Maybe someone has more experience with werkzeug or controllers in general, so could point me in the right direction?
P.S. Also in Odoo itself I did not find any example that would read headers like I want. It looks the only place headers are used (actually setting them instead of reading), are after the fact, when building a response back.
from openerp.http import request
Within your controller handling your specific path. You can access the request headers using the code below. (Confirmed Odoo8,Odoo10... probably works for Odoo9 as well)
headers = request.httprequest.headers

Getting isMultipartContent = false while using python poster library

I'm using the python poster library to try to upload a form containing including an image to a servlet. Locally, it runs fine, but when I deploy to app engine, it doesn't recognize it as multipart content.
ServletFileUpload.isMultipartContent(request) returns false
Here's how I'm using the poster library:
register_openers()
datagen, headers = multipart_encode({"image": open(filename)})
request = urllib2.Request(url, datagen, headers)
The servlet checks to make sure it is Multipart, but it fails that check. What can I do to further debug?
Thanks,
jean
*******update*********
printing out the stack trace...here's what i get. It complains the content type header isnull
org.apache.commons.fileupload.FileUploadBase$InvalidContentTypeException: the request doesn't contain a multipart/form-data or multipart/mixed stream, content type header is null
at org.apache.commons.fileupload.FileUploadBase$FileItemIteratorImpl.(FileUploadBase.java:885)
at org.apache.commons.fileupload.FileUploadBase.getItemIterator(FileUploadBase.java:331)
at org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:349)
at org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
If you're on Windows (or a pedant;-), open(filename) is the wrong way to open a binary file and might mess things up -- use open(filename, 'rb'). Apart from that, assuming of course that you continue with a urllib2.urlopen(request) which you've omitted, that your imports are correct, and that filename and url are properly set previously, then your code seems legit.

Categories

Resources