Parse HTTPRequest Body from multipar/form in python - python

I receive a Response from the server with the next body:
body='------WebKitFormBoundarylY6hpxLHtLTD33AY\r\nContent-Disposition: form-data; name="file"; filename="language.py"\r\nContent-Type: text/x-python-script\r\n\r\n#!/usr/bin/env python\n
.....
.....
\r\n------WebKitFormBoundarylY6hpxLHtLTD33AY--\r\n'
And I want to parse this body and extract, name, filename, content-type and the full content of the file for storing.
May Be possible?
Thanks in advance.

Tornado should parse this for you; the contents will be available in self.request.files.
http://www.tornadoweb.org/en/stable/httpserver.html#tornado.httpserver.HTTPRequest.files

Related

Http request boundaries getting caught in pdf decoding process

I'm trying to decode a base64 pdf file and send it to another endpoint.
I used a python policy for the decoding part and here's the code
import base64
pdfB64 = flow.getVariable("request.content")
pdfFile = base64.b64decode(pdfB64)
flow.setVariable("pdfFileDecoded",pdfFile)
Now, when I send my http post request which is below
headers :
Accept : */*
boundary : --Boundaryy
--Boundaryy
Content-Disposition: form-data; name="testdu12janvier"; filename="testdu12janvier.pdf"
Content-Type: application/pdf
<< Heres is sensitive data which is basically a base64 encoded pdf file >>
--Boundaryy--
When I send this POST request and trace it in Apigee Edge, I notice that something else is encoded before the pdf file I think its either the boundary or one of the headers. This makes a corrupt pdf file which can't be read.
How do I isolate the pdf file from the request body without removing boundaries? as I'll need to send multiple in near future.

Python requests : trying to understand form data

i am new to requests in python and i'm trying to understand what's the data I send in the request and what i'm getting back.
Firstly, to understand better, i used the network inspector on chrome and uploaded a file on the website i'm going to send requests to later (the ultimate goal is to upload my file with requests).
It starts by opening a modal window with parameters so i'm guessing in python in something as easy as this (in python):
url = 'myurl'
params = {'whatever params i need'}
export = s.get(url, params=params)
if i print the status_code of this i get 200 so i'm guessing until then it's fine.
then it sends a post to the url without any parameters but with data like this (in python):
url = 'myurl'
data= {'confused'}
export = s.get(url, data=data)
here is where i'm getting a little confused. in the network inspector the data sent looks like this :
------WebKitFormBoundaryf2WTKCh05lDGbAAG
Content-Disposition: form-data; name="form[_token]"
Kmzz8c_N9qfuo8AZ1Pd1OFgaYzE9AFtitmaLkg0-y_g
------WebKitFormBoundaryf2WTKCh05lDGbAAG
Content-Disposition: form-data; name="form[importModule]"; filename="myfile.xml"
Content-Type: text/xml
------WebKitFormBoundaryf2WTKCh05lDGbAAG--
what does all this mean ? how am i supposed to write this in python ? And im guessing this "Kmzz8c_N9qfuo8AZ1Pd1OFgaYzE9AFtitmaLkg0-y_g" is the token, but how do i get in the first place too ?
thank you for your help and time !
You seem to be confused about "parameters" (query string parameters, "GET parameters", in any case the thing you use params= for in Requests) and form data.
What you see in the network inspector in the POST request is the form data (in particular, multipart/form-data data). If you inspect the form in the modal window, you'll probably find a hidden field with name="form[_token]", and a file field with name="form[importModule]".
To emulate that POST (with a file upload) with Requests, you'd do something like
s.post(
url="...",
data={
"form[_token]": "....",
},
files={
"form[importModule]": open("some_file.xlsx", "rb"),
},
)
To actually get the value for _token, you'd probably need to parse the response from the first GET request you do.

How to use WebKitFormBoundary data as your payload when posting to the server with python?

Hopefully, my question makes sense, but I'll try to explain it better here.
So, this is the post request data that was sent to the server when I analyzed the post request headers:
------WebKitFormBoundaryq4q6NLNtlzAsbRBY
Content-Disposition: form-data; name="form_type"
product
------WebKitFormBoundaryq4q6NLNtlzAsbRBY
Content-Disposition: form-data; name="utf8"
✓
------WebKitFormBoundaryq4q6NLNtlzAsbRBY
Content-Disposition: form-data; name="id"
36110014939287
------WebKitFormBoundaryq4q6NLNtlzAsbRBY
Content-Disposition: form-data; name="add"
I have two issues here. I am trying to use this data to send as my payload in a post request like a dictionary, but I'm not really sure how I would do this as I've never seen anything like this before.
Second, I see there is a hidden value for the "utf8" name, so how would I go about decoding that value and converting it back to a string.
Again, hopefully this makes sense and I'm sorry if it doesn't - I will do my best to respond to any follow up questions.
Thanks!
if you are using ajax use contentType: false;
or
use enctype="application/x-www-form-urlencoded" in form
<form enctype="application/x-www-form-urlencoded">

Python remove multipart header content

When I try to upload a CSV file and read them, the header content is also added to the csv at the top and at the end of the file :
----------------------------1323424324242342
Content-Disposition: form-data; name="file"; filename="test.csv"
Content-Type: text/csv
<actual content here>
----------------------------113131313331313133--
How do I get the actual content in the file and ignore the multipart headers?

Python requests, how to know what format data is in?

How do i find out what format the data i am trying to request is in?
The data can be found in the following address: https://api.coinmarketcap.com/v1/ticker/
Thank you :)
The response headers for this request include
content-type: application/json
as your browser will tell you. So it is JSON.
The Content-Type header field in the response is what you're looking for.

Categories

Resources