Submit multipart/form-data using mechanize python? - python

I'm trying to make a POST request of multipart/form-data using mechanize, here's what it looks like from firefox live http header when I actually make a post:
http://example.com/new/example
POST /new/example HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Referer: http://example.com/new/example
Cookie: tmgioct=c32MbAGn1sTuZrH8etPqVNU5; __qca=P0-495598852-1339139301054; __utma=189990958.911848588.1339139302.1339556345.1339561805.32; __utmz=189990958.1339139302.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); logged_in=1; tog_appearance_fieldset=fieldset_open; __utmc=189990958; pfu=42375294; pfp=h2YrFoaTr5LtrVys8PMmKNdyuoeA9FNLakxGzrJK; pfe=1371048319; __utmb=189990958.5.10.1339561805
Content-Type: multipart/form-data; boundary=---------------------------41184676334
Content-Length: 2947
-----------------------------41184676334
Content-Disposition: form-data; name="UPLOAD_IDENTIFIER"
0ad3af1c502c7cb59577b01720ee58ff014810c4
-----------------------------41184676334
Content-Disposition: form-data; name="post[state]"
2
-----------------------------41184676334
blahblahblahblah....
-----------------------------41184676334--
And here's my code:
browser = mechanize.Browser()
url = "http://example.com/new/example"
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0',
'Referer': 'http://example.com/new/example',
'Content-Type': 'multipart/form-data; boundary=---------------------------41184676334'
}
data = "-----------------------------41184676334\rContent-Disposition: form-data; name="UPLOAD_IDENTIFIER"\r\r0ad3af1c502c7cb59577b01720ee58ff014810c4\r-----------------------------41184676334\rContent-Disposition: form-data; name="post[state]"\r\r2\r-----------------------------41184676334\rblahblahblahblah....\r\r-----------------------------41184676334--\r"
req = urllib2.Request(url, data, header)
response = browser.open(req, timeout = 30)
response.close()
I don't know why it does NOT work. Anybody knows? Please help me out.
By the way, does it have something to do with boundary? I use random numbers in above code.

From the MIME media types RFC 2046:
The canonical form of any MIME "text" subtype MUST always represent a
line break as a CRLF sequence.
Your code uses carriage returns ('\r') only; you need to add line feeds (\n) as well.

browser.form.enctype = "application/x-www-form-urlencoded"

Ended up using requests module to do the task. It turned out to be more convenient and reliable.
You can check out this page for details: POST a Multipart-Encoded File

Related

What is the purpose of the header element 'x-instagram-ajax' in API calls via Python to Instagram?

On the instagram login page, if one inspects the element of the POST call for the url 'https://www.instagram.com/accounts/web_create_ajax/', it lists the following as headers:
Host: www.instagram.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:61.0) Gecko/20100101 Firefox/61.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://www.instagram.com/
X-CSRFToken: 7dmO9F3JuVGvSXumd79yByPxnHoWHz1A
X-Instagram-AJAX: c2d8f4380025
Content-Type: application/x-www-form-urlencoded
X-Requested-With: XMLHttpRequest
Content-Length: 102
Cookie: csrftoken=7dmO9F3JuVGvSXumd79yByPxnHoWHz1A; mid=W30zsQAEAAErXHJ3iUojfTceCd53; mcd=3; csrftoken=7dmO9F3JuVGvSXumd79yByPxnHoWHz1A; rur=FTW
Connection: keep-alive
I am wondering if anyone would have any idea what X-Instagram-AJAX is and how I can generate it each time. Is it connected as a pair with X-CSRFToken? Thanks.
Follow, like etc requests working without this header. I don't know what is it but i think instagram dedects suspicious requests with this and then log it. You can get this value on any page in instagram This is x-instagram-ajax value
You can parse it and use.

How to make post request to Content-Type text/x-gwt-rpc; charset=utf-8

I am beginner in python. I would like to parse a website but the header shows the content type text/x-gwt-rpc; charset=utf-8 and the request payload...
7|0|4|https://kekeke.cc/com.liquable.hiroba.home.gwt.HomeModule/|53263EDF7F9313FDD5BD38B49D3A7A77|com.liquable.hiroba.gwt.client.square.IGwtSquareService|getNoOfCrowd|1|2|3|4|0|
Request:
POST /com.liquable.hiroba.gwt.server.GWTHandler/squareService HTTP/1.1
Host: kekeke.cc
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0
Accept: */*
Accept-Language: zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Referer: https://kekeke.cc/
Content-Type: text/x-gwt-rpc; charset=utf-8
X-GWT-Permutation: 8F22796231EB8C8312C5D1BB10451262
X-GWT-Module-Base: https://kekeke.cc/com.liquable.hiroba.home.gwt.HomeModule/
Content-Length: 177
DNT: 1
Connection: keep-alive
Can anyone tell me how to make post request in python?
I found the solution. It can be solved by simply using r = requests.post(url, "7|0|4|https://kekeke.cc/com.liquable.hiroba.home.gwt.HomeModule/|53263EDF7F9313FDD5BD38B49D3A7A77|com.liquable.hiroba.gwt.client.square.IGwtSquareService|getNoOfCrowd|1|2|3|4|0|", headers=headers). In many online tutorials, they teach the post request by using data in form of {data:data} to submit post request only. However, it can be done by submitting data in form of string in some cases.

Differences in sending a multipart/form-data post via requests

I've got problem while trying to post the file to the server. I'm trying to make file upload script to server, this server is very 'Sensitive to correctness post request'
I debugged page that is sending the file to server and browser send this (TextView):
POST http://example.com/post HTTP/1.1
Host: example.com
Connection: keep-alive
Content-Length: 20625
Accept: application/json, text/javascript, */*; q=0.01
Origin: http://example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.104 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundarykGHBkXoER9gNuVna
Referer: http://example.com/foo
Accept-Encoding: gzip, deflate
Accept-Language: pl-PL,pl;q=0.8,en-US;q=0.6,en;q=0.4,pt;q=0.2
------WebKitFormBoundarykGHBkXoER9gNuVna
Content-Disposition: form-data; name="files[]"; filename="file.zip"
Content-Type: application/octet-stream
...raw file data...
------WebKitFormBoundarykGHBkXoER9gNuVna--
However, my script is sending this (TextView):
POST http://example.com/post HTTP/1.1
Host: example.com
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.18.1
Content-Length: 20604
--f8c266cf436941019c5a80c7d4779a57
Content-Disposition: form-data; name="files[]"; filename="file.zip"
Content-Type: application/zip
...raw file data...
--f8c266cf436941019c5a80c7d4779a57--
With causes error on server, additional note: this error started when I changed files=files to data=files
Current Code:
files = MultipartEncoder({'files[]': (filename, open(local_path,'rb'), mimetype)})
UploadFile = requests.post(self.UploadURL, data=files, allow_redirects=False)
Working code:
files = {'files[]': (filename, open(local_path,'rb'), mimetype)}
UploadFile = requests.post(self.UploadURL, files=files, allow_redirects=False)
I'm using MultipartEncoder to allow sending huge files.
I see that biggest mismatch is "boundary", but why this 'boundary' is generating in working code but in Current code not?
How to fix that?
You are not setting the Content-Type header, the MultipartEncoder provides it for you:
files = MultipartEncoder({'files[]': (filename, open(local_path,'rb'), mimetype)})
UploadFile = requests.post(
self.UploadURL, data=files, allow_redirects=False,
headers={'Content-Type': files.content_type})
The header must come from the multi-part encoding, because it is responsible for picking the boundary used to deliniate the various MIME parts in the multipart response. In your upload that's:
--f8c266cf436941019c5a80c7d4779a57
but it is generated at random each time your code runs. The header provided would look like:
Content-Type: multipart/form-data; boundary=--f8c266cf436941019c5a80c7d4779a57

Send HTTP Post with Python

I want to make a program where I can send HTTP post requests and respond.
So, I want to send THIS post:
POST https: //example.com/index.php?s=&&app=box&module=ajax&section=coreAjax&secure_key=&type=submit&lastid=87311&global=1 HTTP/1.1
Host: example.com
Connection: keep-alive
Content-Length: 10
Accept: text/javascript, text/html, application/xml, text/xml, */*
X-Prototype-Version: 1.7.2
Origin: https://example.com
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
Content-type: application/x-www-form-urlencoded; charset=UTF-8
Referer: https://x.com/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
Cookie: cookieconsent_status=dismiss;
And then enter the request body:
message= # Which I will make: "message= %s" % (messagex))
But I do not know how to send them and canĀ“t seem to find any way online, could someone help please?
The main parts are:
import requests # you have to install this library, with pip for example
# define your custom headers (as many as you want)
headers = {
'X-Prototype-Version': '1.7.2'
}
# define your URL params (!= of the body of the POST request)
params = {
'your_first_param': 'its_value',
'your_second_param': 'its_value'
}
# define the body of the POST request
data = {
'message' : 'your message'
}
# send the POST request
response = requests.post('https://example.com/index.php', params=params, data=data, headers=headers)
# here is the response
print response.text
Hope that helps.

Python Mechanize Prevent Connection:Close

I'm trying to use mechanize to get information from a web page. It's basically succeeding in getting the first bit of information, but the web page includes a button for "Next" to get more information. I can't figure out how to programmatically get the additional information.
By using Live HTTP Headers, I can see the http request that is generated when I click the next button within a browser. It seems as if I can issue the same request using mechanize, but in the latter case, instead of getting the next page, I am redirected to the home page of the website.
Obviously, mechanize is doing something different than my browser is, but I can't figure out what. In comparing the headers, I did find one difference, which was the browser used
Connection: keep-alive
while mechanize used
Connection: close
I don't know if that's the culprit, but when I tried to add the header ('Connection','keep-alive'), it didn't change anything.
[UPDATE]
When I click the button for "page 2" within Firefox, the generated http is (according to Live HTTP Headers):
GET /statistics/movies/ww_load/the-fast-and-the-furious-6-2012?authenticity_token=ItU38334Qxh%2FRUW%2BhKoWk2qsPLwYKDfiNRoSuifo4ns%3D&facebook_fans_page=2&tbl=facebook_fans&authenticity_token=ItU38334Qxh%2FRUW%2BhKoWk2qsPLwYKDfiNRoSuifo4ns%3D HTTP/1.1
Host: www.boxoffice.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0
Accept: text/javascript, text/html, application/xml, text/xml, */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
X-Requested-With: XMLHttpRequest
X-Prototype-Version: 1.6.0.3
Referer: http://www.boxoffice.com/statistics/movies/the-fast-and-the-furious-6-2012
Cookie: __utma=179025207.1680379428.1359475480.1360001752.1360005948.13; __utmz=179025207.1359475480.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __qca=P0-668235205-1359475480409; zip=13421; country_code=US; _boxoffice_session=2202c6a47fc5eb92cd0ba57ef6fbd2c8; __utmc=179025207; user_credentials=d3adbc6ecf16c038fcbff11779ad16f528db8ebd470befeba69c38b8a107c38e9003c7977e32c28bfe3955909ddbf4034b9cc396dac4615a719eb47f49cc9eac%3A%3A15212; __utmb=179025207.2.10.1360005948
Connection: keep-alive
When I try to request the same url within mechanize, it looks like this:
GET /statistics/movies/ww_load/the-fast-and-the-furious-6-2012?facebook_fans_page=2&tbl=facebook_fans&authenticity_token=ZYcZzBHD3JPlupj%2F%2FYf4dQ42Kx9ZBW1gDCBuJ0xX8X4%3D HTTP/1.1
Accept-Encoding: identity
Host: www.boxoffice.com
Accept: text/javascript, text/html, application/xml, text/xml, */*
Keep-Alive: 115
Connection: close
Cookie: _boxoffice_session=ced53a0ca10caa9757fd56cd89f9983e; country_code=US; zip=13421; user_credentials=d3adbc6ecf16c038fcbff11779ad16f528db8ebd470befeba69c38b8a107c38e9003c7977e32c28bfe3955909ddbf4034b9cc396dac4615a719eb47f49cc9eac%3A%3A15212
Referer: http://www.boxoffice.com/statistics/movies/the-fast-and-the-furious-6-2012
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1
--
Daryl
The server was checking X-Requested-With and/or X-Prototype-Version, so adding those two headers to the mechanize request fixed it.
Maybe a little late with an answer but i fixed this by adding an line in _urllib2_forked.py
on line 1098 stands the line: headers["Connection"] = "Close"
Change this to:
if not 'Connection' in headers:
headers["Connection"] = "Close"
and make sure you set the header in you script and it will work.
Gr. Squandor

Categories

Resources