I already have binary data read from a file. Most of the examples I see online link directly to the file, and upload the whole file. I am looking how to upload the binary data that I already have from another source via HTTP POST in python.
Alternatively:
req = urllib2.Request("http://example.com", data, {'Content-Type': 'application/octet-stream'})
urllib2.urlopen(req)
That also shows how you can specify the Content-Type of the data.
I'm not sure what online examples you're looking at, but urllib2.urlopen takes the data to post as a chunk of data and not a file at all.
Related
I found out that requests library can upload file on website by POST request (below is an example from the documentation)
url = 'https://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
But I don't understand totally how to apply this to my issue.I have website https://smallpdf.com/excel-to-pdf (just an example, the site may be different) and I need to upload the excel file and get the converted one by requests library.I would appreciate an explanation of how to correctly make a POST request based on the developer tools in the browser, what arguments to pass, and so on.
THANKS!
Most of the Online Converters may not allow bots to enter their sites.
Thereby, use an Online Api which supports this to do your tasks. Here you can use ConvertApi. It is documented here.
I have the basic code (form https://docs.python.org/2/howto/urllib2.html):
import urllib2
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
the_page = response.read()
I would like to get the of the entire request and the size of the entire response, Is there any way?
(haven't seen one for urllib2 or for requests)
"entire" - means including headers and any meta-data that might be sent with it.
Thanks.
res.headers might or might not contain a field provided by the server which says content-length. So int(res.headers['content-length']) will give you the information - if the server provides it.
A very simple implementation of a HTTP stream might not provide this information at all, so you don't know it until you get EOF.
I am having a bit of trouble understanding API calls and the URLs I'm supposed to use for grabbing data from Imgur. I'm using the following URL to grab JSON data, but I'm receiving old data: http://imgur.com/r/wallpapers/top/day.json
But if I strip the .json from the end of the URL, I see the top pictures from today.
All I want is the JSON data from the top posts of today from Imgur, but keep getting data the refers to Dec 18th, 2014.
I'm using the call in a Python script. I have a token from Imgur to do the stuff, and reading the API documentation, I see a lot of the examples start with https://api. instead of http://imgur.
Which one should I use?
It's probably due to cache control, you can set it to no-cache with your headers and send along with your requests.
Sample (I'm using requests):
import requests
r = requests.get('http://imgur.com/r/wallpapers/top/day.json',
headers={'Cache-Control': 'no-cache'})
# ... your stuff here ...
Imgur updated their docs, so the new and correct form of the URL I used was:
r = requests.get("https://api.imgur.com/3/gallery/r/earthporn/top/")
I have been having problems with a script I am developing whereby I am receiving no output and the memory usage of the script is getting larger and larger over time. I have figured out the problem lies with some of the URLs I am checking with the Requests library. I am expecting to download a webpage however I download a large file instead. All this data is then stored in memory causing my issues.
What I want to know is; is there any way with the requests library to check what is being downloaded? With wget I can see: Length: 710330974 (677M) [application/zip].
Is this information available in the headers with requests? If so is there a way of terminating the download upon figuring out it is not a HTML webpage?
Thanks in advance.
Yes, the headers can tell you a lot about the page, most pages will include a Content-Length header.
By default, however, the request is downloaded in its entirety before the .get() or .post(), etc. call returns. Set the stream=True keyword to defer loading the response:
response = requests.get(url, stream=True)
Now you can inspect the headers and just discard the request if you don't like what you find:
length = int(response.headers.get('Content-Length', 0))
if length > 1048576:
print 'Response larger than 1MB, discarding
Subsequently accessing the .content or .text attributes, or the .json() method will trigger a full download of the response.
I'm new to Python and in need of some help. My aim is to send some XML with a post request to a URL, which is going to trigger a SMS being sent.
I have a small XML document that I want to post to the URL. Can I reference the XML document on my server in the python code that needs posting, or do I include the XML data to be sent in the actual python code. Can any help me out with an example?
If you need to send XML I would recommend that you take a look at requests. It allows you to easily send data using POST requests.
You should be able to transmit the XML data directly from your Python code using requests.
xml = """my xml"""
headers = {'Content-Type': 'application/xml'}
requests.post('http://www.my-website.net/xml', data=xml, headers=headers)
You could also load the xml from a text-file and send that, if you don't want to have the xml document hard-coded.
If you don't want use an outside library, you can just urllib2. See this answer for an example of how to do so.
To extract the XML from the file you just have to do
XML_STRING = open('path/to/xml_file').read()