I'm tying to write some simple app on python3 and tornado for server, and requests for client, and I'm getting some headers in 'self.request.body', which I can't dispose of. For instance, for file 'blahblahblah', I get:
--cb5f6ba84bdf42d382dfd3204f6307c7\r\nContent-Disposition: form-data; name="file"; filename="1.bin"\r\n\r\nblahblahblah\n\r\n--cb5f6ba84bdf42d382dfd3204f6307c7--\r\n
Files are sent by
f = {'file': open(FILE, 'rb')}
requests.post(URL_UPLOAD, files=f)
and received by
class UploadHandler(tornado.web.RequestHandler):
def post(self, filename):
with open(Dir + filename, 'wb') as f:
f.write(self.request.body)
My full code can be seen here
When I send the file by curl with curl -X POST -d $(cat ./1.bin) http://localhost:8080/upload/1.bin I get the correct file, but without \n.
There must be something I missed. Please can someone help me with that? Thank You.
There are two ways to upload files: simply using the file as the request body (usually, but not necessarily, with the HTTP PUT method), or using a multipart wrapper (usually with the HTTP POST method). If you upload the file from an HTML form, it will usually use the multipart wrapper. Your requests example is using a multipart wrapper and the curl one is not; your server is not expecting the wrapper.
To use a multipart wrapper: in requests, pass files= as you've done here. With curl, see this answer: Using curl to upload POST data with files. On the server, use self.request.files instead of self.request.body: http://www.tornadoweb.org/en/stable/httpserver.html#tornado.httpserver.HTTPRequest.files
To not use the multipart wrapper, use data=open(FILE, 'rb').read() from requests, and keep the other two components the same.
It is possible to support both styles simultaneously on the server: use self.requests.files when self.request.headers['Content-Type'] == 'multipart/form-data' and self.request.body otherwise.
Related
I'm trying to upload a secure file to my repository in GitLab.
While I am able to upload a secure file with curl, I encounter an error when using requests in Python.
my python code:
r = requests.post("https://gitlab.com/api/v4/projects/10186699/secure_files",
headers={"PRIVATE-TOKEN": "glpat-TH7FM3nThKmHgOp"},
files={"file": open("/Users/me/Desktop/dev/web-server/utils/a.txt", "r"),
"name": "a.txt"})
print(r.status_code,r.json())
Response:
400 {'error': 'name is invalid'}
The equivalent curl command I use that actually works:
curl --request POST --header "PRIVATE-TOKEN: glpat-TH7FM3nThKmHgOp" https://gitlab.com/api/v4/projects/10186699/secure_files --form "name=a.txt" --form "file=#/Users/me/Desktop/dev/web-server/utils/a.txt"
The equivalent call will be
import requests
resp = requests.post(
"https://gitlab.com/api/v4/projects/10186699/secure_files",
headers={"PRIVATE-TOKEN": "glpat-TH7FM3nThKmHgOp"},
files={"file": open("/Users/me/Desktop/dev/web-server/utils/a.txt", "rb")},
data={"name": "a.txt"}
)
print(resp.status_code,resp.json())
This is because the file= parameter is intended only for uploading files. On the other hand, name is your form data (you need to pass in the data= parameter).
It's also recommended to open files in binary mode. (docs)
I am trying to use the requests library in Python to upload a file into Fedora commons repository on localhost. I'm fairly certain my main problem is not understanding open() / read() and what I need to do to send data with an http request.
def postBinary(fileName,dirPath,url):
path = dirPath+'/'+fileName
print('to ' + url + '\n' + path)
openBin = {'file':(fileName,open(path,'rb').read())}
headers = {'Slug': fileName} #not important
r = requests.put(url, files=openBin,headers=headers, auth=HTTPBasicAuth('username', 'pass'))
print(r.text)
print("and the url used:")
print(r.url)
This will successfully upload a file in the repository, but it will be slightly larger and corrupted after. For example an image that was 6.6kb became 6.75kb and was not openable anymore.
So how should I properly open and upload a file using put in python?
###Extra details:###
When I replace files=openBin with data=openBin I end up with my dictionary and I presume the data as a string. I don't know if that information is helpful or not.
"file=FILE_NAME.extension&file=TYPE89a%24%02Q%03%E7%FF%00E%5B%19%FC%....
and the size of the file increases to a number of megabytes
I am using specifically put because the Fedora RESTful HTTP API end point says to use put.
The following command does work:
curl -u username:password -H "Content-Type: text/plain" -X PUT -T /path/to/someFile.jpeg http://localhost:8080/fcrepo/rest/someFile.jpeg
Updated
Using requests.put() with the files parameter sends a multipart/form-data encoded request which the server does not seem to be able to handle without corrupting the data, even when the correct content type is declared.
The curl command simply performs a PUT with the raw data contained in the body of the request. You can create a similar request by passing the file data in the data parameter. Specify the content type in the header:
headers = {'Content-type': 'image/jpeg', 'Slug': fileName}
r = requests.put(url, data=open(path, 'rb'), headers=headers, auth=('username', 'pass'))
You can vary the Content-type header to suit the payload as required.
Try setting the Content-type for the file.
If you are sure that it is a text file then try text/plain which you used in your curl command - even though you would appear to be uploading a jpeg file? However, for a jpeg image, you should use image/jpeg.
Otherwise for arbitrary binary data you can use application/octet-stream:
openBin = {'file': (fileName, open(path,'rb'), 'image/jpeg' )}
Also it is not necessary to explicitly read the file contents in your code, requests will do that for you, so just pass the open file handle as shown above.
I have the following code:
r = requests.put(
config.get('webdav', 'url') + file_name,
auth=(
config.get('webdav', 'username'),
config.get('webdav', 'password')
),
files={
"files": open(os.path.expanduser(charges_file_path), 'rb')
}
)
Which is fairly straightforward. It simply calls a PUT request to a webdav server, and pushes the data that is in files (plain text) to the server.
It works, except for a strange (or maybe not so strange if I am just missing something small) issue. When I do a GET on the file, or the file is viewed on the server directly, the file itself contains header information:
--55e72d74a10b423590cd4faa68212192
Content-Disposition: form-data; name="files"; filename="test_file6.txt"
(file_data)
--55e72d74a10b423590cd4faa68212192--
I haven't been able to find a reason or way around this. When I cURL the file from command line, it works fine.
Any ideas?
I am not really familiar with how Python requests works, but after reading through some docs and finding a similar issue someone had with sending files to Zendesk (this post), you might want to try using the data (or json) parameter instead of files in your request. Also, maybe attaching a params with filename if that's applicable here as well similar to the post I linked.
Another thing to do would be to put a Content-Type header on this request.
i.e.
requests.put(
...,
headers={'Content-Type': 'application/binary'},
data=open(os.path.expanduser(charges_file_path), 'rb').read()
)
Using CURL I can post a file like
CURL -X POST -d "pxeconfig=`cat boot.txt`" https://ip:8443/tftp/syslinux
My file looks like
$ cat boot.txt
line 1
line 2
line 3
I am trying to achieve the same thing using requests module in python
r=requests.post(url, files={'pxeconfig': open('boot.txt','rb')})
When I open the file on server side, the file contains
{:filename=>"boot.txt", :type=>nil, :name=>"pxeconfig",
:tempfile=>#<Tempfile:/tmp/RackMultipart20170405-19742-1cylrpm.txt>,
:head=>"Content-Disposition: form-data; name=\"pxeconfig\";
filename=\"boot.txt\"\r\n"}
Please suggest how I can achieve this.
Your curl request sends the file contents as form data, as opposed to an actual file! You probably want something like
with open('boot.txt', 'rb') as f:
r = requests.post(url, data={'pxeconfig': f.read()})
The two actions you are performing are not the same.
In the first: you explicitly read the file using cat and pass it to curl instructing it to use it as the value of a header pxeconfig.
Whereas, in the second example you are using multipart file uploading which is a completely different thing. The server is supposed to parse the received file in that case.
To obtain the same behavior as the curl command you should do:
requests.post(url, data={'pxeconfig': open('file.txt').read()})
For contrast the curl request if you actually wanted to send the file multipart encoded is like this:
curl -F "header=#filepath" url
with open('boot.txt', 'rb') as f: r = requests.post(url, files={'boot.txt': f})
You would probably want to do something like that, so that the files closes afterwards also.
Check here for more: Send file using POST from a Python script
I would like to make a POST request to upload a file to a web service (and get response) using Python. For example, I can do the following POST request with curl:
curl -F "file=#style.css" -F output=json http://jigsaw.w3.org/css-validator/validator
How can I make the same request with python urllib/urllib2? The closest I got so far is the following:
with open("style.css", 'r') as f:
content = f.read()
post_data = {"file": content, "output": "json"}
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
data=urllib.urlencode(post_data))
response = urllib2.urlopen(request)
I got a HTTP Error 500 from the code above. But since my curl command succeeds, it must be something wrong with my python request?
I am quite new to this topic and my question may have very simple answers or mistakes.
Personally I think you should consider the requests library to post files.
url = 'http://jigsaw.w3.org/css-validator/validator'
files = {'file': open('style.css')}
response = requests.post(url, files=files)
Uploading files using urllib2 is not impossible but quite a complicated task: http://pymotw.com/2/urllib2/#uploading-files
After some digging around, it seems this post solved my problem. It turns out I need to have the multipart encoder setup properly.
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2
register_openers()
with open("style.css", 'r') as f:
datagen, headers = multipart_encode({"file": f})
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
datagen, headers)
response = urllib2.urlopen(request)
Well, there are multiple ways to do it. As mentioned above, you can send the file in "multipart/form-data". However, the target service may not be expecting this type, in which case you may try some more approaches.
Pass the file object
urllib2 can accept a file object as data. When you pass this type, the library reads the file as a binary stream and sends it out. However, it will not set the proper Content-Type header. Moreover, if the Content-Length header is missing, then it will try to access the len property of the object, which doesn't exist for the files. That said, you must provide both the Content-Type and the Content-Length headers to have the method working:
import os
import urllib2
filename = '/var/tmp/myfile.zip'
headers = {
'Content-Type': 'application/zip',
'Content-Length': os.stat(filename).st_size,
}
request = urllib2.Request('http://localhost', open(filename, 'rb'),
headers=headers)
response = urllib2.urlopen(request)
Wrap the file object
To not deal with the length, you may create a simple wrapper object. With just a little change you can adapt it to get the content from a string if you have the file loaded in memory.
class BinaryFileObject:
"""Simple wrapper for a binary file for urllib2."""
def __init__(self, filename):
self.__size = int(os.stat(filename).st_size)
self.__f = open(filename, 'rb')
def read(self, blocksize):
return self.__f.read(blocksize)
def __len__(self):
return self.__size
Encode the content as base64
Another way is encoding the data via base64.b64encode and providing Content-Transfer-Type: base64 header. However, this method requires support on the server side. Depending on the implementation, the service can either accept the file and store it incorrectly, or return HTTP 400. E.g. the GitHub API won't throw an error, but the uploaded file will be corrupted.