Wrong content length when sending a file - python

I am having the problem that the requests library for some reason is making the payload bigger and causing issues. I enabled http logging, and in in the output I can see the content length being 50569, not 50349, as the actual file size should indicate.
send: b'POST /api/1/images HTTP/1.1\r\nHost: localhost:8000\r\nUser-Agent: python-requests/2.21.0\r\nAccept-Encoding: gzip, deflate\r\nAc
cept: application/json\r\nConnection: keep-alive\r\nAuthorization: Bearer 28956340ba9c7e25b49085b4d273522b\r\ncontent-type: image/png\r\n
Content-Length: 50569\r\n\r\n'
send: b'--ac9e15d6d3aa3a77506c2daccca2ee47\r\nContent-Disposition: form-data; name="0007-Alternate-Lateral-Pulldown_back-STEP1"; filename
="0007-Alternate-Lateral-Pulldown_back-STEP1.png"\r\n\r\n\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\xf0\x00\x00\x00\xf0\x08\x06\x00\
x00\x00>U\xe9\x92\x00\x00\x00\tpHYs\x00\x00\x0b\x13\x00\x00\x0b\x13\x01\x00\x9a\x9c\x18\x00\x00FMiTXtXML:com.adobe.xmp\x00\x00\x00\x00\x0
0<?xpacket begin="\xef\xbb\xbf" id="W5M0MpCehiHzreSzNTczkc9d"?>\n<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.5-c021 79.
155772, 2014/01/13-19:44:00 ">\n <rdf:RDF xmlns:rdf="http:
Chrome has exactly the same headers when sending this, but the correct content length, so I am assuming this is why my server complains of a invalid image being sent.
This is my code
url = self.server + "/api/1/images";
headers = self.default_headers()
headers['content-type'] = 'image/png'
# neither of these are actually used for anything
filename = os.path.basename(image)
field_name = os.path.splitext(filename)[0]
files = {field_name: (filename, open(image, 'rb'), '')}
# Post image
r = requests.post(url, headers=headers, files=files, timeout = 5.0)
As can be seen, I am using the b flag when opening the file to preserve the binary content, so it should not change.
File size is 50349
$ ls -l 0007-Alternate-Lateral-Pulldown_back-STEP1.png
-rw-rw-r--# 1 carlerik staff 50349 Nov 26 2019 0007-Alternate-Lateral-Pulldown_back-STEP1.png

I used Charles proxy to dig into this and I now have gotten to the bottom of this. There are basically two things to note:
The difference in content length between the request sent by Chrome and Requests is exactly the length of the boundary fields (a multipart form concept) before and after the file + the six CRLF (\r\n) sequences + the Content-Disposition header.
echo '50569-178-36-6' | bc
50349
The boundary field looks like this: --ac9e15d6d3aa3a77506c2daccca2ee47\r\n
You can also see from the HTTP header and body dump that the header is actually in the body and after the boundary field, not as part of the normal headers. This was important and led me on the right path.
The second part of the answer is that the guys that wrote the server API I am interfacing with did not understand/read the HTTP spec for the exact bits they ask for: the Content-Disposition header.
The API docs for .../images state that this header must be present always, as they use (well, used in the past) its content to extract filenames and such. The problem is that the way they use it is not in accordance with how it is intended to be used: in a multipart HTTP request it is part of the body of the HTTP request, describing the part (form field) of the request it precedes.
This is, of course, also how Requests uses it, but I did not have this information before venturing into this abyss, so I was misinformed by the code in the controller that states this in its doc. So I assumed that Requests would put the header in the header section, which it did not, and not the body, which it did. After all, I saw that Chrome "did the right thing", but it turned out that was only because these requests were handcrafted in javascript:
apiService.js
/**
* Upload image
* #param file
* #returns {*}
* #private
*/
_api.postImage = function (file) {
if (typeof file != 'object') {
throw Error('ApiService: Object expected');
}
file.fileName = (/(.*)\./).exec(file.name)[1];
var ContentDisposition = 'form-data; name="' + encodeURI(file.fileName) + '"; filename="' + encodeURI(file.name) + '"';
return Upload.http({
url: Routing.generate('api_post_image'),
headers: {
'Content-Disposition': ContentDisposition,
'Content-Type': file.type
},
data: file
});
};
So the Content-Disposition header here is basically a proprietary header to convey information about the filename, that shares its appearance with the general in-body header from the spec. That means all it takes to fix this is to create a request with a custom body read from file and set this header.
To round this off, this was how it was all simplified down to:
headers = dict()
headers['authorization'] = "<something>"
headers['content-type'] = 'image/png'
with open(image, 'rb') as imagefile:
# POST image
r = requests.post(url, headers=headers, data=imagefile)

Related

How do you push files to a github repo in python

This bounty has ended. Answers to this question are eligible for a +50 reputation bounty. Bounty grace period ends in 19 hours.
user19926715 is looking for an up-to-date answer to this question:
I would like an example that shows the before and after of the packet before it gets sent and the response after and showing proof of it going into a repository and explain what was different from the code given by me.
I have a file called w.txt that holds 4 random numbers
I am trying to push it to a git repo and I have the following code:
aa=open(fileName,'r').read()
print(aa)
text= base64.b64encode(aa.encode("utf-8"))
api="https://api.github.com/repos/Ixonblitz-MatOS/Worm/contents/w.txt"
headers = {
"Authorization": f'''Bearer {token}''',
"Content-type": "application/vnd.github+json"
}
data = {
"message": msg, # Put your commit message here.
"content": text.decode("utf-8")
}
r = requests.put(api, headers=headers, json=data)
print(r.text)
In this case fileName is "w.txt" and the api is the api for my repo and the correct token is being used. What am I missing? It is leaving the file on my repo empty instead of putting the numbers there.
You need to provide the SHA of the file as the error message says.
To get the SHA, you can get the file like this:
import requests
url = "https://api.github.com/repos/Ixonblitz-MatOS/Worm/contents/w.txt"
headers = {
'Accept': 'application/vnd.github+json',
'Authorization': 'Bearer <token>',
}
response = requests.request("GET", url, headers=headers)
data = response.json()
The SHA will be at data['sha'].
Then you need to modify your request to include the SHA:
data = {
"message": msg, # Put your commit message here.
"content": text.decode("utf-8"),
"sha": data['sha']
}
The /repos/{owner}/{repo}/contents/{path} API route to fetch files has some size limits. See below:
If the requested file's size is:
1 MB or smaller: All features of this endpoint are supported.
Between
1-100 MB: Only the raw or object custom media types are supported.
Both will work as normal, except that when using the object media
type, the content field will be an empty string and the encoding field
will be "none". To get the contents of these larger files, use the raw
media type.
Greater than 100 MB: This endpoint is not supported.
For larger files, you will have to clone the repository and then calculate the SHA locally.
Why don't you do it this way.
from git import Repo
PATH_OF_GIT_REPO = r'path\to\your\project\folder\.git' # make sure .git folder is properly configured
COMMIT_MESSAGE = 'comment from python script'
def git_push():
try:
repo = Repo(PATH_OF_GIT_REPO)
repo.git.add(update=True)
repo.index.commit(COMMIT_MESSAGE)
origin = repo.remote(name='origin')
origin.push()
except:
print('Some error occured while pushing the code')
git_push()
Based on the code you've provided, it seems that you are encoding the content of the file as base64 before pushing it to your Git repository. While this is a valid approach, you need to make sure that you are decoding the content before writing it to the file on the Git repository. Here's a modified version of your code that should work:
import base64
import requests
fileName = "w.txt"
msg = "Updated w.txt with new numbers"
token = "<your GitHub access token>"
with open(fileName, 'r') as f:
content = f.read()
encoded_content = base64.b64encode(content.encode("utf- 8")).decode("utf-8")
api_url = "https://api.github.com/repos/Ixonblitz-MatOS/Worm/contents/w.txt"
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Accept": "application/vnd.github+json"
}
data = {
"message": msg,
"content": encoded_content
}
response = requests.put(api_url, headers=headers, json=data)
if response.status_code == 200:
print("Successfully pushed the file to Git repository.")
else:
print(f"Error pushing file to Git repository. Response status code: {response.status_code}")
The above code reads the content of the file into the content variable and then encodes it as base64 using the base64.b64encode method. It then sends a PUT request to the GitHub API to create or update the file in your Git repository. The data payload contains the encoded content of the file along with a commit message. The response status code is then checked to verify whether the operation was successful.
Note that you may also want to make sure that the headers dictionary contains the correct User-Agent field as per GitHub API requirements.

Send files to flask endpoints in python [duplicate]

I'm performing a simple task of uploading a file using Python requests library. I searched Stack Overflow and no one seemed to have the same problem, namely, that the file is not received by the server:
import requests
url='http://nesssi.cacr.caltech.edu/cgi-bin/getmulticonedb_release2.cgi/post'
files={'files': open('file.txt','rb')}
values={'upload_file' : 'file.txt' , 'DB':'photcat' , 'OUT':'csv' , 'SHORT':'short'}
r=requests.post(url,files=files,data=values)
I'm filling the value of 'upload_file' keyword with my filename, because if I leave it blank, it says
Error - You must select a file to upload!
And now I get
File file.txt of size bytes is uploaded successfully!
Query service results: There were 0 lines.
Which comes up only if the file is empty. So I'm stuck as to how to send my file successfully. I know that the file works because if I go to this website and manually fill in the form it returns a nice list of matched objects, which is what I'm after. I'd really appreciate all hints.
Some other threads related (but not answering my problem):
Send file using POST from a Python script
http://docs.python-requests.org/en/latest/user/quickstart/#response-content
Uploading files using requests and send extra data
http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
If upload_file is meant to be the file, use:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
and requests will send a multi-part form POST body with the upload_file field set to the contents of the file.txt file.
The filename will be included in the mime header for the specific field:
>>> import requests
>>> open('file.txt', 'wb') # create an empty demo file
<_io.BufferedWriter name='file.txt'>
>>> files = {'upload_file': open('file.txt', 'rb')}
>>> print(requests.Request('POST', 'http://example.com', files=files).prepare().body.decode('ascii'))
--c226ce13d09842658ffbd31e0563c6bd
Content-Disposition: form-data; name="upload_file"; filename="file.txt"
--c226ce13d09842658ffbd31e0563c6bd--
Note the filename="file.txt" parameter.
You can use a tuple for the files mapping value, with between 2 and 4 elements, if you need more control. The first element is the filename, followed by the contents, and an optional content-type header value and an optional mapping of additional headers:
files = {'upload_file': ('foobar.txt', open('file.txt','rb'), 'text/x-spam')}
This sets an alternative filename and content type, leaving out the optional headers.
If you are meaning the whole POST body to be taken from a file (with no other fields specified), then don't use the files parameter, just post the file directly as data. You then may want to set a Content-Type header too, as none will be set otherwise. See Python requests - POST data from a file.
(2018) the new python requests library has simplified this process, we can use the 'files' variable to signal that we want to upload a multipart-encoded file
url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
r.text
Client Upload
If you want to upload a single file with Python requests library, then requests lib supports streaming uploads, which allow you to send large files or streams without reading into memory.
with open('massive-body', 'rb') as f:
requests.post('http://some.url/streamed', data=f)
Server Side
Then store the file on the server.py side such that save the stream into file without loading into the memory. Following is an example with using Flask file uploads.
#app.route("/upload", methods=['POST'])
def upload_file():
from werkzeug.datastructures import FileStorage
FileStorage(request.stream).save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
return 'OK', 200
Or use werkzeug Form Data Parsing as mentioned in a fix for the issue of "large file uploads eating up memory" in order to avoid using memory inefficiently on large files upload (s.t. 22 GiB file in ~60 seconds. Memory usage is constant at about 13 MiB.).
#app.route("/upload", methods=['POST'])
def upload_file():
def custom_stream_factory(total_content_length, filename, content_type, content_length=None):
import tempfile
tmpfile = tempfile.NamedTemporaryFile('wb+', prefix='flaskapp', suffix='.nc')
app.logger.info("start receiving file ... filename => " + str(tmpfile.name))
return tmpfile
import werkzeug, flask
stream, form, files = werkzeug.formparser.parse_form_data(flask.request.environ, stream_factory=custom_stream_factory)
for fil in files.values():
app.logger.info(" ".join(["saved form name", fil.name, "submitted as", fil.filename, "to temporary file", fil.stream.name]))
# Do whatever with stored file at `fil.stream.name`
return 'OK', 200
You can send any file via post api while calling the API just need to mention files={'any_key': fobj}
import requests
import json
url = "https://request-url.com"
headers = {"Content-Type": "application/json; charset=utf-8"}
with open(filepath, 'rb') as fobj:
response = requests.post(url, headers=headers, files={'file': fobj})
print("Status Code", response.status_code)
print("JSON Response ", response.json())
#martijn-pieters answer is correct, however I wanted to add a bit of context to data= and also to the other side, in the Flask server, in the case where you are trying to upload files and a JSON.
From the request side, this works as Martijn describes:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
However, on the Flask side (the receiving webserver on the other side of this POST), I had to use form
#app.route("/sftp-upload", methods=["POST"])
def upload_file():
if request.method == "POST":
# the mimetype here isnt application/json
# see here: https://stackoverflow.com/questions/20001229/how-to-get-posted-json-in-flask
body = request.form
print(body) # <- immutable dict
body = request.get_json() will return nothing. body = request.get_data() will return a blob containing lots of things like the filename etc.
Here's the bad part: on the client side, changing data={} to json={} results in this server not being able to read the KV pairs! As in, this will result in a {} body above:
r = requests.post(url, files=files, json=values). # No!
This is bad because the server does not have control over how the user formats the request; and json= is going to be the habbit of requests users.
Upload:
with open('file.txt', 'rb') as f:
files = {'upload_file': f.read()}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
Download (Django):
with open('file.txt', 'wb') as f:
f.write(request.FILES['upload_file'].file.read())
Regarding the answers given so far, there was always something missing that prevented it to work on my side. So let me show you what worked for me:
import json
import os
import requests
API_ENDPOINT = "http://localhost:80"
access_token = "sdfJHKsdfjJKHKJsdfJKHJKysdfJKHsdfJKHs" # TODO: get fresh Token here
def upload_engagement_file(filepath):
url = API_ENDPOINT + "/api/files" # add any URL parameters if needed
hdr = {"Authorization": "Bearer %s" % access_token}
with open(filepath, "rb") as fobj:
file_obj = fobj.read()
file_basename = os.path.basename(filepath)
file_to_upload = {"file": (str(file_basename), file_obj)}
finfo = {"fullPath": filepath}
upload_response = requests.post(url, headers=hdr, files=file_to_upload, data=finfo)
fobj.close()
# print("Status Code ", upload_response.status_code)
# print("JSON Response ", upload_response.json())
return upload_response
Note that requests.post(...) needs
a url parameter, containing the full URL of the API endpoint you're calling, using the API_ENDPOINT, assuming we have an http://localhost:8000/api/files endpoint to POST a file
a headers parameter, containing at least the authorization (bearer token)
a files parameter taking the name of the file plus the entire file content
a data parameter taking just the path and file name
Installation required (console):
pip install requests
What you get back from the function call is a response object containing a status code and also the full error message in JSON format. The commented print statements at the end of upload_engagement_file are showing you how you can access them.
Note: Some useful additional information about the requests library can be found here
Some may need to upload via a put request and this is slightly different that posting data. It is important to understand how the server expects the data in order to form a valid request. A frequent source of confusion is sending multipart-form data when it isn't accepted. This example uses basic auth and updates an image via a put request.
url = 'foobar.com/api/image-1'
basic = requests.auth.HTTPBasicAuth('someuser', 'password123')
# Setting the appropriate header is important and will vary based
# on what you upload
headers = {'Content-Type': 'image/png'}
with open('image-1.png', 'rb') as img_1:
r = requests.put(url, auth=basic, data=img_1, headers=headers)
While the requests library makes working with http requests a lot easier, some of its magic and convenience obscures just how to craft more nuanced requests.
In Ubuntu you can apply this way,
to save file at some location (temporary) and then open and send it to API
path = default_storage.save('static/tmp/' + f1.name, ContentFile(f1.read()))
path12 = os.path.join(os.getcwd(), "static/tmp/" + f1.name)
data={} #can be anything u want to pass along with File
file1 = open(path12, 'rb')
header = {"Content-Disposition": "attachment; filename=" + f1.name, "Authorization": "JWT " + token}
res= requests.post(url,data,header)

Upload multipart file using POST request [duplicate]

How to send a multipart/form-data with requests in python? How to send a file, I understand, but how to send the form data by this method can not understand.
Basically, if you specify a files parameter (a dictionary), then requests will send a multipart/form-data POST instead of a application/x-www-form-urlencoded POST. You are not limited to using actual files in that dictionary, however:
>>> import requests
>>> response = requests.post('http://httpbin.org/post', files=dict(foo='bar'))
>>> response.status_code
200
and httpbin.org lets you know what headers you posted with; in response.json() we have:
>>> from pprint import pprint
>>> pprint(response.json()['headers'])
{'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'close',
'Content-Length': '141',
'Content-Type': 'multipart/form-data; '
'boundary=c7cbfdd911b4e720f1dd8f479c50bc7f',
'Host': 'httpbin.org',
'User-Agent': 'python-requests/2.21.0'}
Better still, you can further control the filename, content type and additional headers for each part by using a tuple instead of a single string or bytes object. The tuple is expected to contain between 2 and 4 elements; the filename, the content, optionally a content type, and an optional dictionary of further headers.
I'd use the tuple form with None as the filename, so that the filename="..." parameter is dropped from the request for those parts:
>>> files = {'foo': 'bar'}
>>> print(requests.Request('POST', 'http://httpbin.org/post', files=files).prepare().body.decode('utf8'))
--bb3f05a247b43eede27a124ef8b968c5
Content-Disposition: form-data; name="foo"; filename="foo"
bar
--bb3f05a247b43eede27a124ef8b968c5--
>>> files = {'foo': (None, 'bar')}
>>> print(requests.Request('POST', 'http://httpbin.org/post', files=files).prepare().body.decode('utf8'))
--d5ca8c90a869c5ae31f70fa3ddb23c76
Content-Disposition: form-data; name="foo"
bar
--d5ca8c90a869c5ae31f70fa3ddb23c76--
files can also be a list of two-value tuples, if you need ordering and/or multiple fields with the same name:
requests.post(
'http://requestb.in/xucj9exu',
files=(
('foo', (None, 'bar')),
('foo', (None, 'baz')),
('spam', (None, 'eggs')),
)
)
If you specify both files and data, then it depends on the value of data what will be used to create the POST body. If data is a string, only it willl be used; otherwise both data and files are used, with the elements in data listed first.
There is also the excellent requests-toolbelt project, which includes advanced Multipart support. It takes field definitions in the same format as the files parameter, but unlike requests, it defaults to not setting a filename parameter. In addition, it can stream the request from open file objects, where requests will first construct the request body in memory:
from requests_toolbelt.multipart.encoder import MultipartEncoder
mp_encoder = MultipartEncoder(
fields={
'foo': 'bar',
# plain file object, no filename or mime type produces a
# Content-Disposition header with just the part name
'spam': ('spam.txt', open('spam.txt', 'rb'), 'text/plain'),
}
)
r = requests.post(
'http://httpbin.org/post',
data=mp_encoder, # The MultipartEncoder is posted as data, don't use files=...!
# The MultipartEncoder provides the content-type header with the boundary:
headers={'Content-Type': mp_encoder.content_type}
)
Fields follow the same conventions; use a tuple with between 2 and 4 elements to add a filename, part mime-type or extra headers. Unlike the files parameter, no attempt is made to find a default filename value if you don't use a tuple.
Requests has changed since some of the previous answers were written. Have a look at this Issue on Github for more details and this comment for an example.
In short, the files parameter takes a dictionary with the key being the name of the form field and the value being either a string or a 2, 3 or 4-length tuple, as described in the section POST a Multipart-Encoded File in the Requests quickstart:
>>> url = 'http://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
In the above, the tuple is composed as follows:
(filename, data, content_type, headers)
If the value is just a string, the filename will be the same as the key, as in the following:
>>> files = {'obvius_session_id': '72c2b6f406cdabd578c5fd7598557c52'}
Content-Disposition: form-data; name="obvius_session_id"; filename="obvius_session_id"
Content-Type: application/octet-stream
72c2b6f406cdabd578c5fd7598557c52
If the value is a tuple and the first entry is None the filename property will not be included:
>>> files = {'obvius_session_id': (None, '72c2b6f406cdabd578c5fd7598557c52')}
Content-Disposition: form-data; name="obvius_session_id"
Content-Type: application/octet-stream
72c2b6f406cdabd578c5fd7598557c52
You need to use the files parameter to send a multipart form POST request even when you do not need to upload any files.
From the original requests source:
def request(method, url, **kwargs):
"""Constructs and sends a :class:`Request <Request>`.
...
:param files: (optional) Dictionary of ``'name': file-like-objects``
(or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``,
3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``,
where ``'content-type'`` is a string
defining the content type of the given file
and ``custom_headers`` a dict-like object
containing additional headers to add for the file.
The relevant part is: file-tuple can be a:
2-tuple (filename, fileobj)
3-tuple (filename, fileobj, content_type)
4-tuple (filename, fileobj, content_type, custom_headers).
☝ What might not be obvious is that fileobj can be either an actual file object when dealing with files, OR a string when dealing with plain text fields.
Based on the above, the simplest multipart form request that includes both files to upload and form fields will look like this:
import requests
multipart_form_data = {
'upload': ('custom_file_name.zip', open('myfile.zip', 'rb')),
'action': (None, 'store'),
'path': (None, '/path1')
}
response = requests.post('https://httpbin.org/post', files=multipart_form_data)
print(response.content)
☝ Note the None as the first argument in the tuple for plain text fields — this is a placeholder for the filename field which is only used for file uploads, but for text fields passing None as the first parameter is required in order for the data to be submitted.
Multiple fields with the same name
If you need to post multiple fields with the same name then instead of a dictionary you can define your payload as a list (or a tuple) of tuples:
multipart_form_data = (
('file2', ('custom_file_name.zip', open('myfile.zip', 'rb'))),
('action', (None, 'store')),
('path', (None, '/path1')),
('path', (None, '/path2')),
('path', (None, '/path3')),
)
Streaming requests API
If the above API is not pythonic enough for you, then consider using requests toolbelt (pip install requests_toolbelt) which is an extension of the core requests module that provides support for file upload streaming as well as the MultipartEncoder which can be used instead of files, and which also lets you define the payload as a dictionary, tuple or list.
MultipartEncoder can be used both for multipart requests with or without actual upload fields. It must be assigned to the data parameter.
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
multipart_data = MultipartEncoder(
fields={
# a file upload field
'file': ('file.zip', open('file.zip', 'rb'), 'text/plain')
# plain text fields
'field0': 'value0',
'field1': 'value1',
}
)
response = requests.post('http://httpbin.org/post', data=multipart_data,
headers={'Content-Type': multipart_data.content_type})
If you need to send multiple fields with the same name, or if the order of form fields is important, then a tuple or a list can be used instead of a dictionary:
multipart_data = MultipartEncoder(
fields=(
('action', 'ingest'),
('item', 'spam'),
('item', 'sausage'),
('item', 'eggs'),
)
)
Here is the simple code snippet to upload a single file with additional parameters using requests:
url = 'https://<file_upload_url>'
fp = '/Users/jainik/Desktop/data.csv'
files = {'file': open(fp, 'rb')}
payload = {'file_id': '1234'}
response = requests.put(url, files=files, data=payload, verify=False)
Please note that you don't need to explicitly specify any content type.
NOTE: Wanted to comment on one of the above answers but could not because of low reputation so drafted a new response here.
By specifying a files parameter in the POST request, the Content-Type of the request is automatically set to multipart/form-data (followed by the boundary string used to separate each body part in the multipart payload), whether you send only files, or form-data and files at the same time (thus, one shouldn't attempt setting the Content-Type manually in this case). Whereas, if only form-data were sent, the Content-Type would automatically be set to application/x-www-form-urlencoded.
You can print out the Content-Type header of the request to verify the above using the example given below, which shows how to upload multiple files (or a single file) with (optionally) the same key (i.e., 'files' in the case below), as well as with optional form-data (i.e., data=data in the example below). The documentation on how to POST single and multiple files can be found here and here, respectively. In case you need to upload large files without reading them into memory, have a look at Streaming Uploads.
For the server side—in case this is needed—please have a look at this answer, from which the code snippet below has been taken, and which uses the FastAPI web framework.
import requests
url = 'http://127.0.0.1:8000/submit'
files = [('files', open('a.txt', 'rb')), ('files', open('b.txt', 'rb'))]
#file = {'file': open('a.txt','rb')} # to send a single file
data ={"name": "foo", "point": 0.13, "is_accepted": False}
r = requests.post(url=url, data=data, files=files)
print(r.json())
print(r.request.headers['content-type'])
You need to use the name attribute of the upload file that is in the HTML of the site. Example:
autocomplete="off" name="image">
You see name="image">? You can find it in the HTML of a site for uploading the file. You need to use it to upload the file with Multipart/form-data
script:
import requests
site = 'https://prnt.sc/upload.php' # the site where you upload the file
filename = 'image.jpg' # name example
Here, in the place of image, add the name of the upload file in HTML
up = {'image':(filename, open(filename, 'rb'), "multipart/form-data")}
If the upload requires to click the button for upload, you can use like that:
data = {
"Button" : "Submit",
}
Then start the request
request = requests.post(site, files=up, data=data)
And done, file uploaded succesfully
import requests
# assume sending two files
url = "put ur url here"
f1 = open("file 1 path", 'rb')
f2 = open("file 2 path", 'rb')
response = requests.post(url,files={"file1 name": f1, "file2 name":f2})
print(response)
To clarify examples given above,
"You need to use the files parameter to send a multipart form POST request even when you do not need to upload any files."
files={}
won't work, unfortunately.
You will need to put some dummy values in, e.g.
files={"foo": "bar"}
I came up against this when trying to upload files to Bitbucket's REST API and had to write this abomination to avoid the dreaded "Unsupported Media Type" error:
url = "https://my-bitbucket.com/rest/api/latest/projects/FOO/repos/bar/browse/foobar.txt"
payload = {'branch': 'master',
'content': 'text that will appear in my file',
'message': 'uploading directly from python'}
files = {"foo": "bar"}
response = requests.put(url, data=payload, files=files)
:O=
Send multipart/form-data key and value
curl command:
curl -X PUT http://127.0.0.1:8080/api/xxx ...
-H 'content-type: multipart/form-data; boundary=----xxx' \
-F taskStatus=1
python requests - More complicated POST requests:
updateTaskUrl = "http://127.0.0.1:8080/api/xxx"
updateInfoDict = {
"taskStatus": 1,
}
resp = requests.put(updateTaskUrl, data=updateInfoDict)
Send multipart/form-data file
curl command:
curl -X POST http://127.0.0.1:8080/api/xxx ...
-H 'content-type: multipart/form-data; boundary=----xxx' \
-F file=#/Users/xxx.txt
python requests - POST a Multipart-Encoded File:
filePath = "/Users/xxx.txt"
fileFp = open(filePath, 'rb')
fileInfoDict = {
"file": fileFp,
}
resp = requests.post(uploadResultUrl, files=fileInfoDict)
that's all.
import json
import os
import requests
from requests_toolbelt import MultipartEncoder
AUTH_API_ENDPOINT = "http://localhost:3095/api/auth/login"
def file_upload(path_img, token ):
url = 'http://localhost:3095/api/shopping/product/image'
name_img = os.path.basename(path_img)
mp_encoder = MultipartEncoder(
fields={
'email': 'mcm9#gmail.com',
'source': 'tmall',
'productId': 'product_0001',
'image': (name_img, open(path_img, 'rb'), 'multipart/form-data')
#'spam': ('spam.txt', open('spam.txt', 'rb'), 'text/plain'),
}
)
head = {'Authorization': 'Bearer {}'.format(token),
'Content-Type': mp_encoder.content_type}
with requests.Session() as s:
result = s.post(url, data=mp_encoder, headers=head)
return result
def do_auth(username, password, url=AUTH_API_ENDPOINT):
data = {
"email": username,
"password": password
}
# sending post request and saving response as response object
r = requests.post(url=url, data=data)
# extracting response text
response_text = r.text
d = json.loads(response_text)
# print(d)
return d
if __name__ == '__main__':
result = do_auth('mcm4#gmail.com','123456')
token = result.get('data').get('payload').get('token')
print(token)
result = file_upload('/home/mcm/Pictures/1234.png',token)
print(result.json())
Here is the python snippet you need to upload one large single file as multipart formdata. With NodeJs Multer middleware running on the server side.
import requests
latest_file = 'path/to/file'
url = "http://httpbin.org/apiToUpload"
files = {'fieldName': open(latest_file, 'rb')}
r = requests.put(url, files=files)
For the server side please check the multer documentation at: https://github.com/expressjs/multer
here the field single('fieldName') is used to accept one single file, as in:
var upload = multer().single('fieldName');
This is one way to send file in multipart request
import requests
headers = {"Authorization": "Bearer <token>"}
myfile = 'file.txt'
myfile2 = {'file': (myfile, open(myfile, 'rb'),'application/octet-stream')}
url = 'https://example.com/path'
r = requests.post(url, files=myfile2, headers=headers,verify=False)
print(r.content)
Other approach
import requests
url = "https://example.com/path"
payload={}
files=[
('file',('file',open('/path/to/file','rb'),'application/octet-stream'))
]
headers = {
'Authorization': 'Bearer <token>'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
print(response.text)
I have tested both , both works fine.
I'm trying to send a request to URL_server with request module in python 3.
This works for me:
# -*- coding: utf-8 *-*
import json, requests
URL_SERVER_TO_POST_DATA = "URL_to_send_POST_request"
HEADERS = {"Content-Type" : "multipart/form-data;"}
def getPointsCC_Function():
file_data = {
'var1': (None, "valueOfYourVariable_1"),
'var2': (None, "valueOfYourVariable_2")
}
try:
resElastic = requests.post(URL_GET_BALANCE, files=file_data)
res = resElastic.json()
except Exception as e:
print(e)
print (json.dumps(res, indent=4, sort_keys=True))
getPointsCC_Function()
Where:
URL_SERVER_TO_POST_DATA = Server where we going to send data
HEADERS = Headers sended
file_data = Params sended
Postman generated code for file upload with additional form fields:
import http.client
import mimetypes
from codecs import encode
conn = http.client.HTTPSConnection("data.XXXX.com")
dataList = []
boundary = 'wL36Yn8afVp8Ag7AmP8qZ0SA4n1v9T'
dataList.append(encode('--' + boundary))
dataList.append(encode('Content-Disposition: form-data; name=batchSize;'))
dataList.append(encode('Content-Type: {}'.format('text/plain')))
dataList.append(encode(''))
dataList.append(encode("1"))
dataList.append(encode('--' + boundary))
dataList.append(encode('Content-Disposition: form-data; name=file; filename={0}'.format('FileName-1.json')))
fileType = mimetypes.guess_type('FileName-1.json')[0] or 'application/octet-stream'
dataList.append(encode('Content-Type: {}'.format(fileType)))
dataList.append(encode(''))
with open('FileName-1.json', 'rb') as f:
dataList.append(f.read())
dataList.append(encode('--'+boundary+'--'))
dataList.append(encode(''))
body = b'\r\n'.join(dataList)
payload = body
headers = {
'Cookie': 'XXXXXXXXXXX',
'Content-type': 'multipart/form-data; boundary={}'.format(boundary)
}
conn.request("POST", "/fileupload/uri/XXXX", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))

python requests post file using multipart form parameters [duplicate]

I'm performing a simple task of uploading a file using Python requests library. I searched Stack Overflow and no one seemed to have the same problem, namely, that the file is not received by the server:
import requests
url='http://nesssi.cacr.caltech.edu/cgi-bin/getmulticonedb_release2.cgi/post'
files={'files': open('file.txt','rb')}
values={'upload_file' : 'file.txt' , 'DB':'photcat' , 'OUT':'csv' , 'SHORT':'short'}
r=requests.post(url,files=files,data=values)
I'm filling the value of 'upload_file' keyword with my filename, because if I leave it blank, it says
Error - You must select a file to upload!
And now I get
File file.txt of size bytes is uploaded successfully!
Query service results: There were 0 lines.
Which comes up only if the file is empty. So I'm stuck as to how to send my file successfully. I know that the file works because if I go to this website and manually fill in the form it returns a nice list of matched objects, which is what I'm after. I'd really appreciate all hints.
Some other threads related (but not answering my problem):
Send file using POST from a Python script
http://docs.python-requests.org/en/latest/user/quickstart/#response-content
Uploading files using requests and send extra data
http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
If upload_file is meant to be the file, use:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
and requests will send a multi-part form POST body with the upload_file field set to the contents of the file.txt file.
The filename will be included in the mime header for the specific field:
>>> import requests
>>> open('file.txt', 'wb') # create an empty demo file
<_io.BufferedWriter name='file.txt'>
>>> files = {'upload_file': open('file.txt', 'rb')}
>>> print(requests.Request('POST', 'http://example.com', files=files).prepare().body.decode('ascii'))
--c226ce13d09842658ffbd31e0563c6bd
Content-Disposition: form-data; name="upload_file"; filename="file.txt"
--c226ce13d09842658ffbd31e0563c6bd--
Note the filename="file.txt" parameter.
You can use a tuple for the files mapping value, with between 2 and 4 elements, if you need more control. The first element is the filename, followed by the contents, and an optional content-type header value and an optional mapping of additional headers:
files = {'upload_file': ('foobar.txt', open('file.txt','rb'), 'text/x-spam')}
This sets an alternative filename and content type, leaving out the optional headers.
If you are meaning the whole POST body to be taken from a file (with no other fields specified), then don't use the files parameter, just post the file directly as data. You then may want to set a Content-Type header too, as none will be set otherwise. See Python requests - POST data from a file.
(2018) the new python requests library has simplified this process, we can use the 'files' variable to signal that we want to upload a multipart-encoded file
url = 'http://httpbin.org/post'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
r.text
Client Upload
If you want to upload a single file with Python requests library, then requests lib supports streaming uploads, which allow you to send large files or streams without reading into memory.
with open('massive-body', 'rb') as f:
requests.post('http://some.url/streamed', data=f)
Server Side
Then store the file on the server.py side such that save the stream into file without loading into the memory. Following is an example with using Flask file uploads.
#app.route("/upload", methods=['POST'])
def upload_file():
from werkzeug.datastructures import FileStorage
FileStorage(request.stream).save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
return 'OK', 200
Or use werkzeug Form Data Parsing as mentioned in a fix for the issue of "large file uploads eating up memory" in order to avoid using memory inefficiently on large files upload (s.t. 22 GiB file in ~60 seconds. Memory usage is constant at about 13 MiB.).
#app.route("/upload", methods=['POST'])
def upload_file():
def custom_stream_factory(total_content_length, filename, content_type, content_length=None):
import tempfile
tmpfile = tempfile.NamedTemporaryFile('wb+', prefix='flaskapp', suffix='.nc')
app.logger.info("start receiving file ... filename => " + str(tmpfile.name))
return tmpfile
import werkzeug, flask
stream, form, files = werkzeug.formparser.parse_form_data(flask.request.environ, stream_factory=custom_stream_factory)
for fil in files.values():
app.logger.info(" ".join(["saved form name", fil.name, "submitted as", fil.filename, "to temporary file", fil.stream.name]))
# Do whatever with stored file at `fil.stream.name`
return 'OK', 200
You can send any file via post api while calling the API just need to mention files={'any_key': fobj}
import requests
import json
url = "https://request-url.com"
headers = {"Content-Type": "application/json; charset=utf-8"}
with open(filepath, 'rb') as fobj:
response = requests.post(url, headers=headers, files={'file': fobj})
print("Status Code", response.status_code)
print("JSON Response ", response.json())
#martijn-pieters answer is correct, however I wanted to add a bit of context to data= and also to the other side, in the Flask server, in the case where you are trying to upload files and a JSON.
From the request side, this works as Martijn describes:
files = {'upload_file': open('file.txt','rb')}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
However, on the Flask side (the receiving webserver on the other side of this POST), I had to use form
#app.route("/sftp-upload", methods=["POST"])
def upload_file():
if request.method == "POST":
# the mimetype here isnt application/json
# see here: https://stackoverflow.com/questions/20001229/how-to-get-posted-json-in-flask
body = request.form
print(body) # <- immutable dict
body = request.get_json() will return nothing. body = request.get_data() will return a blob containing lots of things like the filename etc.
Here's the bad part: on the client side, changing data={} to json={} results in this server not being able to read the KV pairs! As in, this will result in a {} body above:
r = requests.post(url, files=files, json=values). # No!
This is bad because the server does not have control over how the user formats the request; and json= is going to be the habbit of requests users.
Upload:
with open('file.txt', 'rb') as f:
files = {'upload_file': f.read()}
values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}
r = requests.post(url, files=files, data=values)
Download (Django):
with open('file.txt', 'wb') as f:
f.write(request.FILES['upload_file'].file.read())
Regarding the answers given so far, there was always something missing that prevented it to work on my side. So let me show you what worked for me:
import json
import os
import requests
API_ENDPOINT = "http://localhost:80"
access_token = "sdfJHKsdfjJKHKJsdfJKHJKysdfJKHsdfJKHs" # TODO: get fresh Token here
def upload_engagement_file(filepath):
url = API_ENDPOINT + "/api/files" # add any URL parameters if needed
hdr = {"Authorization": "Bearer %s" % access_token}
with open(filepath, "rb") as fobj:
file_obj = fobj.read()
file_basename = os.path.basename(filepath)
file_to_upload = {"file": (str(file_basename), file_obj)}
finfo = {"fullPath": filepath}
upload_response = requests.post(url, headers=hdr, files=file_to_upload, data=finfo)
fobj.close()
# print("Status Code ", upload_response.status_code)
# print("JSON Response ", upload_response.json())
return upload_response
Note that requests.post(...) needs
a url parameter, containing the full URL of the API endpoint you're calling, using the API_ENDPOINT, assuming we have an http://localhost:8000/api/files endpoint to POST a file
a headers parameter, containing at least the authorization (bearer token)
a files parameter taking the name of the file plus the entire file content
a data parameter taking just the path and file name
Installation required (console):
pip install requests
What you get back from the function call is a response object containing a status code and also the full error message in JSON format. The commented print statements at the end of upload_engagement_file are showing you how you can access them.
Note: Some useful additional information about the requests library can be found here
Some may need to upload via a put request and this is slightly different that posting data. It is important to understand how the server expects the data in order to form a valid request. A frequent source of confusion is sending multipart-form data when it isn't accepted. This example uses basic auth and updates an image via a put request.
url = 'foobar.com/api/image-1'
basic = requests.auth.HTTPBasicAuth('someuser', 'password123')
# Setting the appropriate header is important and will vary based
# on what you upload
headers = {'Content-Type': 'image/png'}
with open('image-1.png', 'rb') as img_1:
r = requests.put(url, auth=basic, data=img_1, headers=headers)
While the requests library makes working with http requests a lot easier, some of its magic and convenience obscures just how to craft more nuanced requests.
In Ubuntu you can apply this way,
to save file at some location (temporary) and then open and send it to API
path = default_storage.save('static/tmp/' + f1.name, ContentFile(f1.read()))
path12 = os.path.join(os.getcwd(), "static/tmp/" + f1.name)
data={} #can be anything u want to pass along with File
file1 = open(path12, 'rb')
header = {"Content-Disposition": "attachment; filename=" + f1.name, "Authorization": "JWT " + token}
res= requests.post(url,data,header)

Python POST binary data

I am writing some code to interface with redmine and I need to upload some files as part of the process, but I am not sure how to do a POST request from python containing a binary file.
I am trying to mimic the commands here:
curl --data-binary "#image.png" -H "Content-Type: application/octet-stream" -X POST -u login:password http://redmine/uploads.xml
In python (below), but it does not seem to work. I am not sure if the problem is somehow related to encoding the file or if something is wrong with the headers.
import urllib2, os
FilePath = "C:\somefolder\somefile.7z"
FileData = open(FilePath, "rb")
length = os.path.getsize(FilePath)
password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_manager.add_password(None, 'http://redmine/', 'admin', 'admin')
auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
request = urllib2.Request( r'http://redmine/uploads.xml', FileData)
request.add_header('Content-Length', '%d' % length)
request.add_header('Content-Type', 'application/octet-stream')
try:
response = urllib2.urlopen( request)
print response.read()
except urllib2.HTTPError as e:
error_message = e.read()
print error_message
I have access to the server and it looks like a encoding error:
...
invalid byte sequence in UTF-8
Line: 1
Position: 624
Last 80 unconsumed characters:
7z¼¯'ÅÐз2^Ôøë4g¸R<süðí6kĤª¶!»=}jcdjSPúá-º#»ÄAtD»H7Ê!æ½]j):
(further down)
Started POST "/uploads.xml" for 192.168.0.117 at 2013-01-16 09:57:49 -0800
Processing by AttachmentsController#upload as XML
WARNING: Can't verify CSRF token authenticity
Current user: anonymous
Filter chain halted as :authorize_global rendered or redirected
Completed 401 Unauthorized in 13ms (ActiveRecord: 3.1ms)
Basically what you do is correct. Looking at redmine docs you linked to, it seems that suffix after the dot in the url denotes type of posted data (.json for JSON, .xml for XML), which agrees with the response you get - Processing by AttachmentsController#upload as XML. I guess maybe there's a bug in docs and to post binary data you should try using http://redmine/uploads url instead of http://redmine/uploads.xml.
Btw, I highly recommend very good and very popular Requests library for http in Python. It's much better than what's in the standard lib (urllib2). It supports authentication as well but I skipped it for brevity here.
import requests
with open('./x.png', 'rb') as f:
data = f.read()
res = requests.post(url='http://httpbin.org/post',
data=data,
headers={'Content-Type': 'application/octet-stream'})
# let's check if what we sent is what we intended to send...
import json
import base64
assert base64.b64decode(res.json()['data'][len('data:application/octet-stream;base64,'):]) == data
UPDATE
To find out why this works with Requests but not with urllib2 we have to examine the difference in what's being sent. To see this I'm sending traffic to http proxy (Fiddler) running on port 8888:
Using Requests
import requests
data = 'test data'
res = requests.post(url='http://localhost:8888',
data=data,
headers={'Content-Type': 'application/octet-stream'})
we see
POST http://localhost:8888/ HTTP/1.1
Host: localhost:8888
Content-Length: 9
Content-Type: application/octet-stream
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/1.0.4 CPython/2.7.3 Windows/Vista
test data
and using urllib2
import urllib2
data = 'test data'
req = urllib2.Request('http://localhost:8888', data)
req.add_header('Content-Length', '%d' % len(data))
req.add_header('Content-Type', 'application/octet-stream')
res = urllib2.urlopen(req)
we get
POST http://localhost:8888/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 9
Host: localhost:8888
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
test data
I don't see any differences which would warrant different behavior you observe. Having said that it's not uncommon for http servers to inspect User-Agent header and vary behavior based on its value. Try to change headers sent by Requests one by one making them the same as those being sent by urllib2 and see when it stops working.
This has nothing to do with a malformed upload. The HTTP error clearly specifies 401 unauthorized, and tells you the CSRF token is invalid. Try sending a valid CSRF token with the upload.
More about csrf tokens here:
What is a CSRF token ? What is its importance and how does it work?
you need to add Content-Disposition header, smth like this (although I used mod-python here, but principle should be the same):
request.headers_out['Content-Disposition'] = 'attachment; filename=%s' % myfname
You can use unirest, It provides easy method to post request.
`
import unirest
def callback(response):
print "code:"+ str(response.code)
print "******************"
print "headers:"+ str(response.headers)
print "******************"
print "body:"+ str(response.body)
print "******************"
print "raw_body:"+ str(response.raw_body)
# consume async post request
def consumePOSTRequestASync():
params = {'test1':'param1','test2':'param2'}
# we need to pass a dummy variable which is open method
# actually unirest does not provide variable to shift between
# application-x-www-form-urlencoded and
# multipart/form-data
params['dummy'] = open('dummy.txt', 'r')
url = 'http://httpbin.org/post'
headers = {"Accept": "application/json"}
# call get service with headers and params
unirest.post(url, headers = headers,params = params, callback = callback)
# post async request multipart/form-data
consumePOSTRequestASync()

Categories

Resources