Python requests zip upload makes zipfile unreadable in Windows

Python requests zip upload makes zipfile unreadable in Windows - python

I'm trying to upload a zipfile to a Server using Python requests. The upload works fine. However the uploaded file cannot be opened using Windows Explorer or ark. I suppose there's some problem with mime-type or content-Length.
Oddly, uploading the file using curl, does not seem to cause the same problem.
Here is my python code for the request:
s = requests.Session()
headers = {'Content-Type': 'application/zip'}
zip = open('file.zip', 'rb')
files = {'file': ('file.zip', zip, 'application/zip')}
fc = {'Content-Disposition': 'attachment; filename=file.zip'}
headers.update(fc)
r = requests.Request('POST', url, files=files, headers=headers, auth=(user, password))
prepared = r.prepare()
resp = s.send(prepared)
This is the curl code, which works flawlessly:
curl -X POST \
-ik \
-u user:password \
--data-binary '#file.zip' \
-H 'Content-Type: application/zip' \
-H "Content-Disposition: attachment; filename=file.zip" \
url
Uploading the file works in both, the Server also seems to recognize the content-type. However the file is rendered invalid when re-downloading. The zifile is readable before sending via requests or after sending with normal curl, using --data-binary.
Opening the downloaded zifile with unip or file-roller works either way.
EDIT:
I was uploading two files successively. Oddly the error was fixed when uploading the exact same files in reverse order.
This has NOT been a python problem. When trying with standard curl
I must have accidentally reversed the order, which is why it has been working.
I can not explain this behavior nor do I have a fix for it.
In conclusion: Uploading the bigger file first did the trick.
All of the above seems to be applicable in curl, pycurl and python requests, so I assume it's some kind of bug in one of the curl libraries.

Related

Upload secure files to GitLab using requests python module

I'm trying to upload a secure file to my repository in GitLab.
While I am able to upload a secure file with curl, I encounter an error when using requests in Python.
my python code:
r = requests.post("https://gitlab.com/api/v4/projects/10186699/secure_files",
headers={"PRIVATE-TOKEN": "glpat-TH7FM3nThKmHgOp"},
files={"file": open("/Users/me/Desktop/dev/web-server/utils/a.txt", "r"),
"name": "a.txt"})
print(r.status_code,r.json())
Response:
400 {'error': 'name is invalid'}
The equivalent curl command I use that actually works:
curl --request POST --header "PRIVATE-TOKEN: glpat-TH7FM3nThKmHgOp" https://gitlab.com/api/v4/projects/10186699/secure_files --form "name=a.txt" --form "file=#/Users/me/Desktop/dev/web-server/utils/a.txt"

The equivalent call will be
import requests
resp = requests.post(
"https://gitlab.com/api/v4/projects/10186699/secure_files",
headers={"PRIVATE-TOKEN": "glpat-TH7FM3nThKmHgOp"},
files={"file": open("/Users/me/Desktop/dev/web-server/utils/a.txt", "rb")},
data={"name": "a.txt"}
)
print(resp.status_code,resp.json())
This is because the file= parameter is intended only for uploading files. On the other hand, name is your form data (you need to pass in the data= parameter).
It's also recommended to open files in binary mode. (docs)

zip file upload with Python Requests Not working

Hi am attempting to upload a file to a server when making a post with requests but everytime I get errors. If do the same with cURL it goes through but Im not familiar with cURL or uploading files really, mostly get requests, so I have no clue what its doing differently. I am on windows and am running the below. This is being uploaded to mcafee epo so Im not sure if their api just super picky or what the difference is but every python example ive tried for uploading a file via requests module has failed for me.
url = "https://server.url.com:1234/remote/repository.checkInPackage.do?&allowUnsignedPackages=True&option=Normal&branch=Evaluation"
user = "domain\user"
password = "mypass"
filepath = "C:\\my\\folder\\with\\afile.zip"
with open(filepath, "rb") as f:
file_dict = {"file": f}
response = requests.post(url, auth=(user, password), files=file_dict)
I usually get a error as follows:
'Error 0 :\r\njava.lang.reflect.InvocationTargetException\r\n'
if I use cURL it works though
curl.exe -k -s -u "domain\username:mypass" "https://server.url.com:1234/remote/repository.checkInPackage.do?&allowUnsignedPackages=True&option=Normal&branch=Evaluation" -F file=#"C:\my\folder\with\afile.zip"
I cant really see the difference though and am wondering what is being done differently on the backend for cURL or what I could be doing wrong when using python.

Python version for curl --output

I have a GitLab API (v4) that I need to call to get a project sub-directory (something apparently new in v.14.4, it seems not yet included python-gitlab libs), which in curl can be done with the following command:
curl --header "PRIVATE-TOKEN: A_Token001" http://192.168.156.55/api/v4/projects/10/repository/archive?path=ProjectSubDirectory --output ~./temp/ProjectSubDirectory.tar.gz
The issue is in the last part, the --output ~./GitLab/some_project_files/ProjectSubDirectory.tar.gz
I tried different methods (.content, .text) which failed, as:
...
response = requests.get(url=url, headers=headers, params=params).content
# and save the respon content with with open(...)
but in all the cases it saved a non-valid tar.gz file, or other issues.
I even tried https://curlconverter.com/, but the code it generates does not work as well, it seems ignoring precisely the --output parameter, not showing anything about the file itself:
headers = {'PRIVATE-TOKEN': 'A_Token001',}
params = (('path', 'ProjectSubDirectory'),)
response = requests.get('http://192.168.156.55/api/v4/projects/10/repository/archive', headers=headers, params=params)
For now, I just created a script and call it with sub-process, but I don't like much this approach due to Python has libraries, as requests, that I guess should have some way to do the same...

2 key things.
Allow redirects
Use raise_for_status() to make sure the request was successful before writing the file. This will help uncover other potential issues, like failed authentication.
After that write response.content to a file opened in binary mode for writing ('wb')
import requests
url = "https://..."
headers = {} # ...
paramus = {} # ...
output_path = 'path/to/local/file.tar.gz'
response = requests.get(url, headers=headers, params=params, allow_redirects=True)
response.raise_for_status() # make sure the request is successful
with open(output_path, 'wb') as f:
f.write(response.content)

How to implement a CURL -o in Python using request or aoihttp

I need to do a get to a URL with certain Headers and Parameters in my python code and then write to an excel file. I am trying the below code but getting the below error when I try to open the excel file. Also given below is a redacted version of a CURL I use to test this API endpoint with success that I cannot replicate in my python code
Python Code:
import requests
rheaders = {
"Content-type": "application/json",
"Accept": "application/json",
"Authorization": "Token XXX",
}
url = "https://api.YYY"
r = requests.get(url, headers=rheaders)
with open("test.xlsx" , mode="wb") as output:
output.write(r.content)
Error: Excel cannot open the file 'test.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.
The assosciated CURL that runs fine in osx terminal and writes the output file without errors
curl -H "Accept: application/json; indent=4" -H 'Content-Type: application/json' -H 'Authorization: Token XXX' https://api.YYY --output ~/Downloads/output.xlsx
I have tried replacing the content-type header with this:
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
No luck. Any suggestions? The above is my attempt with python library requests. I have also tried using aiohttp

The issue is that Excel cannot understand the format of the data that is being written into the file, because Python by itself does not support the Excel format. Try using a library such as xlrd to write to it. Alternatively, you could write to a .csv file instead, which wouldn't require any extra libraries and would still be readable by Excel

How do I use requests.put() to upload a file using Python?

I am trying to use the requests library in Python to upload a file into Fedora commons repository on localhost. I'm fairly certain my main problem is not understanding open() / read() and what I need to do to send data with an http request.
def postBinary(fileName,dirPath,url):
path = dirPath+'/'+fileName
print('to ' + url + '\n' + path)
openBin = {'file':(fileName,open(path,'rb').read())}
headers = {'Slug': fileName} #not important
r = requests.put(url, files=openBin,headers=headers, auth=HTTPBasicAuth('username', 'pass'))
print(r.text)
print("and the url used:")
print(r.url)
This will successfully upload a file in the repository, but it will be slightly larger and corrupted after. For example an image that was 6.6kb became 6.75kb and was not openable anymore.
So how should I properly open and upload a file using put in python?
###Extra details:###
When I replace files=openBin with data=openBin I end up with my dictionary and I presume the data as a string. I don't know if that information is helpful or not.
"file=FILE_NAME.extension&file=TYPE89a%24%02Q%03%E7%FF%00E%5B%19%FC%....
and the size of the file increases to a number of megabytes
I am using specifically put because the Fedora RESTful HTTP API end point says to use put.
The following command does work:
curl -u username:password -H "Content-Type: text/plain" -X PUT -T /path/to/someFile.jpeg http://localhost:8080/fcrepo/rest/someFile.jpeg

Updated
Using requests.put() with the files parameter sends a multipart/form-data encoded request which the server does not seem to be able to handle without corrupting the data, even when the correct content type is declared.
The curl command simply performs a PUT with the raw data contained in the body of the request. You can create a similar request by passing the file data in the data parameter. Specify the content type in the header:
headers = {'Content-type': 'image/jpeg', 'Slug': fileName}
r = requests.put(url, data=open(path, 'rb'), headers=headers, auth=('username', 'pass'))
You can vary the Content-type header to suit the payload as required.
Try setting the Content-type for the file.
If you are sure that it is a text file then try text/plain which you used in your curl command - even though you would appear to be uploading a jpeg file? However, for a jpeg image, you should use image/jpeg.
Otherwise for arbitrary binary data you can use application/octet-stream:
openBin = {'file': (fileName, open(path,'rb'), 'image/jpeg' )}
Also it is not necessary to explicitly read the file contents in your code, requests will do that for you, so just pass the open file handle as shown above.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python requests zip upload makes zipfile unreadable in Windows - python

Related

Upload secure files to GitLab using requests python module

zip file upload with Python Requests Not working

Python version for curl --output

How to implement a CURL -o in Python using request or aoihttp

How do I use requests.put() to upload a file using Python?

Categories

Resources