Write files to disk with python 3.x - python

Using BottlePy, I use the following code to upload a file and write it to disk :
upload = request.files.get('upload')
raw = upload.file.read()
filename = upload.filename
with open(filename, 'w') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
It returns the proper amount of bytes every single time.
The upload works fine for file like .txt, .php, .css ...
But it results in a corrupted file for other files like .jpg, .png, .pdf, .xls ...
I tried to change the open() function
with open(filename, 'wb') as f:
It returns the following error:
TypeError('must be bytes or buffer, not str',)
I guess its an issue related to binary files ?
Is there something to install on top of Python to run upload for any file type ?
Update
Just to be sure, as pointed out by #thkang I tried to code this using the dev version of bottlepy and the built-in method .save()
upload = request.files.get('upload')
upload.save(upload.filename)
It returns the exact same Exception error
TypeError('must be bytes or buffer, not str',)
Update 2
Here the final code which "works" (and dont pop the error TypeError('must be bytes or buffer, not str',)
upload = request.files.get('upload')
raw = upload.file.read().encode()
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
Unfortunately, the result is the same : every .txt file works fine, but other files like .jpg, .pdf ... are corrupted
I've also noticed that those file (the corrupted one) have a larger size than the orginal (before upload)
This binary thing must be the issue with Python 3x
Note :
I use python 3.1.3
I use BottlePy 0.11.6 (raw bottle.py file, no 2to3 on it or anything)

Try this:
upload = request.files.get('upload')
with open(upload.file, "rb") as f1:
raw = f1.read()
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
Update
Try value:
# Get a cgi.FieldStorage object
upload = request.files.get('upload')
# Get the data
raw = upload.value;
# Write to file
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
Update 2
See this thread, it seems to do same as what you are trying...
# Test if the file was uploaded
if fileitem.filename:
# strip leading path from file name to avoid directory traversal attacks
fn = os.path.basename(fileitem.filename)
open('files/' + fn, 'wb').write(fileitem.file.read())
message = 'The file "' + fn + '" was uploaded successfully'
else:
message = 'No file was uploaded'

In Python 3x all strings are now unicode, so you need to convert the read() function used in this file upload code.
The read() function returns a unicode string aswell, which you can convert into proper bytes via encode() function
Use the code contained in my first question, and replace the line
raw = upload.file.read()
with
raw = upload.file.read().encode('ISO-8859-1')
That's all ;)
Further reading : http://python3porting.com/problems.html

Related

Python issue in writing file to pdf

I am working with pdf content in python and my input from a service response is of the type _io.BufferedRandom. I need to save this file as pdf within my service for further usage
response = open('test_file.pdf', 'rb+')
this is the input to my service and is of the type _io.BufferedRandom
with open('output.pdf', 'wb+') as f:
f.write(response)
doing this I get the error - TypeError: a bytes-like object is required, not '_io.BufferedRandom'
Any help is appreciated thank you.
As an open method return the file object to open a file for reading/write or append. like
open(filename, mode)
f = open('workfile', 'w')
and in your case, you try to write file object to another file, not the content
f.write(response)
So you will need to use read function as
f.read(size) - read file and return a string (in text mode) or bytes object (in binary mode).
so the final procedure will be
with open('output.pdf', 'wb+') as f:
f.write(response.read())

Open file to read from web form

I am working on a project where I have to upload a file from file storage (via web form) to MongoDB. In order to achieve this, I need to open the file in "rb" mode, then encode the file and finally upload to MongoDb. I am stuck when opening the file "rb" mode.
if form.validate():
for inFile in request.files.getlist("file"):
connection = pymongo.MongoClient()
db = connection.test
uploads = db.uploads
with open(inFile, "rb") as fin:
f = fin.read()
encoded = Binary(f,0)
try:
uploads.insert({"binFile": encoded})
check = True
except Exception as e:
self.errorList.append("Document upload is unsuccessful"+e)
check = False
The above code is throwing TypeError: coercing to Unicode: need string or buffer, FileStorage found in the open step, i.e. this line:
with open(inFile, "rb") as fin:
Is there a way I can change my code to make it work?
Thanks in advance
The FileStorage object is already file-like so you can use as a file. You don't need to use open on it, just call inFile.read().
If this doesn't work for you for some reason, you can save the file to disk first using inFile.save() and open it from there.
Reference: http://werkzeug.pocoo.org/docs/0.11/datastructures/#werkzeug.datastructures.FileStorage

Write Binary data with python to a zip file

I am tying to write a binary data to a zip file.
The below works but if I try to add a .zip as a file extension to "check" in the variable x nothing is written to the file. I am stuck manually adding .zip
urla = "some url"
tok = "some token"
pp = {"token": tok}
t = requests.get(urla, params=pp)
b = t.content
x = r"C:\temp" + "\check"
z = 'C:\temp\checks.zip'
with open(x, "wb") as work:
work.write(b)
In order to have the correct extension appended to the file I attempted to use the module ZipFile
with ZipFile(x, "wb") as work:
work.write(b)
but get a RuntimeError:
RuntimeError: ZipFile() requires mode "r", "w", or "a"
If I remove the b flag an empty zipfile is created and I get a TypeError:
TypeError: must be encoded string without NULL bytes, not str
I also tried but it creates a corrupted zipfile.
os.rename(x, z )
How do you write binary data to a zip file.
I converted a zip file into binary data and was able to regenerate the zip file in the following way:
bin_data=b"\x0\x12" #Whatever binary data you have store in a variable
binary_file_path = 'file.zip' #Name for new zip file you want to regenerate
with open(binary_file_path, 'wb') as f:
f.write(bin_data)
Use the writestr method.
import zipfile
z = zipfile.ZipFile(path, 'w')
z.writestr(filename, bytes)
z.close()
zipfile.ZipFile.writestr
You don't write the data directly to the zip file. You write it to a file, then you write the filepath to the zip file.
binary_file_path = '/path/to/binary/file.ext'
with open(binary_file_path, 'wb') as f:
f.write('BINARYDATA')
zip_file_path = '/path/to/zip/file.zip'
with ZipFile(zip_file_path, 'w') as zip_file:
zip_file.write(binary_file_path)

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat "input" is a file type object. I have researched this error and it translates to "file not found". I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:
Traceback (most recent call last):
File "CommissionSecurity.py", line 52, in <module>
inputStream = PyPDF2.PdfFileReader(input)
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument
Below is the relevant code. I'm not sure how to correct this issue because I'm not really sure what the issue is. Any guidance is appreciated.
for ID in FileDict:
if ID in EmailDict :
path = "C:\\Apps\\CorVu\\DATA\\Reports\\AlliD\\Monthly Commission Reports\\Output\\pdcom1\\"
#print os.listdir(path)
file = os.path.join(path + FileDict[ID])
with open(file, 'rb') as input:
print type(input)
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = inputStream.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
else : continue
I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:
with open(file, 'rb') as input :
with open(file, 'wb') as outputStream :
The w mode will truncate the file, thus the second line truncates the input.
I'm not sure what you're intention is, because you can't really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you'll have to position the file pointer somewhere.
So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.
Or you could first read the complete file into memory, then write to it:
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
Notes:
you assign inputStream, but never use it
you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.
Please check carefully what you're doing, because it feels there are numerous other problems with your code.
Alternatively, here are some other tips that may help:
The documentation suggests that you can also use the filename as first argument to PdfFileReader:
stream – A File object or an object that supports the standard read
and seek methods similar to a File object. Could also be a string
representing a path to a PDF file.
So try:
inputStream = PyPDF2.PdfFileReader(file)
You can also try to set the strict argument to False:
strict (bool) – Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to True.
For example:
inputStream = PyPDF2.PdfFileReader(file, strict=False)
Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
This error raised up because of PDF file is empty.
My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,
And it solved my Problem!!!
Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file

query method to copy .pdf, .html, .jpeg files

def fetch(self, query, secret):
if secret != self.secret: raise AccessDenied
result = self.query(query)
f = open(join(self.dirname, query), 'w')
f.write(result)
f.close()
return 0
I am trying to get peers fetch files from one host to another using this method(peer-to-peer program).
This method only takes text as it is opening the file and writing the contents to f.
How can I copy .pdf , .mpeg, jpeg files copied/downloaded to the peers directory!
As long as your query method supports binary, try 'wb' instead of 'w'.
To write binary data you should open the file using the file mode 'wb' (write binary). i.e.:
f = open(join(self.dirname, query), 'wb')

Categories

Resources