I am trying to save an image with python that is Base64 encoded. Here the string is to large to post but here is the image
And when received by python the last 2 characters are == although the string is not formatted so I do this
import base64
data = "data:image/png;base64," + photo_base64.replace(" ", "+")
And then I do this
imgdata = base64.b64decode(data)
filename = 'some_image.jpg' # I assume you have a way of picking unique filenames
with open(filename, 'wb') as f:
f.write(imgdata)
But this causes this error
Traceback (most recent call last):
File "/var/www/cgi-bin/save_info.py", line 83, in <module>
imgdata = base64.b64decode(data)
File "/usr/lib64/python2.7/base64.py", line 76, in b64decode
raise TypeError(msg)
TypeError: Incorrect padding
I also printed out the length of the string once the data:image/png;base64, has been added and the spaces replace with + and it has a length of 34354, I have tried a bunch of different images but all of them when I try to open the saved file say that the file is damaged.
What is happening and why is the file corrupt?
Thanks
EDIT
Here is some base64 that also failed
iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAADBQTFRFA6b1q Ci5/f2lt/9yu3 Y8v2cMpb1/DSJbz5i9R2NLwfLrWbw m T8I8////////SvMAbAAAABB0Uk5T////////////////////AOAjXRkAAACYSURBVHjaLI8JDgMgCAQ5BVG3//9t0XYTE2Y5BPq0IGpwtxtTP4G5IFNMnmEKuCopPKUN8VTNpEylNgmCxjZa2c1kafpHSvMkX6sWe7PTkwRX1dY7gdyMRHZdZ98CF6NZT2ecMVaL9tmzTtMYcwbP y3XeTgZkF5s1OSHwRzo1fkILgWC5R0X4BHYu7t/136wO71DbvwVYADUkQegpokSjwAAAABJRU5ErkJggg==
This is what I receive in my python script from the POST Request
Note I have not replace the spaces with +'s
There is no need to add data:image/png;base64, before, I tried using the code below, it works fine.
import base64
data = 'iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAADBQTFRFA6b1q Ci5/f2lt/9yu3 Y8v2cMpb1/DSJbz5i9R2NLwfLrWbw m T8I8////////SvMAbAAAABB0Uk5T////////////////////AOAjXRkAAACYSURBVHjaLI8JDgMgCAQ5BVG3//9t0XYTE2Y5BPq0IGpwtxtTP4G5IFNMnmEKuCopPKUN8VTNpEylNgmCxjZa2c1kafpHSvMkX6sWe7PTkwRX1dY7gdyMRHZdZ98CF6NZT2ecMVaL9tmzTtMYcwbP y3XeTgZkF5s1OSHwRzo1fkILgWC5R0X4BHYu7t/136wO71DbvwVYADUkQegpokSjwAAAABJRU5ErkJggg=='.replace(' ', '+')
imgdata = base64.b64decode(data)
filename = 'some_image.jpg' # I assume you have a way of picking unique filenames
with open(filename, 'wb') as f:
f.write(imgdata)
If you append data:image/png;base64, to data, then you get error. If You have this, you must replace it.
new_data = initial_data.replace('data:image/png;base64,', '')
Related
I am using the cryptography library for python. My goal is to take a string, encrypt it and then write to to a file.
This may be done multiple times, with each appending to the end of the file additional data; which is also encrypted.
I have tried a few solutions, such as:
Using the hazmat level API to avoid as much meta data stored in the encrypted text.
Writing each encrypted string to a new line in a text file.
This is the code that uses ECB mode and the hazmat API. It attempts to read the file and decrypt line by line. I understand it is unsafe, my main use is to log this data only locally to a file and then use a safe PKCS over the wire.
from cryptography import fernet
key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt_obj = fernet.Fernet(key)
file_handle = open('test.txt', 'a')
data = 'Hello1'
data = crypt_obj.encrypt(data.encode())
file_handle.write(data.decode() + '\n')
file_handle.close()
file_handle_two = open('test.txt', 'a')
data_two = 'Hello2'
data_two = crypt_obj.encrypt(data_two.encode())
file_handle_two.write(data_two.decode() + '\n')
file_handle_two.close()
file_read = open('test.txt', 'r')
file_lines = file_read.readlines()
file_content = ''
for line in file_lines:
line = line[:-2]
file_content = crypt_obj.decrypt(line.encode()).decode()
print(file_content)
file_read.close()
For the code above I get the following error:
Traceback (most recent call last):
File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 110, in _get_unverified_token_data
data = base64.urlsafe_b64decode(token)
File "C:\Users\19097\AppData\Local\Programs\Python\Python39\lib\base64.py", line 133, in urlsafe_b64decode
return b64decode(s)
File "C:\Users\19097\AppData\Local\Programs\Python\Python39\lib\base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Dev\Python\local_crypt_test\main.py", line 25, in <module>
file_content = crypt_obj.decrypt(line.encode()).decode()
File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 83, in decrypt
timestamp, data = Fernet._get_unverified_token_data(token)
File "C:\Dev\Python\local_crypt_test\venv\lib\site-packages\cryptography\fernet.py", line 112, in _get_unverified_token_data
raise InvalidToken
cryptography.fernet.InvalidToken
Process finished with exit code 1
These examples are only to demonstrate the issue, my real code looks much different so you may ignore errors in the example that do not pertain to my main issue. That is, appending encrypted data to a file and decrypting/reading that data from the file at a later time. The file does not need to be in any specific format, as long as it can be read from and decrypted to obtain the original message. Also, the mode of operation is not tied to ECB, if your example uses another type, that works too.
I am honestly stumped and would appreciate any help!
There are a couple details at play here...
1. Trailing newline character(s) are included in each line
When you loop through file_lines, each line includes the trailing newline character(s).
I say "character(s)" because this can vary based on the platform (e.g. Linux/macOS = '\n' versus Windows = '\r\n').
2. base64 decoding silently discards invalid characters
Fernet.encrypt(data) returns a bytes instance containing a base64 encoded "Fernet token".
Conversely, the first step Fernet.decrypt(token) takes is decoding the token by calling base64.urlsafe_b64decode(). This function uses the default non-validating behavior in which characters not within the base64 set are discarded (described here).
Note: This is why the answer from TheTS happens to work despite leaving the extraneous newline character intact.
Solution
I'd recommend making sure you provide Fernet.decrypt() the token exactly as produced by Fernet.encrypt(). I'm guessing this is what you were trying to do by stripping the last two characters.
Here's an approach that should be safe and not platform dependent.
When you call open() for writing, provide the newline='\n' argument to prevent the default behavior of converting instances of '\n' to the platform dependent os.linesep value (in the section describing the newline argument, see the second bullet point detailing how the argument applies when writing files).
When processing each line, use rstrip('\n') to remove the expected trailing newline.
Here's a code example that demonstrates this:
#!/usr/bin/python3
from cryptography import fernet
to_encrypt = ['Hello1', 'Hello2']
output_file = 'test.txt'
key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt = fernet.Fernet(key)
print("ENCRYPTING...")
for data in to_encrypt:
data_bytes = data.encode('utf-8')
token_bytes = crypt.encrypt(data_bytes)
print(f'data: {data}')
print(f'token_bytes: {token_bytes}\n')
with open(output_file, 'a', newline='\n') as f:
f.write(token_bytes.decode('utf-8') + '\n')
print("\nDECRYPTING...")
with open(output_file, 'r') as f:
for line in f:
# Create a copy of line which shows the trailing newline.
line_escaped = line.encode('unicode_escape').decode('utf-8')
line_stripped = line.rstrip('\n')
token_bytes = line_stripped.encode('utf-8')
data = crypt.decrypt(token_bytes).decode('utf-8')
print(f'line_escaped: {line_escaped}')
print(f'token_bytes: {token_bytes}')
print(f'decrypted data: {data}\n')
Output:
Note the trailing newline when line escaped is printed.
$ python3 solution.py
ENCRYPTING...
data: Hello1
token_bytes: b'gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A=='
data: Hello2
token_bytes: b'gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w=='
DECRYPTING...
line_escaped: gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A==\n
token_bytes: b'gAAAAABi-LAo-h8w-ayc267hrLbswMZtkT4RQQ9wt0EusYNrZGjuzbpyRLoKDZZF4oQPOU-iH1PnCc7vSIOoTVMLlCFnHTkN6A=='
decrypted data: Hello1
line_escaped: gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w==\n
token_bytes: b'gAAAAABi-LAoHUT8Iu1bVMcGSIrFRvtVZQFh4O52XYSCgd0leYWS-n38irhv3Ch7oEx6SXazHwAL7a57ncFoMJTQQAms52yf3w=='
decrypted data: Hello2
from cryptography import fernet
key = 'WqSAOfEoOdSP0c6i1CiyoOpTH2Gma3ff_G3BpDx52sE='
crypt_obj = fernet.Fernet(key)
file_handle = open('test.txt', 'a')
data = 'Hello1'
data = crypt_obj.encrypt(data.encode('utf-8'))
file_handle.write(data.decode('utf-8') + '\n')
file_handle.close()
file_handle_two = open('test.txt', 'a')
data_two = 'Hello2'
data_two = crypt_obj.encrypt(data_two.encode('utf-8'))
file_handle_two.write(data_two.decode('utf-8') + '\n')
file_handle_two.close()
file_read = open('test.txt', 'r')
file_lines = file_read.readlines()
file_content = ''
for line in file_lines:
# line = line[:-2]
file_content = crypt_obj.decrypt(line.encode('utf-8')).decode()
print(file_content)
file_read.close()
By removing the last characters from the string you also remove important characters for decoding.
I need to send an jpg image over network via json. I tried to convert the data into str via base64, as below:
from PIL import Image
from tinydb import TinyDB, Query
import base64
import io
from pdb import set_trace as bp
# note: with 'encoding' in name, it is always a bytes obj
in_jpg_encoding = None
# open some randome image
with open('rkt2.jpg', 'rb') as f:
# The file content is a jpeg encoded bytes object
in_jpg_encoding = f.read()
# output is a bytes object
in_b64_encoding = base64.b64encode(in_jpg_encoding)
# interpret above bytes as str, because json value need to be string
in_str = in_b64_encoding.decode(encoding='utf-8')
# in_str = str(in_b64_encoding) # alternative way of above statement
# simulates a transmission, e.g. sending the image data over internet using json
out_str = in_str
# strip-off the utf-8 interpretation to restore it as a base64 encoding
out_utf8_encoding = out_str.encode(encoding='utf-8')
# out_utf8_encoding = out_str.encode() # same way of writing above statement
# strip off the base64 encoding to restore it as its original jpeg encoded conent
# note: output is still a bytes obj due to b64 decoding
out_b64_decoding = base64.b64decode(out_utf8_encoding)
out_jpg_encoding = out_b64_decoding
# ---- verification stage
out_jpg_file = io.BytesIO(out_jpg_encoding)
out_jpg_image = Image.open(out_jpg_file)
out_jpg_image.show()
But I got error at the deserialization stage, saying the cannot identify the image as file:
Traceback (most recent call last):
File "3_test_img.py", line 38, in <module>
out_jpg_image = Image.open(out_jpg_file)
File "/home/gaopeng/Envs/venv_celeb_parser/lib/python3.6/site-packages/PIL/Image.py", line 2687, in open
% (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f6f823c6b48>
Did I missed something?
I need to append many binary files in one binary file. All my binary files are saved i one folder:
file1.bin
file2.bin
...
For that I try by using this code:
import numpy as np
import glob
import os
Power_Result_File_Path ="/home/Deep_Learning_Based_Attack/Test.bin"
Folder_path =r'/home/Deep_Learning_Based_Attack/Test_Folder/'
os.chdir(Folder_path)
npfiles= glob.glob("*.bin")
loadedFiles = [np.load(bf) for bf in binfiles]
PowerArray=np.concatenate(loadedFiles, axis=0)
np.save(Power_Result_File_Path, PowerArray)
It gives me this error:
"Failed to interpret file %s as a pickle" % repr(file))
OSError: Failed to interpret file 'file.bin' as a pickle
My problem is how to concatenate binary file it is not about anaylysing every file indenpendently.
Taking your question literally: Brute raw data concatenation
files = ['my_file1', 'my_file2']
out_data = b''
for fn in files:
with open(fn, 'rb') as fp:
out_data += fp.read()
with open('the_concatenation_of_all', 'wb') as fp:
fp.write(out_data)
Comment about your example
You seem to be interpreting the files as saved numpy arrays (i.e. saved via np.save()). The error, however, tells me that you didn't save those files via numpy (because it fails decoding them). Numpy uses pickle to save and load, so if you try to open a random non-pickle file with np.load the call will throw an error.
for file in files:
async with aiofiles.open(file, mode='rb') as f:
contents = await f.read()
if file == files[0]:
write_mode = 'wb' # overwrite file
else:
write_mode = 'ab' # append to end of file
async with aiofiles.open(output_file), write_mode) as f:
await f.write(contents)
Using BottlePy, I use the following code to upload a file and write it to disk :
upload = request.files.get('upload')
raw = upload.file.read()
filename = upload.filename
with open(filename, 'w') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
It returns the proper amount of bytes every single time.
The upload works fine for file like .txt, .php, .css ...
But it results in a corrupted file for other files like .jpg, .png, .pdf, .xls ...
I tried to change the open() function
with open(filename, 'wb') as f:
It returns the following error:
TypeError('must be bytes or buffer, not str',)
I guess its an issue related to binary files ?
Is there something to install on top of Python to run upload for any file type ?
Update
Just to be sure, as pointed out by #thkang I tried to code this using the dev version of bottlepy and the built-in method .save()
upload = request.files.get('upload')
upload.save(upload.filename)
It returns the exact same Exception error
TypeError('must be bytes or buffer, not str',)
Update 2
Here the final code which "works" (and dont pop the error TypeError('must be bytes or buffer, not str',)
upload = request.files.get('upload')
raw = upload.file.read().encode()
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
Unfortunately, the result is the same : every .txt file works fine, but other files like .jpg, .pdf ... are corrupted
I've also noticed that those file (the corrupted one) have a larger size than the orginal (before upload)
This binary thing must be the issue with Python 3x
Note :
I use python 3.1.3
I use BottlePy 0.11.6 (raw bottle.py file, no 2to3 on it or anything)
Try this:
upload = request.files.get('upload')
with open(upload.file, "rb") as f1:
raw = f1.read()
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
Update
Try value:
# Get a cgi.FieldStorage object
upload = request.files.get('upload')
# Get the data
raw = upload.value;
# Write to file
filename = upload.filename
with open(filename, 'wb') as f:
f.write(raw)
return "You uploaded %s (%d bytes)." % (filename, len(raw))
Update 2
See this thread, it seems to do same as what you are trying...
# Test if the file was uploaded
if fileitem.filename:
# strip leading path from file name to avoid directory traversal attacks
fn = os.path.basename(fileitem.filename)
open('files/' + fn, 'wb').write(fileitem.file.read())
message = 'The file "' + fn + '" was uploaded successfully'
else:
message = 'No file was uploaded'
In Python 3x all strings are now unicode, so you need to convert the read() function used in this file upload code.
The read() function returns a unicode string aswell, which you can convert into proper bytes via encode() function
Use the code contained in my first question, and replace the line
raw = upload.file.read()
with
raw = upload.file.read().encode('ISO-8859-1')
That's all ;)
Further reading : http://python3porting.com/problems.html
I want to encode an image into a string using the base64 module. I've ran into a problem though. How do I specify the image I want to be encoded? I tried using the directory to the image, but that simply leads to the directory being encoded. I want the actual image file to be encoded.
EDIT
I tried this snippet:
with open("C:\Python26\seriph1.BMP", "rb") as f:
data12 = f.read()
UU = data12.encode("base64")
UUU = base64.b64decode(UU)
print UUU
self.image = ImageTk.PhotoImage(Image.open(UUU))
but I get the following error:
Traceback (most recent call last):
File "<string>", line 245, in run_nodebug
File "C:\Python26\GUI1.2.9.py", line 473, in <module>
app = simpleapp_tk(None)
File "C:\Python26\GUI1.2.9.py", line 14, in __init__
self.initialize()
File "C:\Python26\GUI1.2.9.py", line 431, in initialize
self.image = ImageTk.PhotoImage(Image.open(UUU))
File "C:\Python26\lib\site-packages\PIL\Image.py", line 1952, in open
fp = __builtin__.open(fp, "rb")
TypeError: file() argument 1 must be encoded string without NULL bytes, not str
What am I doing wrong?
I'm not sure I understand your question. I assume you are doing something along the lines of:
import base64
with open("yourfile.ext", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
You have to open the file first of course, and read its contents - you cannot simply pass the path to the encode function.
Edit:
Ok, here is an update after you have edited your original question.
First of all, remember to use raw strings (prefix the string with 'r') when using path delimiters on Windows, to prevent accidentally hitting an escape character. Second, PIL's Image.open either accepts a filename, or a file-like (that is, the object has to provide read, seek and tell methods).
That being said, you can use cStringIO to create such an object from a memory buffer:
import cStringIO
import PIL.Image
# assume data contains your decoded image
file_like = cStringIO.StringIO(data)
img = PIL.Image.open(file_like)
img.show()
The first answer will print a string with prefix b'.
That means your string will be like this b'your_string' To solve this issue please add the following line of code.
encoded_string= base64.b64encode(img_file.read())
print(encoded_string.decode('utf-8'))
I have experienced this while converting Image to Base64 string. You can take a look at how I removed that from there also. Link is here Image to base64 string and fix 'b from prefix
import base64
from PIL import Image
from io import BytesIO
with open("image.jpg", "rb") as image_file:
data = base64.b64encode(image_file.read())
im = Image.open(BytesIO(base64.b64decode(data)))
im.save('image1.png', 'PNG')
Borrowing from what Ivo van der Wijk and gnibbler have developed earlier, this is a dynamic solution
import cStringIO
import PIL.Image
image_data = None
def imagetopy(image, output_file):
with open(image, 'rb') as fin:
image_data = fin.read()
with open(output_file, 'w') as fout:
fout.write('image_data = '+ repr(image_data))
def pytoimage(pyfile):
pymodule = __import__(pyfile)
img = PIL.Image.open(cStringIO.StringIO(pymodule.image_data))
img.show()
if __name__ == '__main__':
imagetopy('spot.png', 'wishes.py')
pytoimage('wishes')
You can then decide to compile the output image file with Cython to make it cool. With this method, you can bundle all your graphics into one module.
As I said in your previous question, there is no need to base64 encode the string, it will only make the program slower. Just use the repr
>>> with open("images/image.gif", "rb") as fin:
... image_data=fin.read()
...
>>> with open("image.py","wb") as fout:
... fout.write("image_data="+repr(image_data))
...
Now the image is stored as a variable called image_data in a file called image.py
Start a fresh interpreter and import the image_data
>>> from image import image_data
>>>
Its work for me
import base64
import requests
# Getting image in bytes
response = requests.get("image_url")
# image encoding
encoded_image = base64.b64encode(response.content)
# image decoding and without it's won't work due to some '\xff' error
decoded_image= base64.b64decode(encoded_image)