Decode Base64 after it has been saved as a string object? - python

I am fairly new to Python and am attempting to compile a text (.txt) document that acts as a save file and can be loaded later.
I would like it to be a standalone document that holds all attributes the user is working with (including some images that I would like to be saved in the file as encoded base64 binary strings).
I have written the program and it saves everything to the text file correctly (although I did have to pass the encoded values through a str()) but I am unable to access the images later for decoding. Here is an example of my creation of the text information:
if os.path.isfile("example.png"): #if the user has created this type of image..
with open("example.png", "rb") as image_file:
image_data_base64_encoded_string = base64.b64encode(image_file.read())
f = open("example_save.txt",'a+')
f.write("str(image_data_base64_encoded_string)+"\n")
f.close() #save its information to the text doc
And here is an example of one of my many attempts to re-access this information.
master.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = ((".txt files","*.txt"),("all files","*.*")))
with open(master.filename) as f:
image_import = ((f.readlines()[3]))#pulling the specific line the data string is in
image_imported = tk.PhotoImage(data=image_import)
This is only my most recent attempt of many - and still returns an error. I tried decoding the encoded information before passing to the tkinter PhotoImage function but I think that Python may be seeing the encoded information as a string (since I made it one when I saved the information) but I do not know how to change it back without altering the information.
Any help would be appreciated.

I would recommend using Pillow module for working with images but if you insist on your current way try this code below:
from tkinter import *
import base64
import os
if os.path.isfile("example.png"): #if the user has created this type of image..
with open("example.png", "rb") as image_file:
image_data_base64_encoded_string = base64.b64encode(image_file.read())
f = open("example_save.txt",'a+')
f.write(image_data_base64_encoded_string.decode("utf-8")+"\n")
f.close()
filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = ((".txt files","*.txt"),("all files","*.*")))
with open(filename) as f:
image_import = f.readlines()[3].strip()
image_imported = PhotoImage(data=image_import)
You see your string needs to be utf-8 and that trailing newline character is also preventing PhotoImage() from interpreting your image data as image.

When you write out the value as such:
str(image_data_base64_encoded_string)
That's writing it as follows:
b'...blah...'
Look at the file you're writing, you'll find that line is surrounded by b' '.
You want to decode the binary into the appropriate encoding for your file, for example:
f.write(image_data_base64_encoded_string.decode('utf-8') + "\n")

Related

ReConvert bytes to a file Python

For a personal project i would like to convert files of any types (pdf, png, mp3...) to bytes type and then reconvert the bytes file to the original type.
I made the first part, but i need help for the second part.
In the following example, I read a .jpg file as bytes and i save its content in the "content" object. Now i would like to reconvert "content" (bytes type) to the original .jpg type.
test_file = open("cadenas.jpg", "rb")
content = test_file.read()
content
b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x0 ...
Could you help me ?
Regards
Pictures uses Base64 encoding.
This should do the job.
import base64
test_file = open('cadenas.jpg', 'rb')
content = test_file.read()
content_encode = base64.encodestring(content)
content_decode = base64.decodebytes(content_encode)
result_file = open('cadenas2.jpg', 'wb')
result_file.write(content_decode)

Create a pdf file, write in it and return its byte stream with PyMuPDF

Using PyMuPDF, I need to create a PDF file, write some text into it, and return its byte stream.
This is the code I have, but it uses the filesystem to create and save the file:
import fitz
path = "PyMuPDF_test.pdf"
doc = fitz.open()
page = doc.newPage()
where = fitz.Point(50, 100)
page.insertText(where, "PDF created with PyMuPDF", fontsize=50)
doc.save(path) # Here Im saving to the filesystem
with open(path, "rb") as file:
return io.BytesIO(file.read()).getvalue()
Is there a way I can create a PDF file, write some text in it, and return its byte stream without using the filesystem?
Checking save() I found write() which gives it directly as bytes
import fitz
#path = "PyMuPDF_test.pdf"
doc = fitz.open()
page = doc.newPage()
where = fitz.Point(50, 100)
page.insertText(where, "PDF created with PyMuPDF", fontsize=50)
print(doc.write())

Parse excel attachment from .eml file in python

I'm trying to parse a .eml file. The .eml has an excel attachment that's currently base 64 encoded. I'm trying to figure out how to decode it into XML so that I can later turn it into a CSV I can do stuff with.
This is my code right now:
import email
data = file('Openworkorders.eml').read()
msg = email.message_from_string(data)
for part in msg.walk():
c_type = part.get_content_type()
c_disp = part.get('Content Disposition')
if part.get_content_type() == 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet':
excelContents = part.get_payload(decode = True)
print excelContents
The problem is
When I try to decode it, it spits back something looking like this.
I've used this post to help me write the code above.
How can I get an email message's text content using Python?
Update:
This is exactly following the post's solution with my file, but part.get_payload() returns everything still encoded. I haven't figured out how to access the decoded content this way.
import email
data = file('Openworkorders.eml').read()
msg = email.message_from_string(data)
for part in msg.walk():
if part.get_content_type() == 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet':
name = part.get_param('name') or 'MyDoc.doc'
f = open(name, 'wb')
f.write(part.get_payload(None, True))
f.close()
print part.get("content-transfer-encoding")
As is clear from this table (and as you have already concluded), this file is an .xlsx. You can't just decode it with unicode or base64: you need a special package. Excel files specifically are a bit tricker (for e.g. this one does PowerPoint and Word, but not Excel). There are a few online, see here - xlrd might be the best.
Here is my solution:
I found 2 things out:
1.) I thought .open() was going inside the .eml and changing the selected decoded elements. I thought I needed to see decoded data before moving forward. What's really happening with .open() is it's creating a new file in the same directory of that .xlsx file. You must open the attachment before you will be able to deal with the data.
2.) You must open an xlrd workbook with the file path.
import email
import xlrd
data = file('EmailFileName.eml').read()
msg = email.message_from_string(data) # entire message
if msg.is_multipart():
for payload in msg.get_payload():
bdy = payload.get_payload()
else:
bdy = msg.get_payload()
attachment = msg.get_payload()[1]
# open and save excel file to disk
f = open('excelFile.xlsx', 'wb')
f.write(attachment.get_payload(decode=True))
f.close()
xls = xlrd.open_workbook(excelFilePath) # so something in quotes like '/Users/mymac/thisProjectsFolder/excelFileName.xlsx'
# Here's a bonus for how to start accessing excel cells and rows
for sheets in xls.sheets():
list = []
for rows in range(sheets.nrows):
for col in range(sheets.ncols):
list.append(str(sheets.cell(rows, col).value))

KineticJS toDataURL() gives incorrect padding error in Python

I have a base64 string that I've acquired from KineticJS toDataURL(). It's a valid base64 string. This fiddle shows that it is: http://jsfiddle.net/FakRe/
My Problem: I'm sending this dataURI up to my server, decoding using python, and saving on the server. However, I'm getting a padding error. Here's my code:
def base64_to_file(self, image_data, image_name):
extension = re.search('data\:image\/(\w+)\;', image_data)
if extension:
ext = extension.group(1)
image_name = '{0}.{1}'.format(image_name, ext)
else:
# it's not in a format we understand
return None
image_data = image_data + '=' #fix incorrect padding
image_path = os.path.join('/my/image/path/', image_name)
image_file = open(image_path, 'w+')
image_file.write(image_data.decode('base64'))
image_file.close()
return image_file
I can get around this padding error by doing this at the top of my function:
image_data = image_data + '=' #fixes incorrect padding
After I add the arbitrary padding, it decodes successfully and writes to the filesystem. However, whenever I try and view the image, it doesn't render, and gives me the 'broken image' icon. No 404, just a broken image. What gives? Any help would be greatly appreciated. I've already spent way too much time on this as it is.
Steps to recreate (May be pedantic but I'll try and help)
Grab the base64 string from the JSFiddle
Save it to a text file
Open up the file in python, read in the data, save to variable
Change '/path/to/my/image' in the function to anywhere on your machine
Pass in your encoded text variable into the function with an name
See the output
Again, any help would be awesome. Thanks.
If you need to add padding, you have the wrong string. Do make sure you are parsing the data URI correctly; the data:image/png;base64, section is metadata about the encoded value, only the part after the comma is the actual Base64 value.
The actual data portion is 223548 characters long:
>>> len(image_data)
223548
>>> import hashlib
>>> hashlib.md5(image_data).hexdigest()
'03918c3508fef1286af8784dc65f23ff'
If your URI still includes the data: prefix, do parse that out:
from urllib import unquote
if image_data.startswith('data:'):
params, data = image_data.split(',', 1)
params = params[5:] or 'text/plain;charset=US-ASCII'
params = params.split(';')
if not '=' in params[0] and '/' in params[0]:
mimetype = params.pop(0)
else:
mimetype = 'text/plain'
if 'base64' in params:
# handle base64 parameters first
data = data.decode('base64')
for param in params:
if param.startswith('charset='):
# handle characterset parameter
data = unquote(data).decode(param.split('=', 1)[-1])
This then lets you make some more informed decisions about what extension to use for image URLs, for example, as you now have the mimetype parsed out as well.

Encoding an image file with base64

I want to encode an image into a string using the base64 module. I've ran into a problem though. How do I specify the image I want to be encoded? I tried using the directory to the image, but that simply leads to the directory being encoded. I want the actual image file to be encoded.
EDIT
I tried this snippet:
with open("C:\Python26\seriph1.BMP", "rb") as f:
data12 = f.read()
UU = data12.encode("base64")
UUU = base64.b64decode(UU)
print UUU
self.image = ImageTk.PhotoImage(Image.open(UUU))
but I get the following error:
Traceback (most recent call last):
File "<string>", line 245, in run_nodebug
File "C:\Python26\GUI1.2.9.py", line 473, in <module>
app = simpleapp_tk(None)
File "C:\Python26\GUI1.2.9.py", line 14, in __init__
self.initialize()
File "C:\Python26\GUI1.2.9.py", line 431, in initialize
self.image = ImageTk.PhotoImage(Image.open(UUU))
File "C:\Python26\lib\site-packages\PIL\Image.py", line 1952, in open
fp = __builtin__.open(fp, "rb")
TypeError: file() argument 1 must be encoded string without NULL bytes, not str
What am I doing wrong?
I'm not sure I understand your question. I assume you are doing something along the lines of:
import base64
with open("yourfile.ext", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
You have to open the file first of course, and read its contents - you cannot simply pass the path to the encode function.
Edit:
Ok, here is an update after you have edited your original question.
First of all, remember to use raw strings (prefix the string with 'r') when using path delimiters on Windows, to prevent accidentally hitting an escape character. Second, PIL's Image.open either accepts a filename, or a file-like (that is, the object has to provide read, seek and tell methods).
That being said, you can use cStringIO to create such an object from a memory buffer:
import cStringIO
import PIL.Image
# assume data contains your decoded image
file_like = cStringIO.StringIO(data)
img = PIL.Image.open(file_like)
img.show()
The first answer will print a string with prefix b'.
That means your string will be like this b'your_string' To solve this issue please add the following line of code.
encoded_string= base64.b64encode(img_file.read())
print(encoded_string.decode('utf-8'))
I have experienced this while converting Image to Base64 string. You can take a look at how I removed that from there also. Link is here Image to base64 string and fix 'b from prefix
import base64
from PIL import Image
from io import BytesIO
with open("image.jpg", "rb") as image_file:
data = base64.b64encode(image_file.read())
im = Image.open(BytesIO(base64.b64decode(data)))
im.save('image1.png', 'PNG')
Borrowing from what Ivo van der Wijk and gnibbler have developed earlier, this is a dynamic solution
import cStringIO
import PIL.Image
image_data = None
def imagetopy(image, output_file):
with open(image, 'rb') as fin:
image_data = fin.read()
with open(output_file, 'w') as fout:
fout.write('image_data = '+ repr(image_data))
def pytoimage(pyfile):
pymodule = __import__(pyfile)
img = PIL.Image.open(cStringIO.StringIO(pymodule.image_data))
img.show()
if __name__ == '__main__':
imagetopy('spot.png', 'wishes.py')
pytoimage('wishes')
You can then decide to compile the output image file with Cython to make it cool. With this method, you can bundle all your graphics into one module.
As I said in your previous question, there is no need to base64 encode the string, it will only make the program slower. Just use the repr
>>> with open("images/image.gif", "rb") as fin:
... image_data=fin.read()
...
>>> with open("image.py","wb") as fout:
... fout.write("image_data="+repr(image_data))
...
Now the image is stored as a variable called image_data in a file called image.py
Start a fresh interpreter and import the image_data
>>> from image import image_data
>>>
Its work for me
import base64
import requests
# Getting image in bytes
response = requests.get("image_url")
# image encoding
encoded_image = base64.b64encode(response.content)
# image decoding and without it's won't work due to some '\xff' error
decoded_image= base64.b64decode(encoded_image)

Categories

Resources