Insert/Read pdf file from MySQL Database through Python

Insert/Read pdf file from MySQL Database through Python - python

I am trying to create function that I can upload and download (or View) pdf file into my database with PyQt5. Basically, I want to achieve following steps
Click button to upload, then opens the window to select pdf file
From the Tablewidget or some other available views, make function to download PDF or view PDF file that is saved in my database through 'step 1)'
I think I figured one way to upload the file,
import mysql.connector
from mysql.connector import Error
def write_file(data, filename):
# Convert binary data to proper format and write it on Hard Disk
with open(filename, 'wb', encoding="utf8", errors='ignore') as file:
file.write(data)
def readBLOB(emp_id, photo, bioData):
print("Reading BLOB data from python_Employee table")
try:
connection = mysql.connector.connect(host="00.00.00.000",
user="user",
password="pswd",
database="database")
cursor = connection.cursor()
sql_fetch_blob_query = """SELECT * from python_Employee where id = %s"""
cursor.execute(sql_fetch_blob_query, (emp_id,))
record = cursor.fetchall()
for row in record:
print("Id = ", row[0], )
print("Name = ", row[1])
image = row[2]
file = row[3]
print("Storing employee image and bio-data on disk \n")
write_file(image, photo)
write_file(file, bioData)
except mysql.connector.Error as error:
print("Failed to read BLOB data from MySQL table {}".format(error))
finally:
if (connection.is_connected()):
cursor.close()
connection.close()
print("MySQL connection is closed")
path1 = r'C:\Users\Bruce Ko\Desktop\Minsub Lee\Development\Git\Inventory Software\testdata.pdf'
path2 = r'C:\Users\Bruce Ko\Desktop\Minsub Lee\Development\Git\Inventory Software\eric_bioData.txt'
readBLOB(1, path1, path2)
From above programming, I could see from MySQL Workbench that my pdf file is uploaded (Even if I cannot read from there. Only binary information was available while uploaded image file was readable as image)
So if above function is correct to upload the pdf file, how can I download or read it? Above code is from https://pynative.com/python-mysql-blob-insert-retrieve-file-image-as-a-blob-in-mysql/ and reading blob part from above link does not work out for me. it leaves me error like following
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
I would much appreciated any help on this!

Related

Retrieve zipped file from bytea column in PostgreSQL using Python

I have a table in my PostgreSQL database in which a column type is set to bytea in order to store zipped files.
The storing procedure works fine. I have problems when I need to retrieve the zipped file I uploaded.
def getAnsibleByLibrary(projectId):
con = psycopg2.connect(
database="xyz",
user="user",
password="pwd",
host="localhost",
port="5432",
)
print("Database opened successfully")
cur = con.cursor()
query = "SELECT ansiblezip FROM library WHERE library.id = (SELECT libraryid from project WHERE project.id = '"
query += str(projectId)
query += "')"
cur.execute(query)
rows = cur.fetchall()
repository = rows[0][0]
con.commit()
con.close()
print(repository, type(repository))
with open("zippedOne.zip", "wb") as fin:
fin.write(repository)
This code creates a zippedOne.zip file but it seems to be an invalid archive.
I tried also saving repository.tobytes() but it gives the same result.
I don't understand how I can handle memoriview objects.
If I try:
print(repository, type(repository))
the result is:
<memory at 0x7f6b62879348> <class 'memoryview'>
If I try to unzip the file:
chain#wraware:~$ unzip zippedOne.zip
The result is:
Archive: zippedOne.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of zippedOne.zip or
zippedOne.zip.zip, and cannot find zippedOne.zip.ZIP, period.
Trying to extract it in windows gives me the error: "The compressed (zipped) folder is invalid"

This code, based on the example in the question, works for me:
import io
import zipfile
import psycopg2
DROP = """DROP TABLE IF EXISTS so69434887"""
CREATE = """\
CREATE TABLE so69434887 (
id serial primary key,
ansiblezip bytea
)
"""
buf = io.BytesIO()
with zipfile.ZipFile(buf, mode='w') as zf:
zf.writestr('so69434887.txt', 'abc')
with psycopg2.connect(database="test") as conn:
cur = conn.cursor()
cur.execute(DROP)
cur.execute(CREATE)
conn.commit()
cur.execute("""INSERT INTO so69434887 (ansiblezip) VALUES (%s)""", (buf.getvalue(),))
conn.commit()
cur.execute("""SELECT ansiblezip FROM so69434887""")
memview, = cur.fetchone()
with open('so69434887.zip', 'wb') as f:
f.write(memview)
and is unzippable (on Linux, at least)
$ unzip -p so69434887.zip so69434887.txt
abc
So perhaps the data is not being inserted correctly.
FWIW I got the "End-of-central-directory signature not found" until I made sure I closed the zipfile object before writing to the database.

Insert PDF into Postgres database with python and retrieve it

I have to store a small PDF file in a Postgres database (already have a table ready with a bytea column for the data), then be able to delete the file, and use the data in the database to restore the PDF as it was.
For context, I'm working with FastApi in Python3 so I can get the file as bytes, or as a whole file. So the main steps are:
Getting the file as bytes or a file via FastAPI
Inserting it into the Postgres DB
Retrieve the data in the DB
Make a new PDF file with the data.
How can I do that in a clean way?
The uploading function from FastAPI :
def import_data(file: UploadFile= File(...)):
# Put the whole data into a variable as bytes
pdfFile = file.file.read()
database.insertPdfInDb(pdfFile)
# Saving the file we just got to check if it's intact (it is)
file_name = file.filename.replace(" ", "-")
with open(file_name,'wb+') as f:
f.write(pdfFile)
f.close()
return {"filename": file.filename}
The function inserting the data into the Postgres DB :
def insertPdfInDb(pdfFile):
conn = connectToDb()
curs = conn.cursor()
curs.execute("INSERT INTO PDFSTORAGE(pdf, description) values (%s, 'Some description...')", (psycopg2.Binary(pdfFile),))
conn.commit()
print("PDF insertion in the database attempted.")
disconnectFromDb(conn)
return 0
# Saving the file we just got to check if it's intact (it is)
file_name = file.filename.replace(" ", "-")
with open(file_name,'wb+') as f:
f.write(pdfFile)
f.close()
return {"filename": file.filename}
The exporting part is just started and entirely try-and-error code.

How to generalise the import script

I have a query to generate a CSV file from the data in a Postgres Table.The script is working fine.
But i have a situation where i need to create separate files using the data from a different table.
So basically only the below hardcoded one change and rest code is same.Now the situation is i have to create separate scripts for all CSV's.
Is there a way i can have one script and only change this parameters.
I'm using Jenkins to automate the CSV file creation.
filePath = '/home/jenkins/data/'
fileName = 'data.csv'
import csv
import os
import psycopg2
from pprint import pprint
from datetime import datetime
from utils.config import Configuration as Config
from utils.postgres_helper import get_connection
from utils.utils import get_global_config
# File path and name.
filePath = '/home/jenkins/data/'
fileName = 'data.csv'
# Database connection variable.
connect = None
# Check if the file path exists.
if os.path.exists(filePath):
try:
# Connect to database.
connect = get_connection(get_global_config(), 'dwh')
except psycopg2.DatabaseError as e:
# Confirm unsuccessful connection and stop program execution.
print("Database connection unsuccessful.")
quit()
# Cursor to execute query.
cursor = connect.cursor()
# SQL to select data from the google feed table.
sqlSelect = "SELECT * FROM data"
try:
# Execute query.
cursor.execute(sqlSelect)
# Fetch the data returned.
results = cursor.fetchall()
# Extract the table headers.
headers = [i[0] for i in cursor.description]
# Open CSV file for writing.
csvFile = csv.writer(open(filePath + fileName, 'w', newline=''),
delimiter=',', lineterminator='\r\n',
quoting=csv.QUOTE_ALL, escapechar='\\')
# Add the headers and data to the CSV file.
csvFile.writerow(headers)
csvFile.writerows(results)
# Message stating export successful.
print("Data export successful.")
print('CSV Path : '+ filePath+fileName)
except psycopg2.DatabaseError as e:
# Message stating export unsuccessful.
print("Data export unsuccessful.")
quit()
finally:
# Close database connection.
connect.close()
else:
# Message stating file path does not exist.
print("File path does not exist.")

How to download BLOB .docx file from MySQL using python?

Current Code:
import mysql.connector
import sys
def write_file(data, filename):
with open(filename, 'wb') as f:
f.write(data)
sampleNum = 0;
# select photo column of a specific author
# read database configuration
db_config = mysql.connector.connect(user='root', password='test',
host='localhost',
database='technical')
# query blob data form the authors table
cursor = db_config.cursor()
try:
sampleNum=sampleNum+1;
query = "SELECT file FROM test WHERE id=%s"
cursor.execute(query,(sampleNum,))
photo = cursor.fetchone()[0]
write_file(photo, 'User'+str(sampleNum)+'.jpg')
except AttributeError as e:
print(e)
finally:
cursor.close()
Goal of above code
Code above allows me to get the image from MySQL that is stored as a BLOB and save it into a folder where .py script is saved.
It works fine!
Similar code with .docx
import mysql.connector
import sys
def write_file(data, filename):
with open(filename, 'wb') as f:
f.write(data)
sampleNum = 0;
db_config = mysql.connector.connect(user='root', password='test',
host='localhost',
database='technical')
cursor = db_config.cursor()
try:
sampleNum=sampleNum+1;
query = "SELECT fileAttachment FROM document_control WHERE id=%s"
cursor.execute(query,(sampleNum,))
file = cursor.fetchone()[0]
write_file(file, 'User'+str(sampleNum)+'.docx')
except AttributeError as e:
print(e)
finally:
cursor.close()
Here I am trying to extract and save a docx file from MySQL stored as a BLOB and it does not work.
The output of above script is the following:
f.write(data)
TypeError: a bytes-like object is required, not 'str'
How can I extract the .docx file from MySQL?

As per the insert query you mentioned
insert into document_control (fileattachment) values ('C:/Users/<user>/Desktop/Weekly Checks.xlsx');
it seems that you are just inserting the filepath in the database.
You must use LOAD_FILE to insert the actual file in the database blob object.
How to use LOAD_FILE to load a file into a MySQL blob?

How can i convert blob data received from database to image in python

I receive longblob data from database.
and i try to convert blob data to image and read image using cv2.
So I tried converting blob data to base64 like below code but it failed.
img = base64.decodebytes(img_str)
How can i convert blob to image? Is there a converting feature for this problem in the cv2 package?

You no need cv2 for convert blob to image, you need to store on disk the blob/image and show it. here and example of retrive from mysql blob to disk file..
Good luck!
Page URL referenced:URL
import mysql.connector
from mysql.connector import Error
def write_file(data, filename):
# Convert binary data to proper format and write it on Hard Disk
with open(filename, 'wb') as file:
file.write(data)
def readBLOB(emp_id, photo, bioData):
print("Reading BLOB data from python_employee table")
try:
connection = mysql.connector.connect(host='localhost',
database='python_db',
user='pynative',
password='pynative##29')
cursor = connection.cursor()
sql_fetch_blob_query = """SELECT photo from python_employee where id = %s"""
cursor.execute(sql_fetch_blob_query, (emp_id,))
record = cursor.fetchall()
for row in record:
print("Id = ", row[0], )
print("Name = ", row[1])
image = row[2]
file = row[3]
print("Storing employee image and bio-data on disk \n")
write_file(image, photo)
write_file(file, bioData)
except mysql.connector.Error as error:
print("Failed to read BLOB data from MySQL table {}".format(error))
finally:
if (connection.is_connected()):
cursor.close()
connection.close()
print("MySQL connection is closed")
readBLOB(1, "D:\Python\Articles\my_SQL\query_output\eric_photo.png",
"D:\Python\Articles\my_SQL\query_output\eric_bioData.txt")
readBLOB(2, "D:\Python\Articles\my_SQL\query_output\scott_photo.png",
"D:\Python\Articles\my_SQL\query_output\scott_bioData.txt")

If you want to save the image, use the code from 'Danilo Mercado Oudalova'.
But if you want to use without save file, use the example below.
import mysql.connector
from mysql.connector import Error
from io import BytesIO #from io import StringIO.
import PIL.Image
try:
connection = mysql.connector.connect(host='localhost',
database='python_db',
user='pynative',
password='pynative##29')
cursor = connection.cursor()
sql = "your query"
cursor.execute(sql)
result = cursor.fetchall()
except mysql.connector.Error as error:
print("Failed to read BLOB data from MySQL table {}".format(error))
finally:
if (connection.is_connected()):
cursor.close()
connection.close()
print("MySQL connection is closed")
type(result[0][0])
#If the type is byte, use from io import BytesIOf. If str, use from io import StringIO.
file_like= BytesIO(result[0][0])
img=PIL.Image.open(file_like)
img.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Insert/Read pdf file from MySQL Database through Python - python

Related

Retrieve zipped file from bytea column in PostgreSQL using Python

Insert PDF into Postgres database with python and retrieve it

How to generalise the import script

How to download BLOB .docx file from MySQL using python?

How can i convert blob data received from database to image in python

Categories

Resources