I have a program that collects some data from the web-site. Text data is appended into "info" DataFrame and photo urls are appended to "photos" DataFrame.
I have already inserted "info" table to my SQL database and works really fine!
data.to_sql('Flat', con=conn, if_exists='replace',index=False)
Now i need to understand how can convert image links to Blob data and insert it into DataBase.
BLOBs are Binary Large OBjects. First you need to convert the image to a binary object.
def convertToBinaryData(imageLocation):
#Convert digital data to binary format
with open(imageLocation, 'rb') as file:
blobData = file.read()
return blobData
The rest is a simple insert, make sure you are connected. Create an insert statement, inject your binaries into this statement.
insert = """ INSERT INTO 'images' ('id', 'image') VALUES (?, ?) """
id = 1
image = convertToBinary(imageLocation)
cursor.execute(insert, (id, image))
connection.commit()
These functions are omitting how to create a connection and get a cursor, however full example can be found at: https://pynative.com/python-sqlite-blob-insert-and-retrieve-digital-data/
Related
I am using Streamlit, where one function is to upload a .csv file. The uploader returns a io.StringIO or io.BytesIO object.
I need to upload this object to my postgres database. There I have a column, holding an array of bytea:
CREATE TABLE files (
id int4 NOT NULL,
docs_array bytea[] NOT NULL
...
)
;
Usually, I would use the SQL query like
UPDATE files SET docs_array = array_append(docs_array, pg_read_binary_file('/Users/XXX//Testdata/test.csv')::bytea) WHERE id = '1';
however, since I have a stringIO object, this does not work.
I have tried
sql = UPDATE files SET docs_array = array_append(docs_array, %s::bytea) WHERE id = '%s';
cursor.execute(sql, (file, testID,) )
and cursor.execute(sql, (psycopg.Binary(file), testID,) )
yet I always get one of the following errors
can't adapt type '_io.BytesIO
can't escape _io.BytesIO to binary
can't adapt type '_io.StringIO
can't escape _io.StringIO to binary
How can I load the object?
UPDATE:
*thanks Mike Organek for the suggestion!
example file.read() looks like b'"Datum/Uhrzeit","Durchschnittsverbrauch Strom (kWh/100km)","Durchschnittsverbrauch Verbrenner (l/100km)","Fahrstrecke (km)","Fahrzeit (h)","Durchschnittsgeschwindigkeit (km/h)"\r\n"2015-11-28T11:44:06Z","7,6","8,5","1.792","14:01","128"\r\n"2015-11-28T12:28:45Z","7,7","8,5","1.473","14:21","103"\r\n"2015-12-24T06:04:43Z","5,5","8,3","4.848","48:01","101"\r\n"2015-12-24T12:15:25Z","27,2","8,0","290","3:20","88"\r\n'
yet if I try to execute
cursor.execute(sql, (file.read(), testID,) )
only "\x" is loaded to the db. The whole data is lost for whatever reason
Screenshot
Yet, if I define file as b'"Datum/Uhrzeit","Durchschnittsverbrauch Strom (kWh/100km)"..."8,5"\r\n' - everything works. So my only guess is the problem lies somewhere with io object and .read()?
Mike Organek pointed out the solution:
file.read() in cursor.execute(sql, (file.read(), testID,)) to get the binary data while not reading the file previously did the trick.
I have a rather large dataframe (500k+ rows) that I'm trying to load to Vertica. I have the following code working, but it is extremely slow.
#convert df to list format
lists = output_final.values.tolist()
#make insert string
insert_qry = " INSERT INTO SCHEMA.TABLE(DATE,ID, SCORE) VALUES (%s,%s,%s) "
# load into database
for i in range(len(lists)):
cur.execute(insert_qry, lists[i])
conn_info.commit()
I have seen a few posts talking about using COPY rather than EXECUTE to do this large of a load, but haven't found a good working example.
After a lot of trial and error... I found that the following worked for me.
# insert statements
copy_str = "COPY SCHEMA.TABLE(DATE,ID, SCORE)FROM STDIN DELIMITER ','"
# turn the df into a csv-like object
stream = io.StringIO()
contact_output_final.to_csv(stream, sep=",",index=False, header=False)
# reset the position of the stream variable
stream.seek(0)
# load to data
with conn_info.cursor() as cursor:
cur.copy(copy_str,stream.getvalue())
conn_info.commit()
So, I'm trying to save an image into my PostgreSQL table in Python using psycopg2
INSERT Query (Insert.py)
#Folder/Picture is my path/ID is generated/IDnum is increment
picopen = open("Folder/Picture."+str(ID)+"."+str(IDnum)+".jpg", 'rb').read()
filename = ("Picture."+str(ID)+"."+str(IDnum)+".jpg")
#Sample file name is this Picture.1116578795.7.jpg
#upload_name is where I want the file name, upload_content is where I store the image
SQL = "INSERT INTO tbl_upload (upload_name, upload_content) VALUES (%s, %s)"
data = (filename, psycopg2.Binary(picopen))
cur.execute(SQL, data)
conn.commit()
now to recover the saved Image I perform this query (recovery.py)
cur.execute("SELECT upload_content, upload_name from tbl_upload")
for row in cur:
mypic = cur.fetchone()
open('Folder/'+row[1], 'wb').write(str(mypic[0]))
now what happens is when I execute the recovery.py it does generate a ".jpg" file but I can't view or open it.
If it helps I'm doing it using Python 2.7 and Centos7.
for the sake of additional information, I get this on the image viewer when I open the generated file.
Error interpreting JPEG image file (Not a JPEG file: starts with 0x5c 0x78)
I also tried using other formats as well (.png, .bmp)
I double checked my database type and apparently upload_content datatype is text its supposed to be bytea I thought I already had it set to bytea when I created my db. Problem solved.
Craig Ringer yo no puedo trabajar con large object functions
My database looks like this
this is my table
-- Table: files
--
DROP TABLE files;
CREATE TABLE files
(
id serial NOT NULL,
orig_filename text NOT NULL,
file_data bytea NOT NULL,
CONSTRAINT files_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE files
I want save .pdf in my database, I saw you did the last answer, but using python27 (read the file and convert to a buffer object or use the large object functions)
I did the code would look like
path="D:/me/A/Res.pdf"
listaderuta = path.split("/")
longitud=len(listaderuta)
f = open(path,'rb')
f.read().__str__()
cursor = con.cursor()
cursor.execute("INSERT INTO files(id, orig_filename, file_data) VALUES (DEFAULT,%s,%s) RETURNING id", (listaderuta[longitud-1], f.read()))
but when I'm downloading, ie save
fula = open("D:/INSTALL/pepe.pdf",'wb')
cursor.execute("SELECT file_data, orig_filename FROM files WHERE id = %s", (int(17),))
(file_data, orig_filename) = cursor.fetchone()
fula.write(file_data)
fula.close()
but when I'm downloading the file can not be opened, this damaged
I repeat I can not work with large object functions
try this and turned me, can you help ?
I updated my previous answer to indicate usage for Python 2.7. The general idea is to read the manual and follow the instructions there.
Here's the relevant part:
In Python 2, you can't just pass a str directly, as psycopg2 will think it's an encoded text string, not raw bytes. You must either use psycopg2.Binary to wrap it, or load the data into a bytearray object.
So either:
filedata = psycopg2.Binary( f.read() )
or
filedata = buffer( f.read() )
I am trying to take BLOB type information from rows in a sqlite database and save them into picture files. They are all jpg files stored in the blob. When I attempt to loop through and create my jpeg files from the database blobs I get strange corrupted looking jpeg files that have some of the original data and shows the image, but there is alot of distortion too. I think it has to do something with the way the data is being handled from the DB. Anyways it's column six that contains the blobs, here is my code...
import sqlite3
cnt = 0
con = None
con = sqlite3.connect("C:\\my.db")
cur = con.cursor()
data = cur.execute("SELECT * FROM friends")
for item in data:
cnt = cnt + 1
a = open("C:\\images\\"+str(cnt)+".jpg", 'w')
a.write(item[6])
a.close()
It will create an image for each blob in each row, 902 to be exact, and the images actually are viewable, even of the correct picture, just some heavy distortion/corruption has been added and really mucks up the pictures? How can I create jpg picture files from each blob jpg file in my database using Python that aren't corrupted?
Thanks!
Write files in binary mode: open("C:\\images\\"+str(cnt)+".jpg", 'wb').