I will bring below a function code snippet which I can't make to work correct way.
def upload_csv():
conn = sqlite3.connect("data.db")
cursor = conn.cursor()
#as far as tkFileDialog returns absolute path to file, we have to slice from last slash till the end
filename = fn[fn.rfind("/")+1:]
cursor.execute("CREATE TABLE IF NOT EXISTS {0}('MSISDN' INTEGER PRIMARY KEY, 'IMEI' TEXT, 'TAC' INTEGER );".format(filename))
reader = csv.reader(open(fn,'r'))
for row in reader:
to_db = [unicode(row[0]),unicode(row[1]),unicode(int(row[2][0:8]))]
print to_db
cursor.execute("INSERT INTO data.{0} (MSISDN,IMEI,TAC) VALUES (?,?,?);".__format__(filename), to_db)
conn.commit()
I receive an Operational error:
OperationalError: unknown database May2015
So guys, I found the problem.
In my code i didn't strip the .csv file extention and that was the problem.
Thanks to CL for his advice to look in the deep to the name of the file.
For those who stacks on a similar problem, the right code is:
#as far as tkFileDialog returns absolute path to file, we have to slice from last slash till the end and also strip the extention!
filename = fn[fn.rfind("/")+1:].strip('.csv')
Related
I am trying to get a list of files in a user specified directory to be saved to a database. What I have at the moment is :
import os
import sqlite3
def get_list():
folder = input("Directory to scan : ")
results = []
for path in os.listdir(folder):
if os.path.isfile(os.path.join(folder, path)):
results.append(path)
print(results)
return results
def populate(results):
connection = sqlite3.connect("videos.db")
with connection:
connection.execute("CREATE TABLE IF NOT EXISTS files (id INTEGER PRIMARY KEY, file_name TEXT);")
for filename in results:
insert_string = "INSERT INTO files (file_name) VALUES ('"+filename+"');"
connection.execute(insert_string)
filelist = get_list()
populate(filelist)
It runs without a problem and prints out a list of the file names, which is great, but then when it's running the INSERT SQL statement, that seems to have no effect on the database table. I have tried to debug it, and the statement which is saved in the variable looks good, and when executing it manually in the console, it inserts a row in the table, but when running it, nothing changes. Am I missing something really simple here ?
Python's SQLite3 module doesn't auto-commit by default, so you need to call connection.commit() after you've finished executing queries. This is covered in the tutorial.
In addition, use ? placeholders to avoid SQL injection issues:
cur.execute('INSERT INTO files (file_name) VALUES (?)', (filename,))
Once you do that, you can insert all of your filenames at once using executemany:
cur.executemany(
'INSERT INTO files (file_name) VALUES (?)',
[(filename,) for filename in results],
)
I'm using Python. I have a daily csv file that I need to copy daily into a postgresql table. Some of those .csv records may be same day over day so I want to ignore those, based on a primary key field. Using cursor.copy_from,Day 1 all is fine, new table created. Day 2, copy_from throws duplicate key error (as it should), but copy_from stops on 1st error. Is there a copy_from parameter that would ignore the duplicates and continue? If not, any other recommendations other than copy_from?
f = open(csv_file_name, 'r')
c.copy_from(f, 'mytable', sep=',')
This is how I'm doing it with psycopg3.
Assumes the file is in the same folder as the script and that it has a header row.
from pathlib import Path
from psycopg import sql
file = Path(__file__).parent / "the_data.csv"
target_table = "mytable"
conn = <your connection>
with conn.cursor() as cur:
# Create an empty table with the same columns as target_table.
cur.execute(f"CREATE TEMP TABLE tmp_table (LIKE {target_table})")
# The csv file imports as text.
# This approach tells postgres how to convert text to the proper column types.
column_types = sql.Identifier(target_table)
query = sql.SQL("COPY tmp_table FROM STDIN WITH(FORMAT csv, HEADER true)")
typed_query = query.format(column_types)
with cur.copy(typed_query) as copy:
with file.open() as csv_data:
copy.write(csv_data.read())
cur.execute(
f"INSERT INTO {target_table} SELECT * FROM tmp_table ON CONFLICT DO NOTHING"
)
I have a table in my PostgreSQL database in which a column type is set to bytea in order to store zipped files.
The storing procedure works fine. I have problems when I need to retrieve the zipped file I uploaded.
def getAnsibleByLibrary(projectId):
con = psycopg2.connect(
database="xyz",
user="user",
password="pwd",
host="localhost",
port="5432",
)
print("Database opened successfully")
cur = con.cursor()
query = "SELECT ansiblezip FROM library WHERE library.id = (SELECT libraryid from project WHERE project.id = '"
query += str(projectId)
query += "')"
cur.execute(query)
rows = cur.fetchall()
repository = rows[0][0]
con.commit()
con.close()
print(repository, type(repository))
with open("zippedOne.zip", "wb") as fin:
fin.write(repository)
This code creates a zippedOne.zip file but it seems to be an invalid archive.
I tried also saving repository.tobytes() but it gives the same result.
I don't understand how I can handle memoriview objects.
If I try:
print(repository, type(repository))
the result is:
<memory at 0x7f6b62879348> <class 'memoryview'>
If I try to unzip the file:
chain#wraware:~$ unzip zippedOne.zip
The result is:
Archive: zippedOne.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of zippedOne.zip or
zippedOne.zip.zip, and cannot find zippedOne.zip.ZIP, period.
Trying to extract it in windows gives me the error: "The compressed (zipped) folder is invalid"
This code, based on the example in the question, works for me:
import io
import zipfile
import psycopg2
DROP = """DROP TABLE IF EXISTS so69434887"""
CREATE = """\
CREATE TABLE so69434887 (
id serial primary key,
ansiblezip bytea
)
"""
buf = io.BytesIO()
with zipfile.ZipFile(buf, mode='w') as zf:
zf.writestr('so69434887.txt', 'abc')
with psycopg2.connect(database="test") as conn:
cur = conn.cursor()
cur.execute(DROP)
cur.execute(CREATE)
conn.commit()
cur.execute("""INSERT INTO so69434887 (ansiblezip) VALUES (%s)""", (buf.getvalue(),))
conn.commit()
cur.execute("""SELECT ansiblezip FROM so69434887""")
memview, = cur.fetchone()
with open('so69434887.zip', 'wb') as f:
f.write(memview)
and is unzippable (on Linux, at least)
$ unzip -p so69434887.zip so69434887.txt
abc
So perhaps the data is not being inserted correctly.
FWIW I got the "End-of-central-directory signature not found" until I made sure I closed the zipfile object before writing to the database.
I am working on a new project. I want to connect the database, download the file from here and upload it again after making changes. But there is a problem. When I pull the data with Python, the result should be exactly the same. However, when I open the file, I see that the spaces are removed, adds parentheses to the beginning and end, the UTF-8 structure is broken, and the lines are completely removed. Why is this happening and how can it be resolved?
My Code:
# -*- coding: utf-8 -*-
f = open('sonuc.txt','w', encoding='utf-8')
import MySQLdb
db=MySQLdb.connect(host='host',user='usr',password='ps',db='db',)
mycursor = db.cursor()
mycursor.execute('SELECT message FROM mybb_posts WHERE pid=1;')
sonuc = mycursor.fetchall()
f.write(str(sonuc))
f.close()
The original data is as follows:
Lets Try This!
Line 2
Line 3
Try other charecter:
like "ş", "i", "ü", "ğ", "İ"
Line 6
Python result (sonuc.txt):
(('Lets Try This!\r\nLine 2\r\nLine 3\r\nTry other charecter:\r\nlike "?", "i", "ü", "?", "?"\r\nLine 6\r\n',),)
Edit:
for UTF-8 problem, add charset='utf8mb4', to MySQLdb.connect()
There's nothing corrupt about that. That's just the Python representation of an 1-tuple containing an 1-tuple containing a string, since .fetchall() returns a tuple of tuples with the columns you requested.
If you want to write the first column of each row returned by your query,
for row in mycursor:
message = row[0]
f.write(message)
f.close()
While you're at it, you should practice proper open hygiene:
import MySQLdb
with MySQLdb.connect(
host="host",
user="usr",
password="ps",
db="db",
) as db:
mycursor = db.cursor()
mycursor.execute("SELECT message FROM mybb_posts WHERE pid=1;")
with open("sonuc.txt", "w", encoding="utf-8") as f:
for row in mycursor:
message = row[0]
f.write(message)
I am trying to follow one copy_from example describe in stackoverflow but i modify little as i need to read data from csv file. Following this example i wrote a small program where the file is to be readed from file stored in disk and then copy data from that file to created table, My code is :
def importFile():
path = "C:\myfile.csv"
curs = conn.cursor()
curs.execute("Drop table if exists test_copy; ")
data = StringIO.StringIO()
data.write(path)
data.seek(0)
curs.copy_from(data, 'MyTable')
print("Data copied")
But i get error,
psycopg2.DataError: invalid input syntax for integer:
Does this mean there is mismatch between csv file and my table? OR is this syntax enough in order to copy csv file? or I need some more code ?? I am new to python, so any help will be appreciated..
Look at your .csv file with a text editor. You want to be sure that
the field-separator is a tab character
there are no quote-chars
there is no header row
If this is true, the following should work:
import psycopg2
def importFromCsv(conn, fname, table):
with open(fname) as inf:
conn.cursor().copy_from(inf, table)
def main():
conn = ?? # set up database connection
importFromCsv(conn, "c:/myfile.csv", "MyTable")
print("Data copied")
if __name__=="__main__":
main()