MySQL - LOAD DATA LOCAL

MySQL - LOAD DATA LOCAL - python

Running a MySQL (8.0) database on a Ubuntu (20.04) VPS. My current objective is trying to load a .CSV automatically into a table via a Python script. The script is theoretically correct and should work, it's the ability to process the data from the CSV into the table.
dbupdate.py:
import mysql.connector
import os
import string
db = mysql.connector.connect (
host="localhost",
user="root",
passwd="********",
db="Rack_Info"
)
sqlLoadData = "LOAD DATA LOCAL INFILE '/home/OSA_ADVA_Dashboard/Processed_CSV/DownloadedCSV.csv' INTO TABLE BerT FIELDS TERMINATED BY ',' ENCLOSED BY '*' IGNORE 1 LINES;"
try:
curs = db.cursor()
curs.execute(sqlLoadData)
db.commit()
print ("SQL execution complete")
resultSet = curs.fetchall()
except IOError:
print ("Error incurred: ")
db.rollback()
db.close()
print ("Data loading complete.\n")
I have consulted the official documentation and enabled local_infile on both the server and client, configured the my.cnf and in SQL.
The my.cnf file:
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/
!includedir /etc/mysql/mysql.conf.d/
[client]
local_infile=1
[mysql]
local_infile=1
[mysqld]
local_infile=1
I have restarted both php and MySQL services to no avail, as well as the server. At a loss here at what to do. Any help would be much appreciated.

if im not mistaken php has its own config file where you have to enable load data local infile

I investigated the php.ini file and uncommented the load data lines, still nothing.
Turns out one of mysqld's variable, secure_file_priv, was pointing at an empty/default directory. All I had to do was change the directory to the where my files were located. All working now.

Related

Correct Format for Absolute Path for SQLite Database in Python

I have a small Python application that references a folder on my Raspberry Pi. I'm using a direct link to the .db file, and want to update to use an absolute link. However, my tries have failed and the path isn't correct. How do I format this for the database link to work anywhere it's installed? The code creates a new database.db file if one doesn't exist, but I want to use code that will always work even without the /home/name/code/... location specified. Thanks!
(I read this, and it didn't help me.)
#Connect to database
sqliteConnection = sqlite3.connect('/home/name/code/bot/database.db')
cursor = sqliteConnection.cursor()
I'm trying to do something more like this, but it isn't working:
#Connect to database
dbdir = os.path.direname(__file__)
dbpath = os.path.join(dbdir, "..", "database.db")
sqliteConnection = sqlite3.connect(dbpath)

#Dillon Davis got me on the right track. I'm not sure if this is the most elegant way to handle this, but it works:
filename = os.path.abspath(__file__)
dbdir = filename.rstrip('filename.py')
dbpath = os.path.join(dbdir, "database.db")
sqliteConnection = sqlite3.connect(dbpath)
cursor = sqliteConnection.cursor()

Accessing SQLite DB /w python and getting malformed DBs

I have some python code that copies a SQLite db across sftp. However, it is a highly active db, so many of the times I am running into a malformed db. I'm thinking of these possible options, but I don't know how to implement them because I am newer to python.
Alternate method of getting the sqlite db copied?
Maybe there is a way to query the sqlite file from the device? Not sure if that would work since sqlite is more of a local db not sure how I can query it like I could w mysql etc...
Create a loop? I could call the function again in the exception, but not sure how to retry the rest of the code.
Also, the malformed db issue can possibly occur in other sections im thinking? Maybe I need to run a pragma quick_check?
This is commonly what I am seeing.... The other catch is why am I seeing it as often as I am? Because if I load the sqlite file from my main machine, and it runs the query files?
(venv) dulanic#mediaserver:/opt/python_scripts/rpi$ cd /opt/python_scripts/rpi ; /usr/bin/env /opt/python_scripts/rpi/venv/bin/python /home/dulanic/.vscode-server/extensions/ms-python.python-2021.2.636928669/pythonFiles/lib/python/debugpy/launcher 37599 -- /opt/python_scripts/rpi/rpdb.py
An error occurred: database disk image is malformed
This is my current code:
#!/usr/bin/env python3
import psycopg2, sqlite3, sys, paramiko, sys, os, socket, time
scpuser=os.getenv('scpuser')
scppw = os.getenv('scppw')
sqdb = os.getenv('sqdb')
sqlike = os.getenv('sqlike')
pgdb = os.getenv('pgdb')
pguser = os.getenv('pguser')
pgpswd = os.getenv('pgpswd')
pghost = os.getenv('pghost')
pgport = os.getenv('pgport')
pgschema = os.getenv('pgschema')
database = r"./pihole.db"
pihole = socket.gethostbyname('pi.hole')
tabnames=[]
tabgrab = ''
def pullsqlite():
sftp.get('/etc/pihole/pihole-FTL.db','pihole.db')
sftp.close()
# SFTP pull config
ssh_client=paramiko.SSHClient()
ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh_client.connect(hostname=pihole,username=scpuser,password=scppw)
sftp=ssh_client.open_sftp()
# Pull SQlite
pullsqlite()
# Load sqlite tables to list
consq=sqlite3.connect(sqdb)
cursq=consq.cursor()
cursq.execute(f"SELECT name FROM sqlite_master WHERE type='table' AND name in ({sqlike})" )
tabgrab = cursq.fetchall()
# postgres connection
conpg = psycopg2.connect(database=pgdb, user=pguser, password=pgpswd,
host=pghost, port=pgport)
#Load data to postgres from sqlite
for item in tabgrab:
tabnames.append(item[0])
start = time.perf_counter()
for table in tabnames:
curpg = conpg.cursor()
if table=='queries':
curpg.execute(f"SELECT max(id) FROM {table};")
max_id = curpg.fetchone()[0]
cursq.execute(f"SELECT * FROM {table} where id > {max_id};")
else:
cursq.execute(f"SELECT * FROM {table};")
try:
rows=cursq.fetchall()
except sqlite3.Error as e:
print("An error occurred:", e.args[0])
colcount=len(rows[0])
pholder=('%s,'*colcount)[:-1]
try:
curpg.execute(f"SET search_path TO {pgschema};" )
curpg.executemany(f"INSERT INTO {table} VALUES ({pholder}) ON CONFLICT DO NOTHING;" ,rows)
conpg.commit()
print(f'Inserted {len(rows)} rows into {table}')
except psycopg2.DatabaseError as e:
print (f'Error {e}')
sys.exit(1)
if 'start' in locals():
elapsed = time.perf_counter() - start
print(f'Time {elapsed:0.4}')
consq.close()

copy data from csv to postgresql using python

I am on windows 7 64 bit.
I have a csv file 'data.csv'.
I want to import data to a postgresql table 'temp_unicommerce_status' via a python script.
My Script is:
import psycopg2
conn = psycopg2.connect("host='localhost' port='5432' dbname='Ekodev' user='bn_openerp' password='fa05844d'")
cur = conn.cursor()
cur.execute("""truncate table "meta".temp_unicommerce_status;""")
cur.execute("""Copy temp_unicommerce_status from 'C:\Users\n\Desktop\data.csv';""")
conn.commit()
conn.close()
I am getting this error
Traceback (most recent call last):
File "C:\Users\n\Documents\NetBeansProjects\Unicommerce_Status_Update\src\unicommerce_status_update.py", line 5, in <module>
cur.execute("""Copy temp_unicommerce_status from 'C:\\Users\\n\\Desktop\\data.csv';""")
psycopg2.ProgrammingError: must be superuser to COPY to or from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

Use the copy_from cursor method
f = open(r'C:\Users\n\Desktop\data.csv', 'r')
cur.copy_from(f, temp_unicommerce_status, sep=',')
f.close()
The file must be passed as an object.
Since you are coping from a csv file it is necessary to specify the separator as the default is a tab character

The way I solved this problem particular to use psychopg2 cursor class function copy_expert (Docs: http://initd.org/psycopg/docs/cursor.html). copy_expert allows you to use STDIN therefore bypassing the need to issue a superuser privilege for the postgres user. Your access to the file then depends on the client (linux/windows/mac) user's access to the file
From Postgres COPY Docs (https://www.postgresql.org/docs/current/static/sql-copy.html):
Do not confuse COPY with the psql instruction \copy. \copy invokes
COPY FROM STDIN or COPY TO STDOUT, and then fetches/stores the data in
a file accessible to the psql client. Thus, file accessibility and
access rights depend on the client rather than the server when \copy
is used.
You can also leave the permissions set strictly for access to the development_user home folder and the App folder.
csv_file_name = '/home/user/some_file.csv'
sql = "COPY table_name FROM STDIN DELIMITER '|' CSV HEADER"
cursor.copy_expert(sql, open(csv_file_name, "r"))

#sample of code that worked for me
import psycopg2 #import the postgres library
#connect to the database
conn = psycopg2.connect(host='localhost',
dbname='database1',
user='postgres',
password='****',
port='****')
#create a cursor object
#cursor object is used to interact with the database
cur = conn.cursor()
#create table with same headers as csv file
cur.execute("CREATE TABLE IF NOT EXISTS test(**** text, **** float, **** float, ****
text)")
#open the csv file using python standard file I/O
#copy file into the table just created
with open('******.csv', 'r') as f:
next(f) # Skip the header row.
#f , <database name>, Comma-Seperated
cur.copy_from(f, '****', sep=',')
#Commit Changes
conn.commit()
#Close connection
conn.close()
f.close()

Here is an extract from relevant PostgreSQL documentation : COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible to the server and the name must be specified from the viewpoint of the server. When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server
That's the reason why the copy command to or from a file a restricted to a PostgreSQL superuser : the file must be present on server and is loaded directly by the server process.
You should instead use :
cur.copy_from(r'C:\Users\n\Desktop\data.csv', temp_unicommerce_status)
as suggested by this other answer, because internally it uses COPY from stdin.

You can use d6tstack which makes this simple
import d6tstack
import glob
c = d6tstack.combine_csv.CombinerCSV([r'C:\Users\n\Desktop\data.csv']) # single-file
c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv')) # multi-file
c.to_psql_combine('postgresql+psycopg2://psqlusr:psqlpwdpsqlpwd#localhost/psqltest', 'tablename')
It also deals with data schema changes, create/append/replace table and allows you to preprocess data with pandas.

I know this question has been answered, but here are my two cent. I am adding little more description:
You can use cursor.copy_from method :
First you have to create a table with same no of columns as your csv file.
Example:
My csv looks like this:
Name, age , college , id_no , country , state , phone_no
demo_name 22 , bdsu , 1456 , demo_co , demo_da , 9894321_
First create a table:
import psycopg2
from psycopg2 import Error
connection = psycopg2.connect(user = "demo_user",
password = "demo_pass",
host = "127.0.0.1",
port = "5432",
database = "postgres")
cursor = connection.cursor()
create_table_query = '''CREATE TABLE data_set
(Name TEXT NOT NULL ,
age TEXT NOT NULL ,
college TEXT NOT NULL ,
id_no TEXT NOT NULL ,
country TEXT NOT NULL ,
state TEXT NOT NULL ,
phone_no TEXT NOT NULL);'''
cursor.execute(create_table_query)
connection.commit()
Now you can simply use cursor.copy_from where you need three parameters :
first file object , second table_name , third sep type
you can copy now :
f = open(r'final_data.csv', 'r')
cursor.copy_from(f, 'data_set', sep=',')
f.close()
done

I am going to post some of the errors I ran into trying to copy a csv file to a database on a linux based system....
here is an example csv file:
Name Age Height
bob 23 59
tom 56 67
You must install the library psycopg2 (i.e. pip install psycopg2 or sudo apt install python3-psycopg2 )
You must have postgres installed on your system before you can use psycopg2 (sudo apt install postgresql-server postgresql-contrib )
Now you must create a database to store the csv unless you already have postgres setup with a pre-existing database
COPY CSV USING POSTGRES COMMANDS
After installing postgres it creates a default user account which gives you access to postgres commands
To switch to the postgres account issue: sudo -u postgres psql
Acess the prompt by issuing: psql
#command to create a database
create database mytestdb;
#connect to the database to create a table
\connect mytestdb;
#create a table with same csv column names
create table test(name char(50), age char(50), height char(50));
#copy csv file to table
copy mytestdb 'path/to/csv' with csv header;
COPY CSV USING PYTHON
The main issue I ran into with copying the CSV file to a database was I didn't have the database created yet, however this can be done with python still.
import psycopg2 #import the Postgres library
#connect to the database
conn = psycopg2.connect(host='localhost',
dbname='mytestdb',
user='postgres',
password='')
#create a cursor object
#cursor object is used to interact with the database
cur = conn.cursor()
#create table with same headers as csv file
cur.execute('''create table test(name char(50), age char(50), height char(50));''')
#open the csv file using python standard file I/O
#copy file into the table just created
f = open('file.csv','r')
cursor.copy_from(f, 'test', sep=',')
f.close()

Try to do the same as the root user - postgres. If it were linux system, you could change file's permissions or move the file to /tmp. The problem results from missing credentials to read from the filesystem.

Fast MySQL Import

Writing a script to convert raw data for MySQL import I worked with a temporary textfile so far which I later imported manually using the LOAD DATA INFILE... command.
Now I included the import command into the python script:
db = mysql.connector.connect(user='root', password='root',
host='localhost',
database='myDB')
cursor = db.cursor()
query = """
LOAD DATA INFILE 'temp.txt' INTO TABLE myDB.values
FIELDS TERMINATED BY ',' LINES TERMINATED BY ';';
"""
cursor.execute(query)
cursor.close()
db.commit()
db.close()
This works but temp.txt has to be in the database directory which isn't suitable for my needs.
Next approch is dumping the file and commiting directly:
db = mysql.connector.connect(user='root', password='root',
host='localhost',
database='myDB')
sql = "INSERT INTO values(`timestamp`,`id`,`value`,`status`) VALUES(%s,%s,%s,%s)"
cursor=db.cursor()
for line in lines:
mode, year, julian, time, *values = line.split(",")
del values[5]
date = datetime.strptime(year+julian, "%Y%j").strftime("%Y-%m-%d")
time = datetime.strptime(time.rjust(4, "0"), "%H%M" ).strftime("%H:%M:%S")
timestamp = "%s %s" % (date, time)
for i, value in enumerate(values[:20], 1):
args = (timestamp,str(i+28),value, mode)
cursor.execute(sql,args)
db.commit()
Works as well but takes around four times as long which is too much. (The same for construct was used in the first version to generate temp.txt)
My conclusion is that I need a file and the LOAD DATA INFILE command to be faster. To be free where the textfile is placed the LOCAL option seems useful. But with MySQL Connector (1.1.7) there is the known error:
mysql.connector.errors.ProgrammingError: 1148 (42000): The used command is not allowed with this MySQL version
So far I've seen that using MySQLdb instead of MySQL Connector can be a workaround. Activity on MySQLdb however seems low and Python 3.3 support will probably never come.
Is LOAD DATA LOCAL INFILE the way to go and if so is there a working connector for python 3.3 available?
EDIT: After development the database will run on a server, script on a client.

I may have missed something important, but can't you just specify the full filename in the first chunk of code?
LOAD DATA INFILE '/full/path/to/temp.txt'
Note the path must be a path on the server.

To use LOAD DATA INFILE with every accessible file you have to set the
LOCAL_FILES client flag while creating the connection
import mysql.connector
from mysql.connector.constants import ClientFlag
db = mysql.connector.connect(client_flags=[ClientFlag.LOCAL_FILES], <other arguments>)

using an sqlite3 database with WAL enabled -Python

I'm trying to modify the two database files used by Google Drive to redirect my sync folder via a script (snapshot.db and sync_conf.db). While I can open the files in certain sqlite browsers (not all) I cant get python to execute a query. I just get the message: sqlite3.DatabaseError: file is encrypted or is not a database
Apparently google is using a Write-Ahead-logging (WAL) configuration on the databases and it can be turned off by running PRAGMA journal_mode=DELETE; (according to sqlite.org) against the database, but I can't figure out how to run that against the database if python can't read it.
heres what I have (I tried executing the PRAGMA command and commiting and then reopening but it didnt work):
import sqlite3
snapShot = 'C:\Documents and Settings\user\Local Settings\Application Data\Google\Drive\snapshot.db'
sync_conf = 'C:\Documents and Settings\user\Local Settings\Application Data\Google\Drive\sync_config.db'
sync_folder_path = 'H:\Google Drive'
conn = sqlite3.connect(snapShot)
cursor = conn.cursor()
#cursor.execute('PRAGMA journal_mode=DELETE;')
#conn.commit()
#conn= sqlite3.connect(snapShot)
#cursor = conn.cursor()
query = "UPDATE local_entry SET filename = '\\?\\" + sync_folder_path +"' WHERE filename ='\\?\C:Users\\admin\Google Drive'"
print query
cursor.execute(query)

problem solved. I just downloaded the latest version of sqlite from http://www.sqlite.org/download.html and overwrote the old .dll in my python27/DLL directory. Works fine now.
What a nusance.

I don't think the journal_mode pragma should keep sqlite3 from being able to open the db at all. Perhaps you're using an excessively old version of the sqlite3 lib? What version of Python are you using, and what version of the sqlite3 library?
import sqlite3
print sqlite3.version

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

MySQL - LOAD DATA LOCAL - python

if im not mistaken php has its own config file where you have to enable load data local infile

I investigated the php.ini file and uncommented the load data lines, still nothing. Turns out one of mysqld's variable, secure_file_priv, was pointing at an empty/default directory. All I had to do was change the directory to the where my files were located. All working now.

Related

Correct Format for Absolute Path for SQLite Database in Python

Accessing SQLite DB /w python and getting malformed DBs

copy data from csv to postgresql using python

Fast MySQL Import

using an sqlite3 database with WAL enabled -Python

Categories

Resources