I'm aware that the best way to prevent sql injection is to write Python queries of this form (or similar):
query = 'SELECT %s %s from TABLE'
fields = ['ID', 'NAME']
cur.execute(query, fields)
The above will work for a single query, but what if we want to do a UNION of 2 SQL commands? I've set this up via sqlite3 for sake of repeatability, though technically I'm using pymysql. Looks as follows:
import sqlite3
conn = sqlite3.connect('dummy.db')
cur = conn.cursor()
query = 'CREATE TABLE DUMMY(ID int AUTO INCREMENT, VALUE varchar(255))'
query2 = 'CREATE TABLE DUMMy2(ID int AUTO INCREMENT, VALUE varchar(255)'
try:
cur.execute(query)
cur.execute(query2)
except:
print('Already made table!')
tnames = ['DUMMY1', 'DUMMY2']
sqlcmds = []
for i in range(0,2):
query = 'SELECT %s FROM {}'.format(tnames[i])
sqlcmds.append(query)
fields = ['VALUE', 'VALUE']
sqlcmd = ' UNION '.join(sqlcmds)
cur.execute(sqlcmd, valid_fields)
When I run this, I get a sqlite Operational Error:
sqlite3.OperationalError: near "%": syntax error
I've validated the query prints as expected with this output:
INSERT INTO DUMMY VALUES(%s) UNION INSERT INTO DUMMY VALUES(%s)
All looks good there. What is the issue with the string substitutions here? I can confirm that running a query with direct string substitution works fine. I've tried it with both selects and inserts.
EDIT: I'm aware there are multiple ways to do this with executemany and a few other. I need to do this with UNION for the purposes I'm using this for because this is a very, very simplified example fo the operational code I'm using
The code below executes few INSERTS at once
import sqlite3
conn = sqlite3.connect('dummy.db')
cur = conn.cursor()
query = 'CREATE TABLE DUMMY(ID int AUTO INCREMENT NOT NULL, VALUE varchar(255))'
try:
cur.execute(query)
except:
print('Already made table!')
valid_fields = [('ya dummy',), ('stupid test example',)]
cur.executemany('INSERT INTO DUMMY (VALUE) VALUES (?)',valid_fields)
I have the following Python code:
import pandas as pd
from sqlalchemy import create_engine
import mysql.connector
# Give the location of the file
loc = ("C:\\Users\\27826\\Desktop\\11Sixteen\\Models and Reports\\Historical results files\\EPL 1993-94.csv")
df = pd.read_csv(loc)
# Remove empty columns then rows
df = df.dropna(axis=1, how='all')
df = df.dropna(axis=0, how='all')
# Create DataFrame and then import to db (new game results table)
engine = create_engine("mysql://root:xxx#localhost/11sixteen")
df.to_sql('new_game_results', con=engine, if_exists="replace")
# Move from new games results table to game results table
db = mysql.connector.connect(host="localhost",
user="root",
passwd="xxx",
database="11sixteen")
my_cursor = db.cursor()
my_cursor.execute("INSERT INTO 11sixteen.game_results "
"SELECT * FROM 11sixteen.new_game_results WHERE "
"NOT EXISTS (SELECT date, HomeTeam "
"FROM 11sixteen.game_results WHERE "
"11sixteen.game_results.date = 11sixteen.new_game_results.date AND "
"11sixteen.game_results.HomeTeam = 11sixteen.new_game_results.HomeTeam)")
print("complete")
Basically the objective is that I copy data from several excel files to a SQL table (one at a time) and then transfer it from there to the fuller table where ALL the data will be aggregated (without duplicates hopefully)
Everything works 100% except the SQL query as below:
INSERT INTO 11sixteen.game_results
SELECT * FROM 11sixteen.new_game_results
WHERE NOT EXISTS ( SELECT date, HomeTeam
FROM 11sixteen.game_results WHERE
11sixteen.game_results.date = 11sixteen.new_game_results.date AND
11sixteen.game_results.HomeTeam = 11sixteen.new_game_results.HomeTeam)
If I run the same query on MySQL Workbench it works perfect. Any ideas why I can't get Python to execute the query as expected?
add a commit at the end.
db.commit()
Summary
I have a Python program (2.7) that connects to a SQL Server database using SQLAlchemy. I want to copy the entire SQL table into a local CSV file (including column headers). I'm new to SQLAlchemy (version .7) and so far I'm able to dump the entire csv file, but I have to explicitly list my column headers.
Question
How do I copy an entire SQL table into a local CSV file (including column headers)? I don't want to explicitly type in my column headers. The reason is that I want to avoid changing the code if there's changes in the table's columns.
Code
import sqlalchemy
# Setup connection info, assume database connection info is correct
SQLALCHEMY_CONNECTION = (DB_DRIVER_SQLALCHEMY + '://'
+ DB_UID + ":" + DB_PWD + "#" + DB_SERVER + "/" + DB_DATABASE
)
engine = sqlalchemy.create_engine(SQLALCHEMY_CONNECTION, echo=True)
metadata = sqlalchemy.MetaData(bind=engine)
vw_AllCenterChatOverview = sqlalchemy.Table( \
'vw_AllCenterChatOverview', metadata, autoload=True)
metadata.create_all(engine)
conn = engine.connect()
# Run the SQL Select Statement
result = conn.execute("""SELECT * FROM
[LifelineChatDB].[dbo].[vw_AllCenterChatOverview]""")
# Open file 'output.csv' and write SQL query contents to it
f = csv.writer(open('output.csv', 'wb'))
f.writerow(['StartTime', 'EndTime', 'Type', 'ChatName', 'Queue', 'Account',\
'Operator', 'Accepted', 'WaitTimeSeconds', 'PreChatSurveySkipped',\
'TotalTimeInQ', 'CrisisCenterKey']) # Where I explicitly list table headers
for row in result:
try:
f.writerow(row)
except UnicodeError:
print "Error running this line ", row
result.close()
Table Structure
In my example, 'vw_AllCenterChatOverview' is the table. Here's the Table Headers:
StartTime, EndTime, Type, ChatName, Queue, Account, Operator, Accepted, WaitTimeSeconds, PreChatSurveySkipped, TotalTimeInQ, CrisisCenterKey
Thanks in advance!
Use ResultProxy.keys:
# Run the SQL Select Statement
result = conn.execute("""SELECT * FROM
[LifelineChatDB].[dbo].[vw_AllCenterChatOverview]""")
# Get column names
column_names = result.keys()
I would like to dump only one table but by the looks of it, there is no parameter for this.
I found this example of the dump but it is for all the tables in the DB:
# Convert file existing_db.db to SQL dump file dump.sql
import sqlite3, os
con = sqlite3.connect('existing_db.db')
with open('dump.sql', 'w') as f:
for line in con.iterdump():
f.write('%s\n' % line)
You can copy only the single table in an in memory db:
import sqlite3
def getTableDump(db_file, table_to_dump):
conn = sqlite3.connect(':memory:')
cu = conn.cursor()
cu.execute("attach database '" + db_file + "' as attached_db")
cu.execute("select sql from attached_db.sqlite_master "
"where type='table' and name='" + table_to_dump + "'")
sql_create_table = cu.fetchone()[0]
cu.execute(sql_create_table);
cu.execute("insert into " + table_to_dump +
" select * from attached_db." + table_to_dump)
conn.commit()
cu.execute("detach database attached_db")
return "\n".join(conn.iterdump())
TABLE_TO_DUMP = 'table_to_dump'
DB_FILE = 'db_file'
print getTableDump(DB_FILE, TABLE_TO_DUMP)
Pro:
Simplicity and reliability: you don't have to re-write any library method, and you are more assured that the code is compatible with future versions of the sqlite3 module.
Con:
You need to load the whole table in memory, which may or may not be a big deal depending on how big the table is, and how much memory is available.
Dump realization lies here http://coverage.livinglogic.de/Lib/sqlite3/dump.py.html (local path: PythonPath/Lib/sqlite3/dump.py)
You can modify it a little:
# Mimic the sqlite3 console shell's .dump command
# Author: Paul Kippes <kippesp#gmail.com>
def _iterdump(connection, table_name):
"""
Returns an iterator to the dump of the database in an SQL text format.
Used to produce an SQL dump of the database. Useful to save an in-memory
database for later restoration. This function should not be called
directly but instead called from the Connection method, iterdump().
"""
cu = connection.cursor()
table_name = table_name
yield('BEGIN TRANSACTION;')
# sqlite_master table contains the SQL CREATE statements for the database.
q = """
SELECT name, type, sql
FROM sqlite_master
WHERE sql NOT NULL AND
type == 'table' AND
name == :table_name
"""
schema_res = cu.execute(q, {'table_name': table_name})
for table_name, type, sql in schema_res.fetchall():
if table_name == 'sqlite_sequence':
yield('DELETE FROM sqlite_sequence;')
elif table_name == 'sqlite_stat1':
yield('ANALYZE sqlite_master;')
elif table_name.startswith('sqlite_'):
continue
else:
yield('%s;' % sql)
# Build the insert statement for each row of the current table
res = cu.execute("PRAGMA table_info('%s')" % table_name)
column_names = [str(table_info[1]) for table_info in res.fetchall()]
q = "SELECT 'INSERT INTO \"%(tbl_name)s\" VALUES("
q += ",".join(["'||quote(" + col + ")||'" for col in column_names])
q += ")' FROM '%(tbl_name)s'"
query_res = cu.execute(q % {'tbl_name': table_name})
for row in query_res:
yield("%s;" % row[0])
# Now when the type is 'index', 'trigger', or 'view'
#q = """
# SELECT name, type, sql
# FROM sqlite_master
# WHERE sql NOT NULL AND
# type IN ('index', 'trigger', 'view')
# """
#schema_res = cu.execute(q)
#for name, type, sql in schema_res.fetchall():
# yield('%s;' % sql)
yield('COMMIT;')
Now it accepts table name as second argument.
You can use it like this:
with open('dump.sql', 'w') as f:
for line in _iterdump(con, 'GTS_vehicle'):
f.write('%s\n' % line)
Will get something like:
BEGIN TRANSACTION;
CREATE TABLE "GTS_vehicle" ("id" integer NOT NULL PRIMARY KEY, "name" varchar(20) NOT NULL, "company_id" integer NULL, "license_plate" varchar(20) NULL, "icon" varchar(100) NOT NULL DEFAULT 'baseicon.png', "car_brand" varchar(30) NULL, "content_type_id" integer NULL, "modemID" varchar(100) NULL, "distance" integer NULL, "max_speed" integer NULL DEFAULT 100, "max_rpm" integer NULL DEFAULT 4000, "fuel_tank_volume" integer NULL DEFAULT 70, "max_battery_voltage" integer NULL, "creation_date" datetime NOT NULL, "last_RFID" text NULL);
INSERT INTO "GTS_vehicle" VALUES(1,'lan1_op1_car1',1,'03115','baseicon.png','UFP',16,'lan_op1_car1',NULL,100,4000,70,12,'2011-06-23 11:54:32.395000',NULL);
INSERT INTO "GTS_vehicle" VALUES(2,'lang_op1_car2',1,'03','baseicon.png','ыва',16,'lan_op1_car2',NULL,100,4000,70,12,'2011-06-23 11:55:02.372000',NULL);
INSERT INTO "GTS_vehicle" VALUES(3,'lang_sup_car1',1,'0000','baseicon.png','Fiat',16,'lan_sup_car1',NULL,100,4000,70,12,'2011-06-23 12:32:09.017000',NULL);
INSERT INTO "GTS_vehicle" VALUES(4,'lang_sup_car2',1,'123','baseicon.png','ЗАЗ',16,'lan_sup_car2',NULL,100,4000,70,12,'2011-06-23 12:31:38.108000',NULL);
INSERT INTO "GTS_vehicle" VALUES(9,'lang_op2_car1',1,'','baseicon.png','',16,'1233211234',NULL,100,4000,70,12,'2011-07-05 13:32:09.865000',NULL);
INSERT INTO "GTS_vehicle" VALUES(11,'Big RIder',1,'','baseicon.png','0311523',16,'111',NULL,100,4000,70,20,'2011-07-07 12:12:40.358000',NULL);
COMMIT;
By iterdump(), all information would be displayed like this:
INSERT INTO "name" VALUES(1, 'John')
INSERT INTO "name" VALUES(2, 'Jane')
INSERT INTO "phone" VALUES(1, '111000')
INSERT INTO "phone" VALUES(2, '111001')
An easy way is by filter certain keywords by string.startswith() method.
For example, the table name is 'phone':
# Convert file existing_db.db to SQL dump file dump.sql
import sqlite3, os
con = sqlite3.connect('existing_db.db')
with open('dump.sql', 'w') as f:
for line in con.iterdump():
if line.startswith('INSERT INTO "phone"'):
f.write('%s\n' % line)
Not very smart, but can fit your objective.