Here's a question for you mysql + python folks out there.
Why does this mysql sql sequence of commands not work when I execute it through Python, but it does when I execute it via the mysql CLI?
#!/usr/bin/env python
import oursql as mysql
import sys, traceback as tb
import logging
# some other stuff...
class MySqlAuth(object):
def __init__(self, host = None, db = None, user = None, pw = None, port = None):
self.host = 'localhost' if host is None else host
self.db = 'mysql' if db is None else db
self.user = 'root' if user is None else user
self.pw = pw
self.port = 3306 if port is None else port
#property
def cursor(self):
auth_dict = dict()
auth_dict['host'] = self.host
auth_dict['user'] = self.user
auth_dict['passwd'] = self.pw
auth_dict['db'] = self.db
auth_dict['port'] = self.port
conn = mysql.connect(**auth_dict)
cur = conn.cursor(mysql.DictCursor)
return cur
def ExecuteNonQuery(auth, sql):
try:
cur = auth.cursor
log.debug('SQL: ' + sql)
cur.execute(sql)
cur.connection.commit()
return cur.rowcount
except:
cur.connection.rollback()
log.error("".join(tb.format_exception(*sys.exc_info())))
finally:
cur.connection.close()
def CreateTable(auth, table_name):
CREATE_TABLE = """
CREATE TABLE IF NOT EXISTS %(table)s (
uid VARCHAR(120) PRIMARY KEY
, k VARCHAR(1000) NOT NULL
, v BLOB
, create_ts TIMESTAMP NOT NULL
, mod_ts TIMESTAMP NOT NULL
, UNIQUE(k)
, INDEX USING BTREE(k)
, INDEX USING BTREE(mod_ts) );
"""
ExecuteNonQuery(auth, CREATE_TABLE % { 'table' : table_name })
CREATE_BEFORE_INSERT_TRIGGER = """
DELIMITER //
CREATE TRIGGER %(table)s_before_insert BEFORE INSERT ON %(table)s
FOR EACH ROW
BEGIN
SET NEW.create_ts = NOW();
SET NEW.mod_ts = NOW();
SET NEW.uid = UUID();
END;// DELIMIETER ;
"""
ExecuteNonQuery(auth, CREATE_BEFORE_INSERT_TRIGGER % { 'table' : table_name })
CREATE_BEFORE_INSERT_TRIGGER = """
DELIMITER //
CREATE TRIGGER %(table)s_before_update BEFORE UPDATE ON %(table)s
FOR EACH ROW
BEGIN
SET NEW.mod_ts = NOW();
END;// DELIMIETER ;
"""
ExecuteNonQuery(auth, CREATE_BEFORE_UPDATE_TRIGGER % { 'table' : table_name })
# some other stuff
The error that I get when I run the python is this:
2012-01-15 11:53:00,138 [4214 MainThread mynosql.py] DEBUG SQL:
DELIMITER //
CREATE TRIGGER nosql_before_insert BEFORE INSERT ON nosql
FOR EACH ROW
BEGIN
SET NEW.create_ts = NOW();
SET NEW.mod_ts = NOW();
SET NEW.uid = UUID();
END;// DELIMIETER ;
2012-01-15 11:53:00,140 [4214 MainThread mynosql.py] ERROR Traceback (most recent call last):
File "./mynosql.py", line 39, in ExecuteNonQuery
cur.execute(sql)
File "cursor.pyx", line 120, in oursql.Cursor.execute (oursqlx/oursql.c:15856)
File "cursor.pyx", line 111, in oursql.execute (oursqlx/oursql.c:15728)
File "statement.pyx", line 157, in oursql._Statement.prepare (oursqlx/oursql.c:7750)
File "statement.pyx", line 127, in oursql._Statement._raise_error (oursqlx/oursql.c:7360)
ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DELIMITER //\n CREATE TRIGGER nosql_before_insert BEFORE INSERT ON nosql\n F' at line 1", None)
Although the error you are getting seems to be generated by the first DELIMITER // statement, you have a typo at the last mention of DELIMITER - you wrote it DELIMIETER ; - try to change that and see if that solves your issue.
Update
You have 2 typos for the same DELIMIETER ; - I believe you are getting the error just after the interpreter finds the first one:
DELIMITER //
CREATE TRIGGER %(table)s_before_insert BEFORE INSERT ON %(table)s
FOR EACH ROW
BEGIN
SET NEW.create_ts = NOW();
SET NEW.mod_ts = NOW();
SET NEW.uid = UUID();
END;// DELIMIETER ; <-- this one is wrong, it should be DELIMITER
You can only pass queries to mysql one at a time; it's up to the client to ensure that the query text is just one valid statement.
The MySQL client does this by tokenizing the entered query and looking for statement separators. In the case of a trigger definition, this doesn't work because the definition can contain semicolons (the default statement separator), and so you have to tell the cli to separate statements in another way, using the DELIMITER command.
The MySQLdb (and other) python api's require no such statement separation; the programmer is obligated to pass statements on at a time to query.
Try removing the DELIMITER statements altogether from your queries (when passed through the python api).
Related
I have the following SQL being sent from pyodbc using bound parameters.
IF NOT EXISTS (
SELECT *
FROM dbo.tApplicationCMSNegativeFactors2
WHERE N'transientkey' = N'67'
)
INSERT dbo.tApplicationCMSNegativeFactors2 (
N'transientkey, nfid, active, applicationnum, modstamp, source_database_tablename_for_kafka_connector, source_identifier_for_kafka_connector'
)
VALUES (
N'''67'', ''5'', ''1'', ''52'', ''2022-10-01 03:28:25.000372'', ''tapplicationcmsnegativefactors'', ''transientkey'''
)
ELSE
UPDATE dbo.tApplicationCMSNegativeFactors2
SET N'transientkey = ''67'', nfid = ''5'', active = ''1'', applicationnum = ''52'', modstamp = ''2022-10-01 03:28:25.000372'', source_database_tablename_for_kafka_connector = ''tapplicationcmsnegativefactors'', source_identifier_for_kafka_connector = ''transientkey'''
WHERE N'transientkey' = N'67';
I am unsure why but when trying to execute this code, an error shows next to the set clause in SSMS. What can I do to make this sql execute successfully but still retaining the N prefix so I can make it work with pyodbc.
I was expecting this code to execute successfully seeing as removing the N prefix allows the code to execute.
Included the python code below.
import pyodbc
# Auth.
server = ""
database = ""
username = ""
password = ""
# Set up the database connection
cnxn = pyodbc.connect(f'DRIVER={"SQL Server"};SERVER={server};DATABASE={database};UID={username};PWD={password}')
cursor = cnxn.cursor()
ls = [
"transientkey",
67,
"transientkey, nfid, active, applicationnum, modstamp, source_database_tablename_for_kafka_connector, source_identifier_for_kafka_connector",
"'67', '5', '1', '52', '2022-10-01 03:28:25.000372', 'tapplicationcmsnegativefactors', 'transientkey'",
"transientkey = '67', nfid = '5', active = '1', applicationnum = '52', modstamp = '2022-10-01 03:28:25.000372', source_database_tablename_for_kafka_connector = 'tapplicationcmsnegativefactors', source_identifier_for_kafka_connector = 'transientkey'",
]
# Execute SQL
def exec_sql(kv, join_kv, col_inst, val_inst, val_upd):
cursor.execute(
"IF NOT EXISTS (SELECT * FROM dbo.tApplicationCMSNegativeFactors2 WHERE ? = ?) INSERT tApplicationCMSNegativeFactors2 (?) VALUES (?) ELSE UPDATE dbo.tApplicationCMSNegativeFactors2 SET ? WHERE ? = ?", kv, join_kv, col_inst, val_inst, val_upd, kv, join_kv
)
exec_sql(str(ls[0]), str(ls[1]), str(ls[2]), str(ls[3]), str(ls[4]))
You're mixing up parameters and dynamic SQL. You can't change the structure of the SQL with parameters, so this
tApplicationCMSNegativeFactors2 (?) VALUES (?)
Needs to be done with string interpolation (accounting for SQL injection vulnerabilities) before the string with parameter markers ? is sent to cursor.
The reason I'm using ? here is to prevent SQL Injection attacks from occuring from the python code.
Just can't do that. If you want to avoid dynamic SQL you can have a number of static SQL queries with with parameter markers for the data, like
... INSERT tApplicationCMSNegativeFactors2 (transientkey, nfid, active, applicationnum, modstamp, source_database_tablename_for_kafka_connector, source_identifier_for_kafka_connector) VALUES (?,?,?,?,?) ...
I have created a database and I am trying to fetch data from it. I have a class Query and inside the class I have a function that calls a table called forecasts. The function is as follows:
def forecast(self, provider: str, zone: str='Mainland',):
self.date_start = date_start)
self.date_end = (date_end)
self.df_forecasts = pd.DataFrame()
fquery = """
SELECT dp.name AS provider_name, lf.datetime_from AS date, fr.name AS run_name, lf.value AS value
FROM load_forecasts lf
INNER JOIN bidding_zones bz ON lf.zone_id = bz.zone_id
INNER JOIN data_providers dp ON lf.provider_id = dp.provider_id
INNER JOIN forecast_runs fr ON lf.run_id = fr.run_id
WHERE bz.name = '{zone}'
AND dp.name = '{provider}'
AND date(lf.datetime_from) BETWEEN '{self.date_start}' AND '{self.date_end}'
"""
df_forecasts = pd.read_sql_query(fquery, self.connection)
return df_forecasts
In the scripts that I run I am calling the Query class giving it my inputs
query = Query(date_start, date_end)
And the function
forecast_df = query.forecast(provider='Meteologica')
I run my script in the command line in the classic way
python myscript.py '2022-11-10' '2022-11-18'
My script shows the error
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidDatetimeFormat) invalid input syntax for type date: "{self.date_start}"
LINE 9: AND date(lf.datetime_from) BETWEEN '{self.date_start...
when I use this syntax, but when I manually input the string for date_start and date_end it works.
I cannot find a way to solve the problem with sqlalchemy, so I opened a cursor with psycopg2.
# Returns the datetime, value and provider name and issue date of the forecasts in the load_forecasts table
# The dates range is specified by the user when the class is called
def forecast(self, provider: str, zone: str='Mainland',):
# Opens a cursor to get the data
cursor = self.connection.cursor()
# Query to run
query = """
SELECT dp.name, lf.datetime_from, fr.name, lf.value, lf.issue_date
FROM load_forecasts lf
INNER JOIN bidding_zones bz ON lf.zone_id = bz.zone_id
INNER JOIN data_providers dp ON lf.provider_id = dp.provider_id
INNER JOIN forecast_runs fr ON lf.run_id = fr.run_id
WHERE bz.name = %s
AND dp.name = %s
AND date(lf.datetime_from) BETWEEN %s AND %s
"""
# Execute the query, bring the data and close the cursor
cursor.execute(query, (zone, provider, self.date_start, self.date_end))
self.df_forecasts = cursor.fetchall()
cursor.close()
return self.df_forecasts
If anyone finds the answer with sqlalchemy, I would love to see it!
I am trying to get the ROW_COUNT() from a MySQL stored procedure into python.
here is what I got, but I don't know what I am missing.
DELIMITER //
CREATE OR REPLACE PROCEDURE sp_refresh_mytable(
OUT row_count INT
)
BEGIN
DECLARE exit handler for SQLEXCEPTION
BEGIN
ROLLBACK;
END;
DECLARE exit handler for SQLWARNING
BEGIN
ROLLBACK;
END;
DECLARE exit handler FOR NOT FOUND
BEGIN
ROLLBACK;
END;
START TRANSACTION;
DELETE FROM mytable;
INSERT INTO mytable
(
col1
, col2
)
SELECT
col1
, col2
FROM othertable
;
SET row_count = ROW_COUNT();
COMMIT;
END //
DELIMITER ;
If I call this in via normal SQL like follows I get the correct row_count of the insert operation (e.g. 26 rows inserted):
CALL sp_refresh_mytable(#rowcount);
select #rowcount as t;
-- output: 26
Then in python/mysqlalchemy:
def call_procedure(engine, function_name, params=None):
connection = engine.raw_connection()
try:
cursor = connection.cursor()
result = cursor.callproc('sp_refresh_mytable', [0])
## try result outputs
resultfetch = cursor.fetchone()
logger.info(result)
logger.info(result[0])
logger.info(resultfetch)
cursor.close()
connection.commit()
connection.close()
logger.info(f"Running procedure {function_name} success!")
return result
except Exception as e:
logger.error(f"Running procedure {function_name} failed!")
logger.exception(e)
return None
finally:
connection.close()
So I tried logging different variations of getting the out value, but it is always 0 or None.
[INFO] db_update [0]
[INFO] db_update 0
[INFO] db_update None
What am I missing?
Thanks!
With the help of this answer I found the following solution that worked for me.
a) Working solution using engine.raw_connection() and cursor.callproc:
def call_procedure(engine, function_name):
connection = engine.raw_connection()
try:
cursor = connection.cursor()
cursor.callproc(function_name, [0])
cursor.execute(f"""SELECT #_{function_name}_0""")
results = cursor.fetchone() ## returns a tuple e.g. (285,)
rows_affected = results[0]
cursor.close()
connection.commit()
logger.info(f"Running procedure {function_name} success!")
return rows_affected
except Exception as e:
logger.error(f"Running procedure {function_name} failed!")
logger.exception(e)
return None
finally:
connection.close()
And with this answer I found this solution also:
b) Instead of using a raw connection, this worked as well:
def call_procedure(engine, function_name, params=None):
try:
with engine.begin() as db_conn:
db_conn.execute(f"""CALL {function_name}(#out)""")
results = db_conn.execute('SELECT #out').fetchone() ## returns a tuple e.g. (285,)
rows_affected = results[0]
logger.debug(f"Running procedure {function_name} success!")
return rows_affected
except Exception as e:
logger.error(f"Running procedure {function_name} failed!")
logger.exception(e)
return None
finally:
if db_conn: db_conn.close()
If there are any advantages or drawbacks of using one of these methods over the other, please let me know in a comment.
I just wanted to add another piece of code, since I was trying to get callproc to work (using sqlalchemy) with multiple in- and out-params.
For this case I went with the callproc method using a raw connection [solution b) in my previous answer], since this functions accepts params as a list.
It could probably be done more elegantly or more pythonic in some parts, but it was mainly for getting it to work and I will probably create a function from this so I can use it for generically calling a SP with multiple in and out params.
I included comments in the code below to make it easier to understand what is going on.
In my case I decided to put the out-params in a dict so I can pass it along to the calling app in case I need to react to the results. Of course you could also include the in-params which could make sense for error logging maybe.
## some in params
function_name = 'sp_upsert'
in_param1 = 'my param 1'
in_param2 = 'abcdefg'
in_param3 = 'some-name'
in_param4 = 'some display name'
in_params = [in_param1, in_param1, in_param1, in_param1]
## out params
out_params = [
'out1_row_count'
,'out2_row_count'
,'out3_row_count'
,'out4_row_count_ins'
,'out5_row_count_upd'
]
params = copy(in_params)
## adding the outparams as integers from out_params indices
params.extend([i for i, x in enumerate(out_params)])
## the params list will look like
## ['my param 1', 'abcdefg', 'some-name', 'some display name', 0, 1, 2, 3, 4]
logger.info(params)
## build query to get results from callproc (including in and out params)
res_qry_params = []
for i in range(len(params)):
res_qry_params.append(f"#_{function_name}_{i}")
res_qry = f"SELECT {', '.join(res_qry_params)}"
## the query to fetch the results (in and out params) will look like
## SELECT #_sp_upsert_0, #_sp_upsert_1, #_sp_upsert_2, #_sp_upsert_3, #_sp_upsert_4, #_sp_upsert_5, #_sp_upsert_6, #_sp_upsert_7, #_sp_upsert_8
logger.info(res_qry)
try:
connection = engine.raw_connection()
## calling the sp
cursor = connection.cursor()
cursor.callproc(function_name, params)
## get the results (includes in and out params), the 0/1 in the end are the row_counts from the sp
## fetchone is enough since all results come as on result record like
## ('my param 1', 'abcdefg', 'some-name', 'some display name', 1, 0, 1, 1, 0)
cursor.execute(res_qry)
results = cursor.fetchone()
logger.info(results)
## adding just the out params to a dict
res_dict = {}
for i, element in enumerate(out_params):
res_dict.update({
element: results[i + len(in_params)]
})
## the result dict in this case only contains the out param results and will look like
## { 'out1_row_count': 1,
## 'out2_row_count': 0,
## 'out3_row_count': 1,
## 'out4_row_count_ins': 1,
## 'out5_row_count_upd': 0}
logger.info(pformat(res_dict, indent=2, sort_dicts=False))
cursor.close()
connection.commit()
logger.debug(f"Running procedure {function_name} success!")
except Exception as e:
logger.error(f"Running procedure {function_name} failed!")
logger.exception(e)
Just to complete the picture, here is a shortened version of my stored procedure. After BEGIN I declare some error handlers I set the out params to default 0, otherwise they could also return as NULL/None if not set by the procedure (e.g. because no insert was made):
DELIMITER //
CREATE OR REPLACE PROCEDURE sp_upsert(
IN in_param1 VARCHAR(32),
IN in_param2 VARCHAR(250),
IN in_param3 VARCHAR(250),
IN in_param4 VARCHAR(250),
OUT out1_row_count INTEGER,
OUT out2_row_count INTEGER,
OUT out3_row_count INTEGER,
OUT out4_row_count_ins INTEGER,
OUT out5_row_count_upd INTEGER
)
BEGIN
-- declare variables, do NOT declare the out params here!
DECLARE dummy INTEGER DEFAULT 0;
-- declare error handlers (e.g. continue handler for not found)
DECLARE CONTINUE HANDLER FOR NOT FOUND SET dummy = 1;
-- set out params defaulting to 0
SET out1_row_count = 0;
SET out2_row_count = 0;
SET out3_row_count = 0;
SET out4_row_count_ins = 0;
SET out5_row_count_upd = 0;
-- do inserts and updates and set the outparam variables accordingly
INSERT INTO some_table ...;
SET out1_row_count = ROW_COUNT();
-- commit if no errors
COMMIT;
END //
DELIMITER ;
I have a weird behaviour with postgres + sqlalchemy.
I call a function that insert into a table, but when called from sqlalchemy it roolback at the end, and when called from psql it succeed:
Logs when called by sqlalchemy (as reported by the logs):
Jan 21 13:17:28 intersec.local postgres[3466]: [18-9] STATEMENT: SELECT name, suffix
Jan 21 13:17:28 intersec.local postgres[3466]: [18-10] FROM doc_codes('195536d95bd155b9ea412154b3e920761495681a')
Jan 21 13:17:28 intersec.local postgres[3466]: [19-9] STATEMENT: ROLLBACK
Jan 21 13:17:28 intersec.local postgres[3465]: [13-9] STATEMENT: COMMIT
If using psql:
Jan 21 13:28:47 intersec.local postgres[3561]: [20-9] STATEMENT: SELECT name, suffix FROM doc_codes('195536d95bd155b9ea412154b3e920761495681a');
Note not transaction stuff at all.
This is my python code:
def getCon(self):
conStr = "postgresql+psycopg2://%(USER)s:%(PASSWORD)s#%(HOST)s/%(NAME)s"
config = settings.DATABASES['default']
#print conStr % config
con = sq.create_engine(
conStr % config,
echo=ECHO
)
event.listen(con, 'checkout', self.set_path)
self.con = con
self.meta.bind = con
return con
def getDocPrefixes(self, deviceId):
f = sq.sql.func.doc_codes(deviceId, type_=types.String)
columns = [
sq.Column('name', types.String),
sq.Column('suffix', types.String)
]
return [dict(x.items()) for x in self.con.execute
(
select(columns).
select_from(f)
).fetchall()]
sync = dbSync('malab')
for k in sync.getDocPrefixes('195536d95bd155b9ea412154b3e920761495681a'):
print k['name'], '=', k['suffix']
What could trigger the ROLLBACK?
P.D: My DB functions:
CREATE OR REPLACE FUNCTION next_letter (
table_name TEXT,
OUT RETURNS TEXT
)
AS
$$
DECLARE
result TEXT = 'A';
nextLetter TEXT;
num INTEGER;
BEGIN
SELECT INTO num nextval('letters');
nextLetter := chr(num);
result := nextLetter;
WHILE true LOOP
--RAISE NOTICE '%', result;
IF EXISTS(SELECT 1 FROM DocPrefix WHERE Name=result AND TableName=table_name) THEN
SELECT max(SUBSTRING(name FROM '\d+'))
FROM DocPrefix WHERE Name=result AND TableName=table_name
INTO num;
result := nextLetter || (coalesce(num,0) + 1);
ELSE
EXIT;
END IF;
END LOOP;
RETURNS = result;
END;
$$
LANGUAGE 'plpgsql';
-- Retorna el prefijo unico para la tabla/dispositivo.
CREATE OR REPLACE FUNCTION prefix_fordevice (
table_name TEXT,
device_id TEXT,
OUT RETURNS TEXT
)
AS
$$
DECLARE
result TEXT = NULL;
row RECORD;
BEGIN
IF NOT(EXISTS(SELECT 1 FROM DocPrefix WHERE MachineId=device_id AND TableName=table_name)) THEN
INSERT INTO DocPrefix
(Name, MachineId, TableName)
VALUES
(next_letter(table_name), device_id, table_name);
END IF;
SELECT name FROM DocPrefix WHERE
MachineId=device_id AND TableName=table_name
INTO result;
RETURNS = result;
END;
$$
LANGUAGE 'plpgsql';
--Retornar los prefijos exclusivos para el ID de dispositvo
CREATE OR REPLACE FUNCTION doc_codes(device_id TEXT) RETURNS TABLE("name" TEXT, "suffix" TEXT) AS $$
SELECT name, prefix_fordevice(name, device_id) AS suffix FROM doccode;
$$ LANGUAGE SQL;
the antipattern here is that you're confusing a SQLAlchemy Engine for a connection, when you do something like this:
con = sq.create_engine(<url>)
result = con.execute(statement)
the Engine is associated with a connection pool as a source of connections. When you call the execute() method on Engine, it checks out a connection from the pool, runs the statement, and returns the results; when the result set is exhausted, it returns the connection to the pool. At that stage, the pool will either close the connection fully, or it will re-pool it. Storing the connection in the pool means that any remaining transactional state must be cleared (note that DBAPI connections are always in a transaction when they are used), so it emits a rollback.
Your program should create a single Engine per URL at the module level, and when it needs a connection, should call upon engine.connect().
the document Working with Engines and Connections explains all of this.
I finally found the answer here:
Make SQLAlchemy COMMIT instead of ROLLBACK after a SELECT query
def getDocPrefixes(self, deviceId):
f = sq.sql.func.doc_codes(deviceId, type_=types.String)
columns = [
sq.Column('name', types.String),
sq.Column('sufix', types.String)
]
with self.con.begin():
return [dict(x.items()) for x in self.con.execute
(
select(columns).
select_from(f)
).fetchall()]
The thing is, the function can insert data + also return a SELECT, so, sqlalchemy think this is a normal SELECT when in fact the function also change data and need commit.
i have the following:
ora_wet = oracle_connection()
cursor = ora_wet.cursor()
sqlQuery = u"SELECT * FROM web_cities WHERE cty_name = 'София'"
cursor.execute(sqlQuery)
sqlResult = cursor.fetchone()
When I do this I get the following error:
TypeError: expecting None or a string on line 18 which is the cursor.execute(sqlQuery)
If I make the query non-unicode (without the u) it goes through but it returns nothing
edit: in reply to first comment:
NLS_LANGUAGE is BULGARIAN,
NLS_CHARACTERSET is CL8MSWIN1251
language is Python...
yes there is a record with cty_name = 'София'
connection is just:
def oracle_connection():
return cx_Oracle.connect('user/pass#server')
ora_wet = oracle_connection()