I've got a program which has a database with a certain schema, v0.1.0.
In my next version (v0.1.1) I've made changes to the database schema.
So when I update to (0.1.1) I would like those changes to take effect without effecting the original data in (0.1.0) and subsequent versions.
How do I go about making the changes without affecting (0.1.0) database data and keeping track of those changes in subsequent versions?
I'm using python with sqlite3.
Update
There is no support for multiple versions of the software. The database is dependent on the verion that you are using.
The is no concurrent access to the database, 1 database per version.
So the user can use the old version but when they upgrade to the new version the .sqlite schema will be changed.
Track the schema version in the database with the user_version PRAGMA, and keep a sequence of upgrade steps:
def get_schema_version(conn):
cursor = conn.execute('PRAGMA user_version')
return cursor.fetchone()[0]
def set_schema_version(conn, version):
conn.execute('PRAGMA user_version={:d}'.format(version))
def initial_schema(conn):
# install initial schema
# ...
def ugrade_1_2(conn):
# modify schema, alter data, etc.
# map target schema version to upgrade step from previous version
upgrade_steps = {
1: initial_schema,
2: ugrade_1_2,
}
def check_upgrade(conn):
current = get_schema_version(conn)
target = max(upgrade_steps)
for step in range(current + 1, target + 1):
if step in upgrade_steps:
upgrade_steps[step](conn)
set_schema_version(conn, step)
There are a few ways to do this, I will only mention one.
It seems like you already track the version within the database. On your application start, you will want to check this version against the running application's version and run any sql scripts that will perform the schema changes.
update
An example of this in action:
import os
import sqlite3 as sqlite
def make_movie_table(cursor):
cursor.execute('CREATE TABLE movies(id INTEGER PRIMARY KEY, title VARCHAR(20), link VARCHAR(20))')
def make_series_table(cursor):
cursor.execute('CREATE TABLE series(title VARCHAR(30) PRIMARY KEY,series_link VARCHAR(60),number_of_episodes INTEGER,number_of_seasons INTEGER)')
def make_episode_table(cursor):
cursor.execute('CREATE TABLE episodes(id INTEGER PRIMARY KEY,title VARCHAR(30),episode_name VARCHAR(15), episode_link VARCHAR(40), Date TIMESTAMP, FOREIGN KEY (title) REFERENCES series(title) ON DELETE CASCADE)')
def make_version_table(cursor):
cursor.execute('CREATE TABLE schema_versions(version VARCHAR(6))')
cursor.execute('insert into schema_versions(version) values ("0.1.0")')
def create_database(sqlite_file):
if not os.path.exists(sqlite_file):
connection = sqlite.connect(sqlite_file)
cursor = connection.cursor()
cursor.execute("PRAGMA foreign_keys = ON")
make_movie_table(cursor)
make_series_table(cursor)
make_episode_table(cursor)
make_version_table(cursor)
connection.commit()
connection.close()
def upgrade_database(sqlite_file):
if os.path.exists(sqlite_file):
connection = sqlite.connect(sqlite_file)
cursor = connection.cursor()
cursor.execute("select max(version) from schema_versions")
row = cursor.fetchone()
database_version = row[0]
print('current version is %s' % database_version)
if database_version == '0.1.0':
print('upgrading version to 0.1.1')
cursor.execute('alter table series ADD COLUMN new_column1 VARCHAR(10)')
cursor.execute('alter table series ADD COLUMN new_column2 INTEGER')
cursor.execute('insert into schema_versions(version) values ("0.1.1")')
#if database_version == '0.1.1':
#print('upgrading version to 0.1.2')
#etc cetera
connection.commit()
connection.close()
#I need to add 2 columns to the series table, when the user upgrade the software.
if __name__ == '__main__':
create_database('/tmp/db.sqlite')
upgrade_database('/tmp/db.sqlite')
Each upgrade script will take care of making your database changes, and then update the version inside the DB to the latest version. Note that we do not use elif statements, this is so that you can upgrade a database over multiple versions if it needs to.
There are some caveats to watch out for:
run upgrades inside transactions, you will want to rollback on any errors to avoid leaving the database in an unusable state. -- update this is incorrect as pointed out, thanks Martijn!
avoid renames and column deletes if you can, and if you must, ensure any views using them are updated too.
Long term you're going to be better off using an ORM such as SQLAlchemy with a migration tool like Alembic.
Related
I'm using sqlalchemy in combination with sqlite and the databases library and I'm trying to wrap my head around what that combination returns when doing update queries. I'm running a testcase and I have sqlalchemy set up to roll back upon execution of each testcase via force_rollback=True.
db = databases.Database(DB_URL, force_rollback=True)
query = update(my_table).where(my_table.columns.id == some_id_to_update).values(**values)
res = await db.execute(query)
When working with psql, I'd expect res to be the number of rows that were affected by the UPDATE query, but from reading the documentation, sqlite seems to behave differently in that it doesn't return anything. I tested this manually by connecting to the database via sqlite3 and as expected, there is no return when doing UPDATE queries. sqlalchemy however does return something, which I assume is the number of total rows in the table, but I'm not sure. Can anybody shed some light into what is actually returned?
What's more, when I tried to get the number of rows affected by the UPDATE query via SELECT changes(), I'm also getting the number of total rows in the table and not the rows affected by the most recent query. Do I have a misunderstanding of what changes() does?
"The changes() function returns the number of database rows that were changed or inserted or deleted by the most recently completed INSERT, DELETE, or UPDATE statement, exclusive of statements in lower-level triggers."
When you use the Python sqlite3 module, you use .executeXXX interfaces to evaluate/prepare your query. If the query is supposed to modify the database, it does it at this stage. You have to use the same interface to prepare a SELECT statement. In either case, the .executeXXX interfaces never return anything. To get the result of a SELECT query, you have to use a .fetchXXX interface after running .executeXXX.
To get the number of changed rows after INSERT, DELETE, or UPDATE statement via sqlite3, you can also take the difference in con.total_changes before/after running .executeXXX.
Background
I maintain a Python application that automatically applies SQL schema migrations (adding/removing tables and columns, adjusting the data, etc) to our database (SQL2016). Each migration is executed via PyODBC within a transaction so that it can be rolled back if something goes wrong. Sometimes a migration requires one or more batch statements (GO) to execute correctly. Since GO is not actually a T-SQL command but rather a special keyword in SSMS, I've been splitting each SQL migration on GO and executing each SQL fragment separately within the same transaction.
import pyodbc
import re
conn_args = {
'driver': '{ODBC Driver 17 for SQL Server}',
'hostname': 'MyServer',
'port': 1298,
'server': r'MyServer\MyInstance',
'database': 'MyDatabase',
'user': 'MyUser',
'password': '********',
'autocommit': False,
}
connection = pyodbc.connect(**conn_args)
cursor = connection.cursor()
sql = '''
ALTER TABLE MyTable ADD NewForeignKeyID INT NULL FOREIGN KEY REFERENCES MyParentTable(ID)
GO
UPDATE MyTable
SET NewForeignKeyID = 1
'''
sql_fragments = re.split(r'^\s*GO;?\s*$', sql, flags=re.IGNORECASE|re.MULTILINE)
for sql_frag in sql_fragments:
cursor.execute(sql_frag)
# Wait for the command to complete. This is necessary for some database system commands
# (backup, restore, etc). Probably not necessary for schema migrations, but included
# for completeness.
while cursor.nextset():
pass
connection.commit()
Problem
SQL statement batches aren't being executed like I expected. When the above schema migration is executed in SSMS, it succeeds. When executed in Python, the first batch (adding the foreign key) executes just fine, but the second batch (setting the foreign key value) fails because it isn't aware of the new foreign key.
('42S22', "[42S22] [FreeTDS][SQL Server]Invalid column name 'NewForeignKeyID'. (207) (SQLExecDirectW)")
Goal
Execute a hierarchy of SQL statement batches (i.e. where each statement batch depends upon the previous batch) within a single transaction in PyODBC.
What I've Tried
Searching the PyODBC documentation for information on how PyODBC supports or doesn't support batch statements / the GO command. No references found.
Searching StackOverflow & Google for how to batch statements within PyODBC.
Introducing a small sleep between SQL fragment executions just in case there's some sort of race condition. Seemed unlikely to be a solution, and didn't change the behavior.
I've considered separating each batch of statements out into a separate transaction that is committed before the next batch is executed, but that would reduce/eliminate our ability to automatically roll back a schema migration that fails.
EDIT: I just found this question, which is pretty much exactly what I want to do. However, upon testing (in SSMS) the answer that recommends using EXEC I discovered that the second EXEC command (setting the value) fails because it isn't aware of the new foreign key. I'm bad at testing and it actually does succeed. This solution might work but isn't ideal since EXEC isn't compatible with parameters. Also, this won't work if variables are used across fragments.
BEGIN TRAN
EXEC('ALTER TABLE MyTable ADD NewForeignKeyID INT NULL FOREIGN KEY REFERENCES MyParentTable(ID)')
EXEC('UPDATE MyTable SET NewForeignKeyID = 1')
ROLLBACK TRAN
Invalid column name 'FK_TestID'.
If you are reading the SQL statements from a text file (such as one produced by scripting objects in SSMS) then you could just use Python's subprocess module to run the sqlcmd utility with that file as the input (-i). In its simplest form that would look like
server = "localhost"
port = 49242
uid = "scott"
pwd = "tiger^5HHH"
database = "myDb"
script_file = r"C:\__tmp\batch_test.sql"
"""contents of the above file:
DROP TABLE IF EXISTS so69020084;
CREATE TABLE so69020084 (src varchar(10), var_value varchar(10));
INSERT INTO so69020084 (src, var_value) VALUES ('1st batch', 'foo');
GO
INSERT INTO so69020084 (src, var_value) VALUES ('2nd batch', 'bar');
GO
"""
import subprocess
cmd = [
"sqlcmd",
"-S", f"{server},{port}",
"-U", uid,
"-P", pwd,
"-d", database,
"-i", script_file,
]
subprocess.run(cmd)
I have a stored procedure.
calling it via MySQL workbench as follows working;
CALL `lobdcapi`.`escalatelobalarm`('A0001');
But not from the python program. (means it is not throwing any exception, process finish execution silently) if I make any error in column names, then at python I get an error. So it calls my stored procedure but not working as expected. (it is an update query .it needs SAFE update )
Why through the python sqlalchemy this update didn't update any records?
CREATE DEFINER=`lob`#`%` PROCEDURE `escalatelobalarm`(IN client_id varchar(50))
BEGIN
SET SQL_SAFE_UPDATES = 0;
update lobdcapi.alarms
set lobalarmescalated=1
where id in (
SELECT al.id
from (select id,alarmoccurredhistoryid from lobdcapi.alarms where lobalarmpriorityid=1 and lobalarmescalated=0 and clientid=client_id
and alarmstatenumber='02' ) as al
inner join lobdcapi.`alarmhistory` as hi on hi.id=al.alarmoccurredhistoryid
and hi.datetimestamp<= current_timestamp() )
);
SET SQL_SAFE_UPDATES = 1;
END
I call it like;
from sqlalchemy import and_, func,text
db.session.execute(text("CALL escalatelobalarm(:param)"), {'param': clientid})
I suspect the param I pass via code didn't get bind properly?
I haven't called stored procs from SQLAlchemy, but it seems possible that this could be within a transaction because you're using the session. Perhaps calling db.session.commit() at the end would help?
If that fails, SQLAlchemy calls out calling stored procs here. Perhaps try their method of using callproc. Adapting to your use-case, something like:
connection = db.session.connection()
try:
cursor = connection.cursor()
cursor.callproc("escalatelobalarm", [clientid])
results = list(cursor.fetchall())
cursor.close()
connection.commit()
finally:
connection.close()
I can't seem to correctly connect and pull from a test postgreSQL database in python. I installed PostgreSQL using Homebrew. Here's how I have been accessing the database table and value from the terminal:
xxx-macbook:~ xxx$ psql
psql (9.4.0)
Type "help" for help.
xxx=# \dn
List of schemas
Name | Owner
--------+---------
public | xxx
(1 row)
xxx=# \connect postgres
You are now connected to database "postgres" as user "xxx".
postgres=# SELECT * from test.test;
coltest
-----------
It works!
(1 row)
But when trying to access it from python, using the code below, it doesn't work. Any suggestions?
########################################################################################
# Importing variables from PostgreSQL database via SQL commands
db_conn = psycopg2.connect(database='postgres',
user='xxx')
cursor = db_conn.cursor()
#querying the database
result = cursor.execute("""
Select * From test.test
""")
print "Result: ", result
>>> Result: None
It should say: Result: It works!
You need to fetch the results.
From the docs:
The [execute()-]method returns None. If a query was executed, the returned values can be retrieved using fetch*() methods.
Example:
result = cursor.fetchall()
For reference:
http://initd.org/psycopg/docs/cursor.html#execute
http://initd.org/psycopg/docs/cursor.html#fetch
Note that (unlike psql) psycopg2 wraps anything in transactions. So if you intend to issue persistent changes to the database (INSERT, UPDATE, DELETE, ...) you need to commit them explicitly. Otherwise changes will be rolled back automatically when the connection object is destroyed. Read more on that topic here:
http://initd.org/psycopg/docs/usage.html
http://initd.org/psycopg/docs/usage.html#transactions-control
I'm having some trouble updating a row in a MySQL database. Here is the code I'm trying to run:
import MySQLdb
conn=MySQLdb.connect(host="localhost", user="root", passwd="pass", db="dbname")
cursor=conn.cursor()
cursor.execute("UPDATE compinfo SET Co_num=4 WHERE ID=100")
cursor.execute("SELECT Co_num FROM compinfo WHERE ID=100")
results = cursor.fetchall()
for row in results:
print row[0]
print "Number of rows updated: %d" % cursor.rowcount
cursor.close()
conn.close()
The output I get when I run this program is:
4Number of rows updated: 1
It seems like it's working but if I query the database from the MySQL command line interface (CLI) I find that it was not updated at all. However, if from the CLI I enter UPDATE compinfo SET Co_num=4 WHERE ID=100; the database is updated as expected.
What is my problem? I'm running Python 2.5.2 with MySQL 5.1.30 on a Windows box.
I am not certain, but I am going to guess you are using a INNODB table, and you haven't done a commit. I believe MySQLdb enable transactions automatically.
Call conn.commit() before calling close.
From the FAQ: Starting with 1.2.0, MySQLdb disables autocommit by default
MySQLdb has autocommit off by default, which may be confusing at first. Your connection exists in its own transaction and you will not be able to see the changes you make from other connections until you commit that transaction.
You can either do conn.commit() after the update statement as others have pointed out, or disable this functionality altogether by setting conn.autocommit(True) right after you create the connection object.
You need to commit changes manually or turn auto-commit on.
The reason SELECT returns the modified (but not persisted) data is because the connection is still in the same transaction.
I've found that Python's connector automatically turns autocommit off, and there doesn't appear to be any way to change this behaviour. Of course you can turn it back on, but then looking at the query logs, it stupidly does two pointless queries after connect to turn autocommit off then back on.
Connector/Python Connection Arguments
Turning on autocommit can be done directly when you connect to a database:
import mysql.connector as db
conn = db.connect(host="localhost", user="root", passwd="pass", db="dbname", autocommit=True)
MySQLConnection.autocommit Property
Or separately:
import MySQLdb
conn = MySQLdb.connect(host="localhost", user="root", passwd="pass", db="dbname")
cursor = conn.cursor()
conn.get_autocommit() # will return **False**
conn.autocommit(True) # will make it True
conn.get_autocommit() # Should return **True** now
cursor = conn.cursor()
Explicitly committing the changes is done with
conn.commit()
I have to execute SET autocommit=true on my mysqlWorkbench app script