Can you achieve a case insensitive 'unique' constraint in Sqlite3 (with Django)? - python

So let's say I'm using Python 2.5's built-in default sqlite3 and I have a Django model class with the following code:
class SomeEntity(models.Model):
some_field = models.CharField(max_length=50, db_index=True, unique=True)
I've got the admin interface setup and everything appears to be working fine except that I can create two SomeEntity records, one with some_field='some value' and one with some_field='Some Value' because the unique constraint on some_field appears to be case sensitive.
Is there some way to force sqlite to perform a case insensitive comparison when checking for uniqueness?
I can't seem to find an option for this in Django's docs and I'm wondering if there's something that I can do directly to sqlite to get it to behave the way I want. :-)

Yes this can easily be done by adding a unique index to the table with the following command:
CREATE UNIQUE INDEX uidxName ON mytable (myfield COLLATE NOCASE)
If you need case insensitivity for nonASCII letters, you will need to register your own COLLATION with commands similar to the following:
The following example shows a custom collation that sorts “the wrong way”:
import sqlite3
def collate_reverse(string1, string2):
return -cmp(string1, string2)
con = sqlite3.connect(":memory:")
con.create_collation("reverse", collate_reverse)
cur = con.cursor()
cur.execute("create table test(x)")
cur.executemany("insert into test(x) values (?)", [("a",), ("b",)])
cur.execute("select x from test order by x collate reverse")
for row in cur:
print row
con.close()
Additional python documentation for sqlite3 shown here

Perhaps you can create and use a custom model field; it would be a subclass of CharField but providing a db_type method returning "text collate nocase"

For anyone in 2021, with the help of Django 4.0 UniqueConstraint expressions you could add a Meta class to your model like this:
class Meta:
constraints = [
models.UniqueConstraint(
Lower('<field name>'),
name='<constraint name>'
),
]

Related

How to get base from existing sql DDL file?

I'm using SQLAlchemy for MySQL.
The common example of SQLAlchemy is
Defining model classes by the table structure. (class User(Base))
Migrate to the database by db.create_all (or alembic, etc)
Import the model class, and use it. (db.session.query(User))
But what if I want to use raw SQL file instead of defined model classes?
I did read automap do similar like this, but I want to get mapper object from raw SQL file, not created database.
Is there any best practice to do this?
This is an example of DDL
-- ddl.sql
-- This is just an example, so please ignore some issues related to a grammar
CREATE TABLE `card` (
`card_id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'card',
`card_company_id` bigint(20) DEFAULT NULL COMMENT 'card_company_id',
PRIMARY KEY (`card_id`),
KEY `card_ix01` (`card_company_id`),
KEY `card_ix02` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='card table'
And I want to do like
Base = raw_sql_base('ddl.sql') # Some kinda automap_base but from SQL file
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# reflect the tables
Base.prepare(engine)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert
SQLAlchemy is not an SQL parser, but the exact opposite; its reflection works against existing databases only. In other words you must execute your DDL and then use reflection / automap to create the necessary Python models:
from sqlalchemy.ext.automap import automap_base
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# execute the DDL in order to populate the DB
with open('ddl.sql') as ddl:
engine.execute(ddl)
Base = automap_base()
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert
This of course may fail, if you have already executed the same DDL against your database, so you would have to handle that case as well. Another possible caveat is that some DB-API drivers may not like executing multiple statements at a time, if your ddl.sql happens to contain more than one CREATE TABLE statement etc.
...but I want to get mapper object from raw SQL file.
Ok, in that case what you need is the aforementioned parser. A cursory search produced two candidates:
sqlparse: Generic, but the issue tracker is a testament to how nontrivial parsing SQL is. Is often confused, for example parses ... COMMENT 'card', `card_company_id` ... as a keyword and an identifier list, not as a keyword, a literal, punctuation, and an identifier (or even better, the column definitions as their own nodes).
mysqlparse: A MySQL specific solution, but with limited support for just about anything, and it seems abandoned.
Parsing would be just the first step, though. You'd then have to convert the resulting trees to models.

How convert a MySQL table to utf8 character set with alembic?

My database is MySQL. I use SqlAlchemy ORM to define and access it. I use Alembic for migrations. I have a model with a field that used to contain just English text (Ascii/latin-1). Now, this field needs to contain Unicode text. In order to convert my model to support Unicode for MySQL I need to add the following class level attribute: mysql_character_set = 'utf8'
class MyModel(Base):
__tablename__ = 'mymodel'
mysql_character_set = 'utf8'
id = Column(Integer, primary_key=True)
name = Column(String(64), unique=True, nullable=False)
So far so good. I want to add this attribute as part of an Alembic migration script. I normally use Alembic's excellent auto-generate command:
alembic revision --autogenerate
The problem is that this command doesn't capture every model change and in particular not the addition of the mysql_character_set attribute.
How do I add this attribute manually to the alembic migration script?
I did it like this:
from alembic import op
import sqlalchemy as sa
def upgrade():
conn = op.get_bind()
conn.execute(sa.sql.text('ALTER table my_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci'))
You should use the utf8mb4 character set, as utf8 (aka utf8mb3) is broken.
To change the default character set for a table and convert all character columns (CHAR, VARCHAR, TEXT) to the new character set, you can use ALTER TABLE in a migration (but see the docs for possible side effects):
from alembic import op
def upgrade():
op.execute(
'ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci'
)
def downgrade():
op.execute(
'ALTER TABLE mytable CONVERT TO CHARACTER SET latin1 COLLATE latin1_swedish_ci'
)
Just specify these parameters in your MyModel class. You'll need to create alembic migrations as well to incorporate those changes into DB.
mysql_charset='utf8mb4', mysql_collate='utf8mb4_bin'

Auto_increment custom Primary Key in Peewee model

I want a primary key id field to be Bigint
class Tweets(Model):
id = BigIntegerField(primary_key=True)
...
But it needs to be auto_incremented and I can't find a way in the Peewee docs.
Please suggest if it's possible.
Update: I'm using MySql db.
Peewee automatically generates an integer id column serving as primary key, having the auto_increment property. This is true for any table you create with Peewee.
It is very likely that IntegerField is enough for your needs; BigIntegerField is very rarely useful. Will you really need numbers bigger than 2147483647? Will you insert more than two billion rows?
See: http://dev.mysql.com/doc/refman/5.5/en/integer-types.html
Peewee, as of 3.1, includes a BigAutoField which is an auto-incrementing integer field using 64-bit integer storage. Should do the trick:
http://docs.peewee-orm.com/en/latest/peewee/api.html#BigAutoField
I think the most convenience answer is by using SQL constraints:
import peewee
class MyModel(peewee.Model):
id = peewee.BigIntegerField(primary_key=True, unique=True,
constraints=[peewee.SQL('AUTO_INCREMENT')])
Looks like this should help.
After creating table, do:
db.register_fields({'primary_key': 'BIGINT AUTOINCREMENT'})
After that when you say
class Tweets(Model):
id = PrimaryKey()
...
class Meta():
db = db
Then in mysql that field will appear as BigInt with auto increment

Creating a case insensitive SQLAlchemy query for MS-SQL

I'm currently trying to transfer a program designed for a MySQL database onto a MS-SQL database and I've run into some trouble. I discovered that MySQL does not have case sensitivity by default as MS-SQL has. This has lead to some problems with code similar to that listed below.
class Employee(Base):
__tablename__ = "Employees"
Id = Column(Integer(unsigned=True),
primary_key=True, nullable=False, unique=True)
DisplayName = Column(String(64),
nullable=False)
#more columns
def get_employees(sql_session, param, columns=None, partial_match=True):
if not columns:
columns = [Employee.Id, Employee.DisplayName]
clauses = []
if partial_match:
clauses.append(Employee.DisplayName.startswith(param))
whereclause = and_(*clauses)
stmt = select(columns, whereclause)
return sql_session.execute(stmt)
I know of the SQL keyword COLLATE but I'm not sure how to implement that, or if it's even the best option to use in this situation. What recommendations would you give to create a case insensitive LIKE query using SQLAlchemy?
Python 2.7.7
SQLAlchemy 0.7.7
That's a bit odd, in my experience MS SQL Server is case insensitive by default although you can optionally set it to case sensitive using the database's collation setting.
You can use COLLATE with SqlAlchemy (see here), so you should be able to do (I have not tried this myself):
clauses.append(Employee.DisplayName.startswith(collate(param, 'SQL_Latin1_General_CP1_CI_AS')))
SQL Server also supports regex-like pattern matching with LIKE queries, so alternatively you could make use of this in your param value e.g. '[vV]alue%'

Is there are standard way to store a database schema outside a python app

I am working on a small database application in Python (currently targeting 2.5 and 2.6) using sqlite3.
It would be helpful to be able to provide a series of functions that could setup the database and validate that it matches the current schema. Before I reinvent the wheel, I thought I'd look around for libraries that would provide something similar. I'd love to have something akin to RoR's migrations. xml2ddl doesn't appear to be meant as a library (although it could be used that way), and more importantly doesn't support sqlite3. I'm also worried about the need to move to Python 3 one day given the lack of recent attention to xml2ddl.
Are there other tools around that people are using to handle this?
You can find the schema of a sqlite3 table this way:
import sqlite3
db = sqlite3.connect(':memory:')
c = db.cursor()
c.execute('create table foo (bar integer, baz timestamp)')
c.execute("select sql from sqlite_master where type = 'table' and name = 'foo'")
r=c.fetchone()
print(r)
# (u'CREATE TABLE foo (bar integer, baz timestamp)',)
Take a look at SQLAlchemy migrate. I see no problem using it as migration tool only, but comparing of configuration to current database state is experimental yet.
I use this to keep schemas in sync.
Keep in mind that it adds a metadata table to keep track of the versions.
South is the closest I know to RoR migrations. But just as you need Rails for those migrations, you need django to use south.
Not sure if it is standard but I just saved all my schema queries in a txt file like so (tables_creation.txt):
CREATE TABLE "Jobs" (
"Salary" TEXT,
"NumEmployees" TEXT,
"Location" TEXT,
"Description" TEXT,
"AppSubmitted" INTEGER,
"JobID" INTEGER NOT NULL UNIQUE,
PRIMARY KEY("JobID")
);
CREATE TABLE "Questions" (
"Question" TEXT NOT NULL,
"QuestionID" INTEGER NOT NULL UNIQUE,
PRIMARY KEY("QuestionID" AUTOINCREMENT)
);
CREATE TABLE "FreeResponseQuestions" (
"Answer" TEXT,
"FreeResponseQuestionID" INTEGER NOT NULL UNIQUE,
PRIMARY KEY("FreeResponseQuestionID"),
FOREIGN KEY("FreeResponseQuestionID") REFERENCES "Questions"("QuestionID")
);
...
Then I used this function taking advantage of the fact that I made each query delimited by two newline characters:
def create_db_schema(self):
db_schema = open("./tables_creation.txt", "r")
sql_qs = db_schema.read().split('\n\n')
c = self.conn.cursor()
for sql_q in sql_qs:
c.execute(sql_q)

Categories

Resources