How convert a MySQL table to utf8 character set with alembic? - python

My database is MySQL. I use SqlAlchemy ORM to define and access it. I use Alembic for migrations. I have a model with a field that used to contain just English text (Ascii/latin-1). Now, this field needs to contain Unicode text. In order to convert my model to support Unicode for MySQL I need to add the following class level attribute: mysql_character_set = 'utf8'
class MyModel(Base):
__tablename__ = 'mymodel'
mysql_character_set = 'utf8'
id = Column(Integer, primary_key=True)
name = Column(String(64), unique=True, nullable=False)
So far so good. I want to add this attribute as part of an Alembic migration script. I normally use Alembic's excellent auto-generate command:
alembic revision --autogenerate
The problem is that this command doesn't capture every model change and in particular not the addition of the mysql_character_set attribute.
How do I add this attribute manually to the alembic migration script?

I did it like this:
from alembic import op
import sqlalchemy as sa
def upgrade():
conn = op.get_bind()
conn.execute(sa.sql.text('ALTER table my_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci'))

You should use the utf8mb4 character set, as utf8 (aka utf8mb3) is broken.
To change the default character set for a table and convert all character columns (CHAR, VARCHAR, TEXT) to the new character set, you can use ALTER TABLE in a migration (but see the docs for possible side effects):
from alembic import op
def upgrade():
op.execute(
'ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci'
)
def downgrade():
op.execute(
'ALTER TABLE mytable CONVERT TO CHARACTER SET latin1 COLLATE latin1_swedish_ci'
)

Just specify these parameters in your MyModel class. You'll need to create alembic migrations as well to incorporate those changes into DB.
mysql_charset='utf8mb4', mysql_collate='utf8mb4_bin'

Related

How set start of auto increment in flask-sqlalchemy [duplicate]

The autoincrement argument in SQLAlchemy seems to be only True and False, but I want to set the pre-defined value aid = 1001, the via autoincrement aid = 1002 when the next insert is done.
In SQL, can be changed like:
ALTER TABLE article AUTO_INCREMENT = 1001;
I'm using MySQL and I have tried following, but it doesn't work:
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Article(Base):
__tablename__ = 'article'
aid = Column(INTEGER(unsigned=True, zerofill=True),
autoincrement=1001, primary_key=True)
So, how can I get that? Thanks in advance!
You can achieve this by using DDLEvents. This will allow you to run additional SQL statements just after the CREATE TABLE ran. Look at the examples in the link, but I am guessing your code will look similar to below:
from sqlalchemy import event
from sqlalchemy import DDL
event.listen(
Article.__table__,
"after_create",
DDL("ALTER TABLE %(table)s AUTO_INCREMENT = 1001;")
)
According to the docs:
autoincrement –
This flag may be set to False to indicate an integer primary key column that should not be considered to be the “autoincrement” column, that is the integer primary key column which generates values implicitly upon INSERT and whose value is usually returned via the DBAPI cursor.lastrowid attribute. It defaults to True to satisfy the common use case of a table with a single integer primary key column.
So, autoincrement is only a flag to let SQLAlchemy know whether it's the primary key you want to increment.
What you're trying to do is to create a custom autoincrement sequence.
So, your example, I think, should look something like:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.schema import Sequence
Base = declarative_base()
class Article(Base):
__tablename__ = 'article'
aid = Column(INTEGER(unsigned=True, zerofill=True),
Sequence('article_aid_seq', start=1001, increment=1),
primary_key=True)
Note, I don't know whether you're using PostgreSQL or not, so you should make note of the following if you are:
The Sequence object also implements special functionality to accommodate Postgresql’s SERIAL datatype. The SERIAL type in PG automatically generates a sequence that is used implicitly during inserts. This means that if a Table object defines a Sequence on its primary key column so that it works with Oracle and Firebird, the Sequence would get in the way of the “implicit” sequence that PG would normally use. For this use case, add the flag optional=True to the Sequence object - this indicates that the Sequence should only be used if the database provides no other option for generating primary key identifiers.
I couldn't get the other answers to work using mysql and flask-migrate so I did the following inside a migration file.
from app import db
db.engine.execute("ALTER TABLE myDB.myTable AUTO_INCREMENT = 2000;")
Be warned that if you regenerated your migration files this will get overwritten.
I know this is an old question but I recently had to figure this out and none of the available answer were quite what I needed. The solution I found relied on Sequence in SQLAlchemy. For whatever reason, I could not get it to work when I called the Sequence constructor within the Column constructor as has been referenced above. As a note, I am using PostgreSQL.
For your answer I would have put it as such:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Sequence, Column, Integer
import os
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Sequence, Integer, create_engine
Base = declarative_base()
def connection():
engine = create_engine(f"postgresql://postgres:{os.getenv('PGPASSWORD')}#localhost:{os.getenv('PGPORT')}/test")
return engine
engine = connection()
class Article(Base):
__tablename__ = 'article'
seq = Sequence('article_aid_seq', start=1001)
aid = Column('aid', Integer, seq, server_default=seq.next_value(), primary_key=True)
Base.metadata.create_all(engine)
This then can be called in PostgreSQL with:
insert into article (aid) values (DEFAULT);
select * from article;
aid
------
1001
(1 row)
Hope this helps someone as it took me a while
You can do it using the mysql_auto_increment table create option. There are mysql_engine and mysql_default_charset options too, which might be also handy:
article = Table(
'article', metadata,
Column('aid', INTEGER(unsigned=True, zerofill=True), primary_key=True),
mysql_engine='InnoDB',
mysql_default_charset='utf8',
mysql_auto_increment='1001',
)
The above will generate:
CREATE TABLE article (
aid INTEGER UNSIGNED ZEROFILL NOT NULL AUTO_INCREMENT,
PRIMARY KEY (aid)
)ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8
If your database supports Identity columns*, the starting value can be set like this:
import sqlalchemy as sa
tbl = sa.Table(
't10494033',
sa.MetaData(),
sa.Column('id', sa.Integer, sa.Identity(start=200, always=True), primary_key=True),
)
Resulting in this DDL output:
CREATE TABLE t10494033 (
id INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 200),
PRIMARY KEY (id)
)
Identity(..) is ignored if the backend does not support it.
* PostgreSQL 10+, Oracle 12+ and MSSQL, according to the linked documentation above.

Why does a SQLite database created in Python have the VARCHAR data type?

When I create an SQLite database from a python data model, any column defined as a String in Python is displayed as VARCHAR in SQLite (viewing with DB Browser for SQLite). Here is an example of the data model in Python:
class Users(db.Model):
id = db.Column(db.Integer, primary_key=True)
role = db.Column(db.String(10))
name_first = db.Column(db.String(50), nullable=False)
name_last = db.Column(db.String(50), nullable=False)
This may not be relevant, but I should clarify that I'm doing this as part of a website hosted with Flask. The database is initially created by dropping to a python prompt and:
from app import db
db.create_all()
I have a basic understanding of MS SQL and SQLite datatypes (NULL,INTEGER,REAL,TEXT,BLOB), but I don't understand why I'm seeing the columns defined as Strings in Python classified as VARCHAR in DB Browser for SQLite. If I attempt to modify the table, I see all of the expected datatypes for SQLite and also VARCHAR as an option. If I create a new database/table, then VARCHAR doesn't exist as an option for datatypes. Why wouldn't these columns be displayed as TEXT datatypes?
Strings in Python classified as VARCHAR in DB Browser for SQLite.
In Flask you are actually using SQLAlchemy ORM which will convert your class models directly into SQLite statements to create the relational database tables corresponding to it.
String data type in Python class model will be as VARCHAR data type in SQLite using Object Relational Mapper.
In SQLite, Text is the parent of VARCHAR and in the default installation is no different, so VARCHAR is actually same as TEXT.
Also If you check 3.1. Determination Of Column Affinity in the documentation you notice in the second point that:
If the declared type of the column contains any of the strings "CHAR",
"CLOB", or "TEXT" then that column has TEXT affinity. Notice that the
type VARCHAR contains the string "CHAR" and is thus assigned TEXT
affinity.
For more info check : http://www.sqlite.org/datatype3.html

How to get base from existing sql DDL file?

I'm using SQLAlchemy for MySQL.
The common example of SQLAlchemy is
Defining model classes by the table structure. (class User(Base))
Migrate to the database by db.create_all (or alembic, etc)
Import the model class, and use it. (db.session.query(User))
But what if I want to use raw SQL file instead of defined model classes?
I did read automap do similar like this, but I want to get mapper object from raw SQL file, not created database.
Is there any best practice to do this?
This is an example of DDL
-- ddl.sql
-- This is just an example, so please ignore some issues related to a grammar
CREATE TABLE `card` (
`card_id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'card',
`card_company_id` bigint(20) DEFAULT NULL COMMENT 'card_company_id',
PRIMARY KEY (`card_id`),
KEY `card_ix01` (`card_company_id`),
KEY `card_ix02` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='card table'
And I want to do like
Base = raw_sql_base('ddl.sql') # Some kinda automap_base but from SQL file
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# reflect the tables
Base.prepare(engine)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert
SQLAlchemy is not an SQL parser, but the exact opposite; its reflection works against existing databases only. In other words you must execute your DDL and then use reflection / automap to create the necessary Python models:
from sqlalchemy.ext.automap import automap_base
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine("mysql://user#localhost/program")
# execute the DDL in order to populate the DB
with open('ddl.sql') as ddl:
engine.execute(ddl)
Base = automap_base()
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by sql file
Card = Base.classes.card
session = Session(engine)
session.add(Card(card_id=1, card_company_id=1))
session.commit() # Insert
This of course may fail, if you have already executed the same DDL against your database, so you would have to handle that case as well. Another possible caveat is that some DB-API drivers may not like executing multiple statements at a time, if your ddl.sql happens to contain more than one CREATE TABLE statement etc.
...but I want to get mapper object from raw SQL file.
Ok, in that case what you need is the aforementioned parser. A cursory search produced two candidates:
sqlparse: Generic, but the issue tracker is a testament to how nontrivial parsing SQL is. Is often confused, for example parses ... COMMENT 'card', `card_company_id` ... as a keyword and an identifier list, not as a keyword, a literal, punctuation, and an identifier (or even better, the column definitions as their own nodes).
mysqlparse: A MySQL specific solution, but with limited support for just about anything, and it seems abandoned.
Parsing would be just the first step, though. You'd then have to convert the resulting trees to models.

Django vs MySQL uuid

Django noob here
I have created a model using
customer_id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
After migrating the model to MySQL, I tried to add data into mysql using
insert into customer_customer (customer_id, ...) values (uuid(), ...)
The data gets inserted properly in MySQL with a unique code, however, when I try to display this via Django admin tool (this table feeds into a property for users), it throws a badly formatted uuid error.
File "/usr/lib/python2.7/uuid.py", line 134, in __init__
raise ValueError('badly formed hexadecimal UUID string')
ValueError: badly formed hexadecimal UUID string
Please discuss if there is another way of creating seed data directly in MySQL.
A field for storing universally unique identifiers. Uses Python’s UUID
class. When used on PostgreSQL, this stores in a uuid datatype,
otherwise in a char(32).
So with MySQL django handles the uuid, and manages the field as Char32. You can't use native MySQL uuid.
If you have to create uuid from the MySQL side, use a CharField in django model, and populate it:
class MyModel(models.Model):
fld = models.CharField(max_length=36)
Then when saving:
import uuid
MyModel.fld = str(uuid.uuid4())
As a default:
fld = models.CharField(max_length=36, default=uuid.uuid4)
Try this:
insert into customer_customer (customer_id, ...) values (Replace(uuid(),'-',''), ...)
then it will work.
Documentation states that if you use a MySQL database, Django will store a string (char32):
UUIDField.
A field for storing universally unique identifiers. Uses Python’s UUID class. When used on PostgreSQL, this stores in a uuid datatype, otherwise in a char(32).
Python's uuid module gives you the following options to generate UUIDs:
>>> import uuid
>>> uuid.uuid4()
UUID('bd65600d-8669-4903-8a14-af88203add38')
>>> str(uuid.uuid4())
'f50ec0b7-f960-400d-91f0-c42a6d44e3d0'
>>> uuid.uuid4().hex
'9fe2c4e93f654fdbb24c02b15259716c'
In your case (using uuid4 as default in the Django module), you will need to use the "UUID.uuid4().hex" option in order to save the UUID as a string, just like Django would save it in your MySql database.

Can you achieve a case insensitive 'unique' constraint in Sqlite3 (with Django)?

So let's say I'm using Python 2.5's built-in default sqlite3 and I have a Django model class with the following code:
class SomeEntity(models.Model):
some_field = models.CharField(max_length=50, db_index=True, unique=True)
I've got the admin interface setup and everything appears to be working fine except that I can create two SomeEntity records, one with some_field='some value' and one with some_field='Some Value' because the unique constraint on some_field appears to be case sensitive.
Is there some way to force sqlite to perform a case insensitive comparison when checking for uniqueness?
I can't seem to find an option for this in Django's docs and I'm wondering if there's something that I can do directly to sqlite to get it to behave the way I want. :-)
Yes this can easily be done by adding a unique index to the table with the following command:
CREATE UNIQUE INDEX uidxName ON mytable (myfield COLLATE NOCASE)
If you need case insensitivity for nonASCII letters, you will need to register your own COLLATION with commands similar to the following:
The following example shows a custom collation that sorts “the wrong way”:
import sqlite3
def collate_reverse(string1, string2):
return -cmp(string1, string2)
con = sqlite3.connect(":memory:")
con.create_collation("reverse", collate_reverse)
cur = con.cursor()
cur.execute("create table test(x)")
cur.executemany("insert into test(x) values (?)", [("a",), ("b",)])
cur.execute("select x from test order by x collate reverse")
for row in cur:
print row
con.close()
Additional python documentation for sqlite3 shown here
Perhaps you can create and use a custom model field; it would be a subclass of CharField but providing a db_type method returning "text collate nocase"
For anyone in 2021, with the help of Django 4.0 UniqueConstraint expressions you could add a Meta class to your model like this:
class Meta:
constraints = [
models.UniqueConstraint(
Lower('<field name>'),
name='<constraint name>'
),
]

Categories

Resources