I want to get an object from the database if it already exists (based on provided parameters) or create it if it does not.
Django's get_or_create (or source) does this. Is there an equivalent shortcut in SQLAlchemy?
I'm currently writing it out explicitly like this:
def get_or_create_instrument(session, serial_number):
instrument = session.query(Instrument).filter_by(serial_number=serial_number).first()
if instrument:
return instrument
else:
instrument = Instrument(serial_number)
session.add(instrument)
return instrument
Following the solution of #WoLpH, this is the code that worked for me (simple version):
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance
With this, I'm able to get_or_create any object of my model.
Suppose my model object is :
class Country(Base):
__tablename__ = 'countries'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True)
To get or create my object I write :
myCountry = get_or_create(session, Country, name=countryName)
That's basically the way to do it, there is no shortcut readily available AFAIK.
You could generalize it ofcourse:
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
params = {k: v for k, v in kwargs.items() if not isinstance(v, ClauseElement)}
params.update(defaults or {})
instance = model(**params)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
2020 update (Python 3.9+ ONLY)
Here is a cleaner version with Python 3.9's the new dict union operator (|=)
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).one_or_none()
if instance:
return instance, False
else:
kwargs |= defaults or {}
instance = model(**kwargs)
try:
session.add(instance)
session.commit()
except Exception: # The actual exception depends on the specific database so we catch all exceptions. This is similar to the official documentation: https://docs.sqlalchemy.org/en/latest/orm/session_transaction.html
session.rollback()
instance = session.query(model).filter_by(**kwargs).one()
return instance, False
else:
return instance, True
Note:
Similar to the Django version this will catch duplicate key constraints and similar errors. If your get or create is not guaranteed to return a single result it can still result in race conditions.
To alleviate some of that issue you would need to add another one_or_none() style fetch right after the session.commit(). This still is no 100% guarantee against race conditions unless you also use a with_for_update() or serializable transaction mode.
I've been playing with this problem and have ended up with a fairly robust solution:
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), False
except NoResultFound:
kwargs.update(create_method_kwargs or {})
created = getattr(model, create_method, model)(**kwargs)
try:
session.add(created)
session.flush()
return created, True
except IntegrityError:
session.rollback()
return session.query(model).filter_by(**kwargs).one(), False
I just wrote a fairly expansive blog post on all the details, but a few quite ideas of why I used this.
It unpacks to a tuple that tells you if the object existed or not. This can often be useful in your workflow.
The function gives the ability to work with #classmethod decorated creator functions (and attributes specific to them).
The solution protects against Race Conditions when you have more than one process connected to the datastore.
EDIT: I've changed session.commit() to session.flush() as explained in this blog post. Note that these decisions are specific to the datastore used (Postgres in this case).
EDIT 2: I’ve updated using a {} as a default value in the function as this is typical Python gotcha. Thanks for the comment, Nigel! If your curious about this gotcha, check out this StackOverflow question and this blog post.
A modified version of erik's excellent answer
def get_one_or_create(session,
model,
create_method='',
create_method_kwargs=None,
**kwargs):
try:
return session.query(model).filter_by(**kwargs).one(), True
except NoResultFound:
kwargs.update(create_method_kwargs or {})
try:
with session.begin_nested():
created = getattr(model, create_method, model)(**kwargs)
session.add(created)
return created, False
except IntegrityError:
return session.query(model).filter_by(**kwargs).one(), True
Use a nested transaction to only roll back the addition of the new item instead of rolling back everything (See this answer to use nested transactions with SQLite)
Move create_method. If the created object has relations and it is assigned members through those relations, it is automatically added to the session. E.g. create a book, which has user_id and user as corresponding relationship, then doing book.user=<user object> inside of create_method will add book to the session. This means that create_method must be inside with to benefit from an eventual rollback. Note that begin_nested automatically triggers a flush.
Note that if using MySQL, the transaction isolation level must be set to READ COMMITTED rather than REPEATABLE READ for this to work. Django's get_or_create (and here) uses the same stratagem, see also the Django documentation.
This SQLALchemy recipe does the job nice and elegant.
The first thing to do is to define a function that is given a Session to work with, and associates a dictionary with the Session() which keeps track of current unique keys.
def _unique(session, cls, hashfunc, queryfunc, constructor, arg, kw):
cache = getattr(session, '_unique_cache', None)
if cache is None:
session._unique_cache = cache = {}
key = (cls, hashfunc(*arg, **kw))
if key in cache:
return cache[key]
else:
with session.no_autoflush:
q = session.query(cls)
q = queryfunc(q, *arg, **kw)
obj = q.first()
if not obj:
obj = constructor(*arg, **kw)
session.add(obj)
cache[key] = obj
return obj
An example of utilizing this function would be in a mixin:
class UniqueMixin(object):
#classmethod
def unique_hash(cls, *arg, **kw):
raise NotImplementedError()
#classmethod
def unique_filter(cls, query, *arg, **kw):
raise NotImplementedError()
#classmethod
def as_unique(cls, session, *arg, **kw):
return _unique(
session,
cls,
cls.unique_hash,
cls.unique_filter,
cls,
arg, kw
)
And finally creating the unique get_or_create model:
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('sqlite://', echo=True)
Session = sessionmaker(bind=engine)
class Widget(UniqueMixin, Base):
__tablename__ = 'widget'
id = Column(Integer, primary_key=True)
name = Column(String, unique=True, nullable=False)
#classmethod
def unique_hash(cls, name):
return name
#classmethod
def unique_filter(cls, query, name):
return query.filter(Widget.name == name)
Base.metadata.create_all(engine)
session = Session()
w1, w2, w3 = Widget.as_unique(session, name='w1'), \
Widget.as_unique(session, name='w2'), \
Widget.as_unique(session, name='w3')
w1b = Widget.as_unique(session, name='w1')
assert w1 is w1b
assert w2 is not w3
assert w2 is not w1
session.commit()
The recipe goes deeper into the idea and provides different approaches but I've used this one with great success.
The closest semantically is probably:
def get_or_create(model, **kwargs):
"""SqlAlchemy implementation of Django's get_or_create.
"""
session = Session()
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance, False
else:
instance = model(**kwargs)
session.add(instance)
session.commit()
return instance, True
not sure how kosher it is to rely on a globally defined Session in sqlalchemy, but the Django version doesn't take a connection so...
The tuple returned contains the instance and a boolean indicating if the instance was created (i.e. it's False if we read the instance from the db).
Django's get_or_create is often used to make sure that global data is available, so I'm committing at the earliest point possible.
I slightly simplified #Kevin. solution to avoid wrapping the whole function in an if/else statement. This way there's only one return, which I find cleaner:
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if not instance:
instance = model(**kwargs)
session.add(instance)
return instance
There is a Python package that has #erik's solution as well as a version of update_or_create(). https://github.com/enricobarzetti/sqlalchemy_get_or_create
Depending on the isolation level you adopted, none of the above solutions would work.
The best solution I have found is a RAW SQL in the following form:
INSERT INTO table(f1, f2, unique_f3)
SELECT 'v1', 'v2', 'v3'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE f3 = 'v3')
This is transactionally safe whatever the isolation level and the degree of parallelism are.
Beware: in order to make it efficient, it would be wise to have an INDEX for the unique column.
One problem I regularly encounter is when a field has a max length (say, STRING(40)) and you'd like to perform a get or create with a string of large length, the above solutions will fail.
Building off of the above solutions, here's my approach:
from sqlalchemy import Column, String
def get_or_create(self, add=True, flush=True, commit=False, **kwargs):
"""
Get the an entity based on the kwargs or create an entity with those kwargs.
Params:
add: (default True) should the instance be added to the session?
flush: (default True) flush the instance to the session?
commit: (default False) commit the session?
kwargs: key, value pairs of parameters to lookup/create.
Ex: SocialPlatform.get_or_create(**{'name':'facebook'})
returns --> existing record or, will create a new record
---------
NOTE: I like to add this as a classmethod in the base class of my tables, so that
all data models inherit the base class --> functionality is transmitted across
all orm defined models.
"""
# Truncate values if necessary
for key, value in kwargs.items():
# Only use strings
if not isinstance(value, str):
continue
# Only use if it's a column
my_col = getattr(self.__table__.columns, key)
if not isinstance(my_col, Column):
continue
# Skip non strings again here
if not isinstance(my_col.type, String):
continue
# Get the max length
max_len = my_col.type.length
if value and max_len and len(value) > max_len:
# Update the value
value = value[:max_len]
kwargs[key] = value
# -------------------------------------------------
# Make the query...
instance = session.query(self).filter_by(**kwargs).first()
if instance:
return instance
else:
# Max length isn't accounted for here.
# The assumption is that auto-truncation will happen on the child-model
# Or directtly in the db
instance = self(**kwargs)
# You'll usually want to add to the session
if add:
session.add(instance)
# Navigate these with caution
if add and commit:
try:
session.commit()
except IntegrityError:
session.rollback()
elif add and flush:
session.flush()
return instance
Related
I have a Flask, SQLAlchemy webapp which uses a single mysql server. I want to expand the database setup to have a read-only slave server such that I can spread the reads between both master and slave while continuing to write to the master db server.
I have looked at few options and I believe I can't do this with plain SQLAlchemy. Instead I'm planning to create 2 database handles in my webapp, one each for master and slave db servers. Then using a simple random value use either the master/slave db handle for "SELECT" operations.
However, I'm not sure if this is the right way to go with using SQLAlchemy. Any suggestion/tips on how to pull this off?
I have an example of how to do this on my blog at http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/ . Basically you can enhance the Session so that it chooses from master or slave on a query-by-query basis. One potential glitch with that approach is that if you have one transaction that calls six queries, you might end up using both slaves in one request....but there we're just trying to imitate Django's feature :)
A slightly less magic approach that also establishes the scope of usage more explicitly I've used is a decorator on view callables (whatever they're called in Flask), like this:
#with_slave
def my_view(...):
# ...
with_slave would do something like this, assuming you have a Session and some engines set up:
master = create_engine("some DB")
slave = create_engine("some other DB")
Session = scoped_session(sessionmaker(bind=master))
def with_slave(fn):
def go(*arg, **kw):
s = Session(bind=slave)
return fn(*arg, **kw)
return go
The idea is that calling Session(bind=slave) invokes the registry to get at the actual Session object for the current thread, creating it if it doesn't exist - however since we're passing an argument, scoped_session will assert that the Session we're making here is definitely brand new.
You point it at the "slave" for all subsequent SQL. Then, when the request is over, you'd ensure that your Flask app is calling Session.remove() to clear out the registry for that thread. When the registry is next used on the same thread, it will be a new Session bound back to the "master".
Or a variant, you want to use the "slave" just for that call, this is "safer" in that it restores any existing bind back to the Session:
def with_slave(fn):
def go(*arg, **kw):
s = Session()
oldbind = s.bind
s.bind = slave
try:
return fn(*arg, **kw)
finally:
s.bind = oldbind
return go
For each of these decorators you can reverse things, have the Session be bound to a "slave" where the decorator puts it on "master" for write operations. If you wanted a random slave in that case, if Flask had some kind of "request begin" event you could set it up at that point.
Or, we can try another way. Such as we can declare two different class with all the instance attributes the same but the __bind__ class attribute is different. Thus we can use rw class to do read/write and r class to do read only. :)
I think this way is more easy and reliable. :)
We declare two db models because we can have tables in two different db with the same names. This way we can also bypass the 'extend_existing' error when two models with the same __tablename__.
Here is an example:
app = Flask(__name__)
app.config['SQLALCHEMY_BINDS'] = {'rw': 'rw', 'r': 'r'}
db = SQLAlchemy(app)
db.Model_RW = db.make_declarative_base()
class A(db.Model):
__tablename__ = 'common'
__bind_key__ = 'r'
class A(db.Model_RW):
__tablename__ = 'common'
__bind_key__ = 'rw'
Maybe this answer is too late! I use a slave_session to query the slave DB
class RoutingSession(SignallingSession):
def __init__(self, db, bind_name=None, autocommit=False, autoflush=True, **options):
self.app = db.get_app()
if bind_name:
bind = options.pop('bind', None)
else:
bind = options.pop('bind', None) or db.engine
self._bind_name = bind_name
SessionBase.__init__(
self, autocommit=autocommit, autoflush=autoflush,
bind=bind, binds=None, **options
)
def get_bind(self, mapper=None, clause=None):
if self._bind_name is not None:
state = get_state(self.app)
return state.db.get_engine(self.app, bind=self._bind_name)
else:
if mapper is not None:
try:
persist_selectable = mapper.persist_selectable
except AttributeError:
persist_selectable = mapper.mapped_table
info = getattr(persist_selectable, 'info', {})
bind_key = info.get('bind_key')
if bind_key is not None:
state = get_state(self.app)
return state.db.get_engine(self.app, bind=bind_key)
return SessionBase.get_bind(self, mapper, clause)
class RouteSQLAlchemy(SQLAlchemy):
def __init__(self, *args, **kwargs):
SQLAlchemy.__init__(self, *args, **kwargs)
self.slave_session = self.create_scoped_session({'bind_name':
'slave'})
def create_session(self, options):
return orm.sessionmaker(class_=RoutingSession,db=self,**options)
db = RouteSQLAlchemy(metadata=metadata, query_class=orm.Query)
The question really is how to update a SQLAlchemy declarative model so that it runs the validators. In my case using setters like User.name = name is not really an option.
Below is a runnable example of what I mean
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import validates
from sqlalchemy.ext.declarative import declarative_base
some_engine = create_engine('sqlite://')
Session = sessionmaker(bind=some_engine)
session = Session()
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
#validates('name')
def validate_name(self, key, value):
if value != 'asd':
raise ValueError('not asd')
return value
Base.metadata.create_all(bind=some_engine)
user = User(id=1, name='qwe')
# >>> ValueError: not asd
user = User(id=1, name='asd')
session.add(user)
session.commit()
session.query(User).filter(User.id=1).update({'name': 'qwe'})
session.query(User).filter(User.id==1)[0].name
# >>> 'qwe'
You could add a mixin to your models that provides a rather simple update method that just uses setattr() to set attributes of an instance.
class UpdateMixin:
"""
Add a simple update() method to instances that accepts
a dictionary of updates.
"""
def update(self, values):
for k, v in values.items():
setattr(self, k, v)
User class would then be defined as
class User(UpdateMixin, Base):
...
And to update a single instance from a given dictionary you could for example run
session.query(User).get(1).update({ 'name': 'qwe' })
# or since you have the user instance from before
user.update({ 'name': 'qwe' })
Note the use of Query.get(). If there is no user with the given id, it will return None and trying to call the method update on it will raise. Another caveat is that if you do not rollback if any exceptions are raised, you cannot predict what, if any, updates took place (were added to the session) because a dictionary has no ordering. So always rollback on any errors.
I'd also recommend actually naming the method updateSelf or some such to prevent risk of confusing it with Query.update().
The short answer is not to use query.update when you want model level constraints. It's exactly for the times when performance is more important than enforcing those sorts of model level constraints. Other answers have provided specifics on solutions, but the fundamental answer is that Query.update is not intended to enforce python-level constraints.
General categories of solutions are:
Use some session-level method and Query.get or a loop on Query.filter.all
Check constraints
Triggers and stored procedures
I am using the SQLAlchemy versioned object example as a reference.
Example: http://docs.sqlalchemy.org/en/rel_0_7/orm/examples.html#versioned-objects
When I update a record I am getting no errors. The case_history table is being created, but the version number is staying at '1' and the case_history table is empty.
(Yes I am aware that I am using 'case' as a class name. Is that bad?)
Here are my code snippets:
models.py:
from history_meta import Versioned, versioned_session
# set up the base class
class Base(object):
#declared_attr
def __tablename__(cls):
return cls.__name__.lower()
id = Column(Integer, primary_key = True)
header_row = Column(SmallInteger)
def to_dict(self):
serialized = dict((column_name, getattr(self, column_name))
for column_name in self.__table__.c.keys())
return serialized
Base = declarative_base(cls=Base)
class case(Versioned, Base):
title = Column(String(32))
description = Column(Text)
def __repr__(self):
return self.title
app.py:
engine = create_engine(SQLALCHEMY_DATABASE_URI)
Session = sessionmaker(bind=engine)
versioned_session(Session)
db = Session()
...
#app.route('/<name>/:record', method='POST')
def default(name, record):
myClass = getattr(sys.modules[__name__], name)
db.query(myClass).filter(myClass.id == record).update(request.json)
for u in db.query(case).filter(case.id == record):
print u.version # Version is always 1
db.commit() # I added this just to test versioning.
Any clue as to why the versioning isn't happening?
For others who find their way here...
Remember: even if filter() returns a single object, the update() method is a bulk operation, and acts differently. It is possible the version is only incremented on an event like after_update() which does not trigger on bulk operations.
Read more on caveats for the update() operation here.
An update query will not cause the version to increment even though the data changes. There might be ways to 'listen' for that type of change, but I don't know.
You have to change an attribute of a mapped class:
#Get an instance of the class
myItem = db.query(myClass).get(record)
#Change an attribute
myItem.title="foo"
#Commit if necessary
db.commit()
Using the SQLAlchemy ORM, I want to make sure values are the right type for their columns.
For example, say I have an Integer column. I try to insert the value “hello”, which is not a valid integer. SQLAlchemy will allow me to do this. Only later, when I execute session.commit(), does it raise an exception: sqlalchemy.exc.DataError: (DataError) invalid input syntax integer: "hello"….
I am adding batches of records, and I don’t want to commit after every single add(…), for performance reasons.
So how can I:
Raise the exception as soon as I do session.add(…)
Or, make sure the value I am inserting can be converted to the target Column datatype, before adding it to the batch?
Or any other way to prevent one bad record from spoiling an entire commit().
SQLAlchemy doesn't build this in as it defers to the DBAPI/database as the best and most efficient source of validation and coercion of values.
To build your own validation, usually TypeDecorator or ORM-level validation is used. TypeDecorator has the advantage that it operates at the core and can be pretty transparent, though it only occurs when SQL is actually emitted.
To do validation and coercion sooner, this is at the ORM level.
Validation can be ad-hoc, at the ORM layer, via #validates:
http://docs.sqlalchemy.org/en/latest/orm/mapped_attributes.html#simple-validators
The event system that #validates uses is also available directly. You can write a generalized solution that links validators of your choosing to the types being mapped:
from sqlalchemy import Column, Integer, String, DateTime
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import event
import datetime
Base= declarative_base()
def validate_int(value):
if isinstance(value, basestring):
value = int(value)
else:
assert isinstance(value, int)
return value
def validate_string(value):
assert isinstance(value, basestring)
return value
def validate_datetime(value):
assert isinstance(value, datetime.datetime)
return value
validators = {
Integer:validate_int,
String:validate_string,
DateTime:validate_datetime,
}
# this event is called whenever an attribute
# on a class is instrumented
#event.listens_for(Base, 'attribute_instrument')
def configure_listener(class_, key, inst):
if not hasattr(inst.property, 'columns'):
return
# this event is called whenever a "set"
# occurs on that instrumented attribute
#event.listens_for(inst, "set", retval=True)
def set_(instance, value, oldvalue, initiator):
validator = validators.get(inst.property.columns[0].type.__class__)
if validator:
return validator(value)
else:
return value
class MyObject(Base):
__tablename__ = 'mytable'
id = Column(Integer, primary_key=True)
svalue = Column(String)
ivalue = Column(Integer)
dvalue = Column(DateTime)
m = MyObject()
m.svalue = "ASdf"
m.ivalue = "45"
m.dvalue = "not a date"
Validation and coercion can also be built at the type level using TypeDecorator, though this is only when SQL is being emitted, such as this example which coerces utf-8 strings to unicode:
http://docs.sqlalchemy.org/en/latest/core/custom_types.html#coercing-encoded-strings-to-unicode
Improving on the answer of #zzzeek , I suggest the following solution:
from sqlalchemy import String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.event import listen_for
Base = declarative_base()
#listens_for(Base, 'attribute_instrument')
def configure_listener(table_cls, attr, col_inst):
if not hasattr(col_inst.property, 'columns'):
return
validator = getattr(col_inst.property.columns[0].type, 'validator', None)
if validator:
# Only decorate columns, that need to be decorated
#listens_for(col_inst, "set", retval=True)
def set_(instance, value, oldvalue, initiator):
return validator(value)
That lets you do things like:
class Name(String):
def validator(self, name):
if isinstance(name, str):
return name.upper()
raise TypeError("name must be a string")
This has two benefits: Firstly, there is only an event triggered, when there actually is a validator attached to the data field object. It does not waste precious CPU cycles on set events for objects, that have no function for validation defined. Secondly, it allows you to define your own field types and just add a validator method there, so not all things that you want to store as Integer etc run through the same checks, just the ones derived from your new field type.
My question does not really have much to do with sqlalchemy but rather with pure python.
I'd like to control the instantiation of sqlalchemy Model instances. This is a snippet from my code:
class Tag(db.Model):
__tablename__ = 'tags'
query_class = TagQuery
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(), unique=True, nullable=False)
def __init__(self, name):
self.name = name
I want to achieve that whenever an entry is instantiated (Tag('django')) that a new instance should be created only if there is not yet another tag with the name django inside the database. Otherwise, instead of initializing a new object, a reference to the already existent row inside the database should be returned by (Tag('django')).
As of now I am ensuring the uniqueness of tags inside the Post Model:
class Post(db.Model):
# ...
# code code code
# ...
def _set_tags(self, taglist):
"""Associate tags with this entry. The taglist is expected to be already
normalized without duplicates."""
# Remove all previous tags
self._tags = []
for tag_name in taglist:
exists = Tag.query.filter(Tag.name==tag_name).first()
# Only add tags to the database that don't exist yet
# TODO: Put this in the init method of Tag (if possible)
if not exists:
self._tags.append(Tag(tag_name))
else:
self._tags.append(exists)
It does its job but still I'd like to know how to ensure the uniqueness of tags inside the Tag class itself so that I could write the _set_tags method like this:
def _set_tags(self, taglist):
# Remove all previous tags
self._tags = []
for tag_name in taglist:
self._tags.append(Tag(tag_name))
While writing this question and testing I learned that I need to use the __new__ method. This is what I've come up with (it even passes the unit tests and I didn't forget to change the _set_tags method):
class Tag(db.Model):
__tablename__ = 'tags'
query_class = TagQuery
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(), unique=True, nullable=False)
def __new__(cls, *args, **kwargs):
"""Only add tags to the database that don't exist yet. If tag already
exists return a reference to the tag otherwise a new instance"""
exists = Tag.query.filter(Tag.name==args[0]).first() if args else None
if exists:
return exists
else:
return super(Tag, cls).__new__(cls, *args, **kwargs)
What bothers me are two things:
First: I get a warning:
DeprecationWarning: object.__new__() takes no parameters
Second: When I write it like so I get errors (I also tried to rename the paramater name to n but it did not change anything) :
def __new__(cls, name):
"""Only add tags to the database that don't exist yet. If tag already
exists return a reference to the tag otherwise a new instance"""
exists = Tag.query.filter(Tag.name==name).first()
if exists:
return exists
else:
return super(Tag, cls).__new__(cls, name)
Errors (or similar):
TypeError: __new__() takes exactly 2 arguments (1 given)
I hope you can help me!
I use class method for that.
class Tag(Declarative):
...
#classmethod
def get(cls, tag_name):
tag = cls.query.filter(cls.name == tag_name).first()
if not tag:
tag = cls(tag_name)
return tag
And then
def _set_tags(self, taglist):
self._tags = []
for tag_name in taglist:
self._tags.append(Tag.get(tag_name))
As for __new__, you should not confuse it with __init__. It is expected to be called w/out args, so even if your own constructor asks for some, you should not pass them to super/object unless you know that your super needs them. Typical invocation would be:
def __new__(cls, name=None):
tag = cls.query.filter(cls.name == tag_name).first()
if not tag:
tag = object.__new__(cls)
return tag
However this will not work as expected in your case, since it calls __init__ automatically if __new__ returns instance of cls. You would need to use metaclass or add some checks in __init__.
Don't embed this within the class itself.
Option 1. Create a factory that has the pre-existing pool of objects.
tag_pool = {}
def makeTag( name ):
if name not in tag_pool:
tag_pool[name]= Tag(name)
return tag_pool[name]
Life's much simpler.
tag= makeTag( 'django' )
This will create the item if necessary.
Option 2. Define a "get_or_create" version of the makeTag function. This will query the database. If the item is found, return the object. If no item is found, create it, insert it and return it.
Given the OP's latest error msg:
TypeError: __new__() takes exactly 2 arguments (1 given)
it seems that somewhere the class is getting instantiated without the name parameter, i.e. just Tag(). The traceback for that exception should tell you where that "somewhere" is (but we're not shown it, so that's how far as we can go;-).
That being said, I agree with other answers that a factory function (possibly nicely dressed up as a classmethod -- making factories is one of the best uses of classmethod, after all;-) is the way to go, avoiding the complication that __new__ entails (such as forcing __init__ to find out whether the object's already initialized to avoid re-initializing it!-).