So there has been a lot of hating on singletons in python. I generally see that having a singleton is usually no good, but what about stuff that has side effects, like using/querying a Database? Why would I make a new instance for every simple query, when I could reuse a present connection already setup again? What would be a pythonic approach/alternative to this?
Thank you!
Normally, you have some kind of object representing the thing that uses a database (e.g., an instance of MyWebServer), and you make the database connection a member of that object.
If you instead have all your logic inside some kind of function, make the connection local to that function. (This isn't too common in many other languages, but in Python, there are often good ways to wrap up multi-stage stateful work in a single generator function.)
If you have all the database stuff spread out all over the place, then just use a global variable instead of a singleton. Yes, globals are bad, but singletons are just as bad, and more complicated. There are a few cases where they're useful, but very rare. (That's not necessarily true for other languages, but it is for Python.) And the way to get rid of the global is to rethink you design. There's a good chance you're effectively using a module as a (singleton) object, and if you think it through, you can probably come up with a good class or function to wrap it up in.
Obviously just moving all of your globals into class attributes and #classmethods is just giving you globals under a different namespace. But moving them into instance attributes and methods is a different story. That gives you an object you can pass around—and, if necessary, an object you can have 2 of (or maybe even 0 under some circumstances), attach a lock to, serialize, etc.
In many types of applications, you're still going to end up with a single instance of something—every Qt GUI app has exactly one MyQApplication, nearly every web server has exactly one MyWebServer, etc. No matter what you call it, that's effectively a singleton or global. And if you want to, you can just move everything into attributes of that god object.
But just because you can do so doesn't mean you should. You've still got function parameters, local variables, globals in each module, other (non-megalithic) classes with their own instance attributes, etc., and you should use whatever is appropriate for each value.
For example, say your MyWebServer creates a new ClientConnection instance for each new client that connects to you. You could make the connections write MyWebServer.instance.db.execute whenever they want to execute a SQL query… but you could also just pass self.db to the ClientConnection constructor, and each connection then just does self.db.execute. So, which one is better? Well, if you do it the latter way, it makes your code a lot easier to extend and refactor. If you want to load-balance across 4 databases, you only need to change code in one place (where the MyWebServer initializes each ClientConnection) instead of 100 (every time the ClientConnection accesses the database). If you want to convert your monolithic web app into a WSGI container, you don't have to change any of the ClientConnection code except maybe the constructor. And so on.
If you're using an object oriented approach, then abamet's suggestion of attaching the database connection parameters as class attributes makes sense to me. The class can then establish a single database connection which all methods of the class refer to as self.db_connection, for example.
If you're not using an object oriented approach, a separate database connection module can provide a functional-style equivalent. Devote a module to establishing a database connection, and simply import that module everywhere you want to use it. Your code can then refer to the connection as db.connection, for example. Since modules are effectively singletons, and the module code is only run on the first import, you will be re-using the same database connection each time.
Related
Inside of the declarative base I define a function like this:
def update_me(self):
if self.raw_info==1: self.changed_info=10
else: self.changed_info=20
I know this can be done with hybrid_property but I actually do more complicated manipulations, the above is just for illustrative purposes, this has to be done through a method. How can I commit these changes, from inside the declarative base, without passing it the session object? It seems logical that there would be a way, if I can access the object and change its values without a session object, then it seems like I should be able to save it here somehow. Of course adding this code to the end of the function above fails
self.commit()
It seems to me that you might want to reconsider your design. However, if you are sure you want to go this way, you can use the object_session method:
object_session(self).commit()
Warning: This will commit your whole session (not just one object) and it will only work if self is already attached to a session (by Session.add or having queried for it, for example). Thus, it will be a fragile operation and I would not recommend it unless absolutely necessary.
But you are right, when you say
It seems logical that there would be a way, if I can access the object and change its values without a session object, then it seems like I should be able to save it here somehow.
The object and the session are connected and thus changes will be delivered to the database. We get this connected session with the method above.
I've been making a lot of classes an Python recently and I usually just access instance variables like this:
object.variable_name
But often I see that objects from other modules will make wrapper methods to access variables like this:
object.getVariable()
What are the advantages/disadvantages to these different approaches and is there a generally accepted best practice (even if there are exceptions)?
There should never be any need in Python to use a method call just to get an attribute. The people who have written this are probably ex-Java programmers, where that is idiomatic.
In Python, it's considered proper to access the attribute directly.
If it turns out that you need some code to run when accessing the attribute, for instance to calculate it dynamically, you should use the #property decorator.
The main advantages of "getters" (the getVariable form) in my modest opinion is that it's much easier to add functionality or evolve your objects without changing the signatures.
For instance, let's say that my object changes from implementing some functionality to encapsulating another object and providing the same functionality via Proxy Pattern (composition). If I'm using getters to access the properties, it doesn't matter where that property is being fetched from, and no change whatsoever is visible to the "clients" using your code.
I use getters and such methods especially when my code is being reused (as a library for instance), by others. I'm much less picky when my code is self-contained.
In Java this is almost a requirement, you should never access your object fields directly. In Python it's perfectly legitimate to do so, but you may take in consideration the possible benefits of encapsulation that I mentioned. Still keep in mind that direct access is not considered bad form in Python, on the contrary.
making getVariable() and setVariable() methods is called enncapsulation.
There are many advantages to this practice and it is the preffered style in object-oriented programming.
By accessing your variables through methods you can add another layer of "error checking/handling" by making sure the value you are trying to set/get is correct.
The setter method is also used for other tasks like notifying listeners that the variable have changed.
At least in java/c#/c++ and so on.
If my class uses a database quite a few times (a lot of functions/properties use data from the DB), what's the best practice: to create a DB connection once at the start of the class, do whatever so many times, and then close the DB connection on exit (using global variables); or to create/use/close DB connections in every property (using local variables)?
If it's better to start a connection once and close it on class destruction, how can I do this?
def __del__ (self)
self.connection.close()
doesn't work.
Thank you.
__del__ function is only called when the object get destructed which is when no object is referencing it anymore while garbage collecting is occurring.
Either see what object is still referencing your class when you let it go or implement an explicit shutdown method on your class.
It can be dangerous to rely on the __del__ method to release resources because of object not being destructed when we think it is.
From python documentation
Some objects contain references to “external” resources such as open files or windows. It is understood that these resources are freed when the object is garbage-collected, but since garbage collection is not guaranteed to happen, such objects also provide an explicit way to release the external resource, usually a close() method. Programs are strongly recommended to explicitly close such objects. The ‘try...finally‘ statement provides a convenient way to do this.
If another classes will also use database connections then you can create a class that will include methods for creating db, connecting/closing db and retrieving information from db, etc and then inherit this class.
Create a database close-request function that can be accessed by the main window class.
You can then call this within the window's closeEvent, and perhaps take different actions depending on the function's return value.
I'm programming a game in Python, where all IO activities are done by an IO object (in the hope that it will be easy to swap that object out for another which implements a different user interface). Nearly all the other objects in the game need to access the IO system at some point (e.g. printing a message, updating the position of the player, showing a special effect caused by an in-game action), so my question is this:
Does it make sense for a reference to the IO object to be available globally?
The alternative is passing a reference to the IO object into the __init__() of every object that needs to use it. I understand that this is good from a testing point of view, but is this worth the resulting "function signature pollution"?
Thanks.
Yes, this is a legitimate use of a global variable. If you'd rather not, passing around a context object that is equivalent to this global is another option, as you mentioned.
Since I assume you're using multiple files (modules), why not do something like:
import io
io.print('hello, world')
io.clear()
This is a common way programs that have more complex I/O needs than simple printing do things like logging.
Yes, I think so.
Another possibility would be to create a module loggerModule that has functions like print() and write(), but this would only marginally be better.
Nope.
Variables are too specific to be passed around in the global namespace. Hide them inside static functions/classes instead that can do magic things to them at run time (or call other ones entirely).
Consider what happens if the IO can periodically change state or if it needs to block for a while (like many sockets do).
Consider what happens if the same block of code is included multiple times. Does the variable instance get duplicated as well?
Consider what happens if you want to have a version 2 of the same variable. What if you want to change its interface? Do you have to modify all the code that references it?
Does it really make sense to infect all the code that uses the variable with knowledge of all the ways it can go bad?
Let's say that i have a Python module to control a videoconference system. In that module i have some global variables and functions to control the states of the videoconference, the calls, a phone book, etc.
To start the control system, the module self-executes a function to initialize the videoconference (ethernet connection, polling states and so)
Now, if i need to start controlling a second videoconference system, i'm not sure how to approach that problem: i thought about making the videoconference module a class and create two instances (one for each videoconference system) and then initialize both, but the problem is that i don't really need to have two instances of a videoconference class since i won't do anything with those objects because i only need to initialize the systems; after that i don't need to call or keep them for anything else.
example code:
Videoconference.py
class Videoconference:
def __init__(self):
self.state = 0
#Initialization code
Main.py
from Videoconference import Videoconference
vidC1 = Videoconference()
vidC2 = Videoconference()
#vidC1 and vidC2 will never be use again
So, the question is: should i convert the videoconference module to a class and create instances (like in the example), even if i'm not going to use them for anything else appart of the initialization process? Or is there another solution without creating a class?
Perhaps this is a matter of preference, but I think having a class in the above case would be the safer bet. Often I'll write a function and when it gets too complicated I'll think that I should have created a class (and often do so), but I've never created a class that was too simple and thought that this is too easy why didn't I just create a function.
Even if you have one object instead of two, it often helps readability to create a class. For example:
vid = VideoConference()
# vid.initialize_old_system() # Suppose you have an old system that you no longer use
# But want to keep its method for reference
vid.initialize_new_system()
vid.view_call_history(since=yesterday)
This sounds like the perfect use case for a VideoConferenceSystem object. You say you have globals (ew!) that govern state (yuck!) and calls functions for control.
Sounds to me like you've got the chance to convert that all to an object that has attributes that hold state and methods to mutate it. Sounds like you should be refactoring more than just the initialization code, so those vidC1 and vidC2 objects are useful.
I think you're approaching this problem the right way in your example. In this way, you can have multiple video conferences, each of which may have different attribute states (e.g. vidC1.conference_duration, etc.).