pep8, autopep8 and imports at the end of file

pep8, autopep8 and imports at the end of file - python

I'm using Eclipse with plugged autopep8 and I found it it very helpful. It's saving a lot of my time from fixing code style by hands. But for some coding patterns I don't know how to avoid pep8 rules I don't want to use. For example using Django (1.5.4) I need to connect signals of installed application. I always use import signals at the end of models.py file. But pep8 doesn't allow to use imports at end of file. # noqa comment doesn't helps. I can not put import signals to the top of the models.py file, because in signals I use some models still not defined on that moment.
What can you suggest in this situation? May be there is more appropriate way to connect signals?

Firstly, everything in PEP8 is a recommendation, not a hard-and-fast rule. If your code needs a certain structure, you should feel free to ignore the recommendation.
That said, importing signals at the end of your models file feels a bit strange. Rather, import both models and signals from a separate file that is itself imported at startup. The app's __init__.py file might be a good candidate, or you can use the new AppConfig functionality in 1.7.

Related

Python Logging: Add custom level for TODO?

While developing an application using PyQT5 I'm still trying to find a workflow that works best for me. At the moment, I'm mostly working on the gui (to demo to client), leaving the back-end code for later, so I'm connecting the button signals to empty functions that I need to write later. Up until now, I've added print('#TODO: xxx') in each function that is empty. This gives me some feedback in the terminal as to how often certain functions are called and which to prioritize. This sort-of works for me.
At the same time I am using the logging module as it is intended to be used. It seems like I could add a new logging level and use the logging module for this purpose as well, something like this:
def print_badge(self, pers: Person):
# TODO: print_badge
# print('#TODO: print badge') #<- not needed anymore
self.log.TODO('print badge')
The logging documentation seems to discourage creating own logging levels though. Is there a reason why I shouldn't do this or a better option?

The reason why custom logging levels are discouraged in general is that people who configure logging have to take these levels into account - if multiple libraries did this with multiple levels, one might have to know all of the custom levels used by all of the libraries in use in order to configure logging a particular way. Of course the logging package allows you to do it for those cases where it might be really necessary - perhaps for an application and libraries which are self-contained and not public, so the question of configuring those libraries by someone else doesn't arise.
It seems that one could easily use the DEBUG level for your use case - just include TODO in the message itself, and one can grep logs for that just as easily as a custom level. Or you can have a logger called TODO to which you log these messages, and handle those in a separate handler/destination (e.g. a file todo.log).

Using config files written in Python [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've noticed a few Python packages that use config files written in Python. Apart from the obvious privilege escalation, what are the pros and cons of this approach?
Is there much of a precedence for this? Are there any guides as to the best way to implement this?
Just to clarify: In my particular use case, this will only be used by programmers or people who know what they're doing. It's not a config file in a piece of software that will be distributed to end users.

The best example I can think of for this is the django settings.py file, but I'm sure there are tons of other examples for using a Python file for configuration.
There are a couple of key advantages for using Python as config file over other solutions, for example:
There is no need to parse the file: Since the file is already Python, you don't have to write or import a parser to extract the key value pairs from the file.
Configuration settings can be more than just key/values: While it would be folly to have settings define their own classes, you can use them to define tuples, lists or dictionaries of settings allowing for more options and configuration than other options. This is especially true with django, where the settings file has to accommodate for all manner of plug-ins that weren't originally known by the framework designers.
Writing configuration files is easy: This is spurious, but since the configuration is a Python file it can be edited and debugged within the IDE of the program itself.
Implicit error-checking: If your program requires an option called FILE_NAME and that isn't in the settings the program will throw an exception. This means that settings become mandatory and error handling of the settings can be more explicit. This can be a double edged sword, but manually changing config files should be for power editors who should be able to handle the consequences of exceptions.
Config options are easily accessed and namespaces: Once you go import settings you can wildly start calling settings.UI_COLOR or settings.TIMEOUT. These are clear, and with the right IDE, tracking where these settings are made becomes easier than with flat files.
But the most powerful reason: Overrides, overrides, overrides. This is quite an advanced situation and can be use-case specific, but one that is encouraged by django in a few places.
Picture that you are building a web application, where there is a development and production server. Each of these need their own settings, but 90% of them are the same. In that case you can do things like define a config file that covers all of development and make it (if its safer) the default settings, and then override if its production, like so:
PORT = 8080
HOSTNAME = "dev.example.com"
COLOR = "0000FF"
if SITE_IS_LIVE:
import * from production_settings.py
Doing an import * from will cause any settings that have been declared in the production_settings.py file to override the declarations in the settings file.
I've not seen a best practise guideline or PEP document that covers how to do this, but if you wanted some general guidelines, the django settings.py is a good example to follow.
Use consistent variable names, preferably UPPER CASE as they are understood to be settings or constants.
Expect odd data structures, if you are using Python as the configuration language, then try to handle all basic data types.
Don't try and make an interface to change settings, that isn't a simple text editor.
When shouldn't you use this approach? When you are dealing with simple key/value pairs that need to be changed by novice users. Python configs are a power user option only. Novice users will forget to end quotes or lists, not be consistent, will delete options they think don't apply and will commit the unholiest of unholies and will mix tabs and spaces spaces only. Because you are essentially dealing with code not config files, all off these will break your program. On the otherside, writing a tool that would parse through through a python file to find the appropriate options and update them is probably more trouble than it is worth, and you'd be better of reusing an existing module like ConfigParser

I think Python code gets directly used for configuration mostly because it's just so easy, quick, powerful and flexible way to get it done. There is currently no other tool in the Python ecosystem that provides all these benefits together. The ConfigParserShootout cat give you enough reasons why it may be better to roll Python code as config.
There are some security considerations that can be worked around either by defensive code evaluation or by policies such as properly setting the filesystem permissions in deployment.
I've seen so much struggle with rather complex configuration being done in various formats, using various parsers, but in the end being the easiest when done in code.
The only real downside I came upon is that people managing the configuration have to be somewhat aware of Python, at least the syntax, to be able to do anything and not to brake anything. May or may not matter case by case.
Also the fact that some serious projects, such as Django and Sphinx, are using this very approach should be soothing enough:
https://docs.djangoproject.com/en/dev/topics/settings/
http://sphinx-doc.org/config.html

There are many options for writing configuration files, with well written parsers:
ini
json
yaml
xml
csv
there's no good reason to have any kind of configuration be parsed as a python script directly. That could led to many kind of problems, from the security aspects to the hard to debug errors, that could be raised late in the run of the program life.
There's even discussions to build an alternative to the setup.py for python packages, which is pretty close to a python source code based configuration from a python coder's point of view.
Otherwise, you may just have seen python objects exported as strings, that looks a bit like json, though a little more flexible… Which is then perfectly fine as long as you don't eval()/exec() them or even import them, but pass it through a parser, like 'ast.literal_eval' or parsing, so you can make sure you only load static data not executable code.
The only few times I'd understand having something close to a config file written in python, is a module included in a library that defines constants used by that library designed to be handled by the user of the library. I'm not even sure that would be a good design decision, but I'd understand such a thing.
edit:
I wouldn't consider django's settings.py an example of good practice, though I consider it's part of what I'm consider a configuration file for coding-literate users that works fine because django is aimed at being used mostly by coders and sysadmins. Also, django offers a way of configuration through a webpage.
To take #lego's arguments:
There is no need to parse the file
there's no need to explicitly parse it, though the cost of parsing is anecdotic, even more given the safety and the extra safety and the ability to detect problems early on
Configuration settings can be more than just key/values
ini files apart, you can define almost any fundamental python type using json/yaml or xml. And you don't want to define classes, or instanciate complex objects in a configuration file…
Writing configuration files is easy:
but using a good editor, json/yaml or even xml syntax can be checked and verified, to have a perfectly parsable file.
Implicit error-checking:
not an argument neither, as you say it's double sworded, you can have something that parses fine, but causes an exception after many hours of run.
Config options are easily accessed and namespaces:
using json/yaml or xml, options can easily be namespaced, and used as python objects naturally.
But the most powerful reason: Overrides, overrides, overrides
It's not a good argument neither in favor of python code. Considering your code is made of several modules that are interdendant and use a common configuration file, and each of them have their own configuration, then it's pretty easy to load first the main configuration file as a good old python dictionary, and the other configuration files just loaded by updating the dictionary.
If you want to track changes, then there are many recipes to organize a hierarchy of dicts that fallbacks to another dict if it does not contain the value.
And finally, configuration values changed at runtime can't be (actually shouldn't be) serialized in python correctly, as doing so would mean changing the currently running program.
I'm not saying you shouldn't use python to store configuration variables, I'm just saying that whatever syntax you choose, you should get it through a parser before getting it as instances in your program. Never, ever load user modifiable content without double checking. Never trust your users!
If the django people are doing it, it's because they've built a framework that only makes sense when gathering many plugins together to build an application. And then, to configure the application, you're using a database (which is a kind of configuration file… on steroids), or actual files.
HTH

I've done this frequently in company internal tools and games. Primary reason being simplicity: you just import the file and don't need to care about formats or parsers. Usually it has been exactly what #zmo said, constants meant for non programmers in the team to modify (say the size of the grid of the game level. or the display resolution).
Sometimes it has been useful to be able to have logic in the configuration. For example alternative functions that populate the initial configuration of the board in the game. I've found this a great advantage actually.
I acknowledge that this could lead to hard to debug problems. Perhaps in these cases those modules have been more like game level init modules than typical config files. Anyhow I've been really happy about the straightforward way to make clear textual config files with the ability to have logic there too and haven't gotten bit by it.

This is yet another config file option. There are several quite adequate config file formats available.
Please take a moment to understand the system administrator's viewpoint or some 3rd party vendor supporting your product. If there is yet another config file format they might drop your product. If you have a product that is of monumental importance then people will go through the hassle of learning the syntax just to read your config file. (like X.org, or apache)
If you plan on another programming language accessing/writing the config file info then a python based config file would be a bad idea.

ConfigParser vs. import config

ConfigParser is the much debated vanilla configuration parser for Python.
However you can simply import config where config.py has python code which sets configuration parameters.
What are the pros\cons of these two approaches of configuration?
When should I choose each?

The biggest issue I see with import config is that you don't know what will happen when you import it. Yes, you will get a set of symbols that are naturally referenced using a . style interface. But the code in the configuration file can also do who-knows-what. Now, if you completely trust your users, then allowing them to do whatever they feel like in the config file is possibly a good thing. However, if you have unknown quantities, or you want to protect users from themselves, then having a configuration file in a more traditional format will be safer and more secure.

This completley depends on your needs and goals for the script. One way really isnt "better", just different. For a very detailed discussion on most of pythons config parsers (including ConfigParser and config modules), see:
Python Wiki - ConfigParserShootout

"import config" is very simple, flexible and powerfull but, since it can do anything, it might be dangerous if the config.py is not in a safe place.

IMO it comes down to a matter of personal style. Do you intend for 3rd parties to edit your config? If so, maybe it makes sense to have a more "natural" configuration style a la ConfigParser that is not as technical and that may not be too far over the heads of your target audience.
Many popular projects such as Fabric and Django use the "native" configuration style which is essentially just a Python module. Fabry has fabfile.py and Django has settings.py.
Overall, you're going to have a lot more flexibility using a native approach of importing a module simply because you can do anything you want in that file, including defining functions, classes, etc. because it's just another Python module you're importing.

Python module getting too big

My module is all in one big file that is getting hard to maintain. What is the standard way of breaking things up?
I have one module in a file my_module.py, which I import like this:
import my_module
"my_module" will soon be a thousand lines, which is pushing the limits of my ability to keep everything straight. I was thinking of adding files my_module_base.py, my_module_blah.py, etc. And then, replacing my_module.py with
from my_module_base import *
from my_module_blah import *
# etc.
Then, the user code does not need to change:
import my_module # still works...
Is this the standard pattern?

It depends on what your module is doing actually. Usually it is always a good idea to make your module a directory with an '__init__.py' file inside. So you would first transform your your_module.py to something like your_module/__init__.py.
After that you continue according to your business logic. Here some examples:
do you have utility functions which are not directly used by the modules API put them in some file called utils.py
do you have some classes dealing with the database or representing your database models put them in models.py
do you have some internal configuration it might make sense to put it into some extra file called settings.py or config.py
These are just examples (a little bit stolen from the Django approach of reusable apps ^^). As said, it depends a lot what your module does. If it is still too big afterwards it also makes sense to create submodules (as subdirectories with their own __init__.py).

i'm sure there are lots of opinions on this, but I'd say you break it into more well-defined functional units (modules), contained in a package. Then you use:
from mypackage import modulex
Then use the package name to reference the object:
modulex.MyClass()
etc.
You should (almost) never use
from mypackage import *
Since that can introduce bugs (duplicate names from different modules will end up clobbering one).

No, that is not the standard pattern. from something import * is usually not a good practice as it will import lot of things you did not intend to. Instead follow the same approach as you did, but include the modules specifically from one to another for e.g.
In base.py if you are having def myfunc then in main.py use from base import myfunc So that for your users, main.myfunc would work too. Of course, you need to take care that you don't end up doing a circular import.
Also, if you see that from something import * is required, then control the import values using the __all__ construct.

How To Run Arbitrary Code After Django is "Fully Loaded"

I need to perform some fairly simple tasks after my Django environment has been "fully loaded".
More specifically I need to do things like Signal.disconnect() some Django Signals that are setup by my third party library by default and connect my own Signals and I need to do some "monkey patching" to add convenience functions to some Django models from another library.
I've been doing this stuff in my Django app's __init__.py file, which seems to work fine for the monkey patching, but doesn't work for my Signal disconnecting. The problem appears to be one of timing--for whatever reason the Third Party Library always seems to call its Signal.connect() after I try to Signal.disconnect() it.
So two questions:
Do I have any guarantee based on the order of my INSTALLED_APPS the order of when my app's __init__.py module is loaded?
Is there a proper place to put logic that needs to run after Django apps have been fully loaded into memory?

In Django 1.7 Apps can implement the ready() method: https://docs.djangoproject.com/en/dev/ref/applications/#django.apps.AppConfig.ready

My question is a more poorly phrased duplicate of this question: Where To Put Django Startup Code. The answer comes from that question:
Write middleware that does this in init and afterwards raise django.core.exceptions.MiddlewareNotUsed from the init, django will remove it for all requests...
See the Django documentation on writing your own middleware.

I had to do the following monkey patching. I use django 1.5 from github branch. I don't know if that's the proper way to do it, but it works for me.
I couldn't use middleware, because i also wanted the manage.py scripts to be affected.
anyway, here's this rather simple patch:
import django
from django.db.models.loading import AppCache
django_apps_loaded = django.dispatch.Signal()
def populate_with_signal(cls):
ret = cls._populate_orig()
if cls.app_cache_ready():
if not hasattr(cls, '__signal_sent'):
cls.__signal_sent = True
django_apps_loaded.send(sender=None)
return ret
if not hasattr(AppCache, '_populate_orig'):
AppCache._populate_orig = AppCache._populate
AppCache._populate = populate_with_signal
and then you could use this signal like any other:
def django_apps_loaded_receiver(sender, *args, **kwargs):
# put your code here.
django_apps_loaded.connect(django_apps_loaded_receiver)

As far as I know there's no such thing as "fully loaded". Plenty of Django functions include import something right in the function. Those imports will only happen if you actually invoke that function. The only way to do what you want would be to explicitly import the things you want to patch (which you should be able to do anywhere) and then patch them. Thereafter any other imports will re-use them.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.