Related
I am writing a Python program with many (approx ~30-40) parameters, all of which have default values, and all should be adjustable at run time by the user. The way I set it up, these parameters are grouped into 4 dictionaries, corresponding to 4 different modules of the program. However, I have encountered a few cases of a single parameter required by more then one of these modules, leading me to consider just unifying the dictionaries into one big config dictionary, or perhaps even one config object, passed to each module.
My questions are
Would this have any effect on run time? I suspect not, but want to be sure.
Is this considered good practice? Is there some other solution to the problem I have described?
probably no effect on runtime. larger dictionaries could take longer to lookup in, but in your case, we are talking about 40 items. that's nothing.
we use a single settings file in which we initialize globals by calling a method that either read the config from the environment, a file or a Python file (as globals). the method that reads the config can get the desired type and default value. Others use YAML or TOML for representing configuration and I'm guessing then stores them a globally accessible object. If your settings can be changed in runtime, you have to protect this object in terms of thread-safety (if you have threads of course).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've noticed a few Python packages that use config files written in Python. Apart from the obvious privilege escalation, what are the pros and cons of this approach?
Is there much of a precedence for this? Are there any guides as to the best way to implement this?
Just to clarify: In my particular use case, this will only be used by programmers or people who know what they're doing. It's not a config file in a piece of software that will be distributed to end users.
The best example I can think of for this is the django settings.py file, but I'm sure there are tons of other examples for using a Python file for configuration.
There are a couple of key advantages for using Python as config file over other solutions, for example:
There is no need to parse the file: Since the file is already Python, you don't have to write or import a parser to extract the key value pairs from the file.
Configuration settings can be more than just key/values: While it would be folly to have settings define their own classes, you can use them to define tuples, lists or dictionaries of settings allowing for more options and configuration than other options. This is especially true with django, where the settings file has to accommodate for all manner of plug-ins that weren't originally known by the framework designers.
Writing configuration files is easy: This is spurious, but since the configuration is a Python file it can be edited and debugged within the IDE of the program itself.
Implicit error-checking: If your program requires an option called FILE_NAME and that isn't in the settings the program will throw an exception. This means that settings become mandatory and error handling of the settings can be more explicit. This can be a double edged sword, but manually changing config files should be for power editors who should be able to handle the consequences of exceptions.
Config options are easily accessed and namespaces: Once you go import settings you can wildly start calling settings.UI_COLOR or settings.TIMEOUT. These are clear, and with the right IDE, tracking where these settings are made becomes easier than with flat files.
But the most powerful reason: Overrides, overrides, overrides. This is quite an advanced situation and can be use-case specific, but one that is encouraged by django in a few places.
Picture that you are building a web application, where there is a development and production server. Each of these need their own settings, but 90% of them are the same. In that case you can do things like define a config file that covers all of development and make it (if its safer) the default settings, and then override if its production, like so:
PORT = 8080
HOSTNAME = "dev.example.com"
COLOR = "0000FF"
if SITE_IS_LIVE:
import * from production_settings.py
Doing an import * from will cause any settings that have been declared in the production_settings.py file to override the declarations in the settings file.
I've not seen a best practise guideline or PEP document that covers how to do this, but if you wanted some general guidelines, the django settings.py is a good example to follow.
Use consistent variable names, preferably UPPER CASE as they are understood to be settings or constants.
Expect odd data structures, if you are using Python as the configuration language, then try to handle all basic data types.
Don't try and make an interface to change settings, that isn't a simple text editor.
When shouldn't you use this approach? When you are dealing with simple key/value pairs that need to be changed by novice users. Python configs are a power user option only. Novice users will forget to end quotes or lists, not be consistent, will delete options they think don't apply and will commit the unholiest of unholies and will mix tabs and spaces spaces only. Because you are essentially dealing with code not config files, all off these will break your program. On the otherside, writing a tool that would parse through through a python file to find the appropriate options and update them is probably more trouble than it is worth, and you'd be better of reusing an existing module like ConfigParser
I think Python code gets directly used for configuration mostly because it's just so easy, quick, powerful and flexible way to get it done. There is currently no other tool in the Python ecosystem that provides all these benefits together. The ConfigParserShootout cat give you enough reasons why it may be better to roll Python code as config.
There are some security considerations that can be worked around either by defensive code evaluation or by policies such as properly setting the filesystem permissions in deployment.
I've seen so much struggle with rather complex configuration being done in various formats, using various parsers, but in the end being the easiest when done in code.
The only real downside I came upon is that people managing the configuration have to be somewhat aware of Python, at least the syntax, to be able to do anything and not to brake anything. May or may not matter case by case.
Also the fact that some serious projects, such as Django and Sphinx, are using this very approach should be soothing enough:
https://docs.djangoproject.com/en/dev/topics/settings/
http://sphinx-doc.org/config.html
There are many options for writing configuration files, with well written parsers:
ini
json
yaml
xml
csv
there's no good reason to have any kind of configuration be parsed as a python script directly. That could led to many kind of problems, from the security aspects to the hard to debug errors, that could be raised late in the run of the program life.
There's even discussions to build an alternative to the setup.py for python packages, which is pretty close to a python source code based configuration from a python coder's point of view.
Otherwise, you may just have seen python objects exported as strings, that looks a bit like json, though a little more flexible… Which is then perfectly fine as long as you don't eval()/exec() them or even import them, but pass it through a parser, like 'ast.literal_eval' or parsing, so you can make sure you only load static data not executable code.
The only few times I'd understand having something close to a config file written in python, is a module included in a library that defines constants used by that library designed to be handled by the user of the library. I'm not even sure that would be a good design decision, but I'd understand such a thing.
edit:
I wouldn't consider django's settings.py an example of good practice, though I consider it's part of what I'm consider a configuration file for coding-literate users that works fine because django is aimed at being used mostly by coders and sysadmins. Also, django offers a way of configuration through a webpage.
To take #lego's arguments:
There is no need to parse the file
there's no need to explicitly parse it, though the cost of parsing is anecdotic, even more given the safety and the extra safety and the ability to detect problems early on
Configuration settings can be more than just key/values
ini files apart, you can define almost any fundamental python type using json/yaml or xml. And you don't want to define classes, or instanciate complex objects in a configuration file…
Writing configuration files is easy:
but using a good editor, json/yaml or even xml syntax can be checked and verified, to have a perfectly parsable file.
Implicit error-checking:
not an argument neither, as you say it's double sworded, you can have something that parses fine, but causes an exception after many hours of run.
Config options are easily accessed and namespaces:
using json/yaml or xml, options can easily be namespaced, and used as python objects naturally.
But the most powerful reason: Overrides, overrides, overrides
It's not a good argument neither in favor of python code. Considering your code is made of several modules that are interdendant and use a common configuration file, and each of them have their own configuration, then it's pretty easy to load first the main configuration file as a good old python dictionary, and the other configuration files just loaded by updating the dictionary.
If you want to track changes, then there are many recipes to organize a hierarchy of dicts that fallbacks to another dict if it does not contain the value.
And finally, configuration values changed at runtime can't be (actually shouldn't be) serialized in python correctly, as doing so would mean changing the currently running program.
I'm not saying you shouldn't use python to store configuration variables, I'm just saying that whatever syntax you choose, you should get it through a parser before getting it as instances in your program. Never, ever load user modifiable content without double checking. Never trust your users!
If the django people are doing it, it's because they've built a framework that only makes sense when gathering many plugins together to build an application. And then, to configure the application, you're using a database (which is a kind of configuration file… on steroids), or actual files.
HTH
I've done this frequently in company internal tools and games. Primary reason being simplicity: you just import the file and don't need to care about formats or parsers. Usually it has been exactly what #zmo said, constants meant for non programmers in the team to modify (say the size of the grid of the game level. or the display resolution).
Sometimes it has been useful to be able to have logic in the configuration. For example alternative functions that populate the initial configuration of the board in the game. I've found this a great advantage actually.
I acknowledge that this could lead to hard to debug problems. Perhaps in these cases those modules have been more like game level init modules than typical config files. Anyhow I've been really happy about the straightforward way to make clear textual config files with the ability to have logic there too and haven't gotten bit by it.
This is yet another config file option. There are several quite adequate config file formats available.
Please take a moment to understand the system administrator's viewpoint or some 3rd party vendor supporting your product. If there is yet another config file format they might drop your product. If you have a product that is of monumental importance then people will go through the hassle of learning the syntax just to read your config file. (like X.org, or apache)
If you plan on another programming language accessing/writing the config file info then a python based config file would be a bad idea.
Does anyone know how pydev determines what to use for code completion? I'm trying to define a set of classes specifically to enable code completion. I've tried using __new__ to set __dict__ and also __slots__, but neither seems to get listed in pydev autocomplete.
I've got a set of enums I want to list in autocomplete, but I'd like to set them in a generator, not hardcode them all for each class.
So rather than
class TypeA(object):
ValOk = 1
ValSomethingSpecificToThisClassWentWrong = 4
def __call__(self):
return 42
I'd like do something like
def TYPE_GEN(name, val, enums={}):
def call(self):
return val
dct = {}
dct["__call__"] = call
dct['__slots__'] = enums.keys()
for k, v in enums.items():
dct[k] = v
return type(name, (), dct)
TypeA = TYPE_GEN("TypeA",42,{"ValOk":1,"ValSomethingSpecificToThisClassWentWrong":4})
What can I do to help the processing out?
edit:
The comments seem to be about questioning what I am doing. Again, a big part of what I'm after is code completion. I'm using python binding to a protocol to talk to various microcontrollers. Each parameter I can change (there are hundreds) has a name conceptually, but over the protocol I need to use its ID, which is effectively random. Many of the parameters accept values that are conceptually named, but are again represented by integers. Thus the enum.
I'm trying to autogenerate a python module for the library, so the group can specify what they want to change using the names instead of the error prone numbers. The __call__ property will return the id of the parameter, the enums are the allowable values for the parameter.
Yes, I can generate the verbose version of each class. One line for each type seemed clearer to me, since the point is autocomplete, not viewing these classes.
Ok, as pointed, your code is too dynamic for this... PyDev will only analyze your own code statically (i.e.: code that lives inside your project).
Still, there are some alternatives there:
Option 1:
You can force PyDev to analyze code that's in your library (i.e.: in site-packages) dynamically, in which case it could get that information dynamically through a shell.
To do that, you'd have to create a module in site-packages and in your interpreter configuration you'd need to add it to the 'forced builtins'. See: http://pydev.org/manual_101_interpreter.html for details on that.
Option 2:
Another option would be putting it into your predefined completions (but in this case it also needs to be in the interpreter configuration, not in your code -- and you'd have to make the completions explicit there anyways). See the link above for how to do this too.
Option 3:
Generate the actual code. I believe that Cog (http://nedbatchelder.com/code/cog/) is the best alternative for this as you can write python code to output the contents of the file and you can later change the code/rerun cog to update what's needed (if you want proper completions without having to put your code as it was a library in PyDev, I believe that'd be the best alternative -- and you'd be able to grasp better what you have as your structure would be explicit there).
Note that cog also works if you're in other languages such as Java/C++, etc. So, it's something I'd recommend adding to your tool set regardless of this particular issue.
Fully general code completion for Python isn't actually possible in an "offline" editor (as opposed to in an interactive Python shell).
The reason is that Python is too dynamic; basically anything can change at any time. If I type TypeA.Val and ask for completions, the system had to know what object TypeA is bound to, what its class is, and what the attributes of both are. All 3 of those facts can change (and do; TypeA starts undefined and is only bound to an object at some specific point during program execution).
So the system would have to know st what point in the program run do you want the completions from? And even if there were some unambiguous way of specifying that, there's no general way to know what the state of everything in the program is like at that point without actually running it to that point, which you probably don't want your editor to do!
So what pydev does instead is guess, when it's pretty obvious. If you have a class block in a module foo defining class Bar, then it's a safe bet that the name Bar imported from foo is going to refer to that class. And so you know something about what names are accessible under Bar., or on an object created by obj = Bar(). Sure, the program could be rebinding foo.Bar (or altering its set of attributes) at runtime, or could be run in an environment where import foo is hitting some other file. But that sort of thing happens rarely, and the completions are useful in the common case.
What that means though is that you basically lose completions whenever you use "too much" of Python's dynamic language flexibility. Defining a class by calling a function is one of those cases. It's not ready to guess that TypeA has names ValOk and ValSomethingSpecificToThisClassWentWrong; after all, there's presumably lots of other objects that result from calls to TYPE_GEN, but they all have different names.
So if your main goal is to have completions, I think you'll have to make it easy for pydev and write these classes out in full. Of course, you could use similar code to generate the python files (textually) if you wanted. It looks though like there's actually more "syntactic overhead" of defining these with dictionaries than as a class, though; you're writing "a": b, per item rather than a = b. Unless you can generate these more systematically or parse existing definition files or something, I think I'd find the static class definition easier to read and write than the dictionary driving TYPE_GEN.
The simpler your code, the more likely completion is to work. Would it be reasonable to have this as a separate tool that generates Python code files containing the class definitions like you have above? This would essentially be the best of both worlds. You could even put the name/value pairs in a JSON or INI file or what have you, eliminating the clutter of the methods call among the name/value pairs. The only downside is needing to run the tool to regenerate the code files when the codes change, but at least that's an automated, simple process.
Personally, I would just go with making things more verbose and writing out the classes manually, but that's just my opinion.
On a side note, I don't see much benefit in making the classes callable vs. just having an id class variable. Both require knowing what to type: TypeA() vs TypeA.id. If you want to prevent instantiation, I think throwing an exception in __init__ would be a bit more clear about your intentions.
I have a bunch of Objects from the same Class in Python.
I've decided to put each object in a different file since it's
easier to manage them (If I plan to add more objects or edit them individually)
However, I'm not sure how to run through all of them, they are in another Package
So if I look at Netbeans I have TopLevel... and there's also a Package named Shapes
in Shapes I have Ball.py, Circle.py, Triangle.py (inside the files is a call for a constructor with the details of the specific shape) and they are all from class GraphicalShape
That is configured in GraphicalShape.py in the TopLevel Package.
Now, I have also on my Toplevel Package a file named newpythonproject.py, which would start the
process of calling each shape and doing things with it, how do I run through all of the shapes?
also: Is it a good way to do this?
p.s. never mind the uppercase lowercase stuff...
Just to clarify, I added a picture of the Project Tree
http://i47.tinypic.com/2i1nomw.png
It seems that you're misunderstanding the Python jargon. The Python term "object" means an actual run-time instance of a class. As far as I can tell, you have "sub-classes" of the Shape class called ball, circle and triangle. Note that a sub-class is also a class. You are keeping the code for each such sub-class in a separate file, which is fine.
I think you're getting mixed up because you're focusing on the file layout of your project far too early. With Python it is often easier to start with just one file, writing everything you need in that file (functions, classes, etc.). Just get things working first. Later, when you've got working code and you just want to split a part of it into another file for organizational reasons, it will be much more obvious (to you!) how this should be done.
In Python, every class does not have to be defined in its own separate file. You can do this if you like, but it is not compulsory.
it's not clear what you mean when you say "run through them all".
If you mean "import them for use", then you should:
Make sure the parent folder of shapes is on the PYTHONPATH environment variable; then use
from shapes import ball.
I cannot understand it. Very simple, and obvious functionality:
You have a code in any programming language, You run it. In this code You generate variables, than You save them (the values, names, namely everything) to a file, with one command. When it's saved You may open such a file in Your code also with simple command.
It works perfect in matlab (save Workspace , load Workspace ) - in python there's some weird "pickle" protocol, which produces errors all the time, while all I want to do is save variable, and load it again in another session (?????)
f.e. You cannot save class with variables (in Matlab there's no problem)
You cannot load arrays in cPickle (but YOu can save them (?????) )
Why don't make it easier?
Is there a way to save the current variables with values, and then load them?
What you are describing is Matlab environment feature not a programming language.
What you need is a way to store serialized state of some object which could be easily done in almost any programming language. In python world pickle is the easiest way to achieve it and if you could provide more details about the errors it produces for you people would probably be able to give you more details on that.
In general for object oriented languages (including python) it is always a good approach to incapsulate a your state into single object that could be serialized and de-serialized and then store/load an instance of such class. Pickling and unpickling of such objects works perfectly for many developers so this must be something specific to your implementation.
Since you're talking about Matlab, you probably want to try out IPython, which is a shell for Python offering much more functionality than the standard interpreter shell you get when executing Python.
Among this functionality is the ability to load/save workspace sessions, create macros out of session input etc., which is probably more like what you are used to in Matlab (I actually use both and find IPython to be much more elegant, but YMMV):
http://ipython.scipy.org
PiCloud has implemented a fancier pickle, but I can't find the code. I saw a poster session.
Generally in Python instantiated objects don't have any one way to recreate them, and in some cases its particularly difficult (like an open file) as it takes several steps to recreate.
I take issue with the statement that the saving of variables in Matlab is an environment function. the "save" statement in matlab is a function and part of the matlab language not just a command. It is a very useful function as you don't have to worry about the trivial minutia of file i/o and it handles all sorts of variables from scalar, matrix, objects, structures.
It's an old thread, but thought I should throw it out there anyway - Spyder the Scientific Python development environment allows you to do just this through the Variable explorer. There's a button there Save data that packs your whole workspace up in a .spydata file that you can later reload. Works like a charm when you're switching between projects!