I am working on a Pylons app that runs on top of Apache with mod_wsgi. I would like to send logging messages that my app generates to files in my app's directory, instead of to Apache's logs. Further, I would like to specify the location of logfiles via a relative path so that it'll be easier to deploy my app on other people's servers. Right now I can log to files, but only via a fragile absolute path.
Here is the relevant part of my development.ini file:
# Logging configuration
[loggers]
keys = root, routes, myapp, sqlalchemy, debugging-logger
[handlers]
keys = console, debugging-logger-file
[formatters]
keys = generic
[logger_debugging-logger]
level = DEBUG
handlers = debugging-logger-file
qualname = myapp.controllers.logging-test-controller.debugging-logger
[handler_debugging-logger-file]
class = FileHandler
args = ('/var/pylons/myapp/logs/myapp-debugging-errors.log', 'a')
level = DEBUG
formatter = generic
Although the .ini helpfully advises using %(here)s to refer to the current path, using %(here)s in the "args = ('foo')" line of the error handler does not behave the way that I expect it to. The syntax of this ini file is documented on the Paste Deploy site, but does not specify how %(here)s can be used in relation to quoted strings.
What syntax should I use in the "args = ('foo')" line to specify the current path?
The problem is that Paste Deploy creates one ConfigParser object to store the 'here' tag in it's set of defaults, and logging.config.fileConfig() is never passed that set of defaults. Therefore, when fileConfig() reads the .ini file, it doesn't have access to the 'here' tag, and the ConfigParser's interpolation can't find it.
You could do something like this:
[DEFAULT]
my_log_dir = '/var/pylons/myapp/logs'
...
[handler_debugging-logger-file]
args = (%(my_log_dir)s + '/myapp-debugging-errors.log', 'a')
Not exactly what you're looking for, but a tiny bit more configurable.
Another possibility is:
args = (os.getcwd() + '/myapp-debugging-errors.log', 'a')
(This works because 'os' is a valid variable in the logging module's namespace when it calls eval() on the args value. But this is an implementation detail of the logging package that may not be reliable long term.) But this most likely won't give you what you want-- it will most likely use the Apache process's working directory.
You could even set an environment variable outside the program, and use it like:
args = (os.environ['MY_LOG_DIR'] + '/myapp-debugging-errors.log', 'a')
And yet another possibility is overriding the behavior of some of the functions or class methods in the logging module or paste package.
Hope those give you some ideas.
Configuration files for Paste Deploy allow a 'here' tag to indicate directory where configuration file is. You can then work relative to that.
Related
For example, I might try the following config:
class Defaults(Enum):
a = 1
b = 2
Then from my main file, I can refer to it with:
import myconfig
windowSize = Defaults.a
This would allow me to change the enum values whenever I want to vary how my program runs. Is this a common way to use Enums in python configuration?
I think you're asking whether it's common to hold the configuration settings as members of an enumeration. As a more explicit example:
class Defaults(Enum):
window_width = 600
window_height = 480
font_size = 14
Technically, I think that would work, but what benefit is the enumeration providing? Enum is useful for providing choices to pick from. If you really want to do this, I think a plain class, a data class, or just module-level variables would be less confusing. Django's settings.py configuration file seems to be the closest thing to your idea that I've seen in common use.
Your broader question is how to read configuration values for a Python program. Personally, I like the style recommended by The Twelve-Factor App.
The twelve-factor app stores config in environment variables (often shortened to env vars or env). Env vars are easy to change between deploys without changing any code; unlike config files, there is little chance of them being checked into the code repo accidentally; and unlike custom config files, or other config mechanisms such as Java System Properties, they are a language- and OS-agnostic standard.
The most flexible way I've found is to use the argparse module, and use the environment variables as the defaults. That way, you can override the environment variables on the command line. Be careful about putting passwords on the command line, though, because other users can probably see your command line arguments in the process list.
Here's an example that uses argparse and environment variables:
def parse_args(argv=None):
parser = ArgumentParser(description='Watch the raw data folder for new runs.',
formatter_class=ArgumentDefaultsHelpFormatter)
parser.add_argument(
'--kive_server',
default=os.environ.get('MICALL_KIVE_SERVER', 'http://localhost:8000'),
help='server to send runs to')
parser.add_argument(
'--kive_user',
default=os.environ.get('MICALL_KIVE_USER', 'kive'),
help='user name for Kive server')
parser.add_argument(
'--kive_password',
default=SUPPRESS,
help='password for Kive server (default not shown)')
args = parser.parse_args(argv)
if not hasattr(args, 'kive_password'):
args.kive_password = os.environ.get('MICALL_KIVE_PASSWORD', 'kive')
return args
Setting those environment variables can be a bit confusing, particularly for system services. If you're using systemd, look at the service unit, and be careful to use EnvironmentFile instead of Environment for any secrets. Environment values can be viewed by any user with systemctl show.
I usually make the default values useful for a developer running on their workstation, so they can start development without changing any configuration.
If you do want to put the configuration settings in a settings.py file, just be careful not to commit that file to source control. I have often committed a settings_template.py file that users can copy.
Working with scientific data, specifically climate data, I am constantly hard-coding paths to data directories in my Python code. Even if I were to write the most extensible code in the world, the hard-coded file paths prevent it from ever being truly portable. I also feel like having information about the file system of your machine coded in your programs could be security issue.
What solutions are out there for handling the configuration of paths in Python to avoid having to code them out explicitly?
One of the solution rely on using configuration files.
You can store all your path in a json file like so :
{
"base_path" : "/home/bob/base_folder",
"low_temp_area_path" : "/home/bob/base/folder/low_temp"
}
and then in your python code, you could just do :
import json
with open("conf.json") as json_conf :
CONF = json.load(json_conf)
and then you can use your path (or any configuration variable you like) like so :
print "The base path is {}".format(CONF["base_path"])
First off its always good practise to add a main function to go with each class to test that class or functions in the file. Along with this you determine the current working directory. This becomes incredibly important when running python from a cron job or from a directory that is not the current working directory. No JSON files or environment variables are then needed and you will obtain interoperation across Mac, RHEL and Debian distributions.
This is how you do it, and it will work on windows also if you use '\' instead of '/' (if that is even necessary, in your case).
if "__main__" == __name__:
workingDirectory = os.path.realpath(sys.argv[0])
As you can see when you run your command, the working directory is calculated if you provide a full path or relative path, meaning it will work in a cron job automatically.
After that if you want to work with data that is stored in the current directory use:
fileName = os.path.join( workingDirectory, './sub-folder-of-current-directory/filename.csv' )
fp = open( fileName,'r')
or in the case of the above working directory (parallel to your project directory):
fileName = os.path.join( workingDirectory, '../folder-at-same-level-as-my-project/filename.csv' )
fp = open( fileName,'r')
I believe there are many ways around this, but here is what I would do:
Create a JSON config file with all the paths I need defined.
For even more portability, I'd have a default path where I look for this config file but also have a command line input to change it.
In my opinion passing arguments from command line would be best solution. You should take a look at argparse . This allows you to create nice way to handle arguments from the command line. for example:
myDataScript.py /home/userName/datasource1/
I have a Pyramid application which I can start using pserve some.ini. The ini file contains the usual paste configuration and everything works fine. In production, I use uwsgi, having a paste = config:/path/to/some.ini entry, which works fine too.
But instead of reading my configuration from a static ini file, I want to retrieve it from some external key value store. Reading the paste documentation and source code, I figured out, that there is a call scheme, which calls a python function to retrieve the "settings".
I implemented some get_conf method and try to start my application using pserve call:my.module:get_conf. If the module/method do not exist, I get an appropriate error, so the method seems to be used. But whatever I return from the method, I end up with this error message:
AssertionError: Protocol None unknown
I have no idea what return value of the method is expected and how to implement it. I tried to find documentation or examples, but without success. How do I have to implement this method?
While not the answer to your exact question, I think this is the answer to what you want to do. When pyramid starts, your ini file vars from the ini file just get parsed into the settings object that is set on your registry, and you access them through the registry from the rest of your app. So if you want to get settings somewhere else (say env vars, or some other third party source), all you need to do is build yourself a factory component for getting them, and use that in the server start up method that is typically in your base _ _ init _ _.py file. You don't need to get anything from the ini file if that's not convenenient, and if you don't, it doesn't matter how you deploy it. The rest of your app doesn't need to know where they came from. Here's an example of how I do this for getting settings from env vars because I have a distributed app with three separate processes and I don't want to be mucking about with three sets of ini files (instead I have one file of env vars that doesn't go in git and gets sourced before anything gets turned on):
# the method that runs on server startup, no matter
# how you deploy.
def main(global_config, **settings):
""" This function returns a Pyramid WSGI application."""
# settings has your values from the ini file
# go ahead and stick things it from any source you'd like
settings['db_url'] = os.environ.get('DB_URL')
config = Configurator(
settings=settings,
# your other configurator args
)
# you can also just stick things directly on the registry
# for other components to use, as everyone has access to
# request.registry.
# here we look in an env var and fall back to the ini file
amqp_url = os.environ.get('AMQP_URL', settings['amqp.url'] )
config.registry.bus = MessageClient( amqp_url=amqp_url )
# rest of your server start up code.... below
Some background (not mandatory, but might be nice to know): I am writing a Python command-line module which is a wrapper around latexdiff. It basically replaces all \cite{ref1, ref2, ...} commands in LaTeX files with written-out and properly formatted references before passing the files to latexdiff, so that latexdiff will properly mark changes to references in the text (otherwise, it treats the whole \cite{...} command as a single "word"). All the code is currently in a single file which can be run with python -m latexdiff-cite, and I have not yet decided how to package or distribute it. To make the script useful for anybody else, the citation formatting needs to be configurable. I have implemented an optional command-line argument -c CONFIGFILE to allow the user to point to their own JSON config file (a default file resides in the module folder and is loaded if the argument is not used).
Current implementation: My single-file command-line Python module currently parses command-line arguments in if __name__ == '__main__', and loads the config file (specified by the user in -c CONFIGFILE) here before running the main function of the program. The config variable is thus available in the entire module and all is well. However, I'm considering publishing to PyPI by following this guide which seems to require me to put the command-line parsing in a main() function, which means the config variable will not be available to the other functions unless passed down as arguments to where it's needed. This "passing down by arguments" method seems a little cluttered to me.
Question: Is there a more pythonic way to set some configuration globals in a module or otherwise accomplish what I'm trying to? (I don't want to rely on 3rd party modules.) Am I perhaps completely off the tracks in some fundamental way?
One way to do it is to have the configurations defined in a class or a simple dict:
class Config(object):
setting1 = "default_value"
setting2 = "default_value"
#staticmethod
def load_config(json_file):
""" load settings from config file """
with open(json_file) as f:
config = json.load(f)
for k, v in config.iteritems():
setattr(Config, k, v)
Then your application can access the settings via this class: Config.setting1 ...
I print out (m x n) table of values for debugging, however, I do not want the debug messages to be printed out in non-debugging mode. In C, it can be done with "#ifdef _DEBUG" in code and define _DEBUG in preprocessor definition. May I know what is equivalent way in Python?
Python has module called "logging"
See this question:
Using print statements only to debug
Or the basic tutorial:
http://docs.python.org/2/howto/logging.html
You could define a global variable someplace, if that's what you want. However, probably the cleaner and more standard way is to read a config file (easy because you can write a config file in plain Python) and define DEBUG in there. So you've got a config file that looks like this:
# program.cfg
# Other comments
# And maybe other configuration settings
DEBUG = True # Or False
And then in your code, you can either import your config file (if it's in a directory on the Python path and has a Python extension), or else you can execfile it.
cfg = {}
execfile('program.cfg', cfg) # Execute the config file in the new "cfg" namespace.
print cfg.get('DEBUG') # Access configuration settings like this.
try this:
import settings
if settings.DEBUG:
print testval
This prints testval if, and only if, DEBUG=True in settings.py