I am programming a website in which users will have a number of settings, such as their choice of colour scheme, etc. I'm happy to store these as plain text files, and security is not an issue.
The way I currently see it is: there is a dictionary, where all the keys are users and the values are dictionaries with the users' settings in them.
For example, userdb["bob"]["colour_scheme"] would have the value "blue".
What is the best way to store it on file? Pickling the dictionary?
Are there better ways of doing what I am trying to do?
I would use the ConfigParser module, which produces some pretty readable and user-editable output for your example:
[bob]
colour_scheme: blue
british: yes
[joe]
color_scheme: that's 'color', silly!
british: no
The following code would produce the config file above, and then print it out:
import sys
from ConfigParser import *
c = ConfigParser()
c.add_section("bob")
c.set("bob", "colour_scheme", "blue")
c.set("bob", "british", str(True))
c.add_section("joe")
c.set("joe", "color_scheme", "that's 'color', silly!")
c.set("joe", "british", str(False))
c.write(sys.stdout) # this outputs the configuration to stdout
# you could put a file-handle here instead
for section in c.sections(): # this is how you read the options back in
print section
for option in c.options(section):
print "\t", option, "=", c.get(section, option)
print c.get("bob", "british") # To access the "british" attribute for bob directly
Note that ConfigParser only supports strings, so you'll have to convert as I have above for the Booleans. See effbot for a good run-down of the basics.
Using cPickle on the dictionary would be my choice. Dictionaries are a natural fit for these kind of data, so given your requirements I see no reason not to use them. That, unless you are thinking about reading them from non-python applications, in which case you'd have to use a language neutral text format. And even here you could get away with the pickle plus an export tool.
I don't tackle the question which one is best. If you want to handle text-files, I'd consider ConfigParser -module. Another you could give a try would be simplejson or yaml. You could also consider a real db table.
For instance, you could have a table called userattrs, with three columns:
Int user_id
String attribute_name
String attribute_value
If there's only few, you could store them into cookies for quick retrieval.
Here's the simplest way. Use simple variables and import the settings file.
Call the file userprefs.py
# a user prefs file
color = 0x010203
font = "times new roman"
position = ( 12, 13 )
size = ( 640, 480 )
In your application, you need to be sure that you can import this file. You have many choices.
Using PYTHONPATH. Require PYTHONPATH be set to include the directory with the preferences files.
a. An explicit command-line parameter to name the file (not the best, but simple)
b. An environment variable to name the file.
Extending sys.path to include the user's home directory
Example
import sys
import os
sys.path.insert(0,os.path.expanduser("~"))
import userprefs
print userprefs.color
For a database-driven website, of course, your best option is a db table. I'm assuming that you are not doing the database thing.
If you don't care about human-readable formats, then pickle is a simple and straightforward way to go. I've also heard good reports about simplejson.
If human readability is important, two simple options present themselves:
Module: Just use a module. If all you need are a few globals and nothing fancy, then this is the way to go. If you really got desperate, you could define classes and class variables to emulate sections. The downside here: if the file will be hand-edited by a user, errors could be hard to catch and debug.
INI format: I've been using ConfigObj for this, with quite a bit of success. ConfigObj is essentially a replacement for ConfigParser, with support for nested sections and much more. Optionally, you can define expected types or values for a file and validate it, providing a safety net (and important error feedback) for users/administrators.
I would use shelve or an sqlite database if I would have to store these setting on the file system. Although, since you are building a website you probably use some kind of database so why not just use that?
The built-in sqlite3 module would probably be far simpler than most alternatives, and gets you ready to update to a full RDBMS should you ever want or need to.
If human readablity of configfiles matters an alternative might be the ConfigParser module which allows you to read and write .ini like files. But then you are restricted to one nesting level.
If you have a database, I might suggest storing the settings in the database. However, it sounds like ordinary files might suit your environment better.
You probably don't want to store all the users settings in the same file, because you might run into trouble with concurrent access to that one file. If you stored each user's settings as a dictionary in their own pickled file, then they would be able to act independently.
Pickling is a reasonable way to store such data, but unfortunately the pickle data format is notoriously not-human-readable. You might be better off storing it as repr(dictionary) which will be a more readable format. To reload the user settings, use eval(open("file").read()) or something like that.
Is there are particular reason you're not using the database for this? it seems the normal and natural thing to do - or store a pickle of the settings in the db keyed on user id or something.
You haven't described the usage patterns of the website, but just thinking of a general website - but I would think that keeping the settings in a database would cause much less disk I/O than using files.
OTOH, for settings that might be used by client-side code, storing them as javascript in a static file that can be cached would be handy - at the expense of having multiple places you might have settings. (I'd probably store those settings in the db, and rebuild the static files as necessary)
I agree with the reply about using Pickled Dictionary. Very simple and effective for storing simple data in a Dictionary structure.
If you don't care about being able to edit the file yourself, and want a quick way to persist python objects, go with pickle. If you do want the file to be readable by a human, or readable by some other app, use ConfigParser. If you need anything more complex, go with some sort of database, be it relational (sqlite), or object-oriented (axiom, zodb).
Related
I am trying to edit Terraform configuration files with Python. I am parsing Terraform files (.tf) using python hcl2 library which returns a python dictionary. I want to add new key/value pairs or change some values in the dictionary. Directly writing to the file is not a good practice since the returned python dictionary is not in Hashicorp Configuration Language format. Also there can be multiple configuration files like variables.tf etc. which are linked together. Should I implement my own serializer which converts python dictionary to terraform configuration file or is there an easier way to do it?
The python-hcl2 library implements a parser for the HCL syntax, but it doesn't have a serializer, and its API is designed to drop all of the HCL specifics and retain only a basic Python data structure, so it doesn't seem to retain enough information to surgically modify the input without losing details such as comments, ordering of attributes, etc.
At the time I'm writing this, the only HCL implementation that explicitly supports updating existing configuration files in-place is the Go package hclwrite. It allows callers to load in arbitrary HCL source, surgically modify parts of it, and then re-serialize that updated version with only minor whitespace normalization to the unchanged parts of the input.
In principle it would be possible to port hclwrite to Python, or to implement a serializer from a dictionary like python-hcl2 returns if you are not concerned with preserving unchanged input, but both of these seem like a significant project.
If you do decide to do it, one part that warrants careful attention is serialization of strings into HCL syntax, because the required escaping isn't exactly the same as any other language. You might wish to refer to the escapeQuotedStringLit function from hclwrite to see all of the cases to handle, so you can potentially implement compatible logic in Python.
As #mark-b mentioned, Terraform supports json. So once you have imported your hcl2 file with the python-hcl2 library, you can modify the data structure internally and then dump it with json.dump() to a file with .tf.json extension.
I just ran this script to have my main.tf change from a GCS backend to a local one.
import hcl2
import json
with open('main.tf', 'r') as file:
dict = hcl2.load(file)
dict['terraform'][0]['backend'][0]= {
'local' : { 'path' : 'default.tfstate' }
}
with open('main.tf.json', 'w') as file:
json.dump(dict, file, indent=4)
After running it, delete the main.tf file.
following an earlier question I asked here (Most appropriate way to combine features of a class to another?) I got an answer that I finally grown to understand. In short what I intend to now is have a bunch of dictionaries, each dictionary will look somewhat like this:
{ "url": "http://....", "parser": SomeParserClass }
though more properties might be added later but will include either strings or some other classes.
Now my question is: what's the best way to save these objects?
I thought up of 3 solutions, not sure which one is the best and if there are any other more acceptable solutions.
Use pickle, while it seems efficient to use it would make editing any of these dictionaries a pain, since it's saved in binary format.
Save each dictionary in a separate module and import these modules dynamically from a single directory, each module would either have a function inside it to return the dictionary or a specially crafted variable name to hold it so I could call it from my loading code. This seems the easier the edit but doesn't sound very efficient or pythonic
Use some sort of database like MongoDB or Riak to save these objects, my problem with this one is either editing which is doable but doesn't sound like fun and the fact that the former 2 are equipped with means to correctly save my parser class inside the dictionary, I have no idea how these databases serialize or 'pickle' such objects.
As you see my main concerns are how easy would it be to edit them, the efficiency of saving and retrieving the data (though not a huge concern since I only have a couple of hundreds of these) and the correctness of the solution.
So, any thoughts?
Thank you in advance for any help you might be able to provide.
Use JSON. It supports python dictionaries and can be easily edited.
You can try shelve. It's built on top of pickle and let's you serialize objects and associate them to string keys.
Because it is based on dbm, it will only access key/values as you need them. So if you only need to access a few items from a large dictionary, shelve may be a better choice than json, which has to load the entire JSON file into a dictionary first.
I have written a generic framework in python for some particular type of task. It is a webserver which serves different requests and operations. This framework can be used by many projects and each one has a different set of validation rules. Right now, I'm just updating my script for each project.
I'm thinking of externalizing this validation part, how do I go about this? The validations are more than mere field content validations; I'm thinking of having a config file which maps incoming request <-> validationModule something like /site1/a/b.xml=validateSite1.py and importing this module in an if condition if the request is for site1. So I'll have generic framework scripts + individual scripts for each site.
Is there a cleaner way to do this?
I think it'd be better to use Python itself as the top-level mapping from URL paths to validation modules. A configuration might look like this:
import site1
import site2
def dispatch(uri):
if uri.startswith('/site1/'):
return site1.validate(uri)
elif uri.startswith('/site2/):
return site2.validate(uri)
This simple example might tempt you to "abstract" it out into a more "generic framework" that turns strings into filenames to use as validation scripts. Here are some advantages of the doing the above instead:
Performance: site modules are imported just once, we don't look up filenames per request.
Flexibility: if you decide later that the dispatching logic is more complicated, you can easily use arbitrary Python code to deal with it. There will never be a need to extend your mapping system itself--only the config files that require more complexity.
Single language.
I have a need to store a python set in a database for accessing later. What's the best way to go about doing this? My initial plan was to use a textfield on my model and just store the set as a comma or pipe delimited string, then when I need to pull it back out for use in my app I could initialize a set by calling split on the string. Obviously if there is a simple way to serialize the set to store it in the db so I can pull it back out as a set when I need to use it later that would be best.
If your database is better at storing blobs of binary data, you can pickle your set. Actually, pickle stores data as text by default, so it might be better than the delimited string approach anyway. Just pickle.dumps(your_set) and unpickled = pickle.loads(database_string) later.
There are a number of options here, depending on what kind of data you wish to store in the set.
If it's regular integers, CommaSeparatedIntegerField might work fine, although it often feels like a clumsy storage method to me.
If it's other kinds of Python objects, you can try pickling it before saving it to the database, and unpickling it when you load it again. That seems like a good approach.
If you want something human-readable in your database though, you could even JSON-encode it into a TextField, as long as the data you're storing doesn't include Python objects.
Redis natively stores sets (as well as other data structures (lists, dicts, queue)) and provides set operations - and its rocket fast too. I find it's the swiss army knife for python development.
I know its not a relational database per se, but it does solve this problem very concisely.
What about CommaSeparatedIntegerField?
If you need other type (string for example) you can create your own field which would work like CommaSeparatedIntegerField but will use strings (without commas).
Or, if you need other type, probably a better way of doing it: have a dictionary which maps integers to your values.
I have some things that do not need to be indexed or searched (game configurations) so I was thinking of storing JSON on a BLOB. Is this a good idea at all? Or are there alternatives?
If you need to query based on the values within the JSON, it would be better to store the values separately.
If you are just loading a set of configurations like you say you are doing, storing the JSON directly in the database works great and is a very easy solution.
No different than people storing XML snippets in a database (that doesn't have XML support). Don't see any harm in it, if it really doesn't need to be searched at the DB level. And the great thing about JSON is how parseable it is.
I don't see why not. As a related real-world example, WordPress stores serialized PHP arrays as a single value in many instances.
I think,It's beter serialize your XML.If you are using python language ,cPickle is good choice.