Edit Terraform configuration files programmatically with Python - python

I am trying to edit Terraform configuration files with Python. I am parsing Terraform files (.tf) using python hcl2 library which returns a python dictionary. I want to add new key/value pairs or change some values in the dictionary. Directly writing to the file is not a good practice since the returned python dictionary is not in Hashicorp Configuration Language format. Also there can be multiple configuration files like variables.tf etc. which are linked together. Should I implement my own serializer which converts python dictionary to terraform configuration file or is there an easier way to do it?

The python-hcl2 library implements a parser for the HCL syntax, but it doesn't have a serializer, and its API is designed to drop all of the HCL specifics and retain only a basic Python data structure, so it doesn't seem to retain enough information to surgically modify the input without losing details such as comments, ordering of attributes, etc.
At the time I'm writing this, the only HCL implementation that explicitly supports updating existing configuration files in-place is the Go package hclwrite. It allows callers to load in arbitrary HCL source, surgically modify parts of it, and then re-serialize that updated version with only minor whitespace normalization to the unchanged parts of the input.
In principle it would be possible to port hclwrite to Python, or to implement a serializer from a dictionary like python-hcl2 returns if you are not concerned with preserving unchanged input, but both of these seem like a significant project.
If you do decide to do it, one part that warrants careful attention is serialization of strings into HCL syntax, because the required escaping isn't exactly the same as any other language. You might wish to refer to the escapeQuotedStringLit function from hclwrite to see all of the cases to handle, so you can potentially implement compatible logic in Python.

As #mark-b mentioned, Terraform supports json. So once you have imported your hcl2 file with the python-hcl2 library, you can modify the data structure internally and then dump it with json.dump() to a file with .tf.json extension.
I just ran this script to have my main.tf change from a GCS backend to a local one.
import hcl2
import json
with open('main.tf', 'r') as file:
dict = hcl2.load(file)
dict['terraform'][0]['backend'][0]= {
'local' : { 'path' : 'default.tfstate' }
}
with open('main.tf.json', 'w') as file:
json.dump(dict, file, indent=4)
After running it, delete the main.tf file.

Related

Mapping fields inside other fields

Hello I would like to make an app that allows the user to import data from a source of his choice (Airtable, xls, csv, JSON) and export to a JSON which will be pushed to an Sqlite database using an API.
The "core" of the functionality of the app is that it allows the user to create a "template" and "map" of the source columns inside the destination columns. Which source column(s) go to which destination column is up to the user. I am attaching two photos here (used in airtable/zapier), so you can get a better idea of the end result:
adding fields inside fields - airtableadding fields inside fields - zapier
I would like to know if you can recommend a library or a way to come about this problem? I have tried to look for some python or nodejs libraries, I am lost between using ETL libraries, some recommended using mapping/zipping features, others recommend coding my own classes. Do you know any libraries that allow to do the same thing as airtable/zapier ? Any suggestions ?
Save file on databases is really a bad practice since it takes up a lot of database storage space and would add latency in the communication.
I hardly recommend saving it on disk and store the path on database.

dictionary | class | namedtuple from YAML

I have a large-ish YAML file (~40 lines) that I'm loading using PyYAML. This is of course parsed into a large-ish dictionary plus a couple of arrays.
My question is: how to manage the data. I can of course leave it in the output dictionary and work through the data. But I was wondering if it's better instead to mangle the data in a class or use a nametuple to hold the data.
Any first-hand experience about that?
Whether you post-process the data structure into a class or not primarily has to do with how you are using that data. The same applies to the decision whether to use a tag or not and load (some off) the data from the YAML file into a specific instance of a class that way.
The primary advantage of using a class in both cases (post-processing, tagging) is that you can do additional tests during initialisation for consistency, that are not done on the key-value pairs of a dict or on the items of list.
A class also allows you to provide methods to check values before they are set, e.g. to make sure they are of the right type.
Whether that overhead is necessary depends on the project, who is using and/or updating the data etc and how long this project and its data is going to live (i.e. are you still going to understand the data and its implicit structure a year from now). These are all issues for which a well designed (and documented) class can help, at the cost of some extra work up-front.

Proper way to save python dictionaries and retrieve them at a later stage

following an earlier question I asked here (Most appropriate way to combine features of a class to another?) I got an answer that I finally grown to understand. In short what I intend to now is have a bunch of dictionaries, each dictionary will look somewhat like this:
{ "url": "http://....", "parser": SomeParserClass }
though more properties might be added later but will include either strings or some other classes.
Now my question is: what's the best way to save these objects?
I thought up of 3 solutions, not sure which one is the best and if there are any other more acceptable solutions.
Use pickle, while it seems efficient to use it would make editing any of these dictionaries a pain, since it's saved in binary format.
Save each dictionary in a separate module and import these modules dynamically from a single directory, each module would either have a function inside it to return the dictionary or a specially crafted variable name to hold it so I could call it from my loading code. This seems the easier the edit but doesn't sound very efficient or pythonic
Use some sort of database like MongoDB or Riak to save these objects, my problem with this one is either editing which is doable but doesn't sound like fun and the fact that the former 2 are equipped with means to correctly save my parser class inside the dictionary, I have no idea how these databases serialize or 'pickle' such objects.
As you see my main concerns are how easy would it be to edit them, the efficiency of saving and retrieving the data (though not a huge concern since I only have a couple of hundreds of these) and the correctness of the solution.
So, any thoughts?
Thank you in advance for any help you might be able to provide.
Use JSON. It supports python dictionaries and can be easily edited.
You can try shelve. It's built on top of pickle and let's you serialize objects and associate them to string keys.
Because it is based on dbm, it will only access key/values as you need them. So if you only need to access a few items from a large dictionary, shelve may be a better choice than json, which has to load the entire JSON file into a dictionary first.

How should I append to small file-like objects in mongodb?

I need an interface to mongodb by which I can treat data in a collection like a standard python file-like object. These will be fairly small files (measured in kilobytes, at most) and in particular I need the ability to append to these so-called files. (So this question is not a dupe.)
I have read the GridFS documentation, and in particular it says I should not use it for small files. The only other implementations I've been able to find have all been PHP. I'm not really looking for help writing any specifics of the code, but implementing the entire file api seems a daunting task.
Are there any shortcuts or tools to make it easier to implement file-like objects in python 2?
Am I missing that someone has already done this?
(Why am I doing this? Because I received an eleventh-hour requirement that we deploy a pre-existing application that produces csv files on a multinode cloud environment that cannot transparently handle files.)
For question 1: check out the io module, and especially IOBase. It implements all of the file-likes in terms of a fairly sensible set of methods.
You could just store the data as binary, or text, in a MongoDB collection. But you'd have two problems:
You'd have to implement as much of the Python file protocol as your other code expects to have implemented.
When you append to the "file", the document would grow in MongoDB and possibly need to be moved on disk to a location with enough space to hold the larger document. Moving documents is expensive.
Go with GridFS -- the documentation discourages you from using for static files but for your case it's perfect because PyMongo has done the work for you of implementing Python's file protocol for MongoDB data. To append to a GridFS file you must read it, save a new version with the additional data, and delete the previous version. But this isn't much more expensive than moving a grown document anyway.

Store simple user settings in Python

I am programming a website in which users will have a number of settings, such as their choice of colour scheme, etc. I'm happy to store these as plain text files, and security is not an issue.
The way I currently see it is: there is a dictionary, where all the keys are users and the values are dictionaries with the users' settings in them.
For example, userdb["bob"]["colour_scheme"] would have the value "blue".
What is the best way to store it on file? Pickling the dictionary?
Are there better ways of doing what I am trying to do?
I would use the ConfigParser module, which produces some pretty readable and user-editable output for your example:
[bob]
colour_scheme: blue
british: yes
[joe]
color_scheme: that's 'color', silly!
british: no
The following code would produce the config file above, and then print it out:
import sys
from ConfigParser import *
c = ConfigParser()
c.add_section("bob")
c.set("bob", "colour_scheme", "blue")
c.set("bob", "british", str(True))
c.add_section("joe")
c.set("joe", "color_scheme", "that's 'color', silly!")
c.set("joe", "british", str(False))
c.write(sys.stdout) # this outputs the configuration to stdout
# you could put a file-handle here instead
for section in c.sections(): # this is how you read the options back in
print section
for option in c.options(section):
print "\t", option, "=", c.get(section, option)
print c.get("bob", "british") # To access the "british" attribute for bob directly
Note that ConfigParser only supports strings, so you'll have to convert as I have above for the Booleans. See effbot for a good run-down of the basics.
Using cPickle on the dictionary would be my choice. Dictionaries are a natural fit for these kind of data, so given your requirements I see no reason not to use them. That, unless you are thinking about reading them from non-python applications, in which case you'd have to use a language neutral text format. And even here you could get away with the pickle plus an export tool.
I don't tackle the question which one is best. If you want to handle text-files, I'd consider ConfigParser -module. Another you could give a try would be simplejson or yaml. You could also consider a real db table.
For instance, you could have a table called userattrs, with three columns:
Int user_id
String attribute_name
String attribute_value
If there's only few, you could store them into cookies for quick retrieval.
Here's the simplest way. Use simple variables and import the settings file.
Call the file userprefs.py
# a user prefs file
color = 0x010203
font = "times new roman"
position = ( 12, 13 )
size = ( 640, 480 )
In your application, you need to be sure that you can import this file. You have many choices.
Using PYTHONPATH. Require PYTHONPATH be set to include the directory with the preferences files.
a. An explicit command-line parameter to name the file (not the best, but simple)
b. An environment variable to name the file.
Extending sys.path to include the user's home directory
Example
import sys
import os
sys.path.insert(0,os.path.expanduser("~"))
import userprefs
print userprefs.color
For a database-driven website, of course, your best option is a db table. I'm assuming that you are not doing the database thing.
If you don't care about human-readable formats, then pickle is a simple and straightforward way to go. I've also heard good reports about simplejson.
If human readability is important, two simple options present themselves:
Module: Just use a module. If all you need are a few globals and nothing fancy, then this is the way to go. If you really got desperate, you could define classes and class variables to emulate sections. The downside here: if the file will be hand-edited by a user, errors could be hard to catch and debug.
INI format: I've been using ConfigObj for this, with quite a bit of success. ConfigObj is essentially a replacement for ConfigParser, with support for nested sections and much more. Optionally, you can define expected types or values for a file and validate it, providing a safety net (and important error feedback) for users/administrators.
I would use shelve or an sqlite database if I would have to store these setting on the file system. Although, since you are building a website you probably use some kind of database so why not just use that?
The built-in sqlite3 module would probably be far simpler than most alternatives, and gets you ready to update to a full RDBMS should you ever want or need to.
If human readablity of configfiles matters an alternative might be the ConfigParser module which allows you to read and write .ini like files. But then you are restricted to one nesting level.
If you have a database, I might suggest storing the settings in the database. However, it sounds like ordinary files might suit your environment better.
You probably don't want to store all the users settings in the same file, because you might run into trouble with concurrent access to that one file. If you stored each user's settings as a dictionary in their own pickled file, then they would be able to act independently.
Pickling is a reasonable way to store such data, but unfortunately the pickle data format is notoriously not-human-readable. You might be better off storing it as repr(dictionary) which will be a more readable format. To reload the user settings, use eval(open("file").read()) or something like that.
Is there are particular reason you're not using the database for this? it seems the normal and natural thing to do - or store a pickle of the settings in the db keyed on user id or something.
You haven't described the usage patterns of the website, but just thinking of a general website - but I would think that keeping the settings in a database would cause much less disk I/O than using files.
OTOH, for settings that might be used by client-side code, storing them as javascript in a static file that can be cached would be handy - at the expense of having multiple places you might have settings. (I'd probably store those settings in the db, and rebuild the static files as necessary)
I agree with the reply about using Pickled Dictionary. Very simple and effective for storing simple data in a Dictionary structure.
If you don't care about being able to edit the file yourself, and want a quick way to persist python objects, go with pickle. If you do want the file to be readable by a human, or readable by some other app, use ConfigParser. If you need anything more complex, go with some sort of database, be it relational (sqlite), or object-oriented (axiom, zodb).

Categories

Resources