I have a puppet manifest file - init.pp for my puppet module
In this file there are parameters for the class and in most cases they're written in the same way:
Example Input:
class test_module(
$first_param = 'test',
$second_param = 'new' )
What is the best way that I can parse this file with Python and get a dict object like this, which includes all the class parameters?
Example output:
param_dict = {'first_param':'test', 'second_param':'new'}
Thanks in Advance :)
Puppet Strings is a rubygem that can be installed on top of Puppet and can output a JSON document containing lists of the class parameters, documentation etc.
After installing it (see above link), run this command either in a shell or from your Python program to generate JSON:
puppet strings generate --emit-json-stdout init.pp
This will generate:
{
"puppet_classes": [
{
"name": "test_module",
"file": "init.pp",
"line": 1,
"docstring": {
"text": "",
"tags": [
{
"tag_name": "param",
"text": "",
"types": [
"Any"
],
"name": "first_param"
},
{
"tag_name": "param",
"text": "",
"types": [
"Any"
],
"name": "second_param"
}
]
},
"defaults": {
"first_param": "'test'",
"second_param": "'new'"
},
"source": "class test_module(\n $first_param = 'test',\n $second_param = 'new' ) {\n}"
}
]
}
(JSON trimmed slightly for brevity)
You can load the JSON in Python with json.loads, and extract the parameter names from root["puppet_classes"]["docstring"]["tags"] (where tag_name is param) and any default values from root["puppet_classes"]["defaults"].
You can use regular expression (straightforward but fragile)
import re
def parse(data):
mm = re.search('\((.*?)\)', data,re.MULTILINE)
dd = {}
if not mm:
return dd
matches = re.finditer("\s*\$(.*?)\s*=\s*'(.*?)'", mm.group(1), re.MULTILINE)
for mm in matches:
dd[mm.group(1)] = mm.group(2)
return dd
You can use it as follows:
import codecs
with codecs.open(filename,'r') as ff:
dd = parse(ff.read())
I don't know about the "best" way, but one way would be:
1) Set up Rspec-puppet (see google or my blog post for how to do that).
2) Compile your code and generate a Puppet catalog. See my other blog post for that.
Now, the Puppet catalog you compiled is a JSON document.
3) Visually inspect the JSON document to find the data you are looking for. Its precise location in the JSON document depends on the version of Puppet you are using.
4) You can now use Python to extract the data as a dictionary from the JSON document.
Related
I am looking for "working"/"syntactically correct" (python) samples for provisioning unified alerts to grafana.
A have a pure terraform config file, provided by grafana, however, the python syntax complicates it further.
I have not the time to post my code yet, however the only problem is defining the model of RuleGroupRuleData.
You may want to copy the model of a GUI-defined alert from here:
/api/ruler/grafana/api/v1/rules
Just copy it using a heredoc-python string:
grafana_RuleGroupRuleData = [RuleGroupRuleData(
ref_id = "A",
query_type = "",
relative_time_range = dict(from_ = 600, to = 0),
datasource_uid = grafana_dataSource.uid,
model = """{
"alias": "$col",
"datasource": {
"type": "influxdb",
"uid": "datasource_influxdb"
},
"groupBy": [ ...
"""
I currently have JSON in the below format.
Some of the Key values are NOT properly formatted as they are missing double quotes (")
How do I fix these key values to have double-quotes on them?
{
Name: "test",
Address: "xyz",
"Age": 40,
"Info": "test"
}
Required:
{
"Name": "test",
"Address": "xyz",
"Age": 40,
"Info": "test"
}
Using the below post, I was able to find such key values in the above INVALID JSON.
However, I could NOT find an efficient way to replace these found values with double-quotes.
s = "Example: String"
out = re.findall(r'\w+:', s)
How to Escape Double Quote inside JSON
Using Regex:
import re
data = """{ Name: "test", Address: "xyz"}"""
print( re.sub("(\w+):", r'"\1":', data) )
Output:
{ "Name": "test", "Address": "xyz"}
You can use PyYaml. Since JSON is a subset of Yaml, pyyaml may overcome the lack of quotes.
Example
import yaml
dirty_json = """
{
key: "value",
"key2": "value"
}
"""
yaml.load(dirty_json, yaml.SafeLoader)
I had few more issues that I faced in my JSON.
Thought of sharing the final solution that worked for me.
jsonStr = re.sub("((?=\D)\w+):", r'"\1":', jsonStr)
jsonStr = re.sub(": ((?=\D)\w+)", r':"\1"', jsonStr)
First Line will fix this double-quotes issue for the Key. i.e.
Name: "test"
Second Line will fix double-quotes issue for the value. i.e. "Info": test
Also, above will exclude double-quoting within date timestamp which have : (colon) in them.
You can use online formatter. I know most of them are throwing error for not having double quotes but below one seems handling it nicely!
JSON Formatter
The regex approach can be brittle. I suggest you find a library that can parse the JSON text that is missing quotes.
For example, in Kotlin 1.4, the standard way to parse a JSON string is using Json.decodeFromString. However, you can use Json { isLenient = true }.decodeFromString to relax the requirements for quotes. Here is a complete example in JUnit.
import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromString
import kotlinx.serialization.json.Json
import org.junit.jupiter.api.Assertions
import org.junit.jupiter.api.Test
#Serializable
data class Widget(val x: Int, val y: String)
class JsonTest {
#Test
fun `Parsing Json`() {
val w: Widget = Json.decodeFromString("""{"x":123, "y":"abc"}""")
Assertions.assertEquals(123, w.x)
Assertions.assertEquals("abc", w.y)
}
#Test
fun `Parsing Json missing quotes`() {
// Json.decodeFromString("{x:123, y:abc}") failed to decode due to missing quotes
val w: Widget = Json { isLenient = true }.decodeFromString("{x:123, y:abc}")
Assertions.assertEquals(123, w.x)
Assertions.assertEquals("abc", w.y)
}
}
I do a lot of JavaScript projects, and I miss a great feature from PHPStorm. Now I don't understand Python that much. So hope you can help me.
This is what I want:
'test'.log => console.log('test');
test.log => console.log(test);
So with a single tabtrigger .log
I want to retrieve anything before the .log. And then I will transform it. How can I do this?
You can just create a plugin to retrieve the text before .log and replace it in the view:
import re
import sublime
import sublime_plugin
class PostSnippetLogCommand(sublime_plugin.TextCommand):
def run(self, edit):
view = self.view
for sel in view.sel():
pos = sel.b
text_before = view.substr(sublime.Region(view.line(pos).a, pos))
# match the text before in reversed order
m = re.match(r"gol\.(\S*)", text_before[::-1])
if not m:
continue
# retrieve the text before .log and reestablish the correct order
text_content = m.group(1)[::-1]
# create the replacements text and region
replace_text = "console.log({});".format(text_content)
replace_reg = sublime.Region(pos - len(m.group(0)), pos)
# replace the text
view.replace(edit, replace_reg, replace_text)
Afterwards add this keybinding to trigger the command if it is prefixed with .log inside a javascript document.
{
"keys": ["tab"],
"command": "post_snippet_log",
"context":
[
{ "key": "selector", "operand": "source.js" },
{ "key": "preceding_text", "operator": "regex_contains", "operand": "\\.log$" },
],
},
I want to get "path" from the below json file; I used json.load to get read json file and then parse one by one using for key, value in data.items() and it leads to lot of for loop (Say 6 loops) to get to the value of "path"; Is there any simple method to retrieve the value of path?
The complete json file can be found here and below is the snippet of it.
{
"products": {
"com.ubuntu.juju:12.04:amd64": {
"version": "2.0.1",
"arch": "amd64",
"versions": {
"20161129": {
"items": {
"2.0.1-precise-amd64": {
"release": "precise",
"version": "2.0.1",
"arch": "amd64",
"size": 23525972,
"path": "released/juju-2.0.1-precise-amd64.tgz",
"ftype": "tar.gz",
"sha256": "f548ac7b2a81d15f066674365657d3681e3d46bf797263c02e883335d24b5cda"
}
}
}
}
},
"com.ubuntu.juju:14.04:amd64": {
"version": "2.0.1",
"arch": "amd64",
"versions": {
"20161129": {
"items": {
"2.0.1-trusty-amd64": {
"release": "trusty",
"version": "2.0.1",
"arch": "amd64",
"size": 23526508,
"path": "released/juju-2.0.1-trusty-amd64.tgz",
"ftype": "tar.gz",
"sha256": "7b86875234477e7a59813bc2076a7c1b5f1d693b8e1f2691cca6643a2b0dc0a2"
}
}
}
}
},
You can use recursive generator:
def get_paths(data):
if 'path' in data:
yield data['path']
for k in data.keys():
if isinstance(data[k], dict):
for i in get_paths(data[k]):
yield i
for path in get_paths(json_data): # loaded json data
print(path)
Is path key always at the same depth in the loaded json (which is a dict so) ? If so, what about doing
products = loaded_json['products']
for product in products.items():
print product[1].items()[2][1].items()[0][1].items()[0][1].items()[0][1]['path']
If not, the answer of Yevhen Kuzmovych is clearly better, cleaner and more general than mine.
If you only care about the path, I think using any JSON parser is an overkill, you can just use built in re regex and use the following pattern (\"path\":\s*\")(.*\s*)(?=\",). I didn't test the whole file but should be able to figure out the best pattern fairly easily.
If you only need the file names present in path field, you can easily get them by simply parsing the file:
import re
files = []
pathre = re.compile(r'\s*"path"\s*:\s*"(.*?)"')
with open('file.json') as fd:
for line in fd:
if "path" in line:
m = pathre.match(line)
if m is not None:
files.append(m.group(1))
If you need to process simultaneously the path and sha256 fields:
files = []
pathre = re.compile(r'\s*"path"\s*:\s*"(.*?)"')
share = re.compile(r'\s*"sha256"\s*:\s*"(.*?)"')
path = None
with open('file.json') as fd:
for line in fd:
if "path" in line:
m = pathre.match(line)
path = m.group(1)
elif "sha256" in line:
m = share.match(line)
if path is not None:
files.append((path, m.group(1)))
path = None
You can use a query language like JSONPath. Here you find the Python implementation: https://pypi.python.org/pypi/jsonpath-rw
Assuming you have your JSON content already loaded, you can do something like the following:
from jsonpath_rw import jsonpath, parse
# Load your JSON content first from a file or from a string
# json_data = ...
jsonpath_expr = parse('products..path')
for match in jsonpath_expr.find(json_data):
print(match.value)
For a further discussion you can read this: Is there a query language for JSON?
I'm new to JSON and Python, any help on this would be greatly appreciated.
I read about json.loads but am confused
How do I read a file into Python using json.loads?
Below is my JSON file format:
{
"header": {
"platform":"atm"
"version":"2.0"
}
"details":[
{
"abc":"3"
"def":"4"
},
{
"abc":"5"
"def":"6"
},
{
"abc":"7"
"def":"8"
}
]
}
My requirement is to read the values of all "abc" "def" in details and add this is to a new list like this [(1,2),(3,4),(5,6),(7,8)]. The new list will be used to create a spark data frame.
Open the file, and get a filehandle:
fh = open('thefile.json')
https://docs.python.org/2/library/functions.html#open
Then, pass the file handle into json.load(): (don't use loads - that's for strings)
import json
data = json.load(fh)
https://docs.python.org/2/library/json.html#json.load
From there, you can easily deal with a python dictionary that represents your json-encoded data.
new_list = [(detail['abc'], detail['def']) for detail in data['details']]
Note that your JSON format is also wrong. You will need comma delimiters in many places, but that's not the question.
I'm trying to understand your question as best as I can, but it looks like it was formatted poorly.
First off your json blob is not valid json, it is missing quite a few commas. This is probably what you are looking for:
{
"header": {
"platform": "atm",
"version": "2.0"
},
"details": [
{
"abc": "3",
"def": "4"
},
{
"abc": "5",
"def": "6"
},
{
"abc": "7",
"def": "8"
}
]
}
Now assuming you are trying to parse this in python you will have to do the following.
import json
json_blob = '{"header": {"platform": "atm","version": "2.0"},"details": [{"abc": "3","def": "4"},{"abc": "5","def": "6"},{"abc": "7","def": "8"}]}'
json_obj = json.loads(json_blob)
final_list = []
for single in json_obj['details']:
final_list.append((int(single['abc']), int(single['def'])))
print(final_list)
This will print the following: [(3, 4), (5, 6), (7, 8)]