I have namespace already created and defined tags to resources. When I try adding new tags to the resources, the old tags are getting deleted.
As I would like to use the old data and return the value along with the new tags. Please help me with how I can achieve this.
get volume details from a specific compartment
import oci
config = oci.config.from_file("~/.oci/config")
core_client = oci.core.BlockstorageClient(config)
get_volume_response = core_client.get_volume(
volume_id="ocid1.test.oc1..<unique_ID>EXAMPLE-volumeId-Value")
# Get the data from response
print(get_volume_response.data)
output
{
"availability_domain": "eto:PHX-AD-1",
"compartment_id": "ocid1.compartment.oc1..aaaaaaaapmj",
"defined_tags": {
"OMCS": {
"CREATOR": "xyz#gmail.com"
},
"Oracle-Tags": {
"CreatedBy": "xyz#gmail.com",
"CreatedOn": "2022-07-5T08:29:24.865Z"
}
},
"display_name": "test_VG",
"freeform_tags": {},
"id": "ocid1.volumegroup.oc1.phx.abced",
"is_hydrated": null,
"lifecycle_state": "AVAILABLE",
"size_in_gbs": 100,
"size_in_mbs": 102400,
"source_details": {
"type": "volumeIds",
"volume_ids": [
"ocid1.volume.oc1.phx.xyz"
]
}
I want the API below to update the tag along with the old data.
old tag
"defined_tags": {
"OMCS": {
"CREATOR": "xyz#gmail.com"
},
"Oracle-Tags": {
"CreatedBy": "xyz#gmail.com",
"CreatedOn": "2022-07-5T08:29:24.865Z"
import oci
config = oci.config.from_file("~/.oci/config")
core_client = oci.core.BlockstorageClient(config)
update_volume_response = core_client.update_volume(
volume_id="ocid1.test.oc1..<unique_ID>EXAMPLE-volumeId-Value",
update_volume_details=oci.core.models.UpdateVolumeDetails(
defined_tags={
'OMCS':{
'INSTANCE': 'TEST',
'COMPONENT': 'temp1.mt.exy.vcn.com'
}
},
display_name = "TEMPMT01"))
print(update_volume_response.data)
I also tried but got an attribute error.
for tag in get_volume_response.data:
def_tag.appened(tag.defined_tags)
return (def_tag)
Please help on how can I append the defined_tags?
tags are defined as dict in OCI. Append works the same way as in appending dict.
Below I have pasted the code for updating the defined_tags for Block Volumes in OCI
import oci
from oci.config import from_file
configAPI = from_file() # Config file is read from user's home location i.e., ~/.oci/config
core_client = oci.core.BlockstorageClient(configAPI)
get_volume_response = core_client.get_volume(
volume_id="ocid1.volume.oc1.ap-hyderabad-1.ameen")
# Get the data from response
volume_details = get_volume_response.data
defined_tags = getattr(volume_details, "defined_tags")
freeform_tags = getattr(volume_details, "freeform_tags")
# Add new tags as required. As defined_tags is a dict, addition of new key/value pair works like below.
# In case there are multiple tags to be added then use update() method of dict.
defined_tags["OMCS"]["INSTANCE"] = "TEST"
defined_tags["OMCS"]["COMPONENT"] = "temp1.mt.exy.vcn.com"
myJson={"freeform_tags":freeform_tags,"defined_tags": defined_tags}
update_volume_response = core_client.update_volume(
volume_id="ocid1.volume.oc1.ap-hyderabad-1.ameen",
update_volume_details=oci.core.models.UpdateVolumeDetails(
defined_tags=defined_tags,
freeform_tags=freeform_tags))
print(update_volume_response.data)
I have this script which I use to pull in some data from an API call.
# list of each api url to use
link =[]
#for every device id , create a new url link into the link list
for i in deviceIDList:
link.append('https://website/v2/accounts/accountid/devices/'+i)
#create a list with all the different requests
deviceReq = []
for i in link:
deviceReq.append(requests.get(i, headers=headers).json())
# write to a txt file
with open('masterSheet.txt', 'x') as f:
for i in deviceReq:
devices =[i['data']]
for x in devices:
models = [x['provision']]
for data in models:
sheet=(data['endpoint_model']+" ",x['name'])
f.write(str(sheet)+"\n")
Some devices do not have the provision key.
Here is some sample data looks like from a device that is different.
Let's say I want to grab the device_type key value instead if provision key is non-existent.
"data": {
"sip": {
"username": "xxxxxxxxxxxxxxxx",
"password": "xxxxxxxxxxxxxxxx",
"expire_seconds": xxxxxxxxxxxxxxxx,
"invite_format": "xxxxxxxxxxxxxxxx",
"method": "xxxxxxxxxxxxxxxx",
"route": "xxxxxxxxxxxxxxxx"
},
"device_type": "msteams",
"enabled": xxxxxxxxxxxxxxxx,
"suppress_unregister_notifications": xxxxxxxxxxxxxxxx,
"owner_id": "xxxxxxxxxxxxxxxx",
"name": "xxxxxxxxxxxxxxxx",
}
How do I cater for missing keys?
You can use .get(key, defualt_value) to get the value from a dict, or if one is not present it will use the default like this:
provision = x.get('provision', None)
if provision is None:
provision = x.get('device_type')
models = [provision]
or if you prefer you can do the same on one line and without the extra if or assignment (though some people might find it more difficult to read and understand.
models = [x.get('provision', x.get('device_type'))]
I have a new project where I obtain JSON data back from a REST API - I'm trying to parse this data to csv pipe delimited to import to our legacy software
I can't seem to get all the value pairs parsed properly - this is my first exposure to JSON and I've tried so many things but only getting a little right at a time
I have used Python and can get some items that I need but not the whole JSON tree - it comes across as a list and has some dictionaries and lists in it as well
I know my code is incomplete and just looking for someone to point me in the right direction on what tools in python can get the job done
import json
import csv
with open('tenants.json') as access_json:
read_content = json.load(access_json)
for rm_access in read_content:
rm_data = rm_access
print(rm_data)
contacts_data = rm_data['Contacts']
leases_data = rm_data['Leases']
udfs_data = rm_data['UserDefinedValues']
for contacts_access in contacts_data:
rm_contacts = contacts_access
UPDATED:
import pandas as pd
with open('tenants.json') as access_json:
read_content = json.load(access_json)
for rm_access in read_content:
rm_data = rm_access
pd.set_option('display.max_rows', 10000)
pd.set_option('display.max_columns', 150)
TenantID = []
TenantDisplayID = []
Name = []
FirstName = []
LastName = []
WebMessage = []
Comment = []
RentDueDay = []
RentPeriod = []
FirstContact = []
PropertyID = []
PostingStartDate = []
CreateDate = []
CreateUserID = []
UpdateDate = []
UpdateUserID = []
Contacts = []
for rm_access in read_content:
rm_data = rm_access
TenantID.append(rm_data["TenantID"])
TenantDisplayID.append(rm_data["TenantDisplayID"])
Name.append(rm_data["Name"])
FirstName.append(rm_data["FirstName"])
LastName.append(rm_data["LastName"])
WebMessage.append(rm_data["WebMessage"])
Comment.append(rm_data["Comment"])
RentDueDay.append(rm_data["RentDueDay"])
RentPeriod.append(rm_data["RentPeriod"])
# FirstContact.append(rm_data["FirstContact"])
PropertyID.append(rm_data["PropertyID"])
PostingStartDate.append(rm_data["PostingStartDate"])
CreateDate.append(rm_data["CreateDate"])
CreateUserID.append(rm_data["CreateUserID"])
UpdateUserID.append(rm_data["UpdateUserID"])
Contacts.append(rm_data["Contacts"])
df = pd.DataFrame({"TenantID":TenantID,"TenantDisplayID":TenantDisplayID, "Name"
: Name,"FirstName":FirstName, "LastName": LastName,"WebMessage": WebMessage,"Com
ment": Comment, "RentDueDay": RentDueDay, "RentPeriod": RentPeriod, "PropertyID"
: PropertyID, "PostingStartDate": PostingStartDate,"CreateDate": CreateDate, "Cr
eateUserID": CreateUserID,"UpdateUserID": UpdateUserID,"Contacts": Contacts})
print(df)
Here is sample of the file
[
{
"TenantID": 115,
"TenantDisplayID": 115,
"Name": "Jane Doe",
"FirstName": "Jane",
"LastName": "Doe",
"WebMessage": "",
"Comment": "",
"RentDueDay": 1,
"RentPeriod": "Monthly",
"FirstContact": "2015-11-01T15:30:00",
"PropertyID": 17,
"PostingStartDate": "2010-10-01T00:00:00",
"CreateDate": "2014-04-16T13:35:37",
"CreateUserID": 1,
"UpdateDate": "2017-03-22T11:31:48",
"UpdateUserID": 1,
"Contacts": [
{
"ContactID": 128,
"FirstName": "Jane",
"LastName": "Doe",
"MiddleName": "",
"IsPrimary": true,
"DateOfBirth": "1975-02-27T00:00:00",
"FederalTaxID": "111-11-1111",
"Comment": "",
"Email": "jane.doe#mail.com",
"License": "ZZT4532",
"Vehicle": "BMW 3 Series",
"IsShowOnBill": true,
"Employer": "REW",
"ApplicantType": "Applicant",
"CreateDate": "2014-04-16T13:35:37",
"CreateUserID": 1,
"UpdateDate": "2017-03-22T11:31:48",
"AnnualIncome": 0.0,
"UpdateUserID": 1,
"ParentID": 115,
"ParentType": "Tenant",
"PhoneNumbers": [
{
"PhoneNumberID": 286,
"PhoneNumberTypeID": 2,
"PhoneNumber": "703-555-5610",
"Extension": "",
"StrippedPhoneNumber": "7035555610",
"IsPrimary": true,
"ParentID": 128,
"ParentType": "Contact"
}
]
}
],
"UserDefinedValues": [
{
"UserDefinedValueID": 1,
"UserDefinedFieldID": 4,
"ParentID": 115,
"Name": "Emerg Contact Name",
"Value": "Terry Harper",
"UpdateDate": "2016-01-22T15:41:53",
"FieldType": "Text",
"UpdateUserID": 2,
"CreateUserID": 2
},
{
"UserDefinedValueID": 174,
"UserDefinedFieldID": 5,
"ParentID": 115,
"Name": "Emerg Contact Phone",
"Value": "703-555-3568",
"UpdateDate": "2016-01-22T15:42:03",
"FieldType": "Text",
"UpdateUserID": 2,
"CreateUserID": 2
}
],
"Leases": [
{
"LeaseID": 115,
"TenantID": 115,
"UnitID": 181,
"PropertyID": 17,
"MoveInDate": "2010-10-01T00:00:00",
"SortOrder": 1,
"CreateDate": "2014-04-16T13:35:37",
"UpdateDate": "2017-03-22T11:31:48",
"CreateUserID": 1,
"UpdateUserID": 1
}
],
"Addresses": [
{
"AddressID": 286,
"AddressTypeID": 1,
"Address": "14393 Montgomery Road Lot #102\r\nCincinnati, OH 45122",
"Street": "14393 Montgomery Road Lot #102",
"City": "Cincinnati",
"State": "OH",
"PostalCode": "45122",
"IsPrimary": true,
"ParentID": 115,
"ParentType": "Tenant"
}
],
"OpenReceivables": [],
"Status": "Current"
},
Not all tenants will have all elements which is also tricky
I need the data from the top where there is TenantID, TenantDisplayID, etc
I also need the data from the Contacts, PhoneNumbers, Leases, etc values
Each line should be static so if it doesn't have certain tags then I'd like a Null or None so it would look like
TentantID|TenantDisplayID|FirstName….etc so each line has same number of fields
Something like this should work:
import pandas as pd
pd.set_option('display.max_rows', 10000)
pd.set_option('display.max_columns', 100000)
TenantID = []
TenantDisplayID = []
Name = []
FirstName = []
LastName = []
WebMessage = []
Comment = []
RentDueDay = []
RentPeriod = []
FirstContact = []
PropertyID = []
PostingStartDate = []
CreateDate = []
CreateUserID = []
UpdateDate = []
UpdateUserID = []
Contacts = []
for rm_access in read_content:
rm_data = rm_access
print(rm_data)
TenantID.append(rm_data["TenantID"])
TenantDisplayID.append(rm_data["TenantDisplayID"])
Name.append(rm_data["Name"])
FirstName.append(rm_data["FirstName"])
LastName.append(rm_data["LastName"])
WebMessage.append(rm_data["WebMessage"])
Comment.append(rm_data["Comment"])
RentDueDay.append(rm_data["RentDueDay"])
RentPeriod.append(rm_data["RentPeriod"])
FirstContact.append(rm_data["FirstContact"])
PropertyID.append(rm_data["PropertyID"])
PostingStartDate.append(rm_data["PostingStartDate"])
CreateDate.append(rm_data["CreateDate"])
CreateUserID.append(rm_data["CreateUserID"])
UpdateUserID.append(rm_data["UpdateUserID"])
Contacts.append(rm_data["Contacts"])
df = pd.DataFrame({"TenantID":TenantID,"TenantDisplayID":TenantDisplayID, "Name": Name,
"FirstName":FirstName, "LastName": LastName,"WebMessage": WebMessage,
"Comment": Comment, "RentDueDay": RentDueDay, "RentPeriod": RentPeriod,
"FirstContact": FirstContact, "PropertyID": PropertyID, "PostingStartDate": PostingStartDate,
"CreateDate": CreateDate, "CreateUserID": CreateUserID,"UpdateUserID": UpdateUserID,
"Contacts": Contacts})
print(df)
The General Problem
The problem with this task (and other similar ones) is not just how to create an algorithm - I am sure you will theoretically be able to solve this with a (not so) nice amount of nested for-loops. The problem is to organise the code in a way that you don't get a headache - i.e. in a way that you can fix bugs easily, that you can write unittests, that you can understand the code easily from reading it (in six months from now) and that you can easily change your code in case you need to do so.
I do not know anybody who does not make mistakes when wrapping their head around a deeply nested structure. And chasing for bugs in a code which is heavily nested because it mirrors the nested structure of the data, can be quite frustrating.
The Quick (and most probably: Best) Solution
Rely on packages that are made for your exact usecase, such as
https://github.com/cwacek/python-jsonschema-objects
In case you have a formal definition of the API schema, you could use packages for that. If, for instance, your API has a Swagger schema definition, you cann use swagger-py (https://github.com/digium/swagger-py) to get your JSON response into Python objects.
The Principle Solution: Object Oriented Programming and Recursion
Even if there might be some libraries for your concrete use case, I would like to explain the principle of how to deal with "that kind" of tasks:
A good way to organise code for this kind of problem is using Object Oriented Programming. The nesting hassle can be laid out much clearer by making use of the principle of recursion. This also makes it easier to chabge the code, in case the JSON schema of your API response changes for any reasons (an update of the API, for instance). In your case I would suggest you create something like the following:
class JsonObject:
"""Parent Class for any Object that will be retrieved from the JSON
and potentially has nested JsonObjects inside.
This class takes care of parsing the json into python Objects and deals
with the recursion into the nested structures."""
primitives = []
json_objects = {
# For each class, this dict defines all the "embedded" classes which
# live directly "under" that class in the nested JSON. It will have the
# following structure:
# attribute_name : class
# In your case the JSON schema does not have any "single" objects
# in the nesting strcuture, but only lists of nested objects. I
# still , to demonstrate how you would do it in case, there would be
# single "embedded"
}
json_object_lists = {
# For each class, this dict defines all the "embedded" subclasses which
# are provided in a list "under" that class in the nested JSON.
# It will have the following structure:
# attribute_name : class
}
#classmethod
def from_dict(cls, d: dict) -> "JsonObject":
instance = cls()
for attribute in cls.primitives:
# Here we just parse all the primitives
instance.attribute = getattr(d, attribute, None)
for attribute, klass in cls.json_object_lists.items():
# Here we parse all lists of embedded JSON Objects
nested_objects = []
l = getattr(d, attribute, [])
for nested_dict in l:
nested_objects += klass.from_dict(nested_dict)
setattr(instance, attribute, nested_objects)
for attribute, klass in cls.json_objects.items():
# Here we parse all "single" embedded JSON Objects
setattr(
instance,
attribute,
klass.from_dict(getattr(d, attribute, None)
)
def to_csv(self) -> str:
pass
Since you didn't explain how exactly you want to create a csv from the JSON, I didn't implement that method and left this to you. It is also not necessary to explain the overall approach.
Now we have the general Parent class all our specific will inherit from, so that we can apply recursion to our problem. Now we only need to define these concrete structures, according to the JSON schema we want to parse. I got the following from your sample, but you can easily change the things you need to:
class Address(JsonObject):
primitives = [
"AddressID",
"AddressTypeID",
"Address",
"Street",
"City",
"State",
"PostalCode",
"IsPrimary",
"ParentID",
"ParentType",
]
json_objects = {}
json_object_lists = {}
class Lease(JsonObject):
primitives = [
"LeaseID",
"TenantID",
"UnitID",
"PropertyID",
"MoveInDate",
"SortOrder",
"CreateDate",
"UpdateDate",
"CreateUserID",
"UpdateUserID",
]
json_objects = {}
json_object_lists = {}
class UserDefinedValue(JsonObject):
primitives = [
"UserDefinedValueID",
"UserDefinedFieldID",
"ParentID",
"Name",
"Value",
"UpdateDate",
"FieldType",
"UpdateUserID",
"CreateUserID",
]
json_objects = {}
json_object_lists = {}
class PhoneNumber(JsonObject):
primitives = [
"PhoneNumberID",
"PhoneNumberTypeID",
"PhoneNumber",
"Extension",
"StrippedPhoneNumber",
"IsPrimary",
"ParentID",
"ParentType",
]
json_objects = {}
json_object_lists = {}
class Contact(JsonObject):
primitives = [
"ContactID",
"FirstName",
"LastName",
"MiddleName",
"IsPrimary",
"DateOfBirth",
"FederalTaxID",
"Comment",
"Email",
"License",
"Vehicle",
"IsShowOnBill",
"Employer",
"ApplicantType",
"CreateDate",
"CreateUserID",
"UpdateDate",
"AnnualIncome",
"UpdateUserID",
"ParentID",
"ParentType",
]
json_objects = {}
json_object_lists = {
"PhoneNumbers": PhoneNumber,
}
class Tenant(JsonObject):
primitives = [
"TenantID",
"TenantDisplayID",
"Name",
"FirstName",
"LastName",
"WebMessage",
"Comment",
"RentDueDay",
"RentPeriod",
"FirstContact",
"PropertyID",
"PostingStartDate",
"CreateDate",
"CreateUserID",
"UpdateDate",
"UpdateUserID",
"OpenReceivables", # Maybe this is also a nested Object? Not clear from your sample.
"Status",
]
json_object_lists = {
"Contacts": Contact,
"UserDefinedValues": UserDefinedValue,
"Leases": Lease,
"Addresses": Address,
}
json_objects = {}
You might imagine the "beauty" (at least: order) of that approach, which lies in the following: With this structure, we could tackle any level of nesting in the JSON response of your API without additional headache - our code would not deepen its indentation level, because we have separated the nasty nesting into the recursive definition of JsonObjects from_json method. That is why it is much easier now to identify bugs or apply changes to our code.
To finally parse the JSON now into our Objects, you would do something like the following:
import typing
import json
def tenants_from_json(json_string: str) -> typing.Iterable["Tenant"]:
tenants = [
Tenant.from_dict(tenant_dict)
for tenant_dict in json.loads(json_string)
]
return tenants
Important Final Side Note: This is just the basic Principle
My code example is just a very brief introduction into the idea of using objects and recursion to deal with an overwhelming (and nasty) nesting of a structure. The code has some flaws. For instance one should avoid define mutable class variables. And of course the whole code should validate the data it gets from the API. You also might want to add the type of each attribute and represent that correctly in the Python objects (Your sample has integers, datetimes and strings, for instance).
I really only wanted to show you the very principle of Object Oriented Programming here.
I didn't take the time to test my code. So there are probably bugs left. Again, I just wanted to demonstrate the principle.
I am new in Python, would like to extract data from json with Padas.
Json nested structure is as follows:
{
"idDriver": "100001",
"defaultTripType": "private",
"fleetManagerRole": null,
"identifications": [
{
"code": "90-00-00-77-20",
"from": "2019-08-08T10:38:15Z",
"rawId": "",
"vehicle": {
"isBusinessCar": "0",
"id": "10000",
"licensePlate": "ABCD",
"class": "Suziki 1.6 CDTI",
}
}
}
]
}
As an output I would need on one line: 'idDriver' from level 0 and then ‘licensePlate’ from identifications/ vehicle node in one line:
What I have been tried to apply is:
(after loading data from API what works fine)
json_data = json.loads(myResponse.text)
#only unwrapping 'identifications' – works 100% fine
workdata = json_normalize(json_data, record_path= ['identifications'],
meta=['idDriver'])
#unwrapping 'identifications'\'vehicle' - is NOT working
workdata = json_normalize(json_data, record_path= ['identifications','vehicle'],
meta=['idDriver'])
I would appreciate any hint on that.
Kind Regards,
Arek
I would go for rebuilding your dictionary like this:
New_Data = {
"id" : [],
"licensePlate" : []
}
New_Data["id"].append(data["idDriver"])
New_Data["licensePlate"].append(data["identifications"][0]["vehicle"]["licensePlate"])
If you have many data["identifications"] you can easly look over them, if you have many drivers you can do it as well.
For me your first code working nice, only if necessary remove vehicle. text from columns names:
json_data = {
"idDriver": "100001",
"defaultTripType": "private",
"fleetManagerRole": 'null',
"identifications": [
{
"code": "90-00-00-77-20",
"from": "2019-08-08T10:38:15Z",
"rawId": "",
"vehicle": {
"isBusinessCar": "0",
"id": "10000",
"licensePlate": "ABCD",
"class": "Suziki 1.6 CDTI",
}
}
]
}
workdata = json_normalize(json_data, record_path= ['identifications'], meta=['idDriver'])
print (workdata)
code from rawId vehicle.isBusinessCar \
0 90-00-00-77-20 2019-08-08T10:38:15Z 0
vehicle.id vehicle.licensePlate vehicle.class idDriver
0 10000 ABCD Suziki 1.6 CDTI 100001
workdata.columns = workdata.columns.str.replace('vehicle\.','')
print (workdata)
code from rawId isBusinessCar id \
0 90-00-00-77-20 2019-08-08T10:38:15Z 0 10000
licensePlate class idDriver
0 ABCD Suziki 1.6 CDTI 100001
I recently wrote a package to deal with tasks like this easily, it's called cherrypicker. I think the following snippet would achieve your task with CherryPicker:
from cherrypicker import CherryPicker
json_data = json.loads(myResponse.text)
picker = CherryPicker(json_data)
flat_data = picker.flatten['idDriver', 'identifications_0_vehicle_licensePlate'].get()
flat_data would then look like this (I'm assuming that your data is actually a list of objects like the one you described above):
[['100001', 'ABCD'], ...]
You can then load this into a dataframe as follows:
import pandas as pd
df = pd.DataFrame(flat_data, columns=["idDriver", "licensePlate"])
If you want to flatten your data in slightly different ways (e.g. you want every license plate/driver ID combination, not just the first license plate for each driver), then you should be able to do this too although it may require two or three lines rather than just one. Check our the docs for examples of other ways of using it: https://cherrypicker.readthedocs.io.
To install cherrypicker, it's just pip install --user cherrypicker.
I have a puppet manifest file - init.pp for my puppet module
In this file there are parameters for the class and in most cases they're written in the same way:
Example Input:
class test_module(
$first_param = 'test',
$second_param = 'new' )
What is the best way that I can parse this file with Python and get a dict object like this, which includes all the class parameters?
Example output:
param_dict = {'first_param':'test', 'second_param':'new'}
Thanks in Advance :)
Puppet Strings is a rubygem that can be installed on top of Puppet and can output a JSON document containing lists of the class parameters, documentation etc.
After installing it (see above link), run this command either in a shell or from your Python program to generate JSON:
puppet strings generate --emit-json-stdout init.pp
This will generate:
{
"puppet_classes": [
{
"name": "test_module",
"file": "init.pp",
"line": 1,
"docstring": {
"text": "",
"tags": [
{
"tag_name": "param",
"text": "",
"types": [
"Any"
],
"name": "first_param"
},
{
"tag_name": "param",
"text": "",
"types": [
"Any"
],
"name": "second_param"
}
]
},
"defaults": {
"first_param": "'test'",
"second_param": "'new'"
},
"source": "class test_module(\n $first_param = 'test',\n $second_param = 'new' ) {\n}"
}
]
}
(JSON trimmed slightly for brevity)
You can load the JSON in Python with json.loads, and extract the parameter names from root["puppet_classes"]["docstring"]["tags"] (where tag_name is param) and any default values from root["puppet_classes"]["defaults"].
You can use regular expression (straightforward but fragile)
import re
def parse(data):
mm = re.search('\((.*?)\)', data,re.MULTILINE)
dd = {}
if not mm:
return dd
matches = re.finditer("\s*\$(.*?)\s*=\s*'(.*?)'", mm.group(1), re.MULTILINE)
for mm in matches:
dd[mm.group(1)] = mm.group(2)
return dd
You can use it as follows:
import codecs
with codecs.open(filename,'r') as ff:
dd = parse(ff.read())
I don't know about the "best" way, but one way would be:
1) Set up Rspec-puppet (see google or my blog post for how to do that).
2) Compile your code and generate a Puppet catalog. See my other blog post for that.
Now, the Puppet catalog you compiled is a JSON document.
3) Visually inspect the JSON document to find the data you are looking for. Its precise location in the JSON document depends on the version of Puppet you are using.
4) You can now use Python to extract the data as a dictionary from the JSON document.