I want to share some ideas about how to access a key in a yaml file if the key may be empty or absent and see if there are other ways.
Consider three possible yaml files called 'dummy.yml':
model: phone
mode: production
model: phone
station_type: tester1
mode: production
model: phone
station_type:
mode: production
In Python we load the yaml into a python dictionary:
import yaml
with open("dummy.yml", 'r') as stream:
try:
config = yaml.load(stream)
except yaml.YAMLError as exc:
print(exc)
Now I want to access the station_type key. If the key is missing or the value is empty, we should get an empty string as the default value.
Here are some options for accessing station_type:
# KeyError if station_type missing:
st = config["station_type"]
# provide an empty string as the default value
# if key is present or absent, but return None # if value is empty:
st = config.get(
"station_type", "")
# my solution that returns a string in all three cases
st = ('' if not config.get("station_type") else config.get("station_type"))
Does this seem about right?
Here is some code testing your solution against a working solution:
import yaml
def orig_solution(config):
return ('' if not config.get("station_type") else config.get("station_type"))
def working_solution(config):
return tmp if (tmp := config.get("station_type")) is not None else ""
def test(name, solution, config):
print("\n{}:".format(name))
val = solution(config)
print("got value: {}".format(val))
config1 = yaml.safe_load("""
model: phone
mode: production
""")
config2 = yaml.safe_load("""
model: phone
station_type:
mode: production
""")
config3 = yaml.safe_load("""
model: phone
station_type: false
mode: production
""")
test("config1 with orig solution", orig_solution, config1)
test("config2 with orig solution", orig_solution, config2)
test("config3 with orig solution", orig_solution, config3)
test("config1 with working solution", working_solution, config1)
test("config2 with working solution", working_solution, config2)
test("config3 with working solution", working_solution, config3)
Output:
config1 with orig solution:
got value:
config2 with orig solution:
got value:
config3 with orig solution:
got value:
config1 with working solution:
got value:
config2 with working solution:
got value:
config3 with working solution:
got value: False
As you can see, your code returns the empty string on config3, which has false as the value. The working solution returns Python's False. The other inputs work with both solutions.
Now I am not completely sure which is the expected behavior in this case, but according to how you describe it, False should be the correct return value here and therefore your solution has a bug.
Related
I'm trying to delete certain registry keys, via python script.
i have no problems reading and deleting keys from the "HKEY_CURRENT_USER", but trying to do the same from the "HKEY_LOCAL_MACHINE", gives me the dreaded WindowsError: [Error 5] Access is denied.
i'm running the script via the IDLE IDE, with admin privileges.
here's the code:
from _winreg import *
ConnectRegistry(None,HKEY_LOCAL_MACHINE)
OpenKey(HKEY_LOCAL_MACHINE,r'software\wow6432node\App',0,KEY_ALL_ACCESS)
DeleteKey(OpenKey(HKEY_LOCAL_MACHINE,r'software\wow6432node'),'App')
You need to remove all subkeys before you can delete the key.
def deleteSubkey(key0, key1, key2=""):
import _winreg
if key2=="":
currentkey = key1
else:
currentkey = key1+ "\\" +key2
open_key = _winreg.OpenKey(key0, currentkey ,0,_winreg.KEY_ALL_ACCESS)
infokey = _winreg.QueryInfoKey(open_key)
for x in range(0, infokey[0]):
#NOTE:: This code is to delete the key and all subkeys.
# If you just want to walk through them, then
# you should pass x to EnumKey. subkey = _winreg.EnumKey(open_key, x)
# Deleting the subkey will change the SubKey count used by EnumKey.
# We must always pass 0 to EnumKey so we
# always get back the new first SubKey.
subkey = _winreg.EnumKey(open_key, 0)
try:
_winreg.DeleteKey(open_key, subkey)
print "Removed %s\\%s " % ( currentkey, subkey)
except:
deleteSubkey( key0, currentkey, subkey )
# no extra delete here since each call
#to deleteSubkey will try to delete itself when its empty.
_winreg.DeleteKey(open_key,"")
open_key.Close()
print "Removed %s" % (currentkey)
return
Here is an how you run it:
deleteSubkey(_winreg.HKEY_CURRENT_USER, "software\\wow6432node", "App")
deleteSubkey(_winreg.HKEY_CURRENT_USER, "software\\wow6432node\\App")
Just my two cents on the topic, but I recurse to the lowest subkey and delete on unravel:
def delete_sub_key(root, sub):
try:
open_key = winreg.OpenKey(root, sub, 0, winreg.KEY_ALL_ACCESS)
num, _, _ = winreg.QueryInfoKey(open_key)
for i in range(num):
child = winreg.EnumKey(open_key, 0)
delete_sub_key(open_key, child)
try:
winreg.DeleteKey(open_key, '')
except Exception:
# log deletion failure
finally:
winreg.CloseKey(open_key)
except Exception:
# log opening/closure failure
The difference between the other posts is that I do not try to delete if num is >0 because it will fail implicitly (as stated in the docs). So I don't waste time to try if there are subkeys.
[EDIT]
I have created a pip package that handles registry keys.
Install with: pip install windows_tools.registry
Usage:
from windows_tools.registry import delete_sub_key, KEY_WOW64_32KEY, KEY_WOW64_64KEY
keys = ['SOFTWARE\MyInstalledApp', 'SOFTWARE\SomeKey\SomeOtherKey']
for key in keys:
delete_sub_key(key, arch=KEY_WOW64_32KEY | KEY_WOW64_64KEY)
[/EDIT]
Unburying this old question, here's an updated version of ChrisHiebert's recursive function that:
Handles Python 3 (tested with Python 3.7.1)
Handles multiple registry architectures (eg Wow64 for Python 32 on Windows 64)
Is PEP-8 compliant
The following example shows function usage to delete two keys in all registry architectures (standard and redirected WOW6432Node) by using architecture key masks.
Hopefully this will help someone:
import winreg
def delete_sub_key(key0, current_key, arch_key=0):
open_key = winreg.OpenKey(key0, current_key, 0, winreg.KEY_ALL_ACCESS | arch_key)
info_key = winreg.QueryInfoKey(open_key)
for x in range(0, info_key[0]):
# NOTE:: This code is to delete the key and all sub_keys.
# If you just want to walk through them, then
# you should pass x to EnumKey. sub_key = winreg.EnumKey(open_key, x)
# Deleting the sub_key will change the sub_key count used by EnumKey.
# We must always pass 0 to EnumKey so we
# always get back the new first sub_key.
sub_key = winreg.EnumKey(open_key, 0)
try:
winreg.DeleteKey(open_key, sub_key)
print("Removed %s\\%s " % (current_key, sub_key))
except OSError:
delete_sub_key(key0, "\\".join([current_key,sub_key]), arch_key)
# No extra delete here since each call
# to delete_sub_key will try to delete itself when its empty.
winreg.DeleteKey(open_key, "")
open_key.Close()
print("Removed %s" % current_key)
return
# Allows to specify if operating in redirected 32 bit mode or 64 bit, set arch_keys to 0 to disable
arch_keys = [winreg.KEY_WOW64_32KEY, winreg.KEY_WOW64_64KEY]
# Base key
root = winreg.HKEY_LOCAL_MACHINE
# List of keys to delete
keys = ['SOFTWARE\MyInstalledApp', 'SOFTWARE\SomeKey\SomeOtherKey']
for key in keys:
for arch_key in arch_keys:
try:
delete_sub_key(root, key, arch_key)
except OSError as e:
print(e)
Figured it out!
turns out the registry key wasn't empty and contained multiple subkeys.
i had to enumerate and delete the subkeys first, and only then i was able to delete the main key from HKLM.
(also added "try...except", so it wouldn't break the whole code, it case there were problems).
This is my solution. I like to use with statements in order to not have to close the key manually. First I check for sub keys and delete them before I delete the key itself. EnumKey raises an OSError if no sub key exists. I use this to break out of the loop.
from winreg import *
def delete_key(key: Union[HKEYType, int], sub_key_name: str):
with OpenKey(key, sub_key_name) as sub_key:
while True:
try:
sub_sub_key_name = EnumKey(sub_key, 0)
delete_key(sub_key, sub_sub_key_name)
except OSError:
break
DeleteKey(key, sub_key_name)
config.yml example,
DBtables:
CurrentMinuteLoad:
CSV_File: trend.csv
Table_Name: currentminuteload
GUI image,
This may not be the cleanest route to take.
I'm making a GUI that creates a config.yml file for another python script I'm working with.
Using pysimplegui, My button isn't functioning the way I'd expect it to. It currently and accurately checks for the Reference name (example here would be CurrentMinuteLoad) and will kick it back if it exists, but will skip the check for the table (so the ELIF statement gets skipped). Adding the table still works, I'm just not getting the double-check that I want. Also, I have to hit the Okay button twice in the GUI for it to work?? A weird quirk that doesn't quite make sense to me.
def add_table():
window2.read()
with open ("config.yml","r") as h:
if values['new_ref'] in h.read():
sg.popup('Reference name already exists')
elif values['new_db'] in h.read():
sg.popup('Table name already exists')
else:
with open("config.yml", "a+") as f:
f.write("\n " + values['new_ref'] +":")
f.write("\n CSV_File:" + values['new_csv'])
f.write("\n Table_Name:" + values['new_db'])
f.close()
sg.popup('The reference "' + values['new_ref'] + '" has been included and will add the table "' + values['new_db'] + '" to PG Admin during the next scheduled upload')
When you use h.read(), you should save the value since it will read it like a stream, and subsequent calls for this method will result in an empty string.
Try editing the code like this:
with open ("config.yml","r") as h:
content = h.read()
if values['new_ref'] in content:
sg.popup('Reference name already exists')
elif values['new_db'] in content:
sg.popup('Table name already exists')
else:
# ...
You should update the YAML file using a real YAML parser, that will allow you
to check on duplicate values, without using in, which will give you false
positives when a new value is a substring of an existing value (or key).
In the following I add values twice, and show the resulting YAML. The
first time around the check on new_ref and new_db does not find
a match although it is a substring of existing values. The second time
using the same values there is of course a match on the previously added
values.
import sys
import ruamel.yaml
from pathlib import Path
def add_table(filename, values, verbose=False):
error = False
yaml = ruamel.yaml.YAML()
data = yaml.load(filename)
dbtables = data['DBtables']
if values['new_ref'] in dbtables:
print(f'Reference name "{values["new_ref"]}" already exists') # use sg.popup in your code
error = True
for k, v in dbtables.items():
if values['new_db'] in v.values():
print(f'Table name "{values["new_db"]}" already exists')
error = True
if error:
return
dbtables[values['new_ref']] = d = {}
for x in ['new_cv', 'new_db']:
d[x] = values[x]
yaml.dump(data, filename)
if verbose:
sys.stdout.write(filename.read_text())
values = dict(new_ref='CurrentMinuteL', new_cv='trend_csv', new_db='currentminutel')
add_table(Path('config.yaml'), values, verbose=True)
print('========')
add_table(Path('config.yaml'), values, verbose=True)
which gives:
DBtables:
CurrentMinuteLoad:
CSV_File: trend.csv
Table_Name: currentminuteload
CurrentMinuteL:
new_cv: trend_csv
new_db: currentminutel
========
Reference name "CurrentMinuteL" already exists
Table name "currentminutel" already exists
I am trying to append duplicate key: value pair to a nested dictionary in a YAML file through python script. Following is the snippet of the code which I have written to achieve this:
import click
import ruamel.yaml
def organization():
org_num = int(input("Please enter the number of organizations to be created: "))
org_val = 0
while org_val!= org_num:
print ("")
print("Please enter values to create Organizations")
print ("")
for org in range(org_num):
organization.org_name = str(raw_input("Enter the Organization Name: "))
organization.org_description = str(raw_input("Enter the Description of Organization: "))
print ("")
if click.confirm("Organization Name: "+ organization.org_name + "\nDescription: "+ organization.org_description + "\nIs this Correct?", default=True):
if org_val == 0:
org_val = org_val + 1
yaml = ruamel.yaml.YAML()
org_data = dict(
organizations=dict(
name=organization.org_name,
description=organization.org_description,
)
)
with open('input.yml', 'a') as outfile:
yaml.indent(mapping=2, sequence=4, offset=2)
yaml.dump(org_data, outfile)
else:
org_val = org_val + 1
yaml = ruamel.yaml.YAML()
org_data = dict(
name=organization.org_name,
description=organization.org_description,
)
with open('input.yml', 'r') as yamlfile:
cur_yaml = yaml.load(yamlfile)
cur_yaml['organizations'].update(org_data)
if cur_yaml:
with open('input.yml','w') as yamlfile:
yaml.indent(mapping=2, sequence=4, offset=2)
yaml.dump(cur_yaml, yamlfile)
return organization.org_name, organization.org_description
organization()
At the end of python script my input.yml file should look like the following:
version: x.x.x
is_enterprise: 'true'
license: secrets/license.txt
organizations:
- description: xyz
name: abc
- description: pqr
name: def
However every time the script is running, instead of appending the value to organizations it overwrites it.
I also have tried using append instead of update but I am getting the following error:
AttributeError: 'CommentedMap' object has no attribute 'append'
What can I do to solve this?
Also since I am new to development, any suggestion on making this code better will be really helpful.
If I understand correctly, what you want is cur_yaml['organizations'] += [org_data].
Note that if you run the script many times, you will have the same entry multiple times.
Using update is not going to work, because the value for for the key
organizations is a sequence and loads as the list-like type
CommentedSeq. So append-ing would be the right thing to do.
That that doesn't work is a bit unclear since you don't provide that input
that you start with, nor the code used when doing the appending which gets
an AttributeError on CommentedMap.
Here is what works if you have one organization and add another:
import sys
import ruamel.yaml
yaml_str = """\
version: x.x.x
is_enterprise: 'true'
license: secrets/license.txt
organizations:
- description: xyz
name: abc
"""
org_data = dict(
description='pqr',
name='def',
)
yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=4, offset=2)
cur_yaml = yaml.load(yaml_str)
cur_yaml['organizations'].append(org_data)
yaml.dump(cur_yaml, sys.stdout)
This gives:
version: x.x.x
is_enterprise: 'true'
license: secrets/license.txt
organizations:
- description: xyz
name: abc
- description: pqr
name: def
If you have no organizations yet, make sure your input YAML looks like:
version: x.x.x
is_enterprise: 'true'
license: secrets/license.txt
organizations: []
On older versions of Python the order of the keys in the data you added is not guaranteed. To
enforce that order on older version as well do:
org_data = ruamel.yaml.comments.CommentedMap((('description', 'pqr'), ('name', 'def')))
or
org_data = ruamel.yaml.comments.CommentedMap()
org_data['description'] = 'pqr'
org_data['name'] = 'def'
Found the issue and its working fine now. Since name and description are list objects for organizations, I have added [] to the below code and it started working.
org_data = dict(
organizations=[dict(
name=tower_organization.org_name,
description=tower_organization.org_description,
)
]
)
In addition to above, I guess append wasn't working because of the hyphen '-' which was missing from the first object as part of the identation. After fixing the above code, append is working fine as well.
Thank you all for your answers.
How can I get the comments when I iterate through the YAML object
yaml = YAML()
with open(path, 'r') as f:
yaml_data = yaml.load(f)
for obj in yaml_data:
# how to get the comments here?
This is the source data (an ansible playbook)
---
- name: gather all complex custom facts using the custom module
hosts: switches
gather_facts: False
connection: local
tasks:
# There is a bug in ansible 2.4.1 which prevents it loading
# playbook/group_vars
- name: ensure we're running a known working version
assert:
that:
- 'ansible_version.major == 2'
- 'ansible_version.minor == 4'
After Anthon comments, this is the way I found to access the comments in the child nodes (needs to be refined):
for idx, obj in enumerate(yaml_data):
for i, item in enumerate(obj.items()):
pprint(yaml_data[i].ca.items)
You did not specify your input, but since your code expects an obj and
not a key, I assume the root level of your YAML is a sequence and not mapping.
If you want to get the comments after each element (i.e nr 1 and the last) you can do:
import ruamel.yaml
yaml_str = """\
- one # nr 1
- two
- three # the last
"""
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
for idx, obj in enumerate(data):
comment_token = data.ca.items.get(idx)
if comment_token is None:
continue
print(repr(comment_token[0].value))
which gives:
'# nr 1\n'
'# the last\n'
You might want to strip of the leading octothorpe and trailing newline.
Please note that this works with the current version (0.15.61), but
there is no guarantee it might not to change.
Using the example from Anthon as well as an issue in ruamel.yaml on sourceforge, here's a set of methods which should allow you to retrieve (almost - see below) all the comments in your documents:
from ruamel.yaml import YAML
from ruamel.yaml.comments import CommentedMap, CommentedSeq
# set attributes
def get_comments_map(self, key):
coms = []
comments = self.ca.items.get(key)
if comments is None:
return coms
for token in comments:
if token is None:
continue
elif isinstance(token, list):
coms.extend(token)
else:
coms.append(token)
return coms
def get_comments_seq(self, idx):
coms = []
comments = self.ca.items.get(idx)
if comments is None:
return coms
for token in comments:
if token is None:
continue
elif isinstance(token, list):
coms.extend(token)
else:
coms.append(token)
return coms
setattr(CommentedMap, 'get_comments', get_comments_map)
setattr(CommentedSeq, 'get_comments', get_comments_seq)
# load string
yaml_str = """\
- name: gather all complex custom facts using the custom module
hosts: switches
gather_facts: False
connection: local
tasks:
# There is a bug in ansible 2.4.1 which prevents it loading
# playbook/group_vars
- name: ensure we're running a known working version
assert:
that:
- 'ansible_version.major == 2'
- 'ansible_version.minor == 4'
"""
yml = YAML(typ='rt')
data = yml.load(yaml_str)
def walk_data(data):
if isinstance(data, CommentedMap):
for k, v in data.items():
print(k, [ comment.value for comment in data.get_comments(k)])
if isinstance(v, CommentedMap) or isinstance(v, CommentedSeq):
walk_data(v)
elif isinstance(data, CommentedSeq):
for idx, item in enumerate(data):
print(idx, [ comment.value for comment in data.get_comments(idx)])
if isinstance(item, CommentedMap) or isinstance(item, CommentedSeq):
walk_data(item)
walk_data(data)
Here's the output:
0 []
name []
hosts []
gather_facts []
connection []
tasks ['# There is a bug in ansible 2.4.1 which prevents it loading\n', '# playbook/group_vars\n']
0 []
name []
assert []
that []
0 []
1 []
Unfortunately, there are two is one problems that I have encountered which are not covered by this method:
You will notice that there is no leading \n in the comments for tasks. As a result, it is not possible with this method to differentiate between comments which start on the same line as tasks or on the next line. Since the CommentToken.start_mark.line only contains the absolute line of the comment, it might be able to be compared to the line of tasks. But, I have not yet found a way to retrieve the line associated with tasks inside the loaded data.
There does not seem to be a way that I have found yet to retrieve comments at the head of the document. So, any initial comments would need to be discovered using a method other than to retrieve them outside the yaml reader. But, related to problem #1, these head comments are included in the absolute line count of other comments. To add the comments at the head of the document, you need to use [comment.value for comment in data.ca.comment[1] as per this explanation by Anthon.
I have a yaml file of the form below:
Solution:
- number of solutions: 1
number of solutions displayed: 1
- Gap: None
Status: optimal
Message: bonmin\x3a Optimal
Objective:
objective:
Value: 0.010981105395
Variable:
battery_E[b1,1,1]:
Value: 0.25
battery_E[b1,1,2]:
Value: 0.259912707017
battery_E[b1,2,1]:
Value: 0.120758408109
battery_E[b2,1,1]:
Value: 0.0899999972181
battery_E[b2,2,3]:
Value: 0.198967393893
windfarm_L[w1,2,3]:
Value: 1
windfarm_L[w1,3,1]:
Value: 1
windfarm_L[w1,3,2]:
Value: 1
Using Python27, I would like to import all battery_E values from this YAML file. I know I can iterate over the keys of battery_E dictionary to retrieve them one by one (I am already doing it using PyYAML) but I would like to avoid iterating and do it in one go!
It's not possible "in one go" - there will still be some kind of iteration either way, and that's completely OK.
However, if the memory is a concern, you can load only values of the keys of interest during YAML loading:
from __future__ import print_function
import yaml
KEY = 'battery_E'
class Loader(yaml.SafeLoader):
def __init__(self, stream):
super(Loader, self).__init__(stream)
self.values = []
def compose_mapping_node(self, anchor):
start_event = self.get_event()
tag = start_event.tag
if tag is None or tag == '!':
tag = self.resolve(yaml.MappingNode, None, start_event.implicit)
node = yaml.MappingNode(tag, [],
start_event.start_mark, None,
flow_style=start_event.flow_style)
if anchor is not None:
self.anchors[anchor] = node
while not self.check_event(yaml.MappingEndEvent):
item_key = self.compose_node(node, None)
item_value = self.compose_node(node, item_key)
if (isinstance(item_key, yaml.ScalarNode)
and item_key.value.startswith(KEY)
and item_key.value[len(KEY)] == '['):
self.values.append(self.construct_object(item_value, deep=True))
else:
node.value.append((item_key, item_value))
end_event = self.get_event()
node.end_mark = end_event.end_mark
return node
with open('test.yaml') as f:
loader = Loader(f)
try:
loader.get_single_data()
finally:
loader.dispose()
print(loader.values)
Note however, that this code does not assume anything about the position of battery_E keys in the tree inside the YAML file - it will just load all of their values.
There is no need to retrieve each entry using PyYAML, you can load the data once, and then use Pythons to select the key-value pairs with the following two lines:
data = yaml.safe_load(open('input.yaml'))
kv = {k:v['Value'] for k, v in data['Solution'][1]['Variable'].items() if k.startswith('battery_E')}
after that kv contains:
{'battery_E[b2,2,3]': 0.198967393893, 'battery_E[b1,1,1]': 0.25, 'battery_E[b1,1,2]': 0.259912707017, 'battery_E[b2,1,1]': 0.0899999972181, 'battery_E[b1,2,1]': 0.120758408109}