I have a dictionary that I am using to populate a YAML config file for each key.
{'id': ['HP:000111'], 'id1': ['HP:000111'], 'id2': ['HP:0001111', 'HP:0001123'])}
code to insert key:value pair into YAML template using ruamel.yaml
import ruamel.yaml
import sys
yaml = ruamel.yaml.YAML()
with open('yaml.yml') as fp:
data = yaml.load(fp)
for k in start.keys():
data['analysis']['hpoIds'] = start.get(key)
with open(f"path/yaml-{k}.yml","w+") as f:
yaml.dump(data, sys.stdout)
this is output I am getting
analysis:
# hg19 or hg38 - ensure that the application has been configured to run the specified assembly otherwise it will halt.
genomeAssembly: hg38
vcf:
ped:
proband:
hpoIds: "['HP:000111','HP:000112','HP:000113']"
but this is what I need
hpoIds: ['HP:000111','HP:000112','HP:000113']
ive tried using string tools i.e strip, replace but didnt
output from ast.literal_eval.
hpoIds:
- HP:000111
- HP:000112
- HP:000113
output from repr
hpoIds: "\"['HP:000111','HP: 000112','HP:000113']\""
any help would be greatly appreciated
It is not entirely clear to me what you are trying to do and why you e.g. open files
'w+' for dumping.
However if you have something that comes out block style and unquoted, that can easily be remedied
by using a small function:
import sys
from pathlib import Path
import ruamel.yaml
SQ = ruamel.yaml.scalarstring.SingleQuotedScalarString
def flow_seq_single_quoted(lst):
res = ruamel.yaml.CommentedSeq([SQ(x) if isinstance(x, str) else x for x in lst])
res.fa.set_flow_style()
return res
in_file = Path('yaml.yaml')
in_file.write_text("""\
hpoIds:
- HP:000111
- HP:000112
- HP:000113
""")
yaml = ruamel.yaml.YAML()
data = yaml.load(in_file)
data['hpoIds'] = flow_seq_single_quoted(data['hpoIds'])
yaml.dump(data, sys.stdout)
which gives:
hpoIds: ['HP:000111', 'HP:000112', 'HP:000113']
The recommended extension for YAML files has been .yaml since at least September 2006.
Related
I have a dictionary which looks like:
{'ab':8082 , 'bc': 8082}
When I dump it to python yaml, I want it to look like:
ab:8082
and not like:
ab: 8082
Is there a way we can achieve it ?
Your output is not valid YAML, as that requires a space after the colon in block style.
So what I recommend is post-processing the output using ruamel.yamls transform argument to dump
import sys
import ruamel.yaml
data = {'ab':8082 , 'bc': 8082}
def remove_space_after_colon(s):
res = []
for line in s.splitlines(True):
res.append(line.replace(': ', ':', 1)) # 1, to prevent replacing in values
return ''.join(res)
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout, transform=remove_space_after_colon)
which gives:
ab:8082
bc:8082
I am using ruamel.yaml for dumping a dict to a yaml file. While doing so, I want to keep the order of the dictionary. That is how I came across the question Keep YAML file order with ruamel. But this solution is not working in my case:
The order is not preserved.
adding tags like !!python/object/apply:ruamel.yaml.comments.CommentedMap or dictitems
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap as ordereddict
generated_file = os.path.join('data_TEST.yaml')
data_dict = {'Sources': {'coil': None}, 'Magnet': 'ABC', 'Current': ordereddict({'heat': {'i': [[]], 'h': None, }})}
data_dict = ordereddict(data_dict)
with open(generated_file, 'w') as yaml_file:
ruamel.yaml.dump(data_dict, yaml_file, default_flow_style=False)
The used dictionary is just an arbitrary one and in the end an automatically created array that could look different is going to be used. So, we cannot hard-code the mapping of the dictionaries in the dictionary like in my example.
Result:
!!python/object/apply:ruamel.yaml.comments.CommentedMap
dictitems:
Current: !!python/object/apply:ruamel.yaml.comments.CommentedMap
dictitems:
heat:
h: null
i:
- []
Magnet: ABC
Sources:
coil: null
Desired result:
Sources:
coil: null
Magnet: ABC
Current:
heat:
h: null
i:
- []
You should really not be using the old PyYAML API that sorts keys when dumping.
Instantiate a YAML instance and use its dump method:
yaml = ruamel.yaml.YAML()
yaml.dump(data, stream)
I have a few small dictionaries which I would like to include in one yaml file and access each one of them separately with PyYAML.
After having some trouble finding out how to write the YAML file, I ended up with this version, where --- is supposed to distinguish the two dictionaries, which are named as elements and as parameters. This was inspired by this post and answer
--- !elements
n: 'N'
p: 'P'
k: 'K'
--- !parameters
ph: 'pH'
org_mat : 'organic matter'
To continue, I created a variable with the name of the path of the file: yaml_fpath = r"\Users\user1\Desktop\yaml_file" and I tried several methods to access the dictionaries, such as:
for item in yaml.safe_load_all(yaml_fpath):
print(item)
or
yaml.safe_load(open(yaml_fpath, 'r', encoding = 'utf-8'))
but none does what I need. In fact, I would like to load the file and be able to call each dictionary by each name when I need to use it.
What am I missing from the documentation of PyYAML?
Assume you have multiple YAML files in a directory that consist of a root level mapping that is tagged:
!elements
n: 'N'
p: 'P'
k: 'K'
If these files have the recommended suffix for YAML files you can combine them using:
import sys
import ruamel.yaml
from pathlib import Path
file_out = Path('out.yaml')
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.explicit_start = True
data = []
for file_name in Path('.').glob('*.yaml'):
if file_name.name == file_out.name:
continue
print('appending', file_name)
data.append(yaml.load(file_name))
yaml.dump_all(data, file_out)
print(file_out.read_text())
which gives:
appending file1.yaml
appending file2.yaml
--- !elements
n: 'N'
p: 'P'
k: 'K'
--- !parameters
ph: 'pH'
org_mat: 'organic matter'
There is no need to register any classes that can handle the tag if you use the (default)
roundtrip mode. You do have to set explicit_start to get the leading directives end indicator
(---, often incorrectly called document separator, although it doesn't have to
appear at the beginning of a document). The other directives end indicators are a
result of using dump_all() instead of dump().
If you want to access some value, assuming you don't know where it is in out.yaml, but knowing
the tag and key, you can do:
import sys
import ruamel.yaml
from pathlib import Path
file_in = Path('out.yaml')
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load_all(file_in)
my_tag = 'parameters'
my_key = 'org_mat'
for d in data:
if d.tag.value == '!' + my_tag:
if my_key in d:
print('found ->', d[my_key])
which gives:
found -> organic matter
I trying to dump a dict object as YAML using the snippet below:
from ruamel.yaml import YAML
# YAML settings
yaml = YAML(typ="rt")
yaml.default_flow_style = False
yaml.explicit_start = False
yaml.indent(mapping=2, sequence=4, offset=2)
rip= {"rip_routes": ["23.24.10.0/15", "23.30.0.10/15", "50.73.11.0/16", "198.0.0.0/16"]}
file = 'test.yaml'
with open(file, "w") as f:
yaml.dump(rip, f)
It dumps correctly, but I am getting an new line appended to the end of the list
rip_routes:
- 23.24.10.0/15
- 23.30.0.10/15
- 198.0.11.0/16
I don't want the new line to be inserted at the end of file. How can I do it?
The newline is part of the representation code for block style sequence elements. And since that code
doesn't have much knowledge about context, and certainly not about representing the last element to be dumped
in a document, it is almost impossible for the final newline not to be output.
However, the .dump() method has an optional transform parameter that allows you to
run the output of the dumped text through some filter:
import sys
import pathlib
import string
import ruamel.yaml
# YAML settings
yaml = ruamel.yaml.YAML(typ="rt")
yaml.default_flow_style = False
yaml.explicit_start = False
yaml.indent(mapping=2, sequence=4, offset=2)
rip= {"rip_routes": ["23.24.10.0/15", "23.30.0.10/15", "50.73.11.0/16", "198.0.0.0/16"]}
def strip_final_newline(s):
if not s or s[-1] != '\n':
return s
return s[:-1]
file = pathlib.Path('test.yaml')
yaml.dump(rip, file, transform=strip_final_newline)
print(repr(file.read_text()))
which gives:
'rip_routes:\n - 23.24.10.0/15\n - 23.30.0.10/15\n - 50.73.11.0/16\n - 198.0.0.0/16'
It is better to use Path() instances as in the code above,
especially if your YAML document is going to contain non-ASCII characters.
I am using Ruamel to preserve quote styles in human-edited YAML files.
I have example input data as:
---
a: '1'
b: "2"
c: 3
I read in data using:
def read_file(f):
with open(f, 'r') as _f:
return ruamel.yaml.round_trip_load(_f.read(), preserve_quotes=True)
I then edit that data:
data = read_file('in.yaml')
data['foo'] = 'bar'
I write back to disk using:
def write_file(f, data):
with open(f, 'w') as _f:
_f.write(ruamel.yaml.dump(data, Dumper=ruamel.yaml.RoundTripDumper, width=1024))
write_file('out.yaml', data)
And the output file is:
a: '1'
b: "2"
c: 3
foo: bar
Is there a way I can enforce hard quoting of the string 'bar' without also enforcing that quoting style throughout the rest of the file?
(Also, can I stop it from deleting the three dashes --- ?)
In order to preserve quotes (and literal block style) for string scalars, ruamel.yaml¹—in round-trip-mode—represents these scalars as SingleQuotedScalarString, DoubleQuotedScalarString and PreservedScalarString. The class definitions for these very thin wrappers can be found in scalarstring.py.
When serializing such instances are written "as they were read", although sometimes the representer falls back to double quotes when things get difficult, as that can represent any string.
To get this behaviour when adding new key-value pairs (or when updating an existing pair), you just have to create these instances yourself:
import sys
from ruamel.yaml import YAML
from ruamel.yaml.scalarstring import SingleQuotedScalarString, DoubleQuotedScalarString
yaml_str = """\
---
a: '1'
b: "2"
c: 3
"""
yaml = YAML()
yaml.preserve_quotes = True
yaml.explicit_start = True
data = yaml.load(yaml_str)
data['foo'] = SingleQuotedScalarString('bar')
data.yaml_add_eol_comment('# <- single quotes added', 'foo', column=20)
yaml.dump(data, sys.stdout)
gives:
---
a: '1'
b: "2"
c: 3
foo: 'bar' # <- single quotes added
the yaml.explicit_start = True recreates the (superfluous) document start marker. Whether such a marker was in the original file or not is not "known" by the top-level dictionary object, so you have to re-add it by hand.
Please note that without preserve_quotes, there would be (single) quotes around the values 1 and 2 anyway to make sure they are seen as string scalars and not as integers.
¹ Of which I am the author.
Since Ruamel 0.15, set the preserve_quotes flag like this:
from ruamel.yaml import YAML
from pathlib import Path
yaml = YAML(typ='rt') # Round trip loading and dumping
yaml.preserve_quotes = True
data = yaml.load(Path("in.yaml"))
yaml.dump(data, Path("out.yaml"))