Python string formatting of multi-level dict

Python string formatting of multi-level dict - python

A common technique for rendering values from template strings looks like this:
>>> num = 7
>>> template = 'there were {num} dwarves'
>>> print template.format(**locals())
there were 7 dwarves
This approach works for any data type that has a __str__ method, e.g. dicts:
>>> data = dict(name='Bob', age=43)
>>> template = 'goofy example 1 {data}'
>>> print template.format(**locals())
goofy example 1 {'age': 43, 'name': 'Bob'}
However it doesn't work when a dict item is referenced by key:
>>> template = 'goofy example 2 {data["name"]}'
>>> print template.format(**locals())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: '"name"'
It's inconvenient, and seems odd, that an identifier that's valid in code outside the format string is invalid when used within the format string. Am I missing something? Is there a way to do this?
I'd like to be able to reference an element several layers down in a nested dictionary structure, like somedict['level1key']['level2key']['level3key']. So far my only workable approach has been to copy these values to a scalar variable just for string formatting, which is icky.

You can do it by using {data[name]} instead of {data["name"]} in the string.
The types of things you can specify in a format string are restricted. Arbitrary expressions aren't allowed, and keys are interpreted in a simplified way, as described in the documentation:
it is not possible to specify arbitrary dictionary keys (e.g., the strings '10' or ':-]') within a format string.
In this case, you can get your key out because it's a simple alphanumeric string, but, as the docs suggest, you can't always necessarily do it. If you have weird dict keys, you may have to change them to use them in a format string, or resort to other methods (like concatening string values explicitly with +).

Related

Store formatted strings, pass in values later?

I have a dictionary with a lot of strings.
Is it possible to store a formatted string with placeholders and pass in a actual values later?
I'm thinking of something like this:
d = {
"message": f"Hi There, {0}"
}
print(d["message"].format("Dave"))
The above code obviously doesn't work but I'm looking for something similar.

You use f-string; it already interpolated 0 in there. You might want to remove f there
d = {
# no f here
"message": "Hi There, {0}"
}
print(d["message"].format("Dave"))
Hi There, Dave

Issue: mixing f-String with str.format
Technique
Python version
f-String
since 3.6
str.format
since 2.6
Your dict-value contains an f-String which is immediately evaluated.
So the expression inside the curly-braces (was {0}) is directly interpolated (became 0), hence the value assigned became "Hi There, 0".
When applying the .format argument "Dave", this was neglected because string already lost the templating {} inside. Finally string was printed as is:
Hi There, 0
Attempt to use f-String
What happens if we use a variable name like name instead of the constant integer 0 ?
Let's try on Python's console (REPL):
>>> d = {"message": f"Hi There, {name}"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'name' is not defined
OK, we must define the variable before. Let's assume we did:
>>> name = "Dave"; d = {"message": f"Hi There, {name}"}
>>> print(d["message"])
Hi There, Dave
This works. But it requires the variable or expression inside the curly-braces to be valid at runtime, at location of definition: name is required to be defined before.
Breaking a lance for str.format
There are reasons
when you need to read templates from external sources (e.g. file or database)
when not variables but placeholders are configured independently from your source
Then indexed-placeholders should be preferred to named-variables.
Consider a given database column message with value "Hello, {1}. You are {0}.". It can be read and used independently from the implementation (programming-language, surrounding code).
For example
in Java: MessageFormat.format(message, 74, "Eric")
in Python: message.format(74, 'Eric').
See also:
Format a message using MessageFormat.format() in Java

Slicing a string from inside a formatted string gives 'TypeError: string indices must be integers'

Shouldn't both these commands do the same thing?
>>> "{0[0:5]}".format("lorem ipsum")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
>>> "{0}".format("lorem ipsum"[0:5])
'lorem'
The commands
>>> "{0[0]}".format("lorem ipsum")
'l'
and
>>> "{0}".format("lorem ipsum"[0])
'l'
evaluate the same. (I know that I can use other methods to do this, I am mainly just curious as to why it dosen't work)

The str.format syntax is handled by the library and supports only a few “expression” syntaxes that are not the same as regular Python syntax. For example,
"{0[foo]}".format(dict(foo=2)) # "2"
works without quotes around the dictionary key. Of course, there are limitations from this simplicity, like not being able to refer to a key with a ] in it, or interpreting a slice, as in your example.
Note that the f-strings mentioned by kendall are handled by the compiler and (fittingly) use (almost) unrestricted expression syntax. They need that power since they lack the obvious alternative of placing those expressions in the argument list to format.

how format specifier taking value while tuple list is passed

I have a piece of code as below:
tupvalue = [('html', 96), ('css', 115), ('map', 82)]
So while printing the above tuple in the desired format for a particular index I found a code like this:
>>> '%s:%d' % tupvalue[0]
'html:96'
I'm wondering how the single value tupvalue[0] is recognised as a tuple of two values by the format specifier '%s:%d'? Please explain this mechanism with a documentation reference.
How can I use a comprehension to format all the values in tupvalue in the required format as in the example shown?

First, the easy question:
How can I use a comprehension to format all the values in tupvalue in the required format as in the example shown?
That's a list comprehension: ['%s:%d' % t for t in tupvalue]
Now, the harder question!
how the single value tupvalue[0] is recognised as a tuple of two values by the format specifier '%s:%d'?
Your intuition that something a bit strange is going on here is correct. Tuples are special-cased in the language for use with string formatting.
>>> '%s:%d' % ('css', 115) # tuple is seen as two elements
'css:115'
>>> '%s:%d' % ['css', 115] # list is just seen as one object!
TypeError: not enough arguments for format string
The percent-style string formatting does not duck-type properly. So, if you actually wanted to format a tuple, you'll have to wrap it in another tuple, unlike any other kind of object:
>>> '%s' % []
'[]'
>>> '%s' % ((),)
'()'
>>> '%s' % ()
TypeError: not enough arguments for format string
The relevant section of the documentation is at section 4.7.2. printf-style String Formatting, where it is mentioned:
If format requires a single argument, values may be a single non-tuple object. Otherwise, values must be a tuple with exactly the number of items specified by the format string
The odd handling of tuples is one of the quirks called out in the note at the beginning of that section of the documentation, and one of the reasons that the newer string formatting method str.format is recommended instead.
Note that the handling of the string formatting happens at runtime†. You can verify this with the abstract syntax tree:
>>> import ast
>>> ast.dump(ast.parse('"%s" % val'))
"Module(body=[Expr(value=BinOp(left=Str(s='%s'), op=Mod(), right=Name(id='val', ctx=Load())))])"
'%s' % val parses to a binary operation on '%s' and val, which is handled like str.__mod__(val), in CPython that's a BINARY_MODULO opcode. This means it's usually up to the str type to decide what to do when the val received is incorrect*, which occurs only once the expression is evaluated, i.e. once the interpreter has reached that line. So, it doesn't really matter whether the val is the wrong type or has too few/too many elements - that's a runtime error, not a syntax error.
† Except in some special cases where CPython's peephole optimizer is able to "constant fold" it at compile time.
* Unless val's type subclasses str, in which case type(val).__rmod__ should be able to control the result.

Use Python reserved words in an XML File

I'm currently trying to use python's (3.6) xml.etree.ElementTree commands to write an xml file. Some of the Elements and Subelements I need to write must have "id" and "map" fields, which are reserved python words.
My problem is contained in the following line of code:
ET.SubElement(messages,'trigger',thing='1',bob='a', max='5')
But "max" is a function and I can't use it. Is there a character I can place there to allow me to write this field as I desire? Or some sort of known workaround?
EDIT: I am aware that an '_' stops the python from processing the word, but unfortunately this underscore will show up in my file...so I am trying to see if there is an 'invisible' option for the file I will later be writing.
Thanks much!

Python functions are no problem in the left side of a keyword expression:
>>> def abc(**kwargs):
print kwargs
>>> abc(id=2)
{'id': 2}
>>>
id, map, int, float, str, repr, etc. are built in symbols, not reserved words. You may use them like any other bunch of letters, but assigning it another value replaces the built in symbol:
>>> int(2.5)
2
>>> int = "5"
>>> int(2.5)
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
int(2.5)
TypeError: 'str' object is not callable
Notice how the first line is entirely legal, but will trigger a warning if you have a good IDE like pycharm.
If you want to send a actual reserved word to a function, like print, None, yield, or try, you can use the double star ** to convert a dictionary into keyword arguments, for example:
>>> abc(**{"print":2, "None":3})
{'print': 2, 'None': 3}
I hope this answers your question!

Converting string to tuple and adding to tuple

I have a config file like this.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
I need to loop through the values and convert them to tuples.
I then need to make a tuple of the tuples like
((2,2,10,10), (12,8,2,10))

Instead of using a regex or int/string functions, you could also use the ast module's literal_eval function, which only evaluates strings that are valid Python literals. This function is safe (according to the docs).
http://docs.python.org/library/ast.html#ast.literal_eval
import ast
ast.literal_eval("(1,2,3,4)") # (1,2,3,4)
And, like others have said, ConfigParser works for parsing the INI file.

To turn the strings into tuples of ints (which is, I assume, what you want), you can use a regex like this:
x = "(1,2,3)"
t = tuple(int(v) for v in re.findall("[0-9]+", x))
And you can use, say, configparser to parse the config file.

Considering that cp is the ConfigParser object for the cfg file having the config.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
>> import ast
>> tuple(ast.literal_eval(v[1]) for v in cp.items('rects'))
((2,2,10,10), (12,8,2,10))
Edit : Changed eval() to a safer version literal_eval()
From python docs - literal_eval() does following :
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None

You can simply make a tuple of tuples like
new_tuple = (rect1,rect2) # ((2,2,10,10), (12,8,2,10))
If you want to loop through values
for i in rect1+rect2:
print i
If you want to regroup the numbers you could do
tuple_regrouped = zip(rect1,rect2) #((2,12),(2,8),(10,2), (10,10))
EDIT:
Didn't notice the string part. If you have lines in strings, like from reading a config file, you can do something like
# line = "rect1 = (1,2,3,4)"
config_dict = {}
var_name, tuple_as_str = line.replace(" ","").split("=")
config_dict[var_name] = tuple([int(i) for i in tuple_as_str[1:-1].split(',')])
# and now you'd have config_dict['rect1'] = (1,2,3,4)

The easiest way to do this would be to use Michael Foord's ConfigObject library. It has an unrepr mode, which'll directly convert the string into a tuple for you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python string formatting of multi-level dict - python

Related

Store formatted strings, pass in values later?

Slicing a string from inside a formatted string gives 'TypeError: string indices must be integers'

how format specifier taking value while tuple list is passed

Use Python reserved words in an XML File

Converting string to tuple and adding to tuple

Categories

Resources