Get list in dictionary with string interpolation - python

I'm trying to get a value from a dictionary with string interpolation.
The dictionary has some digits, and a list of digits:
d = { "year" : 2010, \
"scenario" : 2, \
"region" : 5, \
"break_points" : [1,2,3,4,5] }
Is it possible to reference the list in a string interpolation, or do I need to identify unique keys for each?
Here's what I've tried:
str = "Year = %(year)d, \
Scenario = %(scenario)d, \
Region = %(region)d, \
Break One = %(break_points.0)d..." % d
I've also tried %(break_points[0])d, and %(break_points{'0'})d
Is this possible to do, or do I need to give them keys and save them as integers in the dictionary?

This is possible with new-style formatting:
print "{0[break_points][0]:d}".format(d)
or
print "{break_points[0]:d}".format(**d)
 
str.format documentation
The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument.
Format string syntax
The field_name itself begins with an arg_name that is either a number or a keyword. If it’s a number, it refers to a positional argument, and if it’s a keyword, it refers to a named keyword argument.
...
The arg_name can be followed by any number of index or attribute expressions. An expression of the form '.name' selects the named attribute using getattr(), while an expression of the form '[index]' does an index lookup using __getitem__().

Related

I want to replace a special character with a space

Here is the code i have until now :
dex = tree.xpath('//div[#class="cd-timeline-topic"]/text()')
names = filter(lambda n: n.strip(), dex)
table = str.maketrans(dict.fromkeys('?:,'))
for index, name in enumerate(dex, start = 0):
print('{}.{}'.format(index, name.strip().translate(table)))
The problem is that the output will print also strings with one special character "My name is/Richard". So what i need it's to replace that special character with a space and in the end the printing output will be "My name is Richard". Can anyone help me ?
Thanks!
Your call to dict.fromkeys() does not include the character / in its argument.
If you want to map all the special characters to None, just passing your list of special chars to dict.fromkeys() should be enough. If you want to replace them with a space, you could then iterate over the dict and set the value to for each key.
For example:
special_chars = "?:/"
special_char_dict = dict.fromkeys(special_chars)
for k in special_char_dict:
special_char_dict[k] = " "
You can do this by extending your translation table:
dex = ["My Name is/Richard????::,"]
table = str.maketrans({'?':None,':':None,',':None,'/':' '})
for index, name in enumerate(dex, start = 0):
print('{}.{}'.format(index, name.strip().translate(table)))
OUTPUT
0.My Name is Richard
You want to replace most special characters with None BUT forward slash with a space. You could use a different method to replace forward slashes as the other answers here do, or you could extend your translation table as above, mapping all the other special characters to None and forward slash to space. With this you could have a whole bunch of different replacements happen for different characters.
Alternatively you could use re.sub function following way:
import re
s = 'Te/st st?ri:ng,'
out = re.sub(r'\?|:|,|/',lambda x:' ' if x.group(0)=='/' else '',s)
print(out) #Te st string
Arguments meaning of re.sub is as follows: first one is pattern - it informs re.sub which substring to replace, ? needs to be escaped as otherwise it has special meaning there, | means: or, so re.sub will look for ? or : or , or /. Second argument is function which return character to be used in place of original substring: space for / and empty str for anything else. Third argument is string to be changed.
>>> a = "My name is/Richard"
>>> a.replace('/', ' ')
'My name is Richard'
To replace any character or sequence of characters from the string, you need to use `.replace()' method. So the solution to your answer is:
name.replace("/", " ")
here you can find details

Parsing a text file to store into class objects and attributes

2 part question. How to parse text and save as class object/attributes and best way to rewrite text from the classes in a specific format.
I'm wanting to parse through a text file and extract sections of text and create a class object and attributes. There will be several classes (Polygons, space, zone, system, schedule) involved. In the original file each "Object" and it's "attributes" are separated by '..'. An example of one is below.
"Office PSZ" = SYSTEM
TYPE = PSZ
HEAT-SOURCE = FURNACE
FAN-SCHEDULE = "HVAC Yr Schedule"
COOLING-EIR = 0.233207
..
I'd like to read this text and store into class objects. So "Office PSZ" would be of the HVACsystem or SYSTEM class, haven't decided. 'SYSTEM' would be a class variable. For this instance ("Office PSZ"), self.TYPE would be PSZ. self.HEAT-SOURCE would equal FURNACE,etc.
I want to manipulate these objects based on their attributes. The end result though would be to write all the data that was manipulated back into a text file with the original format. End result for this instance may be.
"Office PSZ" = SYSTEM
TYPE = PSZ
HEAT-SOURCE = ELECTRIC
FAN-SCHEDULE = "Other Schedule"
COOLING-EIR = 0.200
..
Is there a way to print the attribute name/title (idk what to call it)? Because the attribute name (i.e. TYPE,HEAT-SOURCE) comes from the original file and it would be easier to not have to manually anticipate all of the attributes associated with every class.
I suppose I could create an array of all of the values on the left side of "=" and another array for the values on the right and loop through those as I'm writing/formatting a new text file. But I'm not sure if that's a good way to go.
I'm still quite the amateur so I might be overreaching but any suggestions on how I should proceed?
Pyparsing makes it easy to write custom parsers for data like this, and gives back
parsed data in a pyparsing data structure call ParseResults. ParseResults give you
access to your parsed values by position (like a list), by key (like a dict), or for
names that work as Python identifiers, by attribute (like an object).
I've simplfied my parsing of your data to pretty much just take every key = value line
and build up a structure using the key strings as keys. The '..' lines work great
as terminators for each object.
A simple BNF for this might look like:
object ::= attribute+ end
attribute ::= key '=' value
key ::= word composed of letters 'A'..'Z' and '-', starting with 'A'..'Z',
or a quoted string
value ::= value_string | value_number | value_word
value_word ::= a string of non-whitespace characters
value_string ::= a string of any characters in '"' quotes
value_number ::= an integer or float numeric value
end ::= '..'
To implement a pyparsing parser, we work bottom up to define pyparsing sub-expressions.
Then we use Python '+' and '|' operators to assemble lower-level expressions to higher-level
ones:
import pyparsing as pp
END = pp.Suppress("..")
EQ = pp.Suppress('=')
pyparsing includes some predefined expressions for quoted strings and numerics;
the numerics will be automatically converted to ints or floats.
value_number = pp.pyparsing_common.number
value_string = pp.quotedString
value_word = pp.Word(pp.printables)
value = value_string | value_number | value_word
For our attribute key, we will use the two-argument form for Word. The first
argument is a string of allowable leading characters, and the second argument is a
string of allowable body characters. If we just wrote `Word(alphas + '-'), then
our parser would accept '---' as a legal key.
key = pp.Word(pp.alphas, pp.alphas + '-') | pp.quotedString
An attribute definition is just a key, an '=' sign, and a value
attribute = key + EQ + value
Lastly we will use some of the more complex features of pyparsing. The simplest form
would just be "pp.OneOrMore(attribute) + END", but this would just give us back a
pile of parsed tokens with no structure. The Group class structures the enclosed expressions
so that their results will be returned as a sub-list. We will catch every attribute as
its own sub-list using Group. Dict will apply some naming to the results, using
the text from each key expression as the key for that group. Finally, the whole collection
of attributes will be Group'ed again, this time representing all the attributes for a
single object:
object_defn = pp.Group(pp.Dict(pp.OneOrMore(pp.Group(attribute)))) + END
To use this expression, we'll define our parser as:
parser = pp.OneOrMore(object_defn)
and parse the sample string using:
objs = parser.parseString(sample)
The objs variable we get back will be a pyparsing ParseResults, which will work like
a list of the grouped object attributes. We can view just the parsed attributes as a list
of lists using asList():
for obj in objs:
print(obj.asList())
[['"Office PSZ"', 'SYSTEM'], ['TYPE', 'PSZ'], ['HEAT-SOURCE', 'FURNACE'],
['FAN-SCHEDULE', '"HVAC Yr Schedule"'], ['COOLING-EIR', 0.233207]]
If we had not used the Dict class, this would have all we would get, but since we
did use Dict, we can also see the attributes as a Python dict:
for obj in objs:
print(obj.asDict())
{'COOLING-EIR': 0.233207, '"Office PSZ"': 'SYSTEM', 'TYPE': 'PSZ',
'FAN-SCHEDULE': '"HVAC Yr Schedule"', 'HEAT-SOURCE': 'FURNACE'}
We can even access named fields by name, if they work as Python identifiers. In your
sample, "TYPE" is the only legal identifier, so you can see how to print it here. There
is also a dump() method that will give the results in list form, followed by an
indented list of defined key pairs. (I've also shown how you can use list and dict
type access directly on the ParseResults object, without having to convert to list
or dict types):
for obj in objs:
print(obj[0])
print(obj['FAN-SCHEDULE'])
print(obj.TYPE)
print(obj.dump())
['"Office PSZ"', 'SYSTEM']
"HVAC Yr Schedule"
PSZ
[['"Office PSZ"', 'SYSTEM'], ['TYPE', 'PSZ'], ['HEAT-SOURCE', 'FURNACE'],
['FAN-SCHEDULE', '"HVAC Yr Schedule"'], ['COOLING-EIR', 0.233207]]
- "Office PSZ": 'SYSTEM'
- COOLING-EIR: 0.233207
- FAN-SCHEDULE: '"HVAC Yr Schedule"'
- HEAT-SOURCE: 'FURNACE'
- TYPE: 'PSZ'
Here is the full parser code for you to work from:
import pyparsing as pp
END = pp.Suppress("..")
EQ = pp.Suppress('=')
value_number = pp.pyparsing_common.number
value_string = pp.quotedString
value_word = pp.Word(pp.printables)
value = value_string | value_number | value_word
key = pp.Word(pp.alphas, pp.alphas+"-") | pp.quotedString
attribute = key + EQ + value
object_defn = pp.Group(pp.Dict(pp.OneOrMore(pp.Group(attribute)))) + END
parser = pp.OneOrMore(object_defn)
objs = parser.parseString(sample)
for obj in objs:
print(obj.asList())
for obj in objs:
print(obj.asDict())
for obj in objs:
print(obj[0])
print(obj['FAN-SCHEDULE'])
print(obj.TYPE)
print(obj.dump())

How to reference the empty string key in the Format String Syntax?

The empty string ('') is a pertectly valid key for a dictionary, but I can not reference it using the Format String Syntax
data = { 'a' : 'hello' , '' : 'bye' }
print '{a:<14s}'.format(**data)
print '{:<14s}'.format(**data)
Which outputs:
hello
Traceback (most recent call last):
File "xxx.py", line 3, in <module>
print '{:<14s}'.format(**data)
IndexError: tuple index out of range
Is there any way of referencing that key ... as a dictionary key! I can not convert the data to tuples; a bit of background: I am doing some auto-formatting based on a generic format spec which gets converted to Format String Syntax using dicts as data for the formatting. That is, I can not do this:
print '{0:<14s}'.format(data[''])
The data must always be passed to format as **data (basically, because I am doing a generic .format(**data) in my formatter class)
You can't use an empty string. The format strictly limits keys to valid Python identifiers, which means they have to have at least 1 letter or underscore at the start.
From the grammar in the Format String Syntax documentation:
replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name ::= arg_name ("." attribute_name | "[" element_index "]")*
arg_name ::= [identifier | integer]
So the field_name is either an integer or a valid Python identifier.
Note that empty strings are not the only stings that are not valid identifiers; you cannot use strings with spaces in it either, or strings that start with a digit. Such strings can be used in a dictionary, just not as keyword arguments in Python code nor as field names in string formats.
An 'arg_name' can be an identifier, so you can assign the empty string to a variable and use it as a key in a dictionary. This example assumes that the RETURN key is pressed to produce the empty string for a comparison:
empty = ''
test = {
"notso" : "You pressed some other key",
empty: "You pressed the RETURN key"
}
char = input("Press the RETURN key: ")
print(test[empty]) if char == empty else print(test["notso"])

Catching selected portion of the string in python

I have a variable Field in which a string is store like this:
Field= "In Field 'fieldname':(Value1) from (DC) to (deleted)"
or it can also be:
Field= "In Field 'fieldname':(Value1) has changed from (DC) to (1)"
From this string stored in variable Field, I want to catch the values (DC) and (deleted) or (DC) to (1) in two different variables like:
OldValue=DC
NewValue=deleted
OldValue=DC
NewValue=1
I am handling this variables in Python like this:
OldValue,NewValue=re.findall(r'\((\d+)\)',Field)
But this catches only digits not string? Can anyone help
\d in a regular expression matches digits only. In order to match letters (\w) and digits (\d), the best solution is a character class of both ([\d\w]).
Note: this will also catch the (Value1) in your string. You'll need some code to filter it out; or, just modify the tuple:
ValueName, OldValue, NewValue = tuple(re.findall(r'\(([\d\w]+?)\)', Field))
Notes:
The +? modifier matches the least possible nonzero number of [\d\w]s (unlike +, which matches the highest possible number).
The tuple() is necessary to convert the list returned by re.findall into a tuple that the left side can make sense of.
You could change your existing re to be:
>>> re.search(r'from \((.*?)\) to \((.*?)\)$', Field).groups()
('DC', 'deleted')

Use Python format string in reverse for parsing

I've been using the following python code to format an integer part ID as a formatted part number string:
pn = 'PN-{:0>9}'.format(id)
I would like to know if there is a way to use that same format string ('PN-{:0>9}') in reverse to extract the integer ID from the formatted part number. If that can't be done, is there a way to use a single format string (or regex?) to create and parse?
The parse module "is the opposite of format()".
Example usage:
>>> import parse
>>> format_string = 'PN-{:0>9}'
>>> id = 123
>>> pn = format_string.format(id)
>>> pn
'PN-000000123'
>>> parsed = parse.parse(format_string, pn)
>>> parsed
<Result ('123',) {}>
>>> parsed[0]
'123'
You might find simulating scanf interresting.
Here's a solution in case you don't want to use the parse module. It converts format strings into regular expressions with named groups. It makes a few assumptions (described in the docstring) that were okay in my case, but may not be okay in yours.
def match_format_string(format_str, s):
"""Match s against the given format string, return dict of matches.
We assume all of the arguments in format string are named keyword arguments (i.e. no {} or
{:0.2f}). We also assume that all chars are allowed in each keyword argument, so separators
need to be present which aren't present in the keyword arguments (i.e. '{one}{two}' won't work
reliably as a format string but '{one}-{two}' will if the hyphen isn't used in {one} or {two}).
We raise if the format string does not match s.
Example:
fs = '{test}-{flight}-{go}'
s = fs.format('first', 'second', 'third')
match_format_string(fs, s) -> {'test': 'first', 'flight': 'second', 'go': 'third'}
"""
# First split on any keyword arguments, note that the names of keyword arguments will be in the
# 1st, 3rd, ... positions in this list
tokens = re.split(r'\{(.*?)\}', format_str)
keywords = tokens[1::2]
# Now replace keyword arguments with named groups matching them. We also escape between keyword
# arguments so we support meta-characters there. Re-join tokens to form our regexp pattern
tokens[1::2] = map(u'(?P<{}>.*)'.format, keywords)
tokens[0::2] = map(re.escape, tokens[0::2])
pattern = ''.join(tokens)
# Use our pattern to match the given string, raise if it doesn't match
matches = re.match(pattern, s)
if not matches:
raise Exception("Format string did not match")
# Return a dict with all of our keywords and their values
return {x: matches.group(x) for x in keywords}
How about:
id = int(pn.split('-')[1])
This splits the part number at the dash, takes the second component and converts it to integer.
P.S. I've kept id as the variable name so that the connection to your question is clear. It is a good idea to rename that variable that it doesn't shadow the built-in function.
Use lucidity
import lucidty
template = lucidity.Template('model', '/jobs/{job}/assets/{asset_name}/model/{lod}/{asset_name}_{lod}_v{version}.{filetype}')
path = '/jobs/monty/assets/circus/model/high/circus_high_v001.abc'
data = template.parse(path)
print(data)
# Output
# {'job': 'monty',
# 'asset_name': 'circus',
# 'lod': 'high',
# 'version': '001',
# 'filetype': 'abc'}

Categories

Resources