Python string.Template: Replace field containing spaces

Python string.Template: Replace field containing spaces - python

Using Python's string.Template class - how might I utilize the ${} for fields in a dictionary that contain spaces?
E.g.
t = string.Template("hello ${some field}")
d = { "some field": "world" }
print( t.substitute(d) ) # Returns "invalid placeholder in string"
Edit: Here's the closest I could get, with the caveat being that all variables need to be wrapped in a brackets (otherwise all space separated words would be matched).
class MyTemplate(string.Template):
delimiter = '$'
idpattern = '[_a-z][\s_a-z0-9]*'
t = MyTemplate("${foo foo} world ${bar}")
s = t.substitute({ "foo foo": "hello", "bar": "goodbye" })
# hello world goodbye

Just in case this might be helpful to somebody else. In python 3 you can use format_map:
t = "hello {some field}"
d = { "some field": "world" }
print( t.format_map(d) )
# hello world

From Doc it says that we could use Template option
https://docs.python.org/dev/library/string.html#template-strings
import string
class MyTemplate(string.Template):
delimiter = '%'
idpattern = '[a-z]+ [a-z]+'
t = MyTemplate('%% %with_underscore %notunderscored')
d = { 'with_underscore':'replaced',
'notunderscored':'not replaced',
}
print t.safe_substitute(d)

Related

Python How to add nested fields to Yaml file

I need to modify a YAML file and add several fields.I am using the ruamel.yaml package.
First I load the YAML file:
data = yaml.load(file_name)
I can easily add new simple fields, like-
data['prop1'] = "value1"
The problem I face is that I need to add a nested dictionary incorporate with array:
prop2:
prop3:
- prop4:
prop5: "Some title"
prop6: "Some more data"
I tried to define-
record_to_add = dict(prop2 = dict(prop3 = ['prop4']))
This is working, but when I try to add beneath it prop5 it fails-
record_to_add = dict(prop2 = dict(prop3 = ['prop4'= dict(prop5 = "Value")]))
I get
SyntaxError: expression cannot contain assignment, perhaps you meant "=="?
What am I doing wrong?

The problem has little to do with ruamel.yaml. This:
['prop4'= dict(prop5 = "Value")]
is invalid Python as a list ([ ]) expects comma separated values. You would need to use something like:
record_to_add = dict(prop2 = dict(prop3 = dict(prop4= [dict(prop5 = "Some title"), dict(prop6='Some more data'),])))
As your program is incomplete I am not sure if you are using the old API or not. Make sure to use
import ruamel.yaml
yaml = ruamel.yaml.YAML()
and not
import ruamel.yaml as yaml

Its because of having ['prop4'= <> ].Instead record_to_add = dict(prop2 = dict(prop3 = [dict(prop4 = dict(prop5 = "Value"))])) should work.
Another alternate would be,
import yaml
data = {
"prop1": {
"prop3":
[{ "prop4":
{
"prop5": "some title",
"prop6": "some more data"
}
}]
}
}
with open(filename, 'w') as outfile:
yaml.dump(data, outfile, default_flow_style=False)

Converting dictionary into a list of dictionaries

So, I've been tasked with converting a string into a dict (has to be using regex). I've done a findall to separate each element but not sure how to put it together.
I have the following code:
import re
def edata():
with open("employeedata.txt", "r") as file:
employeedata = file.read()
IP_field = re.findall(r"\d+[.]\d+[.]\d+[.]\d+", employeedata)
username_field = re.findall (r"[a-z]+\d+|- -", employeedata)
date_field = re.findall (r"\d+\/[A-Z][a-z][0-9]+\/\d\d\d\d:\d+:\d+:\d+ -\d+", employeedata)
type_field = re.findall (r'"(.*)?"', employeedata)
Fields = ["IP","username","date","type"]
Fields2 = IP_field, username_field, date_field, type_field
dictionary = dict(zip(Fields,Fields2))
return dictionary
print(edata())
Current output:
{ "IP": ["190.912.120.151", "190.912.120.151"], "username": ["skynet10001", "skynet10001"] etc }
Expected output:
[{ "IP": "190.912.120.151", "username": "skynet10001" etc },
{ "IP": "190.912.120.151", "username": "skynet10001" etc }]

Another solution that uses the dictionary that you have already constructed. This code uses list comprehension and the zip function to produce a list of dictionaries from the existing dictionary variable.
import re
def edata():
with open("employeedata.txt", "r") as file:
employeedata = file.read()
IP_field = re.findall(r"\d+[.]\d+[.]\d+[.]\d+", employeedata)
username_field = re.findall (r"[a-z]+\d+|- -", employeedata)
date_field = re.findall (r"\[(.*?)\]", employeedata) ## changed your regex for the date field
type_field = re.findall (r'"(.*)?"', employeedata)
Fields = ["IP","username","date","type"]
Fields2 = IP_field, username_field, date_field, type_field
dictionary = dict(zip(Fields,Fields2))
result_dictionary = [dict(zip(dictionary, i)) for i in zip(*dictionary.values())] ## convert to list of dictionaries
return result_dictionary
print(edata())

You can use
import re
rx = re.compile(r'^(?P<IP>\d+(?:\.\d+){3})\s+\S+\s+(?P<Username>[a-z]+\d+)\s+\[(?P<Date>[^][]+)]\s+"(?P<Type>[^"]*)"')
def edata():
results = []
with open("downloads/employeedata.txt", "r") as file:
for line in file:
match = rx.search(line)
if match:
results.append(match.groupdict())
return results
print(edata())
See the online Python demo. For the file = ['190.912.120.151 - skynet10001 [19/Jan/2012] "Temp"', '221.143.119.260 - terminator002 [16/Feb/2021] "Temp 2"'] input, the output will be:
[{'IP': '190.912.120.151', 'Username': 'skynet10001', 'Date': '19/Jan/2012', 'Type': 'Temp'}, {'IP': '221.143.119.260', 'Username': 'terminator002', 'Date': '16/Feb/2021', 'Type': 'Temp 2'}]
The regex is
^(?P<IP>\d+(?:\.\d+){3})\s+\S+\s+(?P<Username>[a-z]+\d+)\s+\[(?P<Date>[^][]+)]\s+"(?P<Type>[^"]*)"
See the regex demo. Details:
^ - start of string
(?P<IP>\d+(?:\.\d+){3}) - Group "IP": one or more digits and then three occurrences of a . and one or more digits
\s+\S+\s+ - one or more non-whitespace chars enclosed with one or more whitespace chars on both ends
(?P<Username>[a-z]+\d+) - Group "Username": one or more lowercase ASCII letters and then one or more digits
\s+ - one or more whitespaces
\[ - a [ char
(?P<Date>[^][]+) - Group "Date": one or more chars other than [ and ]
]\s+" - a ] char, one or more whitespaces, "
(?P<Type>[^"]*) - Group "Type": zero or more chars other than "
" - a " char.

Parsing a string to dict with Python

I have a string like this:
/acommand foo='bar' msg='Hello World!' -debugMode
or like this:
/acommand
foo='bar'
msg='Hello, World!'
-debugMode
How can I parse this string to a dict and a list like this:
{"command": "/acommand", "foo": "bar", "msg": "Hello World!"}
["-debugMode"]
I've tried to use string.split to parse it but seems it's not feasible.
argparse seems like that it was born for the command line interface so it doesn't apply.
How to achieve this with Python? Thanks!

you can try something like this:
s = "/acommand foo='bar' msg='Hello World!' -debugMode"
debug = [s.split(" ")[-1]]
s_ = "command=" + ' '.join(s.split(" ")[:-1]).replace("'","")
d = dict(x.split("=") for x in s_.split(" ",2))
print (d)
print (debug)
{'command': '/acommand', 'foo': 'bar', 'msg': 'Hello World!'},
['-debugMode']

A regex for CSV parsing? Python3 Re module

Is there a regex (Python re compatible) that I can use for parsing csv?

EDIT: I didn't realize there was a csv module in Python's standard library
Here's the regex: (?<!,\"\w)\s*,(?!\w\s*\",). It's python compatible and JavaScript compatible. Here's the full parsing script (as a python function):
def parseCSV(csvDoc, output_type="dict"):
from re import compile as c
from json import dumps
from numpy import array
# This is where all the parsing happens
"""
To parse csv files.
Arguments:
csvDoc - The csv document to parse.
output_type - the output type this
function will return
"""
csvparser = c('(?<!,\"\\w)\\s*,(?!\\w\\s*\",)')
lines = str(csvDoc).split('\n')
# All the lines are not empty
necessary_lines = [line for line in lines if line != ""]
All = array([csvparser.split(line) for line in necessary_lines])
if output_type.lower() in ("dict", "json"): # If you want JSON or dict
# All the python dict keys required (At the top of the file or top row)
top_line = list(All[0])
main_table = {} # The parsed data will be here
main_table[top_line[0]] = {
name[0]: {
thing: name[
# The 'actual value' counterpart
top_line.index(thing)
] for thing in top_line[1:] # The requirements
} for name in All[1:]
}
return dumps(main_table, skipkeys=True, ensure_ascii=False, indent=1)
elif output_type.lower() in ("list",
"numpy",
"array",
"matrix",
"np.array",
"np.ndarray",
"numpy.array",
"numpy.ndarray"):
return All
else:
# All the python dict keys required (At the top of the file or top row)
top_line = list(All[0])
main_table = {} # The parsed data will be here
main_table[top_line[0]] = {
name[0]: {
thing: name[
# The 'actual value' counterpart
top_line.index(thing)
] for thing in top_line[1:] # The requirements
} for name in All[1:]
}
return dumps(main_table, skipkeys=True, ensure_ascii=False, indent=1)
Dependancies: NumPy
All you need to do is chuck in the raw text of the csv file and then the function will return a json (or a 2-dimension list if you wish) in this format:
{"top-left-corner name":{
"foo":{"Item 1 left to foo":"Item 2 of the top row",
"Item 2 left to foo":"Item 3 of the top row",
...}
"bar":{...}
}
}
And here's an example of it:
CSV.csv
foo,bar,zbar
foo_row,foo1,,
barie,"2,000",,
and it outputs:
{
"foo": {
"foo_row": {
"bar": "foo1",
"zbar": ""
},
"barie": {
"bar": "\"2,000\"",
"zbar": ""
}
}
}
It should work if your csv file is formatted correctly (The ones I tested was made by apple's Numbers)

How to trim a json doc to single line

I'm using json2yaml to convert a yaml doc to a json string. Now i want to pass the json doc as single-line argument as ansible_extravars in order to override the repository settings.
For example:
Yaml:
container:
name: "summary"
version: "1.0.0"
some_level: "3"
another_nested:
schema: "summary"
items_string: "really just a string of words"
json doc: (this was generated by the 'json2yaml' webpage)
{ "container": {
"name": "summary",
"version": "1.0.0",
"some_level": "3",
"another_nested": {
"schema": "summary",
"items_string": "really just a string of words"
}
}
}
I was using a shell command as follows:
% cat json_text.txt | tr -d '[:space:]'
Which is obviously also stripping white-space from the container.another_nested.items_string
Output:
{"container":{"name":"summary","version":"1.0.0","some_level":"3","another_nested":{"schema":"summary","items_string":"reallyjustastringofwords"}}}
How can i convert a json doc to single line and preserve white-space in quoted strings?

You only need to remove the line breaks with tr -d '[\r\n]' instead of all white space.

I came up with the following python script:
#!/usr/bin/env python
import sys
filter_chars = [' ', '\t', '\n', '\r']
def is_filtered_char(c):
return c in filter_chars
def is_filtered_eol_char(c):
return c in ['\n', '\r']
inside_double_quoted_string = inside_single_quoted_string = False
some_chars = sys.stdin.read()
for c in some_chars:
if c == '"':
if not inside_double_quoted_string:
inside_double_quoted_string = True
else:
inside_double_quoted_string = False
if c == "'":
if not inside_single_quoted_string:
inside_single_quoted_string = True
else:
inside_single_quoted_string = False
if not inside_double_quoted_string and not inside_single_quoted_string and not is_filtered_char(c):
sys.stdout.write(c)
elif (inside_double_quoted_string or inside_single_quoted_string) and not is_filtered_eol_char(c):
sys.stdout.write(c)
```
Usgage:
% cat blah.txt | ./remove_ws.py
{"container":{"name":"summary","version":"1.0.0","some_level":"3","another_nested":{"schema":"summary","items_string":"really just a string of words"}}}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python string.Template: Replace field containing spaces - python

Just in case this might be helpful to somebody else. In python 3 you can use format_map: t = "hello {some field}" d = { "some field": "world" } print( t.format_map(d) ) # hello world

Related

Python How to add nested fields to Yaml file

Converting dictionary into a list of dictionaries

Parsing a string to dict with Python

A regex for CSV parsing? Python3 Re module

How to trim a json doc to single line

Categories

Resources