Format String of Dictionary - python

I've a string of dictionary as following:
CREDENTIALS = "{\"aaaUser\": {\"attributes\": {\"pwd\": \"cisco123\", \"name\": \"admin\"}}}"
Now I want to format this string to replace the pwd and name dynamically. What I've tried is:
CREDENTIALS = "{\"aaaUser\": {\"attributes\": {\"pwd\": \"{0}\", \"name\": \"{1}\"}}}".format('password', 'username')
But this gives following error:
traceback (most recent call last):
File ".\ll.py", line 4, in <module>
CREDENTIALS = "{\"aaaUser\": {\"attributes\": {\"pwd\": \"{0}\", \"name\": \"{1}\"}}}".format('password', 'username')
KeyError: '"aaaUser"
It is possible by just loading the string as dict using json.loads()and then setting the attributes as required, but this is not what I want. I want to format the string, so that I can use this string in other files/modules.
'
What I'm missing here? Any help would be appreciated.

Don't try to work with the JSON string directly; decode it, update the data structure, and re-encode it:
# Use single quotes instead of escaping all the double quotes
CREDENTIALS = '{"aaaUser": {"attributes": {"pwd": "cisco123", "name": "admin"}}}'
d = json.loads(CREDENTIALS)
attributes = d["aaaUser"]["attributes"]
attributes["name"] = username
attributes["pwd"] = password
CREDENTIALS = json.dumps(d)
With string formatting, you would need to change your string to look like
CREDENTIALS = '{{"aaaUser": {{"attributes": {{"pwd": "{0}", "name": "{1}"}}}}}}'
doubling all the literal braces so that the format method doesn't mistake them for placeholders.
However, formatting also means that the password needs to be pre-escaped if it contains anything that could be mistaken for JSON syntax, such as a double quote.
# This produces invalid JSON
NEW_CREDENTIALS = CREDENTIALS.format('new"password', 'bob')
# This produces valid JSON
NEW_CREDENTIALS = CREDENTIALS.format('new\\"password', 'bob')
It's far easier and safer to just decode and re-encode.

str.format deals with the text enclosed with braces {}. Here variable CREDENTIALS has the starting letter as braces { which follows the str.format rule to replace it's text and find the immediately closing braces since it don't find it and instead gets another opening braces '{' that's why it throws the error.
The string on which this method is called can contain literal text or replacement fields delimited by braces {}
Now to escape braces and replace only which indented can be done if enclosed twice like
'{{ Hey Escape }} {0}'.format(12) # O/P '{ Hey Escape } 12'
If you escape the parent and grandparent {} then it will work.
Example:
'{{Escape Me {n} }}'.format(n='Yes') # {Escape Me Yes}
So following the rule of the str.format, I'm escaping the parents text enclosed with braces by adding one extra brace to escape it.
"{{\"aaaUser\": {{\"attributes\": {{\"pwd\": \"{0}\", \"name\": \"{1}\"}}}}}}".format('password', 'username')
#O/P '{"aaaUser": {"attributes": {"pwd": "password", "name": "username"}}}'
Now Coming to the string formatting to make it work. There is other way of doing it. However this is not recommended in your case as you need to make sure the problem always has the format as you mentioned and never mess with other otherwise the result could change drastically.
So here the solution that I follow is using string replace to convert the format from {0} to %(0)s so that string formatting works without any issue and never cares about braces .
'Hello %(0)s' % {'0': 'World'} # Hello World
SO here I'm using re.sub to replace all occurrence
def myReplace(obj):
found = obj.group(0)
if found:
found = found.replace('{', '%(')
found = found.replace('}', ')s')
return found
CREDENTIALS = re.sub('\{\d{1}\}', myReplace, "{\"aaaUser\": {\"attributes\": {\"pwd\": \"{0}\", \"name\": \"{1}\"}}}"% {'0': 'password', '1': 'username'}
print CREDENTIALS # It should print desirable result

Related

Python unicode escape for RethinkDB match (regex) query

I am trying to perform a rethinkdb match query with an escaped unicode user provided search param:
import re
from rethinkdb import RethinkDB
r = RethinkDB()
search_value = u"\u05e5" # provided by user via flask
search_value_escaped = re.escape(search_value) # results in u'\\\u05e5' ->
# when encoded with "utf-8" gives "\ץ" as expected.
conn = rethinkdb.connect(...)
results_cursor_a = r.db(...).table(...).order_by(index="id").filter(
lambda doc: doc.coerce_to("string").match(search_value)
).run(conn) # search_value works fine
results_cursor_b = r.db(...).table(...).order_by(index="id").filter(
lambda doc: doc.coerce_to("string").match(search_value_escaped)
).run(conn) # search_value_escaped spits an error
The error for search_value_escaped is the following:
ReqlQueryLogicError: Error in regexp `\ץ` (portion `\ץ`): invalid escape sequence: \ץ in:
r.db(...).table(...).order_by(index="id").filter(lambda var_1: var_1.coerce_to('string').match(u'\\\u05e5m'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I tried encoding with "utf-8" before/after re.escape() but same results with different errors. What am I messing? Is it something in my code or some kind of a bug?
EDIT: .coerce_to('string') converts the document to "utf-8" encoded string. RethinkDB also converts the query to "utf-8" and then it matches them hence the first query works even though it looks like a unicde match inside a string.
From what it looks like RethinkDB rejects escaped unicode characters so I wrote a simple workaround with a custom escape without implementing my own logic of replacing characters (in fear that I must miss one and create a security issue).
import re
def no_unicode_escape(u):
escaped_list = []
for i in u:
if ord(i) < 128:
escaped_list.append(re.escape(i))
else:
escaped_list.append(i)
rv = "".join(escaped_list)
return rv
or a one-liner:
import re
def no_unicode_escape(u):
return "".join(re.escape(i) if ord(i) < 128 else i for i in u)
Which yields the required result of escaping "dangerous" characters and works with RethinkDB as I wanted.

Opposite single/double quotes using pymongo command

I am trying to take user input, create a URI, and add it with a collection in Pymongo, but whenever I try to do this, the format gets messed up and I cant figure out how to fix it.
When running the line:
print(db.command("create", "storage", someStorage={ "URI": {FS_URI}}))
where "Storage" is the collection,
I want the object to be {"fs" : "something://a:b"} or {'fs' : 'something://a:b'}
FS_URI = ('\"fs\" : \"'+URI+'\"')
gives the error: Cannot encode object: {'"fs" : "something://a:b"'}
FS_URI = ("fs\" : \"%s" % URI)
gives the error" Cannot encode object: {'fs" : "something://a:b'}
FS_URI = ("fs\' : \'%s" % URI)
gives the error" Cannot encode object: {"fs' : 'something://a:b"}
The quotes are always unmatching, or have extra quotes around them.
I have tried the command with the actual URI in the quote format I want, and it runs perfectly.
I found that using a dict solved this problem, by changing
FS_URI = ("fs\" : \"%s" % URI)
to a JSON object rather than a string:
FS_URI = {"fs": "{}".format(URI)}
solved this problem

How to check the Emoji property of a character in Python?

In unicode a character can have an Emoji property.
Is there a standard way in Python to determine if a character is an Emoji?
I know of unicodedata, but it doesn't appear to expose all these extra character details.
Note: I'm asking about the specific attribute called "Emoji" in the unicdoe standard, as provided in the link. I don't want to have an arbitrary list of pattern ranges, and preferably use a standard library.
This is the code I ended up creating to load the Emoji information. The get_emoji function gets the data file, parses it, and calls the enumeraton callback. The rest of the code uses this to produce a JSON file of the information I needed.
#!/usr/bin/env python3
# Generates a list of emoji characters and names in JS format
import urllib.request
import unicodedata
import re, json
'''
Enumerates the Emoji characters that match an attributes from the Unicode standard (the Emoji list).
#param on_emoji A callback that is called with each found character. Signature `on_emoji( code_point_value )`
#param attribute The attribute that is desired, such as `Emoji` or `Emoji_Presentation`
'''
def get_emoji(on_emoji, attribute):
with urllib.request.urlopen('http://www.unicode.org/Public/emoji/5.0/emoji-data.txt') as f:
content = f.read().decode(f.headers.get_content_charset())
cldr = re.compile('^([0-9A-F]+)(..([0-9A-F]+))?([^;]*);([^#]*)#(.*)$')
for line in content.splitlines():
m = cldr.match(line)
if m == None:
continue
line_attribute = m.group(5).strip()
if line_attribute != attribute:
continue
code_point = int(m.group(1),16)
if m.group(3) == None:
on_emoji(code_point)
else:
to_code_point = int(m.group(3),16)
for i in range(code_point,to_code_point+1):
on_emoji(i)
# Dumps the values into a JSON format
def print_emoji(value):
c = chr(value)
try:
obj = {
'code': value,
'name': unicodedata.name(c).lower(),
}
print(json.dumps(obj),',')
except:
# Unicode DB is likely outdated in installed Python
pass
print( "module.exports = [" )
get_emoji(print_emoji, "Emoji_Presentation")
print( "]" )
That solved my original problem. To answer the question itself it'd just be a matter of sticking the results into a dictionary and doing a lookup.
I have used the following regex pattern successfully before
import re
emoji_pattern = re.compile("["
u"\U0001F600-\U0001F64F" # emoticons
u"\U0001F300-\U0001F5FF" # symbols & pictographs
u"\U0001F680-\U0001F6FF" # transport & map symbols
u"\U0001F1E0-\U0001F1FF" # flags (iOS)
"]+", flags=re.UNICODE)
Also check out this question: removing emojis from a string in Python

Handling JSON string in python

i have an string for example
var = "{"name":"angelo","apellido":"enriquez"}"
but when doing the following function I get an error
data = json.loads(var)
Error : No JSON object could be decoded
Any help?
Replace your var with :
var = '{"name":"angelo","apellido":"enriquez"}'
i.e put the content inside {} within single quotes (') instead of double .
Hope that helps .
First of all you are not writing good json in javascript, replace inner " double quotes with single ' quotes.
var MyJSVar = {'hello':'bra'}
If string
var MyJSVar = "{'hello':'bra'}"

Writing a txt file after using pylast - encoding error

I am having problems writing to a file where I am using pylast. Following a template given in pylast, I added a regular expression to extract what I need (which is doing ok), but when I tried to print to a file, I get an error, and don't know how to fix it (I am teaching myself python and some of its libraries).
I suspect there is an encoding specification I need to make somewhere (some of the output to screen also shows non-standard characters). I don't know how to solve my problem.
Can anybody help?
Thanks
import re
import pylast
RawArtistList = []
ArtistList = []
# You have to have your own unique two values for API_KEY and API_SECRET
# Obtain yours from http://www.last.fm/api/account for Last.fm
API_KEY = "XXX"
API_SECRET = "YYY"
###### In order to perform a write operation you need to authenticate yourself
username = "username"
password_hash = pylast.md5("password")
network = pylast.LastFMNetwork(api_key = API_KEY, api_secret = API_SECRET, username = username, password_hash = password_hash)
## _________INIT__________
COUNTRY = "Germany"
#---------------------- Get Geo Country --------------------
geo_country = network.get_country(COUNTRY)
#---------------------- Get artist --------------------
top_artists_of_country = str(geo_country.get_top_artists())
RawArtistList = re.findall(r"u'(.*?)'", top_artists_of_country)
top_artists_file = open("C:\artist.txt", "w")
for artist in RawArtistList:
print artist
top_artists_file.write(artist + "\n")
top_artists_file.close()
The name of the file I am trying to create "artist.txt" changes to "x07rtist.txt" and the error kicks in. I get this:
Traceback (most recent call last):
File "C:\music4A.py", line 32, in <module>
top_artists_file = open("C:\artist.txt", "w")
IOError: [Errno 22] invalid mode ('w') or filename:'C:\x07rtist.txt'
Thank you very much for any help! Cheers.
The Python docs say:
The backslash () character is used to escape characters that
otherwise have a special meaning, such as newline, backslash itself,
or the quote character.
...so when you say
top_artists_file = open("C:\artist.txt", "w")
that string literal is being interpreted as
C: \a rtist.txt
...where \a is a single character that has a value of 0x07.
...that line should instead be:
# doubling the backslash prevents misinterpreting the 'a'
top_artists_file = open("C:\\artist.txt", "w")
or
# define the string literal as a raw string to prevent the escape behavior
top_artists_file = open(r"C:\artist.txt", "w")
or
# forward slashes work just fine as path separators on Windows.
top_artists_file = open("C:/artist.txt", "w")

Categories

Resources