Convert unicode string dictionary into dictionary in python - python

I have unicode u"{'code1':1,'code2':1}" and I want it in dictionary format.
I want it in {'code1':1,'code2':1} format.
I tried unicodedata.normalize('NFKD', my_data).encode('ascii','ignore') but it returns string not dictionary.
Can anyone help me?

You can use built-in ast package:
import ast
d = ast.literal_eval("{'code1':1,'code2':1}")
Help on function literal_eval in module ast:
literal_eval(node_or_string)
Safely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.

You can use literal_eval. You may also want to be sure you are creating a dict and not something else. Instead of assert, use your own error handling.
from ast import literal_eval
from collections import MutableMapping
my_dict = literal_eval(my_str_dict)
assert isinstance(my_dict, MutableMapping)

EDIT: Turns out my assumption was incorrect; because the keys are not wrapped in double-quote marks ("), the string isn't JSON. See here for some ways around this.
I'm guessing that what you have might be JSON, a.k.a. JavaScript Object Notation.
You can use Python's built-in json module to do this:
import json
result = json.loads(u"{'code1':1,'code2':1}") # will NOT work; see above

I was getting unicode error when I was reading a json from a file. So this one worked for me.
import ast
job1 = {}
with open('hostdata2.json') as f:
job1= json.loads(f.read())
f.close()
#print type before converting this from unicode to dic would be <type 'unicode'>
print type(job1)
job1 = ast.literal_eval(job1)
print "printing type after ast"
print type(job1)
# this should result <type 'dict'>
for each in job1:
print each
print "printing keys"
print job1.keys()
print "printing values"
print job1.values()

You can use the builtin eval function to convert the string to a python object
>>> string_dict = u"{'code1':1, 'code2':1}"
>>> eval(string_dict)
{'code1': 1, 'code2': 1}

Related

Python output a json style string

here defining a variable:
sms_param = '{\"website\":\"hello\"}'
and it print out ok like this : {"website":"hello"}, but i want to pass a dynamic value to its value, so its format should like this: {\"website\":\"{0}\"}.format(msg), but it output a KeyError, I have no idea of this Error, and change all kinds of string format such as triple quotation and change {0} with %s, but all seems useless. how can i solve it.
My suggestion is using json.loads()
>>> sms_param = '{\"website\":\"hello\"}'
>>> import json
>>> json.loads(sms_param)
{'website': 'hello'}
What you can do is using json.loads() convert the json string to dictionary and then change the value, finally convert it back to string

How can I convert string to dict or list?

I have strings such as:
'[1, 2, 3]'
and
"{'a': 1, 'b': 2}"
How do I convert them to list/dict?
Someone mentions that ast.literal_eval or eval can parse a string that converts to list/dict.
What's the difference between ast.literal_eval and eval?
ast.literal_eval parses 'abstract syntax trees.' You nearly have json there, for which you could use json.loads, but you need double quotes, not single quotes, for dictionary keys to be valid.
import ast
result = ast.literal_eval("{'a': 1, 'b': 2}")
assert type(result) is dict
result = ast.literal_eval("[1, 2, 3]")
assert type(result) is list
As a plus, this has none of the risk of eval, because it doesn't get into the business of evaluating functions. eval("subprocess.call(['sudo', 'rm', '-rf', '/'])") could remove your root directory, but ast.literal_eval("subprocess.call(['sudo', 'rm', '-rf', '/'])") fails predictably, with your file system intact.
Use the eval function:
l = eval('[1, 2, 3]')
d = eval("{'a':1, 'b': 2}")
Just make sure you know where these strings came from and that you aren't allowing user input to be evaluated and do something malicious.
python script to convert this string to dict : -
import json
inp_string = '{"1":"one", "2":"two"}'
out = json.loads(inp_string)
print out["1"]
O/P is like :
"one"
You can eval() but only with safe data. Otherwise, if you parse unsafe data, take a look into safer ast.literal_eval().
JSON parser is also a possibility, most of python dicts and lists have the same syntax.
You can convert string to list/dict by ast.literal_eval() or eval() function. ast.literal_eval() only considers a small subset of Python's syntax to be valid:
The string or node provided may only consist of the following Python
literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None.
Passing __import__('os').system('rm -rf /') into ast.literal_eval() will raise an error, but eval() will happily wipe your drive.
Since it looks like you're only letting the user input a plain dictionary, use ast.literal_eval(). It safely does what you want and nothing more.

JSON like string with unicode to valid JSON

I get a string which resembles JSON and I'm trying to convert it to valid JSON using python.
It looks like this example, but the real data gets very long:
{u'key':[{
u'key':u'object',
u'something':u'd\xfcabc',
u'more':u'\u2023more',
u'boolean':True
}]
}
So there are also a lot of special characters, as well as the "wrong" boolean which should be just lowercase letters.
I don't have any influence over the data I get, I just have to parse it somehow and extract some stuff from it.
I tried to replace the special characters and everything and force it to be a valid JSON, but it is not at all elegant and I could easily forget to replace one type of special character.
You can use literal_eval from the ast module for this.
ast.literal_eval(yourString)
You can then convert this Object back to JSON.
JSON spec only allows javascript data (true, false for booleans, null, undefined for None properties, etc)
The string of this question, it's an python object, so as #florian-dreschsler says, you must use literal_eval from the ast module
>>> import ast
>>> json_string = """
... {u'key':[{
... u'key':u'object',
... u'something':u'd\xfcabc',
... u'more':u'\u2023more',
... u'boolean':True, #this property fails with json module
... u'null':None, #this property too
... }]
... }
... """
>>> ast.literal_eval(json_string)
{u'key': [{u'boolean': True, u'null': None, u'something': u'd\xfcabc', u'key': u'object', u'more': u'\u2023more'}]}

How to convert a dictionary to query string in Python?

After using cgi.parse_qs(), how to convert the result (dictionary) back to query string? Looking for something similar to urllib.urlencode().
Python 3
urllib.parse.urlencode(query, doseq=False, [...])
Convert a mapping object or a sequence of two-element tuples, which may contain str or bytes objects, to a percent-encoded ASCII text string.
— Python 3 urllib.parse docs
A dict is a mapping.
Legacy Python
urllib.urlencode(query[, doseq])
Convert a mapping object or a sequence of two-element tuples to a “percent-encoded” string... a series of key=value pairs separated by '&' characters...
— Python 2.7 urllib docs
In python3, slightly different:
from urllib.parse import urlencode
urlencode({'pram1': 'foo', 'param2': 'bar'})
output: 'pram1=foo&param2=bar'
for python2 and python3 compatibility, try this:
try:
#python2
from urllib import urlencode
except ImportError:
#python3
from urllib.parse import urlencode
You're looking for something exactly like urllib.urlencode()!
However, when you call parse_qs() (distinct from parse_qsl()), the dictionary keys are the unique query variable names and the values are lists of values for each name.
In order to pass this information into urllib.urlencode(), you must "flatten" these lists. Here is how you can do it with a list comprehenshion of tuples:
query_pairs = [(k,v) for k,vlist in d.iteritems() for v in vlist]
urllib.urlencode(query_pairs)
Maybe you're looking for something like this:
def dictToQuery(d):
query = ''
for key in d.keys():
query += str(key) + '=' + str(d[key]) + "&"
return query
It takes a dictionary and convert it to a query string, just like urlencode. It'll append a final "&" to the query string, but return query[:-1] fixes that, if it's an issue.

Converting string to tuple and adding to tuple

I have a config file like this.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
I need to loop through the values and convert them to tuples.
I then need to make a tuple of the tuples like
((2,2,10,10), (12,8,2,10))
Instead of using a regex or int/string functions, you could also use the ast module's literal_eval function, which only evaluates strings that are valid Python literals. This function is safe (according to the docs).
http://docs.python.org/library/ast.html#ast.literal_eval
import ast
ast.literal_eval("(1,2,3,4)") # (1,2,3,4)
And, like others have said, ConfigParser works for parsing the INI file.
To turn the strings into tuples of ints (which is, I assume, what you want), you can use a regex like this:
x = "(1,2,3)"
t = tuple(int(v) for v in re.findall("[0-9]+", x))
And you can use, say, configparser to parse the config file.
Considering that cp is the ConfigParser object for the cfg file having the config.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
>> import ast
>> tuple(ast.literal_eval(v[1]) for v in cp.items('rects'))
((2,2,10,10), (12,8,2,10))
Edit : Changed eval() to a safer version literal_eval()
From python docs - literal_eval() does following :
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None
You can simply make a tuple of tuples like
new_tuple = (rect1,rect2) # ((2,2,10,10), (12,8,2,10))
If you want to loop through values
for i in rect1+rect2:
print i
If you want to regroup the numbers you could do
tuple_regrouped = zip(rect1,rect2) #((2,12),(2,8),(10,2), (10,10))
EDIT:
Didn't notice the string part. If you have lines in strings, like from reading a config file, you can do something like
# line = "rect1 = (1,2,3,4)"
config_dict = {}
var_name, tuple_as_str = line.replace(" ","").split("=")
config_dict[var_name] = tuple([int(i) for i in tuple_as_str[1:-1].split(',')])
# and now you'd have config_dict['rect1'] = (1,2,3,4)
The easiest way to do this would be to use Michael Foord's ConfigObject library. It has an unrepr mode, which'll directly convert the string into a tuple for you.

Categories

Resources