I have strings such as:
'[1, 2, 3]'
and
"{'a': 1, 'b': 2}"
How do I convert them to list/dict?
Someone mentions that ast.literal_eval or eval can parse a string that converts to list/dict.
What's the difference between ast.literal_eval and eval?
ast.literal_eval parses 'abstract syntax trees.' You nearly have json there, for which you could use json.loads, but you need double quotes, not single quotes, for dictionary keys to be valid.
import ast
result = ast.literal_eval("{'a': 1, 'b': 2}")
assert type(result) is dict
result = ast.literal_eval("[1, 2, 3]")
assert type(result) is list
As a plus, this has none of the risk of eval, because it doesn't get into the business of evaluating functions. eval("subprocess.call(['sudo', 'rm', '-rf', '/'])") could remove your root directory, but ast.literal_eval("subprocess.call(['sudo', 'rm', '-rf', '/'])") fails predictably, with your file system intact.
Use the eval function:
l = eval('[1, 2, 3]')
d = eval("{'a':1, 'b': 2}")
Just make sure you know where these strings came from and that you aren't allowing user input to be evaluated and do something malicious.
python script to convert this string to dict : -
import json
inp_string = '{"1":"one", "2":"two"}'
out = json.loads(inp_string)
print out["1"]
O/P is like :
"one"
You can eval() but only with safe data. Otherwise, if you parse unsafe data, take a look into safer ast.literal_eval().
JSON parser is also a possibility, most of python dicts and lists have the same syntax.
You can convert string to list/dict by ast.literal_eval() or eval() function. ast.literal_eval() only considers a small subset of Python's syntax to be valid:
The string or node provided may only consist of the following Python
literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None.
Passing __import__('os').system('rm -rf /') into ast.literal_eval() will raise an error, but eval() will happily wipe your drive.
Since it looks like you're only letting the user input a plain dictionary, use ast.literal_eval(). It safely does what you want and nothing more.
Related
So given the a a textual representation of a list
['cellular organisms', 'Bacteria', 'Bacteroidetes/Chlorobi group', 'Bacteroidetes', 'Bacteroidia', 'Bacteroidales', 'Bacteroidaceae', 'Bacteroides', 'Bacteroides vulgatus']
What is the easiest way to convert this text back into an actual list within a python script?
Is split really the best way? Thanks!
>>> import ast
>>> a = "['cellular organisms', 'Bacteria', 'Bacteroidetes/Chlorobi group', 'Bacteroidetes', 'Bacteroidiroidales', 'Bacteroidaceae', 'Bacteroides', 'Bacteroides vulgatus']"
>>> ast.literal_eval(a)
['cellular organisms', 'Bacteria', 'Bacteroidetes/Chlorobi group', 'Bacteroidetes', 'Bacteroidia', 'Bacteroidales', 'Bacteroidaceae', 'Bacteroides', 'Bacteroides vulgatus']
From the ast module:
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
This can be used for safely evaluating strings containing Python expressions from untrusted sources without the need to parse the values oneself.
I'm new to Python and blocking on this problem:
trying to go from a string like this:
mystring = '[ [10, 20], [20,50], [ [0,400], [50, 328], [22, 32] ], 30, 12 ]'
to the nested list that is represented by the string. Basically, the reverse of
str(mylist)
If I try the obvious option
list(mystring)
it separates each character into a different element and I lose the nesting.
Is there an attribute to the list or str types that does this that I missed in the doc (I use Python 3.3)? Or do I need to code a function that does this?
additionnaly, how would you go about implementing that function? I have no clue what would be required to create nested lists of arbitrary depth...
Thanks,
--Louis H.
Call the ast.literal_eval function on the string.
To implement it by oneself, one could use a recursive function which would convert the string into a list of strings which represent lists. Then those strings would be passed to the function and so on.
If I try the obvious solution list(mystring) it separates each character into a different element and I lose the nesting.
This is because list() actually generates a list out of an iterable, which list() converts into a iterator using the __iter__() method of strings. When a string is converted into an iterator, each character is generated.
Alternately if you're looking to do this for a more general conversion from strings to objects I would suggest using the json module. Works with dictionaries, and returns a tried and true specification that can be readily used throughout the developer and web space.
import json
nested_list = json.reads(mystring)
# You can even go the other way
mystring == json.dumps(nested_list)
>>> True
Additionally, there are convenient methods for dealing directly with files that contain this kind of string representation:
# Instead of
data_structure = json.loads(open(filename).read())
# Just
data_structure = json.load(filename)
The same works in reverse with dump instead of load
If you want to know why you should use json instead of ast.literal_eval(), it's an extremely established point and you should read this question.
I have unicode u"{'code1':1,'code2':1}" and I want it in dictionary format.
I want it in {'code1':1,'code2':1} format.
I tried unicodedata.normalize('NFKD', my_data).encode('ascii','ignore') but it returns string not dictionary.
Can anyone help me?
You can use built-in ast package:
import ast
d = ast.literal_eval("{'code1':1,'code2':1}")
Help on function literal_eval in module ast:
literal_eval(node_or_string)
Safely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
You can use literal_eval. You may also want to be sure you are creating a dict and not something else. Instead of assert, use your own error handling.
from ast import literal_eval
from collections import MutableMapping
my_dict = literal_eval(my_str_dict)
assert isinstance(my_dict, MutableMapping)
EDIT: Turns out my assumption was incorrect; because the keys are not wrapped in double-quote marks ("), the string isn't JSON. See here for some ways around this.
I'm guessing that what you have might be JSON, a.k.a. JavaScript Object Notation.
You can use Python's built-in json module to do this:
import json
result = json.loads(u"{'code1':1,'code2':1}") # will NOT work; see above
I was getting unicode error when I was reading a json from a file. So this one worked for me.
import ast
job1 = {}
with open('hostdata2.json') as f:
job1= json.loads(f.read())
f.close()
#print type before converting this from unicode to dic would be <type 'unicode'>
print type(job1)
job1 = ast.literal_eval(job1)
print "printing type after ast"
print type(job1)
# this should result <type 'dict'>
for each in job1:
print each
print "printing keys"
print job1.keys()
print "printing values"
print job1.values()
You can use the builtin eval function to convert the string to a python object
>>> string_dict = u"{'code1':1, 'code2':1}"
>>> eval(string_dict)
{'code1': 1, 'code2': 1}
I'm trying to (slightly) improve a script that does a quick-and-hacky parse of some config files.
Upon recognising "an item" read from the file, I need to try to convert it into a simple python value. The value could be a number or a string.
To convert strings read from the file into Python numbers I can just use int or float and catch the ValueError if it wasn't actually a number. Is there something similar for Python strings? i.e.
s1 = 'Goodbye World. :('
s2 = repr(s1)
s3 = ' "not a string literal" '
s4 = s3.strip()
v1 = parse_string_literal(s1) # throws ValueError
v2 = parse_string_literal(s2) # returns 'Goodby World. :('
v3 = parse_string_literal(s3) # throws ValueError
v4 = parse_string_literal(s4) # returns 'not a string literal'
In the file, string values are represented very similarly to Python string literals; they can be quoted with either ' or ", and could contain backslash escapes, etc. I could roll my own parser with regexes, but if there's something already existing I'd rather not re-invent the wheel.
I could use eval of course, but that's always somewhat dangerous.
... And sure enough, I just found the answer after I posted.
Even better than what I was looking for is ast.literal_eval: ast — Abstract Syntax Trees
It can evaluate any Python expression consisting solely of literals, which makes it safe. It also means I can recognise items from the config file that are potentially numbers or strings without having attempt multiple conversions, falling back to the next conversion on a ValueError exception. I don't even have to figure out what type the item is.
It's even way more flexible than I need, which could be a problem if I cared about making sure the item was only a number or a string, but I don't:
>>> ast.literal_eval('{"foo": [23.8, 170, (1, 2, 3)]}')
{'foo': [23.8, 170, (1, 2, 3)]}
ast.literal_eval() handles all simple Python literals, and most compound literals.
I have a config file like this.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
I need to loop through the values and convert them to tuples.
I then need to make a tuple of the tuples like
((2,2,10,10), (12,8,2,10))
Instead of using a regex or int/string functions, you could also use the ast module's literal_eval function, which only evaluates strings that are valid Python literals. This function is safe (according to the docs).
http://docs.python.org/library/ast.html#ast.literal_eval
import ast
ast.literal_eval("(1,2,3,4)") # (1,2,3,4)
And, like others have said, ConfigParser works for parsing the INI file.
To turn the strings into tuples of ints (which is, I assume, what you want), you can use a regex like this:
x = "(1,2,3)"
t = tuple(int(v) for v in re.findall("[0-9]+", x))
And you can use, say, configparser to parse the config file.
Considering that cp is the ConfigParser object for the cfg file having the config.
[rects]
rect1=(2,2,10,10)
rect2=(12,8,2,10)
>> import ast
>> tuple(ast.literal_eval(v[1]) for v in cp.items('rects'))
((2,2,10,10), (12,8,2,10))
Edit : Changed eval() to a safer version literal_eval()
From python docs - literal_eval() does following :
Safely evaluate an expression node or a string containing a Python
expression. The string or node provided may only consist of the following
Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
and None
You can simply make a tuple of tuples like
new_tuple = (rect1,rect2) # ((2,2,10,10), (12,8,2,10))
If you want to loop through values
for i in rect1+rect2:
print i
If you want to regroup the numbers you could do
tuple_regrouped = zip(rect1,rect2) #((2,12),(2,8),(10,2), (10,10))
EDIT:
Didn't notice the string part. If you have lines in strings, like from reading a config file, you can do something like
# line = "rect1 = (1,2,3,4)"
config_dict = {}
var_name, tuple_as_str = line.replace(" ","").split("=")
config_dict[var_name] = tuple([int(i) for i in tuple_as_str[1:-1].split(',')])
# and now you'd have config_dict['rect1'] = (1,2,3,4)
The easiest way to do this would be to use Michael Foord's ConfigObject library. It has an unrepr mode, which'll directly convert the string into a tuple for you.