A very naive question but is there a robust or better way to do following.
Say it has nothing to do with json actually.
let say I have list (reading from file)
string_list = [ "foo",1,None, "null","[]","bar"]
Now, null and [] are essentially equivalent of null but different data structures have different interpretation of "None"?? right?
So rather than me writing a regex for all these rules.. is there a better way to convert "null","[]" etc to None.. ??
Thanks
Define a set of values that should be replaced with None and use list comprehension to "replace" them:
>>> string_list = [ "foo",1,None, "null","[]","bar"]
>>> none_items = {"null", "[]"} # or set(("null", "[]"))
>>> [None if item in none_items else item for item in string_list]
['foo', 1, None, None, None, 'bar']
Or, use map():
>>> map(lambda x: None if x in none_items else x, string_list)
['foo', 1, None, None, None, 'bar']
Using set because of O(1) lookups.
You could try:
string_list = [ "foo",1,None, "null","[]","bar"]
nones = [ "null", "[]" ]
print([None if s in nones else s for s in string_list])
1) You shouldn't be converting anything to None.
2) The first thing you want to do is convert to json. The json module will convert null to None, so you don't have to worry about null. And empty json strings, arrays, and objects, will be converted to empty python strings, lists, and dicts, so you won't be dealing with strings at all.
3) Then if you want to filter out the empty objects, you can do things like this:
import json
my_data = json.loads("""
[
"hello",
"",
[],
{},
[1, 2, 3],
{"a": 1, "b": 2}
]
""")
print(my_data)
print([x for x in my_data if x])
--output:--
['hello', '', [], {}, [1, 2, 3], {'a': 1, 'b': 2}]
['hello', [1, 2, 3], {'a': 1, 'b': 2}]
Empty objects(including 0) evaluate to False.
Related
Im trying to print specific value in this dictionary compiled of several list.
the_dic = {'k1':[1,2,3,{'tricky':['oh','man','inception',{'target':[1,2,3,'hello']}]}]}
Expected output: hello
Can i expand from this simple code below?
print(the_dic["k1"])
You can keep indexing further by referring to the specific index (of the list) [3] or further keys in the dictionary!
This works because the result of each indexing or key reference returns the inner value, which exposes its own methods for indexing or referring to its keys (or a single value, ending your chain!) depending on the inner type
Continuing the chain little further
>>> d = {'k1':[1,2,3,{'tricky':['oh','man','inception',{'target':[1,2,3,'hello']}]}]}
>>> d["k1"]
[1, 2, 3, {'tricky': ['oh', 'man', 'inception', {'target': [1, 2, 3, 'hello']}]}]
>>> d["k1"][3]
{'tricky': ['oh', 'man', 'inception', {'target': [1, 2, 3, 'hello']}]}
>>> d["k1"][3]["tricky"]
['oh', 'man', 'inception', {'target': [1, 2, 3, 'hello']}]
Offering a simpler example, it may be clearer still
>>> d = {'a': {'b': [1,2,3]}}
>>> d['a']['b'] # list referenced by key 'b' within key 'a' of d
[1, 2, 3]
>>> d['a']['b'][0] # first member of the very innermost list
1
I think for your desired output your final code will be:
the_dic = {'k1':[1,2,3,{'tricky':['oh','man','inception',{'target':[1,2,3,'hello']}]}]}
print(the_dic['k1'][3]['tricky'][3]['target'][3])
Result:
hello
A helper recursive function such as this should do the trick:
def get_last_val(d):
if isinstance(d, list):
return get_last_val(d[-1])
if isinstance(d, dict):
return get_last_val(d.popitem()[1])
return d
Example:
>>> get_last_val(the_dic)
'hello'
I am trying to create new variables depending on how many elements are stored inside a list.
It would have to look something like this:
dict = {}
list = [1, 2, 3, 4]
for obj in list:
if obj not in dict:
dict[obj] = var + obj = None
How do I do this?
You could do something like this:
>>> l = [1, 2, 3, 4]
>>> d = {'var%d' % el: None for el in l}
>>> d
{'var4': None, 'var1': None, 'var3': None, 'var2': None}
If you want to add new item in dictionary using value from list as a key, you can try this
dict[obj] = new_value
You can use the constructor classmethod dict.fromkeys to pre-add any number of keys to a dictionary:
d = dict.fromkeys(f'var{x}' for x in lst)
OR
d = dict.fromkeys(map('var{}'.format, lst))
Please don't use the variable names dict and list. They overwrite the built-in names in your names, so you won't be able to use the built-in classes until you do del list and del dict.
Here is an example with a longer list with some duplicates:
myDict = {}
myList = [1, 2, 3, 4, 3, 2, 8, 1]
for obj in myList:
if obj not in myDict:
myDict["var" + str(obj)] = None
The output is:
myDict {'var1': None, 'var2': None, 'var3': None, 'var4': None, 'var8': None}
my_list = [ [1,2], [2,3], [3,4] ]
# my attempt
output = { {'a':k[0], 'b':k[1]} for k in my_list }
#desired output
[ {a:1, b:2}, {a:2, b:3}, {a:3,b:4} ]
Is there a way to get the dict comprehension to return a dict, with multiple keys?
Maybe you wanted to do this :
output = [ {'a':k[0], 'b':k[1]} for k in my_list ]
# ^ ^
which we call a list-comprehension in python.
Your outer structure ought to be a list for your output, and you are incorrectly attempting to perform a set comprehension as opposed to a list comprehension. This fails because set elements must be hashable, and dicts are not as they are mutable. Additionally, you can unpack the list items to be a bit more clear in this case.
>>> [dict(a=x, b=y) for x, y in my_list]
[{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 4}]
my_list = [ [1,2], [2,3], [3,4] ]
[dict(zip(['a', 'b'], x)) for x in my_list]
index = {
u'when_air': 0,
u'chrono': 1,
u'age_marker': 2,
u'name': 3
}
How can I make this more beautiful (and clear) way than just manually setting each value?
like:
index = dict_from_range(
[u'when_air', u'chrono', u'age_marker', u'name'],
range(4)
)
You can feed the results of zip() to the builtin dict():
>>> names = [u'when_air', u'chrono', u'age_marker', u'name']
>>> print(dict(zip(names, range(4))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}
zip() will return a list of tuples, where each tuple is the ith element from names and range(4). dict() knows how to create a dictionary from that.
Notice that if you give sequences of uneven lengths to zip(), the results are truncated. Thus it might be smart to use range(len(names)) as the argument, to guarantee an equal length.
>>> print(dict(zip(names, range(len(names)))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}
You can use a dict comprehension together with the built-in function enumerate to build the dictionary from the keys in the desired order.
Example:
keys = [u'when_air', u'chrono', u'age_marker', u'name']
d = {k: i for i,k in enumerate(keys)}
print d
The output is:
{u'age_marker': 2, u'when_air': 0, u'name': 3, u'chrono': 1}
Note that with Python 3.4 the enum module was added. It may provide the desired semantics more conveniently than a dictionary.
For reference:
http://legacy.python.org/dev/peps/pep-0274/
https://docs.python.org/2/library/functions.html#enumerate
https://docs.python.org/3/library/enum.html
index = {k:v for k,v in zip(['when_air','chrono','age_marker','name'],range(4))}
This?
#keys = [u'when_air', u'chrono', u'age_marker', u'name']
from itertools import count
print dict(zip(keys, count()))
I use this code to pretty print a dict into JSON:
import json
d = {'a': 'blah', 'b': 'foo', 'c': [1,2,3]}
print json.dumps(d, indent = 2, separators=(',', ': '))
Output:
{
"a": "blah",
"c": [
1,
2,
3
],
"b": "foo"
}
This is a little bit too much (newline for each list element!).
Which syntax should I use to have this:
{
"a": "blah",
"c": [1, 2, 3],
"b": "foo"
}
instead?
I ended up using jsbeautifier:
import jsbeautifier
opts = jsbeautifier.default_options()
opts.indent_size = 2
jsbeautifier.beautify(json.dumps(d), opts)
Output:
{
"a": "blah",
"c": [1, 2, 3],
"b": "foo"
}
After years, I found a solution with the built-in pprint module:
import pprint
d = {'a': 'blah', 'b': 'foo', 'c': [1,2,3]}
pprint.pprint(d) # default width=80 so this will be printed in a single line
pprint.pprint(d, width=20) # here it will be wrapped exactly as expected
Output:
{'a': 'blah',
'b': 'foo',
'c': [1, 2, 3]}
Another alternative is print(json.dumps(d, indent=None, separators=(',\n', ': ')))
The output will be:
{"a": "blah",
"c": [1,
2,
3],
"b": "foo"}
Note that though the official docs at https://docs.python.org/2.7/library/json.html#basic-usage say the default args are separators=None --that actually means "use default of separators=(', ',': ') ). Note also that the comma separator doesn't distinguish between k/v pairs and list elements.
I couldn't get jsbeautifier to do much, so I used regular expressions. Had json pattern like
'{\n "string": [\n 4,\n 1.0,\n 6,\n 1.0,\n 8,\n 1.0,\n 9,\n 1.0\n ],\n...'
that I wanted as
'{\n "string": [ 4, 1.0, 6, 1.0, 8, 1.0, 9, 1.0],\n'
so
t = json.dumps(apriori, indent=4)
t = re.sub('\[\n {7}', '[', t)
t = re.sub('(?<!\]),\n {7}', ',', t)
t = re.sub('\n {4}\]', ']', t)
outfile.write(t)
So instead of one "dump(apriori, t, indent=4)", I had those 5 lines.
This has been bugging me for a while as well, I found a 1 liner I'm almost happy with:
print json.dumps(eval(str(d).replace('[', '"[').replace(']', ']"').replace('(', '"(').replace(')', ')"')), indent=2).replace('\"\\"[', '[').replace(']\\"\"', ']').replace('\"\\"(', '(').replace(')\\"\"', ')')
That essentially convert all lists or tuples to a string, then uses json.dumps with indent to format the dict. Then you just need to remove the quotes and your done!
Note: I convert the dict to string to easily convert all lists/tuples no matter how nested the dict is.
PS. I hope the Python Police won't come after me for using eval... (use with care)
Perhaps not quite as efficient, but consider a simpler case (somewhat tested in Python 3, but probably would work in Python 2 also):
def dictJSONdumps( obj, levels, indentlevels = 0 ):
import json
if isinstance( obj, dict ):
res = []
for ix in sorted( obj, key=lambda x: str( x )):
temp = ' ' * indentlevels + json.dumps( ix, ensure_ascii=False ) + ': '
if levels:
temp += dictJSONdumps( obj[ ix ], levels-1, indentlevels+1 )
else:
temp += json.dumps( obj[ ix ], ensure_ascii=False )
res.append( temp )
return '{\n' + ',\n'.join( res ) + '\n}'
else:
return json.dumps( obj, ensure_ascii=False )
This might give you some ideas, short of writing your own serializer completely. I used my own favorite indent technique, and hard-coded ensure_ascii, but you could add parameters and pass them along, or hard-code your own, etc.