json dump with dict as value - python

This question clearly explains how to send a response using json and python dicts. The example however uses String as the value type in this dict. How would one do this with dict as value type? That is a dict with with dict as a value.

To clarify adityasdarma1's comment: this is not a limitation of Python or Django, but of JSON. In JSON, object keys must always be strings. There is no "tuple" type in JSON or JavaScript anyway; and arrays can't be keys because they are mutable. (In Python, tuples can be dict keys, but lists can't.)
I'm not sure why you would need that, though. You can either concatenate the values in some way to make a string key - eg "bar-baz" - or alternatively you might need a more complex nested structure, with bar as the key of the outer dict and baz as an inner one. Without seeing your full data structure, it's hard to advise further.

By giving it a dict with a dict as a value type.
>>> json.dumps({'foo': {'bar': 42}})
'{"foo": {"bar": 42}}'

Related

How do i access the sub details in the JSON using python?

the python program that i am writing calls to an api that returns this json:
Code Output
How do i access the subdetails? When i run the .keys() it only lists those three top levels. I want to be able to get specific items, e.g. "Utility"
I've tried several solutions but none parse correctly. I have tried calling the list inside the dictionary, to no avail. Originally i thought it was a dictionary inside of a dictionary, but Python thinks its a list nested into a dictionary.
Any help would be appreciated!
keys() function only returns the keys of dictionary, so it you call keys(), it will only return the three result. The "subdetails" you are referring to are the values of those keys. For key "SUMMARY" as an example, its value is a list instead of dict (note the "[" after the key). However, the list only has a single element. This is quite common in json. To retrive "Utility", all you need to do is data['SUMMARY'][0]['Utility']
Maybe to help you understand the data structure better, call the "values()" and "items()" function to see what it returns.
Since it's a dict of lists of dicts, simply use an index of 0 to access the first item of the list if there is always only one item in each list. For example, if your JSON object is stored as variable data, then the value of Utility can be accessed with data['SUMMARY'][0]['Utility'].

How to print dict sorted by keys?

My use case involves printing a json. To aid legibility I want to print it sorted by key. dict comes into the picture as in my case json.loads returns a dict.
Things I tried:
dict.__str__ = myStrFn which results in TypeError: can't set
attributes of built-in/extension type 'dict'
Write myDict along the lines of
https://stackoverflow.com/a/931822/438758. This does not work for
nested dictionaries as the nested dictionaries are of type dict
and not myDict.
What are my options here? I would prefer something which makes print(json.loads(json_str)) work. But would settle for print(str_func(json.loads(json_str))).
If you have a solution specific to my json use case, that would be great too. But I would prefer a generic answer. I am aware that dict keys only need to be hashable and not "comparable" (in the sense that there might not be a total order), so an absolutely generic solution might not be possible. But I am inclined to believe that we can have a solution for all valid JSON types.
I am using python3
print(json.dumps(your_dict, sort_keys=True))

Why does CouchDb-python (or do I) confuse strings and dictionaries?

I'm trying to use the Python wrapper for CouchDB to update a database. The file is structured as a nested dictionary as follows.
doc = { ...,
'RLSoo': {'RT_freq': 2, 'tweet': "They're going to play monopoly now.
This makes me feel like an excellent mother. #Sandy #NYC"},
'GiltCityNYC': {},
....}
I would like to put each entry of the larger dicitionary, for example RLSoo into its own document. However, I get an error message when I try the following code.
for key in doc:
db.update(doc[key],all_or_nothing=True)
Error Message
TypeError: expected dict, got <type 'str'>
I don't understand why CouchDB won't accept the dictionary.
According Database.update() method realization and his documentation, first argument should be list of document objects (e.g. list of dicts). Since you doc variable has dict type, direct iteration over it actually iterates over all his keys which are string typed. If I understood your case right, probably your doc contains nested documents as values. So, try just:
db.update(doc.values(), all_or_nothing=True)
And it all first level values are dicts, it should works!

Python, checksum of a dict

I'm thinking to create a checksum of a dict to know if it was modified or not
For the moment i have that:
>>> import hashlib
>>> import pickle
>>> d = {'k': 'v', 'k2': 'v2'}
>>> z = pickle.dumps(d)
>>> hashlib.md5(z).hexdigest()
'8521955ed8c63c554744058c9888dc30'
Perhaps a better solution exists?
Note: I want to create an unique id of a dict to create a good Etag.
EDIT: I can have abstract data in the dict.
Something like this:
reduce(lambda x,y : x^y, [hash(item) for item in d.items()])
Take the hash of each (key, value) tuple in the dict and XOR them alltogether.
#katrielalex
If the dict contains unhashable items you could do this:
hash(str(d))
or maybe even better
hash(repr(d))
In Python 3, the hash function is initialized with a random number, which is different for each python session. If that is not acceptable for the intended application, use e.g. zlib.adler32 to build the checksum for a dict:
import zlib
d={'key1':'value1','key2':'value2'}
checksum=0
for item in d.items():
c1 = 1
for t in item:
c1 = zlib.adler32(bytes(repr(t),'utf-8'), c1)
checksum=checksum ^ c1
print(checksum)
I would recommend an approach very similar to the one your propose, but with some extra guarantees:
import hashlib, json
hashlib.md5(json.dumps(d, sort_keys=True, ensure_ascii=True).encode('utf-8')).hexdigest()
sort_keys=True: keep the same hash if the order of your keys changes
ensure_ascii=True: in case you have some non-ascii characters, to make sure the representation does not change
We use this for our ETag.
I don't know whether pickle guarantees you that the hash is serialized the same way every time.
If you only have dictionaries, I would go for o combination of calls to keys(), sorted(), build a string based on the sorted key/value pairs and compute the checksum on that
I think you may not realise some of the subtleties that go into this. The first problem is that the order that items appear in a dict is not defined by the implementation. This means that simply asking for str of a dict doesn't work, because you could have
str(d1) == "{'a':1, 'b':2}"
str(d2) == "{'b':2, 'a':1}"
and these will hash to different values. If you have only hashable items in the dict, you can hash them and then join up their hashes, as #Bart does or simply
hash(tuple(sorted(hash(x) for x in d.items())))
Note the sorted, because you have to ensure that the hashed tuple comes out in the same order irrespective of which order the items appear in the dict. If you have dicts in the dict, you could recurse this, but it will be complicated.
BUT it would be easy to break any implementation like this if you allow arbitrary data in the dictionary, since you can simply write an object with a broken __hash__ implementation and use that. And you can't use id, because then you might have equal items which compare different.
The moral of the story is that hashing dicts isn't supported in Python for a reason.
As you said, you wanted to generate an Etag based on the dictionary content, OrderedDict which preserves the order of the dictionary may be better candidate here. Just iterator through the key,value pairs and construct your Etag string.

Python's json module, converts int dictionary keys to strings

I have found that when the following is run, python's json module (included since 2.6) converts int dictionary keys to strings.
>>> import json
>>> releases = {1: "foo-v0.1"}
>>> json.dumps(releases)
'{"1": "foo-v0.1"}'
Is there any easy way to preserve the key as an int, without needing to parse the string on dump and load.
I believe it would be possible using the hooks provided by the json module, but again this still requires parsing.
Is there possibly an argument I have overlooked?
Sub-question:
Thanks for the answers. Seeing as json works as I feared, is there an easy way to convey key type by maybe parsing the output of dumps?
Also I should note the code doing the dumping and the code downloading the json object from a server and loading it, are both written by me.
This is one of those subtle differences among various mapping collections that can bite you. JSON treats keys as strings; Python supports distinct keys differing only in type.
In Python (and apparently in Lua) the keys to a mapping (dictionary or table, respectively) are object references. In Python they must be immutable types, or they must be objects which implement a __hash__ method. (The Lua docs suggest that it automatically uses the object's ID as a hash/key even for mutable objects and relies on string interning to ensure that equivalent strings map to the same objects).
In Perl, Javascript, awk and many other languages the keys for hashes, associative arrays or whatever they're called for the given language, are strings (or "scalars" in Perl). In perl $foo{1}, $foo{1.0}, and $foo{"1"} are all references to the same mapping in %foo --- the key is evaluated as a scalar!
JSON started as a Javascript serialization technology. (JSON stands for JavaScript Object Notation.) Naturally it implements semantics for its mapping notation which are consistent with its mapping semantics.
If both ends of your serialization are going to be Python then you'd be better off using pickles. If you really need to convert these back from JSON into native Python objects I guess you have a couple of choices. First you could try (try: ... except: ...) to convert any key to a number in the event of a dictionary look-up failure. Alternatively, if you add code to the other end (the serializer or generator of this JSON data) then you could have it perform a JSON serialization on each of the key values --- providing those as a list of keys. (Then your Python code would first iterate over the list of keys, instantiating/deserializing them into native Python objects ... and then use those for access the values out of the mapping).
No, there is no such thing as a Number key in JavaScript. All object properties are converted to String.
var a= {1: 'a'};
for (k in a)
alert(typeof k); // 'string'
This can lead to some curious-seeming behaviours:
a[999999999999999999999]= 'a'; // this even works on Array
alert(a[1000000000000000000000]); // 'a'
alert(a['999999999999999999999']); // fail
alert(a['1e+21']); // 'a'
JavaScript Objects aren't really proper mappings as you'd understand it in languages like Python, and using keys that aren't String results in weirdness. This is why JSON always explicitly writes keys as strings, even where it doesn't look necessary.
Answering your subquestion:
It can be accomplished by using json.loads(jsonDict, object_hook=jsonKeys2int)
def jsonKeys2int(x):
if isinstance(x, dict):
return {int(k):v for k,v in x.items()}
return x
This function will also work for nested dicts and uses a dict comprehension.
If you want to to cast the values too, use:
def jsonKV2int(x):
if isinstance(x, dict):
return {int(k):(int(v) if isinstance(v, unicode) else v) for k,v in x.items()}
return x
Which tests the instance of the values and casts them only if they are strings objects (unicode to be exact).
Both functions assumes keys (and values) to be integers.
Thanks to:
How to use if/else in a dictionary comprehension?
Convert a string key to int in a Dictionary
Alternatively you can also try converting dictionary to a list of [(k1,v1),(k2,v2)] format while encoding it using json, and converting it back to dictionary after decoding it back.
>>>> import json
>>>> json.dumps(releases.items())
'[[1, "foo-v0.1"]]'
>>>> releases = {1: "foo-v0.1"}
>>>> releases == dict(json.loads(json.dumps(releases.items())))
True
I believe this will need some more work like having some sort of flag to identify what all parameters to be converted to dictionary after decoding it back from json.
I've gotten bitten by the same problem. As others have pointed out, in JSON, the mapping keys must be strings. You can do one of two things. You can use a less strict JSON library, like demjson, which allows integer strings. If no other programs (or no other in other languages) are going to read it, then you should be okay. Or you can use a different serialization language. I wouldn't suggest pickle. It's hard to read, and is not designed to be secure. Instead, I'd suggest YAML, which is (nearly) a superset of JSON, and does allow integer keys. (At least PyYAML does.)
Here is my solution! I used object_hook, it is useful when you have nested json
>>> import json
>>> json_data = '{"1": "one", "2": {"-3": "minus three", "4": "four"}}'
>>> py_dict = json.loads(json_data, object_hook=lambda d: {int(k) if k.lstrip('-').isdigit() else k: v for k, v in d.items()})
>>> py_dict
{1: 'one', 2: {-3: 'minus three', 4: 'four'}}
There is filter only for parsing json key to int. You can use int(v) if v.lstrip('-').isdigit() else v filter for json value too.
Convert the dictionary to be string by using str(dict) and then convert it back to dict by doing this:
import ast
ast.literal_eval(string)
I made a very simple extension of Murmel's answer which I think will work on a pretty arbitrary dictionary (including nested) assuming it can be dumped by JSON in the first place. Any keys which can be interpreted as integers will be cast to int. No doubt this is not very efficient, but it works for my purposes of storing to and loading from json strings.
def convert_keys_to_int(d: dict):
new_dict = {}
for k, v in d.items():
try:
new_key = int(k)
except ValueError:
new_key = k
if type(v) == dict:
v = _convert_keys_to_int(v)
new_dict[new_key] = v
return new_dict
Assuming that all keys in the original dict are integers if they can be cast to int, then this will return the original dictionary after storing as a json.
e.g.
>>>d = {1: 3, 2: 'a', 3: {1: 'a', 2: 10}, 4: {'a': 2, 'b': 10}}
>>>convert_keys_to_int(json.loads(json.dumps(d))) == d
True
[NSFW] You can write your json.dumps by yourself, here is a example from djson: encoder.py. You can use it like this:
assert dumps({1: "abc"}) == '{1: "abc"}'

Categories

Resources