Python force dict entries to be utf-8 - python

I spent the better part of an afternoon trying to patch dictionary objects to be utf-8 encoded in lieu of unicode. I am trying to find the fastest and best performing way to extend a dictionary object and ensure that it's entries, keys and values are both utf-8.
Here is what I have come up with, it does the job but I'm wondering what improvements could be made.
class UTF8Dict(dict):
def __init__(self, *args, **kwargs):
d = dict(*args, **kwargs)
d = _decode_dict(d)
super(UTF8Dict,self).__init__(d)
def __setitem__(self,key,value):
if isinstance(key,unicode):
key = key.encode('utf-8')
if isinstance(value,unicode):
value = value.encode('utf-8')
return super(UTF8Dict,self).__setitem__(key,value)
def _decode_list(data):
rv = []
for item in data:
if isinstance(item, unicode):
item = item.encode('utf-8')
elif isinstance(item, list):
item = _decode_list(item)
elif isinstance(item, dict):
item = _decode_dict(item)
rv.append(item)
return rv
def _decode_dict(data):
rv = {}
for key, value in data.iteritems():
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')
elif isinstance(value, list):
value = _decode_list(value)
elif isinstance(value, dict):
value = _decode_dict(value)
rv[key] = value
return rv
Suggestions that improve any of the following would be very helpful:
Performance
Cover more edge-cases
Error handling

I agree with the comments that say that this may be misguided. That said, here are some holes in your current scheme:
d.setdefault can be used to add unicode objects to your dict:
>>> d = UTF8Dict()
>>> d.setdefault(u'x', u'y')
d.update can be used to add unicode objects to your dict:
>>> d = UTF8Dict()
>>> d.update({u'x': u'y'})
the list values contained in a dict could be modified to include unicode objects, using any standard list operations. E.g.:
>>> d = UTF8Dict(x=[])
>>> d['x'].append(u'x')
Why do you want to ensure that your data structure contains only utf-8 strings?

Related

Use dictionary as key for another dictionary? [duplicate]

Python doesn't allow dictionaries to be used as keys in other dictionaries. Is there a workaround for using non-nested dictionaries as keys?
The general problem with more complicated non-hashable objects and my specific use case has been moved here. My original description of my use case was incorrect.
If you have a really immutable dictionary (although it isn't clear to me why you don't just use a list of pairs: e.g. [('content-type', 'text/plain'), ('host', 'example.com')]), then you may convert your dict into:
A tuple of pairs. You've already done that in your question. A tuple is required instead of list because the results rely on the ordering and the immutability of the elements.
>>> tuple(sorted(a.items()))
A frozen set. It is a more suitable approach from the mathematical point of view, as it requires only the equality relation on the elements of your immutable dict, while the first approach requires the ordering relation besides equality.
>>> frozenset(a.items())
If I needed to use dictionaries as keys, I would flatten the dictionary into a tuple of tuples.
You might find this SO question useful: What is the best way to implement nested dictionaries?
And here is an example of a flatten module that will flatten dictionaries: http://yawpycrypto.sourceforge.net/html/public/Flatten.Flatten-module.html
I don't fully understand your use case and I suspect that you are trying to prematurely optimize something that doesn't need optimization.
To turn a someDictionary into a key, do this
key = tuple(sorted(someDictionary .items())
You can easily reverse this with dict( key )
One way to do this would be to subclass the dict and provide a hash method. ie:
class HashableDict(dict):
def __hash__(self):
return hash(tuple(sorted(self.iteritems())))
>>> d = HashableDict(a=1, b=2)
>>> d2 = { d : "foo"}
>>> d2[HashableDict(a=1, b=2)]
"foo"
However, bear in mind the reasons why dicts (or any mutable types) don't do this: mutating the object after it has been added to a hashtable will change the hash, which means the dict will now have it in the wrong bucket, and so incorrect results will be returned.
If you go this route, either be very sure that dicts will never change after they have been put in the other dictionary, or actively prevent them (eg. check that the hash never changes after the first call to __hash__, and throw an exception if not.)
Hmm, isn't your use case just memoizing function calls? Using a decorator, you will have easy support for arbitrary functions. And yes, they often pickle the arguments, and using circular reasoning, this works for non-standard types as long as they can be pickled.
See e.g. this memoization sample
I'll sum up the options and add one of my own,
you can :
make a subclass to dict and provide a hash function
flatten the dict into a tuple
pickle the dict
convert the Dict into a string using the json module (as shown below)
import json
Dict = {'key' :'value123'}
stringifiedDict = json.dumps(Dict)
print(stringifiedDict)
# {"key": "value123"}
newDict = {stringifiedDict: 12345}
print(newDict[stringifiedDict])
# 12345
for key, val in newDict.items():
print(json.loads(key))
# {'key': 'value123'}
print(json.loads(key)['key'])
# value123
I don't see why you'd ever want to do this, but if you really really do need to, you could try pickling the dictionary:
mydict = {"a":1, "b":{"c":10}}
import pickle
key = pickle.dumps(mydict)
d[key] = value
this function will convert a nested dictionary to an immutable tuple of tuples which you can use as a key:
def convert_dictionary_tuple(input_dict):
"""
this function receives a nested dictionary and convert it to an immutable tuple of tuples with all the given
dictionary data
:param input_dict: a nested dictionary
:return: immutable tuple of tuples with all the given dictionary data
"""
tuples_dict = {}
for key, value in input_dict.iteritems():
if isinstance(value, dict):
tuples_dict[key] = convert_dictionary_tuple(value)
elif isinstance(value, list):
tuples_dict[key] = tuple([convert_dictionary_tuple(v) if isinstance(v, dict) else v for v in value])
else:
tuples_dict[key] = value
return tuple(sorted(tuples_dict.items()))
Class name... OK :/
My solution is to create a class, with dict features, but implemented as a list with {key, value} objects. key and value can be anything then.
class DictKeyDictException(Exception):
pass
class DictKeyDict():
def __init__(self, *args):
values = [self.__create_element(key, value) for key, value in args]
self.__values__ = values
def __setitem__(self, key, value):
self.set(key, value)
def __getitem__(self, key):
return self.get(key)
def __len__(self):
return len(self.__values__)
def __delitem__(self, key):
keys = self.keys()
if key in keys:
index = keys.index(key)
del self.__values__[index]
def clear(self):
self.__values__ = []
def copy(self):
return self.__values__.copy()
def has_key(self, k):
return k in self.keys()
def update(self, *args, **kwargs):
if kwargs:
raise DictKeyDictException(f"no kwargs allowed in '{self.__class__.__name__}.update' method")
for key, value in args:
self[key] = value
return self.__values__
def __repr__(self) -> list:
return repr(self.__values__)
#classmethod
def __create_element(cls, key, value):
return {"key": key, "value": value}
def set(self, key, value) -> None:
keys = self.keys()
if key in keys:
index = keys.index(key)
self.__values__[index] = self.__create_element(key, value)
else:
self.__values__.append(self.__create_element(key, value))
return self.__values__
def keys(self):
return [dict_key_value["key"] for dict_key_value in self.__values__]
def values(self):
return [value["value"] for value in self.__values__]
def items(self):
return [(dict_key_value["key"], dict_key_value["value"]) for dict_key_value in self.__values__]
def pop(self, key, default=None):
keys = self.keys()
if key in keys:
index = keys.index(key)
value = self.__values__.pop(index)["value"]
else:
value = default
return value
def get(self, key, default=None):
keys = self.keys()
if key in keys:
index = keys.index(key)
value = self.__values__[index]["value"]
else:
value = default
return value
def __iter__(self):
return iter(self.keys())
and usage :
dad = {"name": "dad"}
mom = {"name": "mom"}
boy = {"name": "son"}
girl = {"name": "daughter"}
# set
family = DictKeyDict()
family[dad] = {"age": 44}
family[mom] = {"age": 43}
# or
family.set(dad, {"age": 44, "children": [boy, girl]})
# or
family = DictKeyDict(
(dad, {"age": 44, "children": [boy, girl]}),
(mom, {"age": 43, "children": [boy, girl]}),
)
# update
family.update((mom, {"age": 33})) # oups sorry miss /!\ loose my children
family.set({"pet": "cutty"}, "cat")
del family[{"pet": "cutty"}] # cutty left...
family.set({"pet": "buddy"}, "dog")
family[{"pet": "buddy"}] = "wolf" # buddy was not a dog
print(family.keys())
print(family.values())
for k, v in family.items():
print(k, v)
I don't know whether I understand your question correctly, but i'll give it a try
d[repr(a)]=value
You can interate over the dictionary like this
for el1 in d:
for el2 in eval(el1):
print el2,eval(el1)[el2]

Python: Can't find unicode field causing bson.errors.InvalidDocument during mongo insert

I am using pymongo to insert a complex structure as a row in a collection. The structure is a dict of list of dicts of lists of dicts etc..
Is there a way to find which field is unicode instead of str, that causes the error? I have tried:
def dump(obj):
with open('log', 'w') as flog:
for attr in dir(obj):
t, att = type(attr), getattr(obj, attr)
output = "obj.%s = %s" % (t, att)
flog.write(output)
but no luck so far.
Any clever recursive way to print everything maybe?
Thanks
The following helped me to find out which dict contained unicode values, since a dict can be identified by its keys. The list-case doesn't help.
def find_the_damn_unicode(obj):
if isinstance(obj, unicode):
''' The following conversion probably doesn't do anything meaningfull since
obj is probably a primitive type, thus passed by value. Thats why encoding
is also performed inside the for loops below'''
obj = obj.encode('utf-8')
return obj
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, unicode):
print 'UNICODE value with key ', k
obj[k] = obj[k].encode('utf-8')
else:
obj[k] = find_the_damn_unicode(v)
if isinstance(obj, list):
for i, v in enumerate(obj):
if isinstance(v, unicode):
print 'UNICODE inside a ... list'
obj[i] = obj[i].encode('utf-8')
else:
obj[i] = find_the_damn_unicode(v)
return obj

Finding a key recursively in a dictionary

I'm trying to write a very simple function to recursively search through a possibly nested (in the most extreme cases ten levels deep) Python dictionary and return the first value it finds from the given key.
I cannot understand why my code doesn't work for nested dictionaries.
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
_finditem(v, key)
print _finditem({"B":{"A":2}},"A")
It returns None.
It does work, however, for _finditem({"B":1,"A":2},"A"), returning 2.
I'm sure it's a simple mistake but I cannot find it. I feel like there already might be something for this in the standard library or collections, but I can't find that either.
If you are looking for a general explanation of what is wrong with code like this, the canonical is Why does my recursive function return None?. The answers here are mostly specific to the task of searching in a nested dictionary.
when you recurse, you need to return the result of _finditem
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
return _finditem(v, key) #added return statement
To fix the actual algorithm, you need to realize that _finditem returns None if it didn't find anything, so you need to check that explicitly to prevent an early return:
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = _finditem(v, key)
if item is not None:
return item
Of course, that will fail if you have None values in any of your dictionaries. In that case, you could set up a sentinel object() for this function and return that in the case that you don't find anything -- Then you can check against the sentinel to know if you found something or not.
Here's a function that searches a dictionary that contains both nested dictionaries and lists. It creates a list of the values of the results.
def get_recursively(search_dict, field):
"""
Takes a dict with nested lists and dicts,
and searches all dicts for a key of the field
provided.
"""
fields_found = []
for key, value in search_dict.iteritems():
if key == field:
fields_found.append(value)
elif isinstance(value, dict):
results = get_recursively(value, field)
for result in results:
fields_found.append(result)
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
more_results = get_recursively(item, field)
for another_result in more_results:
fields_found.append(another_result)
return fields_found
Here is a way to do this using a "stack" and the "stack of iterators" pattern (credits to Gareth Rees):
def search(d, key, default=None):
"""Return a value corresponding to the specified key in the (possibly
nested) dictionary d. If there is no item with that key, return
default.
"""
stack = [iter(d.items())]
while stack:
for k, v in stack[-1]:
if isinstance(v, dict):
stack.append(iter(v.items()))
break
elif k == key:
return v
else:
stack.pop()
return default
The print(search({"B": {"A": 2}}, "A")) would print 2.
Just trying to make it shorter:
def get_recursively(search_dict, field):
if isinstance(search_dict, dict):
if field in search_dict:
return search_dict[field]
for key in search_dict:
item = get_recursively(search_dict[key], field)
if item is not None:
return item
elif isinstance(search_dict, list):
for element in search_dict:
item = get_recursively(element, field)
if item is not None:
return item
return None
Here's a Python 3.3+ solution which can handle lists of lists of dicts.
It also uses duck typing, so it can handle any iterable, or object implementing the 'items' method.
from typing import Iterator
def deep_key_search(obj, key: str) -> Iterator:
""" Do a deep search of {obj} and return the values of all {key} attributes found.
:param obj: Either a dict type object or an iterator.
:return: Iterator of all {key} values found"""
if isinstance(obj, str):
# When duck-typing iterators recursively, we must exclude strings
return
try:
# Assume obj is a like a dict and look for the key
for k, v in obj.items():
if k == key:
yield v
else:
yield from deep_key_search(v, key)
except AttributeError:
# Not a dict type object. Is it iterable like a list?
try:
for v in obj:
yield from deep_key_search(v, key)
except TypeError:
pass # Not iterable either.
Pytest:
#pytest.mark.parametrize(
"data, expected, dscr", [
({}, [], "Empty dict"),
({'Foo': 1, 'Bar': 2}, [1], "Plain dict"),
([{}, {'Foo': 1, 'Bar': 2}], [1], "List[dict]"),
([[[{'Baz': 3, 'Foo': 'a'}]], {'Foo': 1, 'Bar': 2}], ['a', 1], "Deep list"),
({'Foo': 1, 'Bar': {'Foo': 'c'}}, [1, 'c'], "Dict of Dict"),
(
{'Foo': 1, 'Bar': {'Foo': 'c', 'Bar': 'abcdef'}},
[1, 'c'], "Contains a non-selected string value"
),
])
def test_deep_key_search(data, expected, dscr):
assert list(deep_key_search(data, 'Foo')) == expected
I couldn't add a comment to the accepted solution proposed by #mgilston because of lack of reputation. The solution doesn't work if the key being searched for is inside a list.
Looping through the elements of the lists and calling the recursive function should extend the functionality to find elements inside nested lists:
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = _finditem(v, key)
if item is not None:
return item
elif isinstance(v,list):
for list_item in v:
item = _finditem(list_item, key)
if item is not None:
return item
print(_finditem({"C": {"B": [{"A":2}]}}, "A"))
I had to create a general-case version that finds a uniquely-specified key (a minimal dictionary that specifies the path to the desired value) in a dictionary that contains multiple nested dictionaries and lists.
For the example below, a target dictionary is created to search, and the key is created with the wildcard "???". When run, it returns the value "D"
def lfind(query_list:List, target_list:List, targ_str:str = "???"):
for tval in target_list:
#print("lfind: tval = {}, query_list[0] = {}".format(tval, query_list[0]))
if isinstance(tval, dict):
val = dfind(query_list[0], tval, targ_str)
if val:
return val
elif tval == query_list[0]:
return tval
def dfind(query_dict:Dict, target_dict:Dict, targ_str:str = "???"):
for key, qval in query_dict.items():
tval = target_dict[key]
#print("dfind: key = {}, qval = {}, tval = {}".format(key, qval, tval))
if isinstance(qval, dict):
val = dfind(qval, tval, targ_str)
if val:
return val
elif isinstance(qval, list):
return lfind(qval, tval, targ_str)
else:
if qval == targ_str:
return tval
if qval != tval:
break
def find(target_dict:Dict, query_dict:Dict):
result = dfind(query_dict, target_dict)
return result
target_dict = {"A":[
{"key1":"A", "key2":{"key3": "B"}},
{"key1":"C", "key2":{"key3": "D"}}]
}
query_dict = {"A":[{"key1":"C", "key2":{"key3": "???"}}]}
result = find(target_dict, query_dict)
print("result = {}".format(result))
Thought I'd throw my hat in the ring, this will allow for recursive requests on anything that implements a __getitem__ method.
def _get_recursive(obj, args, default=None):
"""Apply successive requests to an obj that implements __getitem__ and
return result if something is found, else return default"""
if not args:
return obj
try:
key, *args = args
_obj = object.__getitem__(obj, key)
return _get_recursive(_obj, args, default=default)
except (KeyError, IndexError, AttributeError):
return default

Checking a nested dictionary using a dot notation string "a.b.c.d.e", automatically create missing levels

Given the following dictionary:
d = {"a":{"b":{"c":"winning!"}}}
I have this string (from an external source, and I can't change this metaphor).
k = "a.b.c"
I need to determine if the dictionary has the key 'c', so I can add it if it doesn't.
This works swimmingly for retrieving a dot notation value:
reduce(dict.get, key.split("."), d)
but I can't figure out how to 'reduce' a has_key check or anything like that.
My ultimate problem is this: given "a.b.c.d.e", I need to create all the elements necessary in the dictionary, but not stomp them if they already exist.
You could use an infinite, nested defaultdict:
>>> from collections import defaultdict
>>> infinitedict = lambda: defaultdict(infinitedict)
>>> d = infinitedict()
>>> d['key1']['key2']['key3']['key4']['key5'] = 'test'
>>> d['key1']['key2']['key3']['key4']['key5']
'test'
Given your dotted string, here's what you can do:
>>> import operator
>>> keys = "a.b.c".split(".")
>>> lastplace = reduce(operator.getitem, keys[:-1], d)
>>> lastplace.has_key(keys[-1])
False
You can set a value:
>>> lastplace[keys[-1]] = "something"
>>> reduce(operator.getitem, keys, d)
'something'
>>> d['a']['b']['c']
'something'
... or using recursion:
def put(d, keys, item):
if "." in keys:
key, rest = keys.split(".", 1)
if key not in d:
d[key] = {}
put(d[key], rest, item)
else:
d[keys] = item
def get(d, keys):
if "." in keys:
key, rest = keys.split(".", 1)
return get(d[key], rest)
else:
return d[keys]
How about an iterative approach?
def create_keys(d, keys):
for k in keys.split("."):
if not k in d: d[k] = {} #if the key isn't there yet add it to d
d = d[k] #go one level down and repeat
If you need the last key value to map to anything else than a dictionary you could pass the value as an additional argument and set this after the loop:
def create_keys(d, keys, value):
keys = keys.split(".")
for k in keys[:-1]:
if not k in d: d[k] = {}
d = d[k]
d[keys[-1]] = value
I thought this discussion was very useful, but for my purpose to only get a value (not setting it), I ran into issues when a key was not present. So, just to add my flair to the options, you can use reduce in combination of an adjusted dict.get() to accommodate the scenario that the key is not present, and then return None:
from functools import reduce
import re
from typing import Any, Optional
def find_key(dot_notation_path: str, payload: dict) -> Any:
"""Try to get a deep value from a dict based on a dot-notation"""
def get_despite_none(payload: Optional[dict], key: str) -> Any:
"""Try to get value from dict, even if dict is None"""
if not payload or not isinstance(payload, (dict, list)):
return None
# can also access lists if needed, e.g., if key is '[1]'
if (num_key := re.match(r"^\[(\d+)\]$", key)) is not None:
try:
return payload[int(num_key.group(1))]
except IndexError:
return None
else:
return payload.get(key, None)
found = reduce(get_despite_none, dot_notation_path.split("."), payload)
# compare to None, as the key could exist and be empty
if found is None:
raise KeyError()
return found
In my use case, I need to find a key within an HTTP request payload, which can often include lists as well. The following examples work:
payload = {
"haystack1": {
"haystack2": {
"haystack3": None,
"haystack4": "needle"
}
},
"haystack5": [
{"haystack6": None},
{"haystack7": "needle"}
],
"haystack8": {},
}
find_key("haystack1.haystack2.haystack4", payload)
# "needle"
find_key("haystack5.[1].haystack7", payload)
# "needle"
find_key("[0].haystack5.[1].haystack7", [payload, None])
# "needle"
find_key("haystack8", payload)
# {}
find_key("haystack1.haystack2.haystack4.haystack99", payload)
# KeyError
EDIT: added list accessor
d = {"a":{}}
k = "a.b.c".split(".")
def f(d, i):
if i >= len(k):
return "winning!"
c = k[i]
d[c] = f(d.get(c, {}), i + 1)
return d
print f(d, 0)
"{'a': {'b': {'c': 'winning!'}}}"

Serializing a suds object in python

Ok I'm working on getting better with python, so I'm not sure this is the right way to go about what I'm doing to begin with, but here's my current problem...
I need to get some information via a SOAP method, and only use part of the information now but store the entire result for future uses (we need to use the service as little as possible). Looking up the best way to access the service I figured suds was the way to go, and it was simple and worked like a charm to get the data. But now I want to save the result somehow, preferably serialized / in a database so I can pull it out later and use it the same.
What's the best way to do this, it looks like pickle/json isn't an option? Thanks!
Update
Reading the top answer at How can I pickle suds results? gives me a better idea of why this isn't an option, I guess I'm stuck recreating a basic object w/ the information I need?
I have been using following approach to convert Suds object into JSON:
from suds.sudsobject import asdict
def recursive_asdict(d):
"""Convert Suds object into serializable format."""
out = {}
for k, v in asdict(d).items():
if hasattr(v, '__keylist__'):
out[k] = recursive_asdict(v)
elif isinstance(v, list):
out[k] = []
for item in v:
if hasattr(item, '__keylist__'):
out[k].append(recursive_asdict(item))
else:
out[k].append(item)
else:
out[k] = v
return out
def suds_to_json(data):
return json.dumps(recursive_asdict(data))
Yep, I confirm the explanation I gave in the answer you refer to -- dynamically generated classes are not easily picklable (nor otherwise easily serializable), you need to extract all the state information, pickle that state, and reconstruct the tricky sudsobject on retrieval if you really insist on using it;-).
Here is what I came up with before researching and finding this answer. This actually works well for me on complex suds responses and also on other objects such as __builtins__ since the solution is suds agnostic:
import datetime
def object_to_dict(obj):
if isinstance(obj, (str, unicode, bool, int, long, float, datetime.datetime, datetime.date, datetime.time)):
return obj
data_dict = {}
try:
all_keys = obj.__dict__.keys() # vars(obj).keys()
except AttributeError:
return obj
fields = [k for k in all_keys if not k.startswith('_')]
for field in fields:
val = getattr(obj, field)
if isinstance(val, (list, tuple)):
data_dict[field] = []
for item in val:
data_dict[field].append(object_to_dict(item))
else:
data_dict[field] = object_to_dict(val)
return data_dict
This solution works and is actually faster. It also works on objects that don't have the __keylist__ attribute.
I ran a benchmark 100 times on a complex suds output object, this solutions run time was 0.04 to .052 seconds (0.045724287 average). While recursive_asdict solution above ran in .082 to 0.102 seconds so nearly double (0.0829765582 average).
I then went back to the drawing board and re-did the function to get more performance out of it, and it does not need the datetime import. I leveraged in using the __keylist__ attribute, so this will not work on other objects such as __builtins__ but works nicely for suds object output:
def fastest_object_to_dict(obj):
if not hasattr(obj, '__keylist__'):
return obj
data = {}
fields = obj.__keylist__
for field in fields:
val = getattr(obj, field)
if isinstance(val, list): # tuple not used
data[field] = []
for item in val:
data[field].append(fastest_object_to_dict(item))
else:
data[field] = fastest_object_to_dict(val)
return data
The run time was 0.18 - 0.033 seconds (0.0260889721 average), so nearly 4x as faster than the recursive_asdict solution.
I made an implementation of a dummy class for Object intance of suds, and then being able to serialize. The FakeSudsInstance behaves like an original Suds Object instance, see below:
from suds.sudsobject import Object as SudsObject
class FakeSudsNode(SudsObject):
def __init__(self, data):
SudsObject.__init__(self)
self.__keylist__ = data.keys()
for key, value in data.items():
if isinstance(value, dict):
setattr(self, key, FakeSudsNode(value))
elif isinstance(value, list):
l = []
for v in value:
if isinstance(v, list) or isinstance(v, dict):
l.append(FakeSudsNode(v))
else:
l.append(v)
setattr(self, key, l)
else:
setattr(self, key, value)
class FakeSudsInstance(SudsObject):
def __init__(self, data):
SudsObject.__init__(self)
self.__keylist__ = data.keys()
for key, value in data.items():
if isinstance(value, dict):
setattr(self, key, FakeSudsNode(value))
else:
setattr(self, key, value)
#classmethod
def build_instance(cls, instance):
suds_data = {}
def node_to_dict(node, node_data):
if hasattr(node, '__keylist__'):
keys = node.__keylist__
for key in keys:
if isinstance(node[key], list):
lkey = key.replace('[]', '')
node_data[lkey] = node_to_dict(node[key], [])
elif hasattr(node[key], '__keylist__'):
node_data[key] = node_to_dict(node[key], {})
else:
if isinstance(node_data, list):
node_data.append(node[key])
else:
node_data[key] = node[key]
return node_data
else:
if isinstance(node, list):
for lnode in node:
node_data.append(node_to_dict(lnode, {}))
return node_data
else:
return node
node_to_dict(instance, suds_data)
return cls(suds_data)
Now, after a suds call, for example below:
# Now, after a suds call, for example below
>>> import cPickle as pickle
>>> suds_intance = client.service.SomeCall(account, param)
>>> fake_suds = FakeSudsInstance.build_instance(suds_intance)
>>> dumped = pickle.dumps(fake_suds)
>>> loaded = pickle.loads(dumped)
I hope it helps.
The solutions suggesed above lose valuable information about class names - it can be of value in some libraries like DFP client https://github.com/googleads/googleads-python-lib where entity types might be encoded in dynamically generated class names (i.e. TemplateCreative/ImageCreative)
Here's the solution I used that preserves class names and restores dict-serialized objects without data loss (except suds.sax.text.Text which would be converted into regular unicode objects and maybe some other types I haven't run into)
from suds.sudsobject import asdict, Factory as SudsFactory
def suds2dict(d):
"""
Suds object serializer
Borrowed from https://stackoverflow.com/questions/2412486/serializing-a-suds-object-in-python/15678861#15678861
"""
out = {'__class__': d.__class__.__name__}
for k, v in asdict(d).iteritems():
if hasattr(v, '__keylist__'):
out[k] = suds2dict(v)
elif isinstance(v, list):
out[k] = []
for item in v:
if hasattr(item, '__keylist__'):
out[k].append(suds2dict(item))
else:
out[k].append(item)
else:
out[k] = v
return out
def dict2suds(d):
"""
Suds object deserializer
"""
out = {}
for k, v in d.iteritems():
if isinstance(v, dict):
out[k] = dict2suds(v)
elif isinstance(v, list):
out[k] = []
for item in v:
if isinstance(item, dict):
out[k].append(dict2suds(item))
else:
out[k].append(item)
else:
out[k] = v
return SudsFactory.object(out.pop('__class__'), out)
I updated the recursive_asdict example above to be compatible with python3 (items instead of iteritems).
from suds.sudsobject import asdict
from suds.sax.text import Text
def recursive_asdict(d):
"""
Recursively convert Suds object into dict.
We convert the keys to lowercase, and convert sax.Text
instances to Unicode.
Taken from:
https://stackoverflow.com/a/15678861/202168
Let's create a suds object from scratch with some lists and stuff
>>> from suds.sudsobject import Object as SudsObject
>>> sudsobject = SudsObject()
>>> sudsobject.Title = "My title"
>>> sudsobject.JustAList = [1, 2, 3]
>>> sudsobject.Child = SudsObject()
>>> sudsobject.Child.Title = "Child title"
>>> sudsobject.Child.AnotherList = ["4", "5", "6"]
>>> childobject = SudsObject()
>>> childobject.Title = "Another child title"
>>> sudsobject.Child.SudObjectList = [childobject]
Now see if this works:
>>> result = recursive_asdict(sudsobject)
>>> result['title']
'My title'
>>> result['child']['anotherlist']
['4', '5', '6']
"""
out = {}
for k, v in asdict(d).items():
k = k.lower()
if hasattr(v, '__keylist__'):
out[k] = recursive_asdict(v)
elif isinstance(v, list):
out[k] = []
for item in v:
if hasattr(item, '__keylist__'):
out[k].append(recursive_asdict(item))
else:
out[k].append(
item.title() if isinstance(item, Text) else item)
else:
out[k] = v.title() if isinstance(v, Text) else v
return out
I like this way. We don't do the iteration ourselves, it is python that iterates when converting it to string
class Ob:
def __init__(self, J) -> None:
self.J = J
def __str__(self):
if hasattr(self.J, "__keylist__"):
self.J = {key: Ob(value) for key, value in dict(self.J).items()}
if hasattr(self.J, "append"):
self.J = [Ob(data) for data in sefl.J]
return str(self.J)
result = Ob(result_soap)

Categories

Resources