Related
I want to apply a function f to a collection xs but keep its type. If I use map, I get a 'map object':
def apply1(xs, f):
return map(f, xs)
If I know that xs is something like a list or tuple I can force it to have the same type:
def apply2(xs, f):
return type(xs)(map(f, xs))
However, that quickly breaks down for namedtuple (which I am currently in a habbit of using) -- because to my knowledge namedtuple needs to be constructed with unpack syntax or by calling its _make function. Also, namedtuple is const, so I cannot iterate over all entries and just change them.
Further problems arise from use of a dict.
Is there a generic way to express such an apply function that works for everything that is iterable?
Looks like a perfect task for functools.singledispatch decorator:
from functools import singledispatch
#singledispatch
def apply(xs, f):
return map(f, xs)
#apply.register(list)
def apply_to_list(xs, f):
return type(xs)(map(f, xs))
#apply.register(tuple)
def apply_to_tuple(xs, f):
try:
# handle `namedtuple` case
constructor = xs._make
except AttributeError:
constructor = type(xs)
return constructor(map(f, xs))
after that apply function can be simply used like
>>> apply([1, 2], lambda x: x + 1)
[2, 3]
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(10, 5)
>>> apply(p, lambda x: x ** 2)
Point(x=100, y=25)
I'm not aware of what is desired behavior for dict objects though, but the greatness of this approach that it is easy to extend.
I have a hunch you're coming from Haskell -- is that right? (I'm guessing because you use f and xs as variable names.) The answer to your question in Haskell would be "yes, it's called fmap, but it only works with types that have a defined Functor instance."
Python, on the other hand, has no general concept of "Functor." So strictly speaking, the answer is no. To get something like this, you'd have to fall back on other abstractions that Python does provide.
ABCs to the rescue
One pretty general approach would be to use abstract base classes. These provide a structured way to specify and check for particular interfaces. A Pythonic version of the Functor typeclass would be an abstract base class that defines a special fmap method, allowing individual classes to specify how they are to be mapped. But no such thing exists. (I think it would be a really cool addition to Python though!)
Now, you can define your own abstract base classes, so you could create a Functor ABC that expects a fmap interface, but you'd still have to write all your own functorized subclasses of list, dict, and so on, so that's not really ideal.
A better approach would be to use the existing interfaces to cobble together a generic definition of mapping that seems reasonable. You'd have to think pretty carefully about what aspects of the existing interfaces you'd need to combine. Just checking to see whether a type defines __iter__ isn't enough, because as you've already seen, a definition of iteration for a type doesn't necessarily translate into a definition of construction. For example, iterating over a dictionary only gives you the keys, but to map a dictionary in this precise way would require iteration over items.
Concrete examples
Here's an abstract base method that includes special cases for namedtuple and three abstract base classes -- Sequence, Mapping, and Set. It will behave as expected for any type that defines any of the above interfaces in the expected way. It then falls back to the generic behavior for iterables. In the latter case, the output won't have the same type as the input, but at least it will work.
from abc import ABC
from collections.abc import Sequence, Mapping, Set, Iterator
class Mappable(ABC):
def map(self, f):
if hasattr(self, '_make'):
return type(self)._make(f(x) for x in self)
elif isinstance(self, Sequence) or isinstance(self, Set):
return type(self)(f(x) for x in self)
elif isinstance(self, Mapping):
return type(self)((k, f(v)) for k, v in self.items())
else:
return map(f, self)
I've defined this as an ABC because that way you can create new classes that inherit from it. But you can also just call it on an existing instance of any class and it will behave as expected. You could also just use the map method above as a stand-alone function.
>>> from collections import namedtuple
>>>
>>> def double(x):
... return x * 2
...
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(5, 10)
>>> Mappable.map(p, double)
Point(x=10, y=20)
>>> d = {'a': 5, 'b': 10}
>>> Mappable.map(d, double)
{'a': 10, 'b': 20}
The cool thing about defining an ABC is that you can use it as a "mix-in." Here's a MappablePoint derived from a Point namedtuple:
>>> class MappablePoint(Point, Mappable):
... pass
...
>>> p = MappablePoint(5, 10)
>>> p.map(double)
MappablePoint(x=10, y=20)
You could also modify this approach slightly in light of Azat Ibrakov's answer, using the functools.singledispatch decorator. (It was new to me -- he should get all credit for this part of the answer, but I thought I'd write it up for the sake of completeness.)
This would look something like the below. Notice that we still have to special-case namedtuples because they break the tuple constructor interface. That hadn't bothered me before, but now it feels like a really annoying design flaw. Also, I set things up so that the final fmap function uses the expected argument order. (I wanted to use mmap instead of fmap because "Mappable" is a more Pythonic name than "Functor" IMO. But mmap is already a built-in library! Darn.)
import functools
#functools.singledispatch
def _fmap(obj, f):
raise TypeError('obj is not mappable')
#_fmap.register(Sequence)
def _fmap_sequence(obj, f):
if isinstance(obj, str):
return ''.join(map(f, obj))
if hasattr(obj, '_make'):
return type(obj)._make(map(f, obj))
else:
return type(obj)(map(f, obj))
#_fmap.register(Set)
def _fmap_set(obj, f):
return type(obj)(map(f, obj))
#_fmap.register(Mapping)
def _fmap_mapping(obj, f):
return type(obj)((k, f(v)) for k, v in obj.items())
def fmap(f, obj):
return _fmap(obj, f)
A few tests:
>>> fmap(double, [1, 2, 3])
[2, 4, 6]
>>> fmap(double, {1, 2, 3})
{2, 4, 6}
>>> fmap(double, {'a': 1, 'b': 2, 'c': 3})
{'a': 2, 'b': 4, 'c': 6}
>>> fmap(double, 'double')
'ddoouubbllee'
>>> Point = namedtuple('Point', ['x', 'y', 'z'])
>>> fmap(double, Point(x=1, y=2, z=3))
Point(x=2, y=4, z=6)
A final note on breaking interfaces
Neither of these approaches can guarantee that this will work for all things recognized as Sequences, and so on, because the ABC mechanism doesn't check function signatures. This is a problem not only for constructors, but also for all other methods. And it's unavoidable without type annotations.
In practice, however, it probably doesn't matter much. If you find yourself using a tool that breaks interface conventions in weird ways, consider using a different tool. (I'd actually say that goes for namedtuples too, as much as I like them!) This is the "consenting adults" philosophy behind many Python design decisions, and it has worked pretty well for the last couple of decades.
How to increment d['a']['b']['c'][1][2][3] if d is defaultdict of defaultdict without code dublication?
from collections import defaultdict
nested_dict_type = lambda: defaultdict(nested_dict_type)
nested_dict = nested_dict_type()
# incrementation
if type(nested_dict['a']['b']['c']['d'][1][2][3][4][5][6]) != int:
nested_dict['a']['b']['c']['d'][1][2][3][4][5][6] = 0
nested_dict['a']['b']['c']['d'][1][2][3][4][5][6] += 1 # ok, now it contains 1
Here we can see that we duplicated (in the code) a chain of keys 3 times.
Question: Is it possible to write a function inc that will take nested_dict['a']['b']...[6] and do the same job as above? So:
def inc(x):
if type(x) != int:
x = 0
x += 1
inc(nested_dict['a']['b']['c']['d'][1][2][3][4][5][6]) # ok, now it contains 1
Update (20 Aug 2018):
There is still no answer to the question. It's clear that there are options "how to do what I want", but the question is straightforward: there is "value", we pass it to a function, function modifies it. It looks that it's not possible.
Just a value, without any "additional keys", etc.
If it is so, can we make an answer more generic?
Notes:
What is defaultdict of defaultdicts - SO.
This question is not about "storing of integers in a defaultdict", so I'm not looking for a hierarchy of defaultdicts with an int type at the leaves.
Assume that type (int in the examples) is known in advance / can be even parametrized (including the ability to perform += operator) - the question is how to dereference the object, pass it for modification and store back in the context of defaultdict of defaultdicts.
Is the answer to this question related to the mutability? See example below:
Example:
def inc(x):
x += 1
d = {'a': int(0)}
inc(d['a'])
# d['a'] == 0, immutable
d = {'a': Int(0)}
inc(d['a'])
# d['a'] == 1, mutated
Where Int is:
class Int:
def __init__(self, value):
self.value = value
def __add__(self, v):
self.value += v
return self
def __repr__(self):
return str(self.value)
It's not exactly abut mutability, more about how assignment performs name binding.
When you do x = 0 in your inc function you bind a new object to the name x, and any connection between that name and the previous object bound to that name is lost. That doesn't depend on whether or not x is mutable.
But since x is an item in a mutable object we can achieve what you want by passing the parent mutable object to inc along with the key needed to access the desired item.
from collections import defaultdict
nested_dict_type = lambda: defaultdict(nested_dict_type)
nested_dict = nested_dict_type()
# incrementation
def inc(ref, key):
if not isinstance(ref[key], int):
ref[key] = 0
ref[key] += 1
d = nested_dict['a']['b']['c']['d'][1][2][3][4][5]
inc(d, 6)
print(d)
output
defaultdict(<function <lambda> at 0xb730553c>, {6: 1})
Now we aren't binding a new object, we're merely mutating an existing one, so the original d object gets updated correctly.
BTW, that deeply nested dict is a bit painful to work with. Maybe there's a better way to organize your data... But anyway, one thing that can be handy when working with deep nesting is to use lists or tuples of keys. Eg,
q = nested_dict
keys = 'a', 'b', 'c', 'd', 1, 2, 3, 4, 5
for k in keys:
q = q[k]
q now refers to nested_dict['a']['b']['c']['d'][1][2][3][4][5]
You can't have multiple default types with defaultdict. You have the following options:
Nested defaultdict of defaultdict objects indefinitely;
defaultdict of int objects, which likely won't suit your needs;
defaultdict of defaultdict down to a specific level with int defined for the last level, e.g. d = defaultdict(lambda: defaultdict(int)) for a single nesting;
Similar to (3), but for counting you can use collections.Counter instead, i.e. d = defaultdict(Counter).
I recommend the 3rd or 4th options if you are always going to go down to a set level. In other words, a scalar value will only be supplied at the nth level, where n is constant.
Otherwise, one manual option is to have a function perform the type-testing. In this case, try / except may be a good alternative. Here we also define a recursive algorithm to allow you to feed a list of keys rather than defining manual __getitem__ calls.
from collections import defaultdict
from functools import reduce
from operator import getitem
nested_dict_type = lambda: defaultdict(nested_dict_type)
d = nested_dict_type()
d[1][2] = 10
def inc(d_in, L):
try:
reduce(getitem, L[:-1], d_in)[L[-1]] += 1
except TypeError:
reduce(getitem, L[:-1], d_in)[L[-1]] = 1
inc(d, [1, 2])
inc(d, [1, 3])
print(d)
defaultdict({1: defaultdict({2: 11, 3: 1})})
Let's say I've got a simple class in python
class Wharrgarbl(object):
def __init__(self, a, b, c, sum, version='old'):
self.a = a
self.b = b
self.c = c
self.sum = 6
self.version = version
def __int__(self):
return self.sum + 9000
def __what_goes_here__(self):
return {'a': self.a, 'b': self.b, 'c': self.c}
I can cast it to an integer very easily
>>> w = Wharrgarbl('one', 'two', 'three', 6)
>>> int(w)
9006
Which is great! But, now I want to cast it to a dict in a similar fashion
>>> w = Wharrgarbl('one', 'two', 'three', 6)
>>> dict(w)
{'a': 'one', 'c': 'three', 'b': 'two'}
What do I need to define for this to work? I tried substituting both __dict__ and dict for __what_goes_here__, but dict(w) resulted in a TypeError: Wharrgarbl object is not iterable in both cases. I don't think simply making the class iterable will solve the problem. I also attempted many googles with as many different wordings of "python cast object to dict" as I could think of but couldn't find anything relevant :{
Also! Notice how calling w.__dict__ won't do what I want because it's going to contain w.version and w.sum. I want to customize the cast to dict in the same way that I can customize the cast to int by using def int(self).
I know that I could just do something like this
>>> w.__what_goes_here__()
{'a': 'one', 'c': 'three', 'b': 'two'}
But I am assuming there is a pythonic way to make dict(w) work since it is the same type of thing as int(w) or str(w). If there isn't a more pythonic way, that's fine too, just figured I'd ask. Oh! I guess since it matters, this is for python 2.7, but super bonus points for a 2.4 old and busted solution as well.
There is another question Overloading __dict__() on python class that is similar to this one but may be different enough to warrant this not being a duplicate. I believe that OP is asking how to cast all the data in his class objects as dictionaries. I'm looking for a more customized approach in that I don't want everything in __dict__ included in the dictionary returned by dict(). Something like public vs private variables may suffice to explain what I'm looking for. The objects will be storing some values used in calculations and such that I don't need/want to show up in the resulting dictionaries.
UPDATE:
I've chosen to go with the asdict route suggested but it was a tough choice selecting what I wanted to be the answer to the question. Both #RickTeachey and #jpmc26 provided the answer I'm going to roll with but the former had more info and options and landed on the same result as well and was upvoted more so I went with it. Upvotes all around though and thanks for the help. I've lurked long and hard on stackoverflow and I'm trying to get my toes in the water more.
There are at least five six ways. The preferred way depends on what your use case is.
Option 1:
Simply add an asdict() method.
Based on the problem description I would very much consider the asdict way of doing things suggested by other answers. This is because it does not appear that your object is really much of a collection:
class Wharrgarbl(object):
...
def asdict(self):
return {'a': self.a, 'b': self.b, 'c': self.c}
Using the other options below could be confusing for others unless it is very obvious exactly which object members would and would not be iterated or specified as key-value pairs.
Option 1a:
Inherit your class from 'typing.NamedTuple' (or the mostly equivalent 'collections.namedtuple'), and use the _asdict method provided for you.
from typing import NamedTuple
class Wharrgarbl(NamedTuple):
a: str
b: str
c: str
sum: int = 6
version: str = 'old'
Using a named tuple is a very convenient way to add lots of functionality to your class with a minimum of effort, including an _asdict method. However, a limitation is that, as shown above, the NT will include all the members in its _asdict.
If there are members you don't want to include in your dictionary, you'll need to specify which members you want the named tuple _asdict result to include. To do this, you could either inherit from a base namedtuple class using the older collections.namedtuple API:
from collections import namedtuple as nt
class Wharrgarbl(nt("Basegarble", "a b c")):
# note that the typing info below isn't needed for the old API
a: str
b: str
c: str
sum: int = 6
version: str = 'old'
...or you could create a base class using the newer API, and inherit from that, using only the dictionary members in the base class:
from typing import NamedTuple
class Basegarbl(NamedTuple):
a: str
b: str
c: str
class Wharrgarbl(Basegarbl):
sum: int = 6
version: str = 'old'
Another limitation is that NT is read-only. This may or may not be desirable.
Option 2:
Implement __iter__.
Like this, for example:
def __iter__(self):
yield 'a', self.a
yield 'b', self.b
yield 'c', self.c
Now you can just do:
dict(my_object)
This works because the dict() constructor accepts an iterable of (key, value) pairs to construct a dictionary. Before doing this, ask yourself the question whether iterating the object as a series of key,value pairs in this manner- while convenient for creating a dict- might actually be surprising behavior in other contexts. E.g., ask yourself the question "what should the behavior of list(my_object) be...?"
Additionally, note that accessing values directly using the get item obj["a"] syntax will not work, and keyword argument unpacking won't work. For those, you'd need to implement the mapping protocol.
Option 3:
Implement the mapping protocol. This allows access-by-key behavior, casting to a dict without using __iter__, and also provides two types of unpacking behavior:
mapping unpacking behavior: {**my_obj}
keyword unpacking behavior, but only if all the keys are strings: dict(**my_obj)
The mapping protocol requires that you provide (at minimum) two methods together: keys() and __getitem__.
class MyKwargUnpackable:
def keys(self):
return list("abc")
def __getitem__(self, key):
return dict(zip("abc", "one two three".split()))[key]
Now you can do things like:
>>> m=MyKwargUnpackable()
>>> m["a"]
'one'
>>> dict(m) # cast to dict directly
{'a': 'one', 'b': 'two', 'c': 'three'}
>>> dict(**m) # unpack as kwargs
{'a': 'one', 'b': 'two', 'c': 'three'}
As mentioned above, if you are using a new enough version of python you can also unpack your mapping-protocol object into a dictionary comprehension like so (and in this case it is not required that your keys be strings):
>>> {**m}
{'a': 'one', 'b': 'two', 'c': 'three'}
Note that the mapping protocol takes precedence over the __iter__ method when casting an object to a dict directly (without using kwarg unpacking, i.e. dict(m)). So it is possible- and might be sometimes convenient- to cause the object to have different behavior when used as an iterable (e.g., list(m)) vs. when cast to a dict (dict(m)).
But note also that with regular dictionaries, if you cast to a list, it will give the KEYS back, and not the VALUES as you require. If you implement another nonstandard behavior for __iter__ (returning values instead of keys), it could be surprising for other people using your code unless it is very obvious why this would happen.
EMPHASIZED: Just because you CAN use the mapping protocol, does NOT mean that you SHOULD do so. Does it actually make sense for your object to be passed around as a set of key-value pairs, or as keyword arguments and values? Does accessing it by key- just like a dictionary- really make sense? Would you also expect your object to have other standard mapping methods such as items, values, get? Do you want to support the in keyword and equality checks (==)?
If the answer to these questions is yes, it's probably a good idea to not stop here, and consider the next option instead.
Option 4:
Look into using the 'collections.abc' module.
Inheriting your class from 'collections.abc.Mapping or 'collections.abc.MutableMapping signals to other users that, for all intents and purposes, your class is a mapping * and can be expected to behave that way. It also provides the methods items, values, get and supports the in keyword and equality checks (==) "for free".
You can still cast your object to a dict just as you require, but there would probably be little reason to do so. Because of duck typing, bothering to cast your mapping object to a dict would just be an additional unnecessary step the majority of the time.
This answer from me about how to use ABCs might also be helpful.
As noted in the comments below: it's worth mentioning that doing this the abc way essentially turns your object class into a dict-like class (assuming you use MutableMapping and not the read-only Mapping base class). Everything you would be able to do with dict, you could do with your own class object. This may be, or may not be, desirable.
Also consider looking at the numerical abcs in the numbers module:
https://docs.python.org/3/library/numbers.html
Since you're also casting your object to an int, it might make more sense to essentially turn your class into a full fledged int so that casting isn't necessary.
Option 5:
Look into using the dataclasses module (Python 3.7+ only), which includes a convenient asdict() utility method.
from dataclasses import dataclass, asdict, field, InitVar
#dataclass
class Wharrgarbl(object):
a: int
b: int
c: int
sum: InitVar[int] # note: InitVar will exclude this from the dict
version: InitVar[str] = "old"
def __post_init__(self, sum, version):
self.sum = 6 # this looks like an OP mistake?
self.version = str(version)
Now you can do this:
>>> asdict(Wharrgarbl(1,2,3,4,"X"))
{'a': 1, 'b': 2, 'c': 3}
Option 6:
Use typing.TypedDict, which has been added in python 3.8.
NOTE: option 6 is likely NOT what the OP, or other readers based on the title of this question, are looking for. See additional comments below.
class Wharrgarbl(TypedDict):
a: str
b: str
c: str
Using this option, the resulting object is a dict (emphasis: it is not a Wharrgarbl). There is no reason at all to "cast" it to a dict (unless you are making a copy).
And since the object is a dict, the initialization signature is identical to that of dict and as such it only accepts keyword arguments or another dictionary.
>>> w = Wharrgarbl(a=1,b=2,b=3)
>>> w
{'a': 1, 'b': 2, 'c': 3}
>>> type(w)
<class 'dict'>
Emphasized: the above "class" Wharrgarbl isn't actually a new class at all. It is simply syntactic sugar for creating typed dict objects with specific keys ONLY and value fields of different types for the type checker. At run time, it is still nothing more than a dict.
As such this option can be pretty convenient for signaling to readers of your code (and also to a type checker such as mypy) that such a dict object is expected to have specific keys with specific value types.
But this means you cannot, for example, add other methods, although you can try:
class MyDict(TypedDict):
def my_fancy_method(self):
return "world changing result"
...but it won't work:
>>> MyDict().my_fancy_method()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'my_fancy_method'
* "Mapping" has become the standard "name" of the dict-like duck type
There is no magic method that will do what you want. The answer is simply name it appropriately. asdict is a reasonable choice for a plain conversion to dict, inspired primarily by namedtuple. However, your method will obviously contain special logic that might not be immediately obvious from that name; you are returning only a subset of the class' state. If you can come up with with a slightly more verbose name that communicates the concepts clearly, all the better.
Other answers suggest using __iter__, but unless your object is truly iterable (represents a series of elements), this really makes little sense and constitutes an awkward abuse of the method. The fact that you want to filter out some of the class' state makes this approach even more dubious.
something like this would probably work
class MyClass:
def __init__(self,x,y,z):
self.x = x
self.y = y
self.z = z
def __iter__(self): #overridding this to return tuples of (key,value)
return iter([('x',self.x),('y',self.y),('z',self.z)])
dict(MyClass(5,6,7)) # because dict knows how to deal with tuples of (key,value)
I think this will work for you.
class A(object):
def __init__(self, a, b, c, sum, version='old'):
self.a = a
self.b = b
self.c = c
self.sum = 6
self.version = version
def __int__(self):
return self.sum + 9000
def __iter__(self):
return self.__dict__.iteritems()
a = A(1,2,3,4,5)
print dict(a)
Output
{'a': 1, 'c': 3, 'b': 2, 'sum': 6, 'version': 5}
Like many others, I would suggest implementing a to_dict() function rather than (or in addition to) allowing casting to a dictionary. I think it makes it more obvious that the class supports that kind of functionality. You could easily implement such a method like this:
def to_dict(self):
class_vars = vars(MyClass) # get any "default" attrs defined at the class level
inst_vars = vars(self) # get any attrs defined on the instance (self)
all_vars = dict(class_vars)
all_vars.update(inst_vars)
# filter out private attributes
public_vars = {k: v for k, v in all_vars.items() if not k.startswith('_')}
return public_vars
It's hard to say without knowing the whole context of the problem, but I would not override __iter__.
I would implement __what_goes_here__ on the class.
as_dict(self:
d = {...whatever you need...}
return d
I am trying to write a class that is "both" a list or a dict. I want the programmer to be able to both "cast" this object to a list (dropping the keys) or dict (with the keys).
Looking at the way Python currently does the dict() cast: It calls Mapping.update() with the object that is passed. This is the code from the Python repo:
def update(self, other=(), /, **kwds):
''' D.update([E, ]**F) -> None. Update D from mapping/iterable E and F.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k, v in F.items(): D[k] = v
'''
if isinstance(other, Mapping):
for key in other:
self[key] = other[key]
elif hasattr(other, "keys"):
for key in other.keys():
self[key] = other[key]
else:
for key, value in other:
self[key] = value
for key, value in kwds.items():
self[key] = value
The last subcase of the if statement, where it is iterating over other is the one most people have in mind. However, as you can see, it is also possible to have a keys() property. That, combined with a __getitem__() should make it easy to have a subclass be properly casted to a dictionary:
class Wharrgarbl(object):
def __init__(self, a, b, c, sum, version='old'):
self.a = a
self.b = b
self.c = c
self.sum = 6
self.version = version
def __int__(self):
return self.sum + 9000
def __keys__(self):
return ["a", "b", "c"]
def __getitem__(self, key):
# have obj["a"] -> obj.a
return self.__getattribute__(key)
Then this will work:
>>> w = Wharrgarbl('one', 'two', 'three', 6)
>>> dict(w)
{'a': 'one', 'c': 'three', 'b': 'two'}
Here is very clean and fast solution
I created a function that converts any custom class to dict
def convert_to_dict(args: dict):
json = dict()
for key, value in args.items():
key_vals = str(key).split("_")
last_index = len(key_vals)
json[str(key_vals[last_index-1])] = value
return json
what you need is to supply it object.__dict__ then it will do the job for you to clean it and help you store it in databse.
Going off of the asdict solution, here's a useful mixin for if you want to add asdict to several different classes.
Adapted from: https://www.pythontutorial.net/python-oop/python-mixin/
class DictMixin:
def asdict(self):
return self._traverse_dict(self._attrs)
def _traverse_dict(self, attributes: dict) -> dict:
result = {}
for key, value in attributes.items():
result[key] = self._traverse(value)
return result
def _traverse(self, value):
if isinstance(value, DictMixin):
return value.asdict()
elif isinstance(value, dict):
return self._traverse_dict(value)
elif isinstance(value, list):
return [self._traverse(v) for v in value]
else:
return value
Which you can then use:
class FooBar(DictMixin):
_attrs = ["foo", "hello"]
def __init__(self):
self.foo = "bar"
self.hello = "world"
>>> a = FooBar()
>>> a.asdict()
{
"foo": "bar",
"hello": "world"
}
You can create a folder like 'Strategy' then you can use pickle to save and load the objects of your class.
import pickle
import os
# Load object as dictionary ---------------------------------------------------
def load_object():
file_path = 'Strategy\\All_Pickles.hd5'
if not os.path.isfile(file_path):
return {}
with open(file_path, 'rb') as file:
unpickler = pickle.Unpickler(file)
return dict(unpickler.load())
# Save object as dictionary ---------------------------------------------------
def save_object(name, value):
file_path = 'Strategy\\All_Pickles.hd5'
object_dict = load_object()
with open(file_path, 'wb') as file:
object_dict[name] = value
pickle.dump(object_dict, file)
return True
class MyClass:
def __init__(self, name):
self.name = name
def show(self):
print(self.name)
save_object('1', MyClass('Test1'))
save_object('2', MyClass('Test2'))
objects = load_object()
obj1 = objects['1']
obj2 = objects['2']
obj1.show()
obj2.show()
I created two objects of one class and called a method of the class. I hope, it can help you.
I need to compare hundreds of objects stored in a unique list to find duplicates:
object_list = {Object_01, Object_02, Object_03, Object_04, Object_05, ...}
I've written a custom function, which returns True, if the objects are equal and False if not:
object_01.compare(object_02)
>>> True
Compare method works well, but takes a lot of time per execution. I'm currently using itertools.combinations(x, 2) to iterate through all combinations. I've thought it's a good idea to use a dict for storing already compared objects and create new sets dynamically like:
dct = {'Compared': {}}
dct['Compared'] = set()
import itertools
for a, b in itertools.combinations(x, 2):
if b.name not in dct['Compared']:
if compare(a,b) == True:
#print (a,b)
key = a.name
value = b.name
if key not in dct:
dct[key] = set()
dct[key].add(value)
else:
dct[key].add(value)
dct[key].add(key)
dct['Compared'].add(b)
Current Output:
Compared: {'Object_02', 'Object_01', 'Object_03', 'Object_04', 'Object_05'}
Object_01: {'Object_02', 'Object_03', 'Object_01'}
Object_04: {'Object_05', 'Object_04'}
Object_05: {'Object_04'}
...
I would like to know: Is there a faster way to iterate through all combinations and how to break/prevent the iteration of an object, which is already assigned to a list of duplicates?
Desired Output:
Compared: {'Object_02', 'Object_01', 'Object_03', 'Object_04', 'Object_05'}
Object_01: {'Object_02', 'Object_03', 'Object_01'}
Object_04: {'Object_05', 'Object_04'}
...
Note: Compare method is a c-wrapper. Requirement is to find an algorithm around it.
You don't need to calculate all combinations, you just need to check if a given item is a duplicate:
for i, a in enumerate(x):
if any(a.compare(b) for b in x[:i]):
# a is a duplicate of an already seen item, so do something
This is still technically O(n^2), but you've cut out at least half the checks required, and should be a bit faster.
In short, x[:i] returns all items in the list before index i. If the item x[i] appears in that list, you know it's a duplicate. If not, there may be a duplicate after it in the list, but you worry about that when you get there.
Using any is also important here: if it finds any true item, it will immediately stop, without checking the rest of the iterable.
You could also improve the number of checks by removing known duplicates from the list you're checking against:
x_copy = x[:]
removed = 0
for i, a in enumerate(x):
if any(a.compare(b) for b in x_copy[:i-removed]):
del x_copy[i-removed]
removed += 1
# a is a duplicate of an already seen item, so do something
Note that we use a copy, to avoid changing the sequence we're iterating over, and we need to take account for the number of items we've removed when using indexes.
Next, we just need to figure out how to build the dictionary.
THis might be a little more complex. The first step is to figure out exactly which element is a duplicate. This can be done by realising any is just a wrapper around a for loop:
def any(iterable):
for item in iterable:
if item: return True
return False
We can then make a minor change, and pass in a function:
def first(iterable, fn):
for item in iterable:
if fn(item): return item
return None
Now, we change our duplicate finder as follows:
d = collections.defaultdict(list)
x_copy = x[:]
removed = 0
for i, a in enumerate(x):
b = first(x_copy[:i-removed], a.compare):
if b is not None:
# b is the first occurring duplicate of a
del x_copy[i-removed]
removed += 1
d[b.name].append(a)
else:
# we've not seen a yet, but might see it later
d[a.name].append(a)
This will put every element in the list into a dict(-like). If you only want the duplicates, it's then just a case of getting all the entries with a length greater than 1.
Group the objects by name if you want to find the dups grouping by attributes
class Foo:
def __init__(self,i,j):
self.i = i
self.j = j
object_list = {Foo(1,2),Foo(3,4),Foo(1,2),Foo(3,4),Foo(5,6)}
from collections import defaultdict
d = defaultdict(list)
for obj in object_list:
d[(obj.i,obj.j)].append(obj)
print(d)
defaultdict(<type 'list'>, {(1, 2): [<__main__.Foo instance at 0x7fa44ee7d098>, <__main__.Foo instance at 0x7fa44ee7d128>],
(5, 6): [<__main__.Foo instance at 0x7fa44ee7d1b8>],
(3, 4): [<__main__.Foo instance at 0x7fa44ee7d0e0>, <__main__.Foo instance at 0x7fa44ee7d170>]})
If not the name then use a tuple to store all the attributes you use to check for comparison.
Or sort the list by the attributes that matter and use groupby to group:
class Foo:
def __init__(self,i,j):
self.i = i
self.j = j
object_list = {Foo(1,2),Foo(3,4),Foo(1,2),Foo(3,4),Foo(5,6)}
from itertools import groupby
from operator import attrgetter
groups = [list(v) for k,v in groupby(sorted(object_list, key=attrgetter("i","j")),key=attrgetter("i","j"))]
print(groups)
[[<__main__.Foo instance at 0x7f794a944d40>, <__main__.Foo instance at 0x7f794a944dd0>], [<__main__.Foo instance at 0x7f794a944d88>, <__main__.Foo instance at 0x7f794a944e18>], [<__main__.Foo instance at 0x7f794a944e60>]]
You could also implement lt, eq and hash to make your objects sortable and hashable:
class Foo(object):
def __init__(self,i,j):
self.i = i
self.j = j
def __lt__(self, other):
return (self.i, self.j) < (other.i, other.j)
def __hash__(self):
return hash((self.i,self.j))
def __eq__(self, other):
return (self.i, self.j) == (other.i, other.j)
print(set(object_list))
object_list.sort()
print(map(lambda x: (getattr(x,"i"),getattr(x,"j")),object_list))
set([<__main__.Foo object at 0x7fdff2fc08d0>, <__main__.Foo object at 0x7fdff2fc09d0>, <__main__.Foo object at 0x7fdff2fc0810>])
[(1, 2), (1, 2), (3, 4), (3, 4), (5, 6)]
Obviously the attributes need to be hashable, if you had lists you could change to tuples etc..
When we use def, we can use **kwargs and *args to define dynamic inputs to the function
Is there anything similar for the return tuples, I've been looking for something that behaves like this:
def foo(data):
return 2,1
a,b=foo(5)
a=2
b=1
a=foo(5)
a=2
However if I only declare one value to unpack, it sends the whole tuple over there:
a=foo(5)
a=(2,1)
I could use 'if' statements, but I was wondering if there was something less cumbersome. I could also use some hold variable to store that value, but my return value might be kind of large to have just some place holder for that.
Thanks
If you need to fully generalize the return value, you could do something like this:
def function_that_could_return_anything(data):
# do stuff
return_args = ['list', 'of', 'return', 'values']
return_kwargs = {'dict': 0, 'of': 1, 'return': 2, 'values': 3}
return return_args, return_kwargs
a, b = function_that_could_return_anything(...)
for thing in a:
# do stuff
for item in b.items():
# do stuff
In my opinion it would be simpler to just return a dictionary, then access parameters with get():
dict_return_value = foo()
a = dict_return_value.get('key containing a', None)
if a:
# do stuff with a
I couldn't quite understand exactly what you're asking, so I'll take a couple guesses.
If you want to use a single value sometimes, consider a namedtuple:
from collections import namedtuple
AAndB = namedtuple('AAndB', 'a b')
def foo(data):
return AAndB(2,1)
# Unpacking all items.
a,b=foo(5)
# Using a single value.
foo(5).a
Or, if you're using Python 3.x, there's extended iterable unpacking to easily unpack only some of the values:
def foo(data):
return 3,2,1
a, *remainder = foo(5) # a==3, remainder==[2,1]
a, *remainder, c = foo(5) # a==3, remainder==[2], c==1
a, b, c, *remainder = foo(5) # a==3, b==2, c==1, remainder==[]
Sometimes the name _ is used to indicate that you are discarding the value:
a, *_ = foo(5)