I want to add a dictionary that maps an object to a list of objects as an instance variable to a class. What is the idiomatic way to do it in Python? Here's how I've done it:
class MyClass:
def __init__(self):
self.myDict = { None : [None] }
You have already hit upon the idiomatic way to do it!
Personally, I'd do it this way, if you want the option of starting out with a certain number of objects, and you want it clearer what you're desiring.
class MyClass:
def __init__(self, objlistdict=None):
"Takes an optional initial dictionary with objects as keys and lists of objects as values" # docstrings are very useful for documentation
self.myDict = {} if objlist is None else objlist # sets `self.myDict` to be the initial dict, otherwise makes it an empty one
Note that the objects you use as keys do need to be hashable, and I believe it's recommended to use immutable objects even if the mutable object you're using still has a hash method defined, due to the fact that objects that compare equal are generally supposed to have the same hash, so changing something about the object will change the hash value, and that might cause problems with being unable to access the value for an original key.
Related
How can I access a Python object by it's address that I get from id(self.__dict__)?
The background is the following, I want to access one class instance from the other (both classes mimic custom dictionaries):
class1:
def __init__(self):
self.a = class2(self)
# I also tried self.a = class2(self.__dict__) which also does not seem to be a good idea
class2:
def __init__(self, parentobject):
self.parentobject = parentobject
As both class implement dictionaries, I can't iterate over them due to infinite self-referencing
One way to get it working, I thought, would to pass and save only the address of the "parent" class instance.
But in order to access it, I would need to transform somehow the address into something I can access.
Edit: Some people are suggesting that it sounds like a XY problem. To me, a bit too. Therefore, the original problem: I want to implement a nested dictionary. The 'outer' dictionary contains only dictionaries of specific different types. These inner dictionaries hold some values, so:
dictionary["parameter1"]
gives something like {"bla": 42, "blubb":15}
but this custom dictionary also allows for executing functions like
dictionary["parameter1"].evaluate()
This function depends on another parameter parameter2. So to access paramter2, I need somehow the other dictionary that holds parameter2 or I need a reference to the outer dictionary. So, what I tried to do above was to pass self of the outer dictionary to the inner dictionary. To hide this data, I reimplemented __repr__ where I iterate over __dict__ in order to skip the hidden keys. This is where the error is happening. So, I would be happy to find another better solution, but the most obvious one to me was to pass the address instead of the object.
This is a way to do what you're asking, but I'm not sure this works on just any implementation of Python besides the standard CPython, which uses the memory address of objects as their id. That's an implementation detail entirely specific to CPython though. As for whether you should even be trying to do this, I'm not here to judge that:
import ctypes
# create an object
a = [1, 2, 3]
# get the id of the object
b = id(a)
# access the object by its id
c = ctypes.cast(b, ctypes.py_object).value
# print the object
print(c) # output: [1, 2, 3]
My dataclass has a field that holds an array of data in a custom type (actually it is a PyROOT std vector). However, for the user it is supposed to be visible as a list. This is simple enough with dataclass getters and setters, that convert the vector to list and vice versa. However, this works only if the user initialises the field with a full list. If the user wants to append to the list, it, obviously, doesn't work, as there is no permanent list associated with the field.
I wonder if there is a way to inhibit the ".append()" call on the field and call instead the vector's push_back()? Or perhaps there is a good Pythonic way to deal with it in general?
The context is, that I need the dataclass fields in the PyROOT format, as later I am storing the data in ROOT TTrees. However, I am creating this interface, so that the user does not need to know ROOT to use the dataclass. I know that I could create both the vector and the list that would hold the same data, but that seems like a waste of memory, and I am not certain how to update the vector each time the list is modified, anyway.
According to the Python Docs, “Lists are mutable sequences, typically used to store collections of homogeneous items (where the precise degree of similarity will vary by application).” (emphasis added)
With that in mind, I would start off with something like this:
from collections.abc import MutableSequence
class ListLike(MutableSequence):
def __init__(self):
self.backing_data = object() # Replace with the type your using
ListLike()
When you run that code, you’ll get the error: TypeError: Can't instantiate abstract class ListLike with abstract methods __delitem__, __getitem__, __len__, __setitem__, insert. Once you implement those methods, you’ll have have a type that acts a lot like list, but isn’t.
To make ListLikes act even more like lists, use this code to compare the two:
example_list = list()
example_list_like = ListLike()
list_attributes = [attribute for attribute in dir(example_list)]
list_like_attributes = [attribute for attribute in dir(example_list_like)]
for attribute in list_attributes:
if attribute not in list_like_attributes:
print(f"ListLikes don't have {attribute}")
print("-----------")
for attribute in list_like_attributes:
if attribute not in list_attributes:
print(f"lists don't have {attribute}")
and change your implementation accordingly.
Let's say I have this class (simplified for the sake of clarity):
class Foo:
def __init__(self, creator_id):
self._id = get_unique_identifier()
self._owner = creator_id
self._members = set()
self._time = datetime.datetime.now()
get_creator(creator_id).add_foo(self._id)
def add_member(self, mbr_id):
self._members.add(mbr_id)
and I want to make a __deepcopy__() method for it. From what I can tell, the way that these copies are generally made is to create a new instance using the same constructor parameters as the old one, however in my case, that will result in a different identifier, a different time, and a different member set, as well as the object being referenced by the creator's object twice, which will result in breakages.
One possible workaround would be to create the new instance then modify the incorrect internal data to match, but this doesn't fix the issues where the new object's ID will still be present in the creator's data structure. of course, that could be removed manually, but that wouldn't be clean or logical to follow.
Another workaround is to have an optional copy_from parameter in the constructor, but this would add complexity to the constructor in a way that could be confusing, especially since it would only be used implicitly by the object's __deepcopy__() method. This still looks like the best option if there isn't a better way.
#...
def __init__(self, creator_id, copy_from=None):
if isinstance(copy_from, Foo):
# copy all the parameters manually
pass
else:
# normal constructor
pass
#...
Basically, I'm looking for something similar to the copy constructor in C++, where I can get a reference to the original object and then copy across its parameters without having to add unwanted complexity to the original constructor.
Any ideas are appreciated. Let me know if I've overlooked something really simple.
Let's suppose I have a class like this:
class MyClass:
def __init__(self, a):
self._a = a
And I construct such instances:
obj1 = MyClass(5)
obj2 = MyClass(12)
obj3 = MyClass(5)
Is there a general way to hash my objects such that objects constructed with same values have equal hashes? In this case:
myhash(obj1) != myhash(obj2)
myhash(obj1) == myhash(obj3)
By general I mean a Python function that can work with objects created by any class I can define. For different classes and same values the hash function must return different results, of course; otherwise this question would be about hashing of several arguments instead.
def myhash(obj):
items = sorted(obj.__dict__.items(), key=lambda it: it[0])
return hash((type(obj),) + tuple(items))
This solution obviously has limitations:
It assumes that all fields in __dict__ are important.
It assumes that __dict__ is present, e.g. this won't work with __slots__.
It assumes that all values are hashable
It breaks the Liskov substitution principle.
The question is badly formed for a couple reasons:
Hashes don't test eqaulity, just inequality. That is, they guarantee that hash(a) != hash(b) implies a != b, but the reverse does not hold true. For example, checking "aKey" in myDict will do a linear search through all keys in myDict that have the same hash as "aKey".
You seem to wanting to do something with storage. Note that the hash of "aKey" will change between runs, so don't write it to a file. See the bottom of __hash__ for more information.
In general, you need to think carefully about subclasses, hashes, and equality. There is a pit here, so even the official documentation quietly sidesteps what the hash of instance means. Do note that each instance has a __dict__ for local variables and the __class__ with more information.
Hope this helps those who come after you.
I've been switching from Matlab to NumPy/Scipy, and I think NumPy is great in many aspects.
But one thing that I don't feel comfortable is that I cannot find a data structure similar to struct in C/C++.
For example, I may want to do the following thing:
struct Parameters{
double frame_size_sec;
double frame_step_sec;
}
One simplest way is using a dictionary as follows.
parameters = {"frame_size_sec" : 0.0, "frame_step_sec", 0.0}
But in case of a dictionary, unlike struct, any keys may be added. I'd like to restrict keys.
The other option might be using a class as follows. But it also has the same type of problems.
class Parameters:
frame_size_sec = 0.0
frame_step_sec = 0.0
From a thread, I saw that there is a data structure called named tuple, which looks great, but the biggest problem with it is that fields are immutable. So it is still different from what I want.
In sum, what would be the best way to use a struct-like object in python?
If you don't need actual memory layout guarantees, user-defined classes can restrict their set of instance members to a fixed list using __slots__. So for example:
class Parameters: # On Python 2, class Parameters(object):, as __slots__ only applies to new-style classes
__slots__ = 'frame_size_sec', 'frame_step_sec'
def __init__(self, frame_size_sec=0., frame_step_sec=0.):
self.frame_size_sec = float(frame_size_sec)
self.frame_step_sec = float(frame_step_sec)
gets you a class where on initialization, it's guaranteed to assign two float members, and no one can add new instance attributes (accidentally or on purpose) to any instance of the class.
Please read the caveats at the __slots__ documentation; in inheritance cases for instance, if a superclass doesn't define __slots__, then the subclass will still have __dict__ and therefore can have arbitrary attributes defined on it.
If you need memory layout guarantees and stricter (C) types for variables, you'll want to look at ctypes Structures, but from what you're saying, it sounds like you're just trying to enforce a fixed, limited set of attributes, not specific types or memory layouts.
While taking the risk of not being very Pythonic, you can create an immutable dictionary by subclassing the dict class and overwriting some of its methods:
def not_supported(*args, **kwargs):
raise NotImplementedError('ImmutableDict is immutable')
class ImmutableDict(dict):
__delitem__ = not_supported
__setattr__ = not_supported
update = not_supported
clear = not_supported
pop = not_supported
popitem = not_supported
def __getattr__(self, item):
return self[item]
def __setitem__(self, key, value):
if key in self.keys():
dict.__setitem__(self, key, value)
else:
raise NotImplementedError('ImmutableDict is immutable')
Some usage examples:
my_dict = ImmutableDict(a=1, b=2)
print my_dict['a']
>> 1
my_dict['a'] = 3 # will work, can modify existing key
my_dict['c'] = 1 # will raise an exception, can't add a new key
print my_dict.a # also works because we overwrote __getattr__ method
>> 3