I am writing some small library and I want to provide users two approaches for the same functionality, by instance method and static method. Here is a simplified example:
class ClassTimesAdd(object):
def __init__(self, a, b):
self.a = a
self.b = b
def TimesAdd(self, c):
return self.a * self.b + c
#staticmethod
def TimesAdd(a, b, c):
return a * b + c
print(ClassTimesAdd.TimesAdd(1, 3, 7))
ins = ClassTimesAdd(2, 5)
print(ins.TimesAdd(7))
And you can find that the earlier function will be overwritten and only the last one is valid. I'm wondering if there is some simple method that I can use to make the two approaches both work.
Related
I have a function find() that needs to loop through a lot of objects to identify a similar object by comparing a bunch of properties.
class Target:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
class Source:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def find(target: Target, source_set: set):
for s in source_set:
if s.a == target.a:
if s.b == target.b:
if s.c == target.c:
print("Found!")
source_set = {
Source(a=1, b=2, c=3),
Source(a=4, b=2, c=4)
}
target = Target(a=4, b=2, c=4)
find(target, source_set)
The current function is very slow as my source_set can be millions.
The source_set creation and its Source objects can be adjusted (e.g. the type). The source_set itself is not modified after initialisation.
The Source objects creation's input is coming from a dict with the same properties. One Source's raw input data is like this:
{'a': '1', 'b': '2', 'c': '3'}
The source_set is searched with many targets.
Is there a nice way to be more efficient? I'm hoping to not need to change the data structure.
Without any external libraries, you can modify the __hash__ method of each class
class Target:
...
def __hash__(self):
return hash(frozenset(self.__dict__.items()))
class Source:
...
def __hash__(self):
return hash(frozenset(self.__dict__.items()))
Now try:
count = len({hash(target),}.intersection(map(hash, source_set)))
print(count)
# Output
1
Using Pandas:
# Python env: pip install pandas
# Miniconda env: conda install pandas
import pandas as pd
df = pd.DataFrame([s.__dict__ for s in source_set])
sr = pd.Series(target.__dict__)
print(df)
print(sr)
# Output of source_set
a b c
0 4 2 4
1 1 2 3
# Output of target
a 4
b 2
c 4
dtype: int64
Find same rows:
>>> sr.eq(df).all(axis=1).sum()
1
Since the source_set is only created once, but searched with many targets (as stated in your question), it is beneficial to invest time into creating a data structure for the source_set (which is only done once) if the reward is a time gain for the comparison later on (which is done multiple times).
Python's set provides the desired functionality. Internally it is somehow implemented as a hash map (not sure on this). To make use of the in statement, the elements in the set and also the elements that are compared to the set have to be hashable and comparable, i.e. both provide a __hash__ method and one of them provide a __eq__ method.
class Target:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __hash__(self):
return hash((self.a, self.b, self.c))
class Source:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __eq__(self, other):
return self.a == other.a and self.b == other.b and self.c == other.c
def __hash__(self):
return hash((self.a, self.b, self.c))
Now building the set of Source elements is a time investment, because for each element that is added, the __hash__ method is applied.
source_set = {
Source(a=1, b=2, c=3),
Source(a=4, b=2, c=4),
}
However, the reward is that now checking if a target is in the source_set happens in constant time compared to your current approach by comparing the target consecutively to each of the Sources which is in linear time.
target = Target(a=4, b=2, c=4)
target in source_set
# returns True
I have a method (dosomething) that defines an attribute (self.b). Dummy code below:
class foo:
def __init__(self):
self.a = 1
def dosomething(self, i):
self.b = 2 * self.a + i
return self.b ** 2
testobj = foo()
Attribute a can change - so dosomething is called to determine b given the current value of a.
I want to write a list comprehension like the one below. Except, I need to call dosomething for b to change. The dummy code below would just repeat the current value of self.b 20 times.
[testobj.b for i in range(20)] # pass i to dosomething then store self.b
The quick way is to just return self.b but, the return statement is preoccupied for another value that's much more complicated. If I could return self.b, then the following statement would work:
[testobj.dosomething(i) for i in range(20)]
Attribute b is just an intermediate value that I want to access. Is there a one liner list comprehension for this situation? I was considering defining a function within the method that returns self.b but, I'm not sure how I would be able to access it properly. So something like foo().dosomething(1).getb() wouldn't work because dosomething(1) evaluates to a number.
class foo:
def __init__(self):
self.a = 1
def dosomething(self, i):
self.b = 2 * self.a + i
def getb():
return self.b
return self.b ** 2
I guess I should also add that I don't want to be returning a data structure of different values. It would effect much of my code elsewhere.
Not a good use case for list comprehensions.
Let's say I have the following class containing a Numpy array a.
class MyClass():
def __init__(self,a,b):
self.a = a
self.other_attributes = b
def transpose(self):
return MyClass(self.a.T,self.other_attributes)
Since this "transpose the data, keep the rest unchanged" method will be used quite often, I would like to implement a short-named attribute like Numpy's .T. My problem is that I don't know how to do it without calling .transpose at initialization, i. e., I only want to do the transpose when it is required, instead of saving it in another attribute. Is this possible?
Use a property to compute attributes. This can also be used to cache computed results for later use.
class MyClass():
def __init__(self, a, b):
self.a = a
self.other_attributes = b
#property
def T(self):
try:
return self._cached_T # attempt to read cached attribute
except AttributeError:
self._cached_T = self._transpose() # compute and cache
return self._cached_T
def _transpose(self):
return MyClass(self.a.T, self.other_attributes)
Since Python 3.8, the standard library provides functools.cached_property to automatically cache computed attributes.
from functools import cached_property
class MyClass():
def __init__(self, a, b):
self.a = a
self.other_attributes = b
#cached_property
def T(self):
return self._transpose()
def _transpose(self):
return MyClass(self.a.T, self.other_attributes)
A transpose is just a view of the data, so "computing" the transpose at instantiation doesn't actually cost anything.
In [11]: a = np.random.rand(2, 1)
In [12]: a
Out[12]:
array([[0.22316214],
[0.69797139]])
In [13]: b = a.T
In [14]: a[0] = 1
In [15]: b
Out[15]: array([[1. , 0.69797139]])
I'm new to Python OOP and for the purpose of this question I have simplified my problem to this:
class Foo:
def __init__(self, a, b):
self.a = a
self.b = b
def add(self):
# some arbitrary change
return self.a + self.b
def subtract(self):
# some arbitrary change
return self.a - self.b
a = Foo(a=1, b=2).add()
b = Foo(a=1, b=3).subtract()
So I have an object, which has 2 methods which do different things, in order for me to get some output, I have created 2 separate instances of Foo as the value b has changed.
Is there a way for me to just dynamically set b and the obj.method() without just listing them one after the other? I.E: some sort of generic class that I can use to dynamically set the attributes and the methods that are present in the object? or is there anything built in I can use...
Edit
Here is another example:
class Foo:
def __init__(self, a, b):
self.a = list(a)
self.b = list(b)
def method1(self):.
# some arbitrary change in data
return self.a * 2
def method2(self):
return self.b + [5, 6, 4]
a = Foo(a=[1, 2, 3], b=[]).method1()
b = Foo(b=[1, 2, 3], a=[]).method2()
print(a)
print(b)
So here, the input list changes based on the method called, is there a way for me to package this up so I could feed just one instance some data and then it 'knows' that list a is for method1(), list b is for method2() - I want to use the word reflection but I feel like that might not be accurate.
Again I'm new to OOP so any advice is appreciated
class Foo:
def add(self, a, b):
return a + b
def subtract(self, a, b):
return a - b
fo = Foo()
a = fo.add(1,2)
b = fo.subtract(1,3)
you don't need 2 instances of Foo to achieve this.
Just do something like this:
foo = Foo(a = 1, b = 2)
# Perform addition (now 'a' is 1 and 'b' is 2)
a = foo.add()
# Change 'b'
foo.b = 3
# Now perform subtraction (now 'a' is 1 and 'b' is 3)
b = foo.subtract()
I just want to be able to unpack the instance variables of class foo, for example:
x = foo("name", "999", "24", "0.222")
a, b, c, d = *x
a, b, c, d = [*x]
I am not sure as to which is the correct method for doing so when implementing my own __iter__ method, however, the latter is the one that has worked with mixed "success". I say mixed because doing so with the presented code appears to alter the original instance object x, such that it is no longer valid.
class foo:
def __init__(self, a, b, c, d):
self.a = a
self.b = b
self.c = c
self.d = d
def __iter__(self):
return iter([a, b, c, d])
I have read the myriad posts on this site regarding __iter__, __next__, generators etc., and also a python book and docs.python.org and seem unable to figure what I am not understanding. I've gathered that __iter__ needs to return an iterable (which can be just be self, but I am not sure how that works for what I want). I've also tried various ways of playing around with implementing __next__ and iterating over vars(foo).items(), either by casting to a list or as a dictionary, with no success.
I don't believe this is a duplicate post on account that the only similar questions I've seen present a single list sequence object attribute or employ a range of numbers instead of a four non-container variables.
If you want the instance's variables, you should access them with .self:
def __iter__(self):
return iter([self.a, self.b, self.c, self.d])
with this change,
a, b, c, d = list(x)
will get you the variables.
You could go to the more risky method of using vars(x) or x.__dict__, sort it by the variables name (and that's why it is also a limited one, the variables are saved in no-order), and extract the second element of each tuple. But I would say the iterator is definitely better.
You can store the arguments in an attribute (self.e below) or return them on function call:
class foo:
def __init__(self, *args):
self.a, self.b, self.c, self.d = self.e = args
def __call__(self):
return self.e
x = foo("name", "999", "24", "0.222")
a, b, c, d = x.e
# or
a, b, c, d = x()