Python: Comparison in function argument - python

I am currently working on the following problem:
I want to write a function, that accepts comparisons as arguments - but want to prevent, that these comparisons are evaluated during runtime (leading to potentially unexpected results).
To be more precise, I have the following challenge (first, original one in pySpark, then similar, more general one):
def test_func(*comparisons):
for comparison in comparison:
left_hand = comparison[0]
comparison = comparison[1]
right_hand = comparison[2]
Example 1:
test_func(F.col('a') == F.col('b'))
left_hand -> F.col('a')
right_hand -> F.col('b')
comparison -> ==
Example 2:
test_func(1 <=> 2)
left_hand -> 1
right_hand -> 2
comparison -> <=>
Right now, the equation/parameter is evaluated before it reaches the function - i.e., I have problems splitting the equation into it individual parts.
Is this even possible to like this?

The python operator module stores operators as functions
from operator import *
def test_func(*comparison):
left_hand = comparison[0]
comparison = comparison[1]
right_hand = comparison[2]
test_func(F.col('a'), eq, F.col('b'))
The variables would be (remember they would still be local to test_func):
left_hand = F.col('a')
comparison = eq -> operator.eq
right_hand = F.col('b')

As a quick proof of concept:
>>> import operator
>>> class Col:
... def __init__(self, col):
... self.col = col
...
... def __eq__(self, other):
... return self, operator.eq, other
...
>>> Col('a') == Col('b')
(<__main__.Col object at 0x11134d5b0>, <built-in function eq>, <__main__.Col object at 0x11147cbe0>)
>>> lh, comp, rh = Col('a') == Col('b')
>>> comp(lh.col, rh.col)
False
You'll need to overload all special methods for all operators you want to support, and return the equivalent operator function (or whatever you want, perhaps '==', or a custom object).

Related

Python (def) functions argument who switches IF NOT or just IF keywords

Example:
s = [1, 2]
def func(argument)":
for x in s:
argument x == 1:
print(x)
If instead argument be just IF keyword, then result will be 1, but if instead argument be NOT IF keywords, then result will be 2.
I want make a function, in which I can to choose change if statements keywords.
How with functions argument make IF NOT or just IF keyword?
Somewhat reading between the lines here, but you basically want this?
def foo(bar):
if bar:
if baz == 1:
print(baz)
elif not bar:
if baz != 1:
print(baz)
Of course that can be simplified:
def foo(bar = True):
if (baz == 1) == bar:
print(baz)
Note that this requires the parentheses, otherwise you'll get a chained comparison, which won't do what you might think.
Alternatively:
from operator import eq, ne
def foo(bar = True):
op = (ne, eq)[bar]
if op(baz, 1):
print(baz)
That (ne, eq)[bar] can be written more explicitly as eq if bar else ne, or if bar: op = eq else: op = ne. operator.eq and operator.ne are the function equivalents to == and !=.
This is also a possible alternative:
def foo(op = eq):
if op(baz, 1):
print(baz)
Then call the function like foo(eq), foo(ne), or with any other comparison function you want. This may be good idea in some circumstances, but maybe not as a general API design for generic functions.

How to emulate a C-style function pointer with Python functions

Suppose I have a function that is hard-coded to make a substring lowercase, when instances of that substring are found in a larger string, e.g.:
def process_record(header, needle, pattern):
sequence = needle
for idx in [m.start() for m in re.finditer(pattern, needle)]:
offset = idx + len(pattern)
sequence = sequence[:idx] + needle[idx:offset].lower() + sequence[offset:]
sys.stdout.write('%s\n%s\n' % (header, sequence))
This works fine, e.g.:
>>> process_record('>foo', 'ABCDEF', 'BCD')
>foo
AbcdEF
What I'd like to do is generalize this, to pass in a string function (lower, in this case, but it could be any function of a primitive type or class) as a parameter. Something like:
def process_record(header, needle, pattern, fn):
sequence = needle
for idx in [m.start() for m in re.finditer(pattern, needle)]:
offset = idx + len(pattern)
sequence = sequence[:idx] + needle[idx:offset].fn() + sequence[offset:]
sys.stdout.write('%s\n%s\n' % (header, sequence))
This doesn't work (which is why I'm asking the question), but hopefully this demonstrates the idea, to try to generalize what the function does in a way that is readable.
One option I suppose is to write a helper function that wraps stringInstance.lower() and passes copies of strings around, which is inefficient and clumsy. I'm hoping there's a more elegant approach that Python experts know about.
With C, for instance, I'd pass a pointer to the function I want to run as a parameter to process_record(), and run the function pointer directly on the variable of interest.
What is the syntax for doing the same when using string primitive functions (or similar on primitive or other classes) in Python?
In general, use this approach:
def call_fn(arg, fn):
return fn(arg)
call_fn('FOO', str.lower) # 'foo'
The definition of a method in Python always starts with self as it's first argument. By calling the method as an attribute of the class you can force the value of that argument.
Your example is a little complex, so I would break this into two different questions:
1) How can you provide functions as arguments?
Functions are objects like everything else, and can be passed around as expected, e.g.:
def apply(val, func):
# e.g. ("X", string.lower) -> "x"
# ("X", lambda x: x * 2) -> "XX"
return func(val)
In your example, you might do
def process_record(..., func):
...
sequence = ... func(needle[idx:offset]) ...
...
An alternative method that I wouldn't recommend would be something like
def apply_by_name(val, method_name):
# e.g. ("X", "lower") -> "x"
return getattr(val, method_name)()
2) How can I apply an effect to each match of a regular expression in a string?
For this I would recommend the built-in 'sub' function, which takes strings as well as functions.
>>> re.sub('[aeiou]', '!', 'the quick brown fox')
'th! q!!ck br!wn f!x'
def foo(match):
v = match.group()
if v == 'i': return '!!!!!!!'
elif v in 'eo': return v * 2
else: return v.upper()
>>> re.sub('[aeiou]', foo, 'the quick brown fox')
'thee qU!!!!!!!ck broown foox'
Hope this helps!

"subtracting" strings, classes in python

Learning about classes in python. I want the difference between two strings, a sort of subtraction. eg:
a = "abcdef"
b ="abcde"
c = a - b
This would give the output f.
I was looking at this class and I am new to this so would like some clarification on how it works.
class MyStr(str):
def __init__(self, val):
return str.__init__(self, val)
def __sub__(self, other):
if self.count(other) > 0:
return self.replace(other, '', 1)
else:
return self
and this will work in the following way:
>>> a = MyStr('thethethethethe')
>>> b = a - 'the'
>>> a
'thethethethethe'
>>> b
'thethethethe'
>>> b = a - 2 * 'the'
>>> b
'thethethe'
So a string is passed to the class and the constructor is called __init__. This runs the constructor and an object is returned, which contains the value of the string? Then a new subtraction function is created, so that when you use - with the MyStr object it is just defining how subtract works with that class? When sub is called with a string, count is used to check if that string is a substring of the object created. If that is the case, the first occurrence of the passed string is removed. Is this understanding correct?
Edit: basically this class could be reduced to:
class MyStr(str):
def __sub__(self, other):
return self.replace(other, '', 1)
Yes, your understanding is entirely correct.
Python will call a .__sub__() method if present on the left-hand operand; if not, a corresponding .__rsub__() method on the right-hand operand can also hook into the operation.
See emulating numeric types for a list of hooks Python supports for providing more arithmetic operators.
Note that the .count() call is redundant; .replace() will not fail if the other string is not present; the whole function could be simplified to:
def __sub__(self, other):
return self.replace(other, '', 1)
The reverse version would be:
def __rsub__(self, other):
return other.replace(self, '', 1)

How to use infix operators as higher order functions?

Is there any way to use infix operators (like +,-,*,/) as higher order functions in python without creating "wrapper" functions?
def apply(f,a,b):
return f(a,b)
def plus(a,b):
return a + b
# This will work fine
apply(plus,1,1)
# Is there any way to get this working?
apply(+,1,1)
You can use the operator module, which has the "wrapper" functions written for you already.
import operator
def apply(f,a,b):
return f(a,b)
print apply(operator.add,1,1)
Result:
2
You can also define the wrapper using lambda functions, which saves you the trouble of a standalone def:
print apply(lamba a,b: a+b, 1, 1)
Use operator module and a dictionary:
>>> from operator import add, mul, sub, div, mod
>>> dic = {'+':add, '*':mul, '/':div, '%': mod, '-':sub}
>>> def apply(op, x, y):
return dic[op](x,y)
...
>>> apply('+',1,5)
6
>>> apply('-',1,5)
-4
>>> apply('%',1,5)
1
>>> apply('*',1,5)
5
Note that you can't use +, -, etc directly as they are not valid identifiers in python.
You can use the operator module this way:
import operator
def apply(op, a, b):
return op(a, b)
print(apply(operator.add, 1, 2))
print(apply(operator.lt, 1, 2))
Output:
3
True
The other solution is to use a lambda function, but "there should be one -- and preferably only one -- obvious way to do it", so I prefer to use the operator module
you can use anonymous function : apply(lambda x,y : x + y, 1,1)
# Is there any way to get this working?
apply(+,1,1)
No. As others have already mentioned, there are function forms of all of the operators in the operator module. But, you can't use the operators themselves as that is a SyntaxError and there is no way to dynamically change python's core syntax. You can get close though using dictionaries and passing strings:
_mapping = {'+':operator.add}
def apply(op,*args):
return _mapping[op](*args)
apply('+',1,1)
It is possible to give the operators +, -, *, and / special behavior for a class using magic methods, you can read about this here: http://www.rafekettler.com/magicmethods.html
This isn't exactly what you were asking for because this still requires the creation of a method for each operator, but it does allow you to use the operators by symbol in your code. Note that I don't think this is better than the other methods, it is just an illustration of how you can define behavior for operators:
class Prefix(object):
def __add__(self, other):
""" Prefix() + (a, b) == a + b """
return other[0] + other[1]
def __sub__(self, other):
""" Prefix() - (a, b) == a - b """
return other[0] - other[1]
def __mul__(self, other):
""" Prefix() * (a, b) == a * b """
return other[0] * other[1]
def __div__(self, other):
""" Prefix() / (a, b) == a / b """
return other[0] / other[1]
And examples:
>>> prefix = Prefix()
>>> prefix + (12, 3)
15
>>> prefix - (12, 3)
9
>>> prefix * (12, 3)
36
>>> prefix / (12, 3)
4
Of course this method can't be used for a more complex prefix equation like * / 6 2 5 because there is no way to define behavior for adjacent operators, which will always give a SyntaxError (except for a few special cases where + or - are interpreted as making the next element positive or negative).

Python, lazy list

Is it possible to have a list be evaluated lazily in Python?
For example
a = 1
list = [a]
print list
#[1]
a = 2
print list
#[1]
If the list was set to evaluate lazily then the final line would be [2]
The concept of "lazy" evaluation normally comes with functional languages -- but in those you could not reassign two different values to the same identifier, so, not even there could your example be reproduced.
The point is not about laziness at all -- it is that using an identifier is guaranteed to be identical to getting a reference to the same value that identifier is referencing, and re-assigning an identifier, a bare name, to a different value, is guaranteed to make the identifier refer to a different value from them on. The reference to the first value (object) is not lost.
Consider a similar example where re-assignment to a bare name is not in play, but rather any other kind of mutation (for a mutable object, of course -- numbers and strings are immutable), including an assignment to something else than a bare name:
>>> a = [1]
>>> list = [a]
>>> print list
[[1]]
>>> a[:] = [2]
>>> print list
[[2]]
Since there is no a - ... that reassigns the bare name a, but rather an a[:] = ... that reassigns a's contents, it's trivially easy to make Python as "lazy" as you wish (and indeed it would take some effort to make it "eager"!-)... if laziness vs eagerness had anything to do with either of these cases (which it doesn't;-).
Just be aware of the perfectly simple semantics of "assigning to a bare name" (vs assigning to anything else, which can be variously tweaked and controlled by using your own types appropriately), and the optical illusion of "lazy vs eager" might hopefully vanish;-)
Came across this post when looking for a genuine lazy list implementation, but it sounded like a fun thing to try and work out.
The following implementation does basically what was originally asked for:
from collections import Sequence
class LazyClosureSequence(Sequence):
def __init__(self, get_items):
self._get_items = get_items
def __getitem__(self, i):
return self._get_items()[i]
def __len__(self):
return len(self._get_items())
def __repr__(self):
return repr(self._get_items())
You use it like this:
>>> a = 1
>>> l = LazyClosureSequence(lambda: [a])
>>> print l
[1]
>>> a = 2
>>> print l
[2]
This is obviously horrible.
Python is not really very lazy in general.
You can use generators to emulate lazy data structures (like infinite lists, et cetera), but as far as things like using normal list syntax, et cetera, you're not going to have laziness.
That is a read-only lazy list where it only needs a pre-defined length and a cache-update function:
import copy
import operations
from collections.abc import Sequence
from functools import partialmethod
from typing import Dict, Union
def _cmp_list(a: list, b: list, op, if_eq: bool, if_long_a: bool) -> bool:
"""utility to implement gt|ge|lt|le class operators"""
if a is b:
return if_eq
for ia, ib in zip(a, b):
if ia == ib:
continue
return op(ia, ib)
la, lb = len(a), len(b)
if la == lb:
return if_eq
if la > lb:
return if_long_a
return not if_long_a
class LazyListView(Sequence):
def __init__(self, length):
self._range = range(length)
self._cache: Dict[int, Value] = {}
def __len__(self) -> int:
return len(self._range)
def __getitem__(self, ix: Union[int, slice]) -> Value:
length = len(self)
if isinstance(ix, slice):
clone = copy.copy(self)
clone._range = self._range[slice(*ix.indices(length))] # slicing
return clone
else:
if ix < 0:
ix += len(self) # negative indices count from the end
if not (0 <= ix < length):
raise IndexError(f"list index {ix} out of range [0, {length})")
if ix not in self._cache:
... # update cache
return self._cache[ix]
def __iter__(self) -> dict:
for i, _row_ix in enumerate(self._range):
yield self[i]
__eq__ = _eq_list
__gt__ = partialmethod(_cmp_list, op=operator.gt, if_eq=False, if_long_a=True)
__ge__ = partialmethod(_cmp_list, op=operator.ge, if_eq=True, if_long_a=True)
__le__ = partialmethod(_cmp_list, op=operator.le, if_eq=True, if_long_a=False)
__lt__ = partialmethod(_cmp_list, op=operator.lt, if_eq=False, if_long_a=False)
def __add__(self, other):
"""BREAKS laziness and returns a plain-list"""
return list(self) + other
def __mul__(self, factor):
"""BREAKS laziness and returns a plain-list"""
return list(self) * factor
__radd__ = __add__
__rmul__ = __mul__
Note that this class is discussed also in this SO.

Categories

Resources