Developing a heuristic to test simple anonymous Python functions for equivalency - python

I know how function comparison works in Python 3 (just comparing address in memory), and I understand why.
I also understand that "true" comparison (do functions f and g return the same result given the same arguments, for any arguments?) is practically impossible.
I am looking for something in between. I want the comparison to work on the simplest cases of identical functions, and possibly some less trivial ones:
lambda x : x == lambda x : x # True
lambda x : 2 * x == lambda y : 2 * y # True
lambda x : 2 * x == lambda x : x * 2 # True or False is fine, but must be stable
lambda x : 2 * x == lambda x : x + x # True or False is fine, but must be stable
Note that I'm interested in solving this problem for anonymous functions (lambda), but wouldn't mind if the solution also works for named functions.
The motivation for this is that inside blist module, it would be nice to verify that two sortedset instances have the same sort function before performing a union, etc. on them.
Named functions are of less interest because I can assume them to be different when they are not identical. After all, suppose someone created two sortedsets with a named function in the key argument. If they intend these instances to be "compatible" for the purposes of set operations, they'd probably use the same function, rather than two separate named functions that perform identical operations.
I can only think of three approaches. All of them seem hard, so any ideas appreciated.
Comparing bytecodes might work but it might be annoying that it's implementation dependent (and hence the code that worked on one Python breaks on another).
Comparing tokenized source code seems reasonable and portable. Of course, it's less powerful (since identical functions are more likely to be rejected).
A solid heuristic borrowed from some symbolic computation textbook is theoretically the best approach. It might seem too heavy for my purpose, but it actually could be a good fit since lambda functions are usually tiny and so it would run fast.
EDIT
A more complicated example, based on the comment by #delnan:
# global variable
fields = ['id', 'name']
def my_function():
global fields
s1 = sortedset(key = lambda x : x[fields[0].lower()])
# some intervening code here
# ...
s2 = sortedset(key = lambda x : x[fields[0].lower()])
Would I expect the key functions for s1 and s2 to evaluate as equal?
If the intervening code contains any function call at all, the value of fields may be modified, resulting in different key functions for s1 and s2. Since we clearly won't be doing control flow analysis to solve this problem, it's clear that we have to evaluate these two lambda functions as different, if we are trying to perform this evaluation before runtime. (Even if fields wasn't global, it might have been had another name bound to it, etc.) This would severely curtail the usefulness of this whole exercise, since few lambda functions would have no dependence on the environment.
EDIT 2:
I realized it's very important to compare the function objects as they exist in runtime. Without that, all the functions that depend on variables from outer scope cannot be compared; and most useful functions do have such dependencies. Considered in runtime, all functions with the same signature are comparable in a clean, logical way, regardless of what they depend on, whether they are impure, etc.
As a result, I need not just the bytecode but also the global state as of the time the function object was created (presumably __globals__). Then I have to match all variables from outer scope to the values from __globals__.

Edited to check whether external state will affect the sorting function as well as if the two functions are equivalent.
I hacked up dis.dis and friends to output to a global file-like object. I then stripped out line numbers and normalized variable names (without touching constants) and compared the result.
You could clean this up so dis.dis and friends yielded out lines so you wouldn't have to trap their output. But this is a working proof-of-concept for using dis.dis for function comparison with minimal changes.
import types
from opcode import *
_have_code = (types.MethodType, types.FunctionType, types.CodeType,
types.ClassType, type)
def dis(x):
"""Disassemble classes, methods, functions, or code.
With no argument, disassemble the last traceback.
"""
if isinstance(x, types.InstanceType):
x = x.__class__
if hasattr(x, 'im_func'):
x = x.im_func
if hasattr(x, 'func_code'):
x = x.func_code
if hasattr(x, '__dict__'):
items = x.__dict__.items()
items.sort()
for name, x1 in items:
if isinstance(x1, _have_code):
print >> out, "Disassembly of %s:" % name
try:
dis(x1)
except TypeError, msg:
print >> out, "Sorry:", msg
print >> out
elif hasattr(x, 'co_code'):
disassemble(x)
elif isinstance(x, str):
disassemble_string(x)
else:
raise TypeError, \
"don't know how to disassemble %s objects" % \
type(x).__name__
def disassemble(co, lasti=-1):
"""Disassemble a code object."""
code = co.co_code
labels = findlabels(code)
linestarts = dict(findlinestarts(co))
n = len(code)
i = 0
extended_arg = 0
free = None
while i < n:
c = code[i]
op = ord(c)
if i in linestarts:
if i > 0:
print >> out
print >> out, "%3d" % linestarts[i],
else:
print >> out, ' ',
if i == lasti: print >> out, '-->',
else: print >> out, ' ',
if i in labels: print >> out, '>>',
else: print >> out, ' ',
print >> out, repr(i).rjust(4),
print >> out, opname[op].ljust(20),
i = i+1
if op >= HAVE_ARGUMENT:
oparg = ord(code[i]) + ord(code[i+1])*256 + extended_arg
extended_arg = 0
i = i+2
if op == EXTENDED_ARG:
extended_arg = oparg*65536L
print >> out, repr(oparg).rjust(5),
if op in hasconst:
print >> out, '(' + repr(co.co_consts[oparg]) + ')',
elif op in hasname:
print >> out, '(' + co.co_names[oparg] + ')',
elif op in hasjrel:
print >> out, '(to ' + repr(i + oparg) + ')',
elif op in haslocal:
print >> out, '(' + co.co_varnames[oparg] + ')',
elif op in hascompare:
print >> out, '(' + cmp_op[oparg] + ')',
elif op in hasfree:
if free is None:
free = co.co_cellvars + co.co_freevars
print >> out, '(' + free[oparg] + ')',
print >> out
def disassemble_string(code, lasti=-1, varnames=None, names=None,
constants=None):
labels = findlabels(code)
n = len(code)
i = 0
while i < n:
c = code[i]
op = ord(c)
if i == lasti: print >> out, '-->',
else: print >> out, ' ',
if i in labels: print >> out, '>>',
else: print >> out, ' ',
print >> out, repr(i).rjust(4),
print >> out, opname[op].ljust(15),
i = i+1
if op >= HAVE_ARGUMENT:
oparg = ord(code[i]) + ord(code[i+1])*256
i = i+2
print >> out, repr(oparg).rjust(5),
if op in hasconst:
if constants:
print >> out, '(' + repr(constants[oparg]) + ')',
else:
print >> out, '(%d)'%oparg,
elif op in hasname:
if names is not None:
print >> out, '(' + names[oparg] + ')',
else:
print >> out, '(%d)'%oparg,
elif op in hasjrel:
print >> out, '(to ' + repr(i + oparg) + ')',
elif op in haslocal:
if varnames:
print >> out, '(' + varnames[oparg] + ')',
else:
print >> out, '(%d)' % oparg,
elif op in hascompare:
print >> out, '(' + cmp_op[oparg] + ')',
print >> out
def findlabels(code):
"""Detect all offsets in a byte code which are jump targets.
Return the list of offsets.
"""
labels = []
n = len(code)
i = 0
while i < n:
c = code[i]
op = ord(c)
i = i+1
if op >= HAVE_ARGUMENT:
oparg = ord(code[i]) + ord(code[i+1])*256
i = i+2
label = -1
if op in hasjrel:
label = i+oparg
elif op in hasjabs:
label = oparg
if label >= 0:
if label not in labels:
labels.append(label)
return labels
def findlinestarts(code):
"""Find the offsets in a byte code which are start of lines in the source.
Generate pairs (offset, lineno) as described in Python/compile.c.
"""
byte_increments = [ord(c) for c in code.co_lnotab[0::2]]
line_increments = [ord(c) for c in code.co_lnotab[1::2]]
lastlineno = None
lineno = code.co_firstlineno
addr = 0
for byte_incr, line_incr in zip(byte_increments, line_increments):
if byte_incr:
if lineno != lastlineno:
yield (addr, lineno)
lastlineno = lineno
addr += byte_incr
lineno += line_incr
if lineno != lastlineno:
yield (addr, lineno)
class FakeFile(object):
def __init__(self):
self.store = []
def write(self, data):
self.store.append(data)
a = lambda x : x
b = lambda x : x # True
c = lambda x : 2 * x
d = lambda y : 2 * y # True
e = lambda x : 2 * x
f = lambda x : x * 2 # True or False is fine, but must be stable
g = lambda x : 2 * x
h = lambda x : x + x # True or False is fine, but must be stable
funcs = a, b, c, d, e, f, g, h
outs = []
for func in funcs:
out = FakeFile()
dis(func)
outs.append(out.store)
import ast
def outfilter(out):
for i in out:
if i.strip().isdigit():
continue
if '(' in i:
try:
ast.literal_eval(i)
except ValueError:
i = "(x)"
yield i
processed_outs = [(out, 'LOAD_GLOBAL' in out or 'LOAD_DECREF' in out)
for out in (''.join(outfilter(out)) for out in outs)]
for (out1, polluted1), (out2, polluted2) in zip(processed_outs[::2], processed_outs[1::2]):
print 'Bytecode Equivalent:', out1 == out2, '\nPolluted by state:', polluted1 or polluted2
The output is True, True, False, and False and is stable. The "Polluted" bool is true if the output will depend on external state -- either global state or a closure.

So, let's address some technical issues first.
1) Byte code: it is probably not an problem because, instead of inspecting the pyc (the binary files), you can use dis module to get the "bytecode". e.g.
>>> f = lambda x, y : x+y
>>> dis.dis(f)
1 0 LOAD_FAST 0 (x)
3 LOAD_FAST 1 (y)
6 BINARY_ADD
7 RETURN_VALUE
No need to worry about platform.
2) Tokenized source code. Again python has all you need to do the job. You can use the ast module to parse the code and obtain the ast.
>>> a = ast.parse("f = lambda x, y : x+y")
>>> ast.dump(a)
"Module(body=[Assign(targets=[Name(id='f', ctx=Store())], value=Lambda(args=arguments(args=[Name(id='x', ctx=Param()), Name(id='y', ctx=Param())], vararg=None, kwarg=None, defaults=[]), body=BinOp(left=Name(id='x', ctx=Load()), op=Add(), right=Name(id='y', ctx=Load()))))])"
So, the question we should really address is: is it feasible to determine that two functions are equivalent analytically?
It is easy for human to say 2*x equals to x+x, but how can we create an algorithm to prove it?
If it is what you want to achieve, you may want to check this out: http://en.wikipedia.org/wiki/Computer-assisted_proof
However, if ultimately you simply want to assert two different data set are sorted in the same order, you just need to run the sort function A on dataset B and vice versa, and then check the outcome. If they are identical, then the functions are probably functionally identical. Of course, the check is only valid for the said datasets.

Related

Evaluate an (almost algebraic) expression without the '*' symbol in python

I have the following content in the value.txt:
2A-25X-8A+34X-5B+11B
If I use MetaFont via terminal bash how below:
#mf
This is METAFONT, Version 2.7182818 (TeX Live 2019/Arch Linux) (preloaded base=mf)
**expr
(/usr/share/texmf-dist/fonts/source/public/knuth-lib/expr.mf
gimme an expr: 2A-25X-8A+34X-5B+11B
>> 6B+9X-6A
gimme an expr:
I can evaluate the expression without the '*' symbol between letters and numbers.
What I want is to do this using Python as cleanly and economically as possible but still without using '*'.
I haven't found anything about it yet.
I also hope it is a syntax that can be implemented with with open, print = and r.
EDIT
A possible idea would be like this:
with open ("value.txt", "r") as value:
data = value.read()
#some python method for evaluate value.txt expression and save in variable value2
print = (value2)
Always interested in questions regarding parsing arithmetic. Here is a pyparsing-based solution (albeit a bit longer than you were hoping, and using more than just with, open, etc.).
The first 30 lines define a class for tallying up the variables, with support for adding, subtracting, and multiplying by an integer. (Integers are modeled as a Tally with a variable of ''.)
The next 30 lines define the actual parser, and the parse-time actions to convert the parsed tokens into cumulative Tally objects.
The final 25 lines are tests, including your sample expression.
The real "smarts" of the parser are in the infixNotation method, which implements the parsing of the various operators, including handling of operator precedence and grouping
with ()'s. The use of "3A" to indicate "3 times A" is done by passing None as the multiplication operator. This also supports constructs like "2(A+2B)" to give "2A+4B".
import pyparsing as pp
# special form of dict to support addition, subtraction, and multiplication, plus a nice repr
class Tally(dict):
def __add__(self, other):
ret = Tally(**self)
for k, v in other.items():
ret[k] = ret.get(k, 0) + v
if k and ret[k] == 0:
ret.pop(k)
return ret
def __mul__(self, other):
if self[''] == 0:
return Tally()
ret = Tally(**other)
for k in ret:
ret[k] *= self['']
return ret
def __sub__(self, other):
return self + MINUS_1 * other
def __repr__(self):
ret = ''.join("{}{}{}".format("+" if coeff > 0 else "-", str(abs(coeff)) if abs(coeff) != 1 else "", var)
for var, coeff in sorted(self.items()) if coeff)
# leading '+' signs are unnecessary
ret = ret.lstrip("+")
return ret
MINUS_1 = Tally(**{'': -1})
var = pp.oneOf(list("ABCDEFGHIJKLMNOPQRSTUVWXYZ"))
# convert var to a Tally of 1
var.addParseAction(lambda t: Tally(**{t[0]: 1}))
integer = pp.pyparsing_common.integer().addParseAction(lambda tokens: Tally(**{'': tokens[0]}))
def add_terms(tokens):
parsed = tokens[0]
ret = parsed[0]
for op, term in zip(parsed[1::2], parsed[2::2]):
if op == '-':
ret -= term
else:
ret += term
return ret
def mult_terms(tokens):
coeff, var = tokens[0]
return coeff * var
# only the leading minus needs to be handled this way, all others are handled
# as binary subtraction operators
def leading_minus(tokens):
parsed = tokens[0]
return MINUS_1 * parsed[1]
leading_minus_sign = pp.StringStart() + "-"
operand = var | integer
expr = pp.infixNotation(operand,
[
(leading_minus_sign, 1, pp.opAssoc.RIGHT, leading_minus),
(None, 2, pp.opAssoc.LEFT, mult_terms),
(pp.oneOf("+ -"), 2, pp.opAssoc.LEFT, add_terms),
])
expr.runTests("""\
B
B+C
B+C+3B
2A
-2A
-3Z+42B
2A+4A-6A
2A-25X-8A+34X-5B+11B
3(2A+B)
-(2A+B)
-3(2A+B)
2A+12
12
-12
2A-12
(5-3)(A+B)
(3-3)(A+B)
""")
Gives the output (runTests echoes each test line, followed by the parsed result):
B
[B]
B+C
[B+C]
B+C+3B
[4B+C]
2A
[2A]
-2A
[-2A]
-3Z+42B
[42B-3Z]
2A+4A-6A
[]
2A-25X-8A+34X-5B+11B
[-6A+6B+9X]
3(2A+B)
[6A+3B]
-(2A+B)
[-2A-B]
-3(2A+B)
[-6A-3B]
2A+12
[12+2A]
12
[12]
-12
[-12]
2A-12
[-12+2A]
(5-3)(A+B)
[2A+2B]
(3-3)(A+B)
[]
To show how to use expr to parse your expression string, see this code:
result = expr.parseString("2A-25X-8A+34X-5B+11B")
print(result)
print(result[0])
print(type(result[0]))
# convert back to dict
print({**result[0]})
Prints:
[-6A+6B+9X]
-6A+6B+9X
<class '__main__.Tally'>
{'B': 6, 'A': -6, 'X': 9}

forcing ndiff on very dissimilar strings

The ndiff function from difflib allows a nice interface to detect differences in lines. It does a great job when the lines are close enough:
>>> print '\n'.join(list(ndiff(['foo*'], ['foot'], )))
- foo*
? ^
+ foot
? ^
But when the lines are too dissimilar, the rich reporting is no longer possible:
>>> print '\n'.join(list(ndiff(['foo'], ['foo*****'], )))
- foo
+ foo*****
This is the use case I am hitting, and I am trying to find ways to use ndiff (or the underlying class Differ) to force the reporting even if the strings are too dissimilar.
For the failing example, I would like to have a result like:
>>> print '\n'.join(list(ndiff(['foo'], ['foo*****'], )))
- foo
+ foo*****
? +++++
The function responsible for printing the context (i.e. those lines starting with ?) is Differ._fancy_replace. That function works by checking whether the two lines are equal by at least 75% (see the cutoff variable). Unfortunately, that 75% cutoff is hard-coded and cannot be changed.
What I can suggest is to subclass Differ and provide a version of _fancy_replace that simply ignores the cutoff. Here it is:
from difflib import Differ, SequenceMatcher
class FullContextDiffer(Differ):
def _fancy_replace(self, a, alo, ahi, b, blo, bhi):
"""
Copied and adapted from https://github.com/python/cpython/blob/3.6/Lib/difflib.py#L928
"""
best_ratio = 0
cruncher = SequenceMatcher(self.charjunk)
for j in range(blo, bhi):
bj = b[j]
cruncher.set_seq2(bj)
for i in range(alo, ahi):
ai = a[i]
if ai == bj:
continue
cruncher.set_seq1(ai)
if cruncher.real_quick_ratio() > best_ratio and \
cruncher.quick_ratio() > best_ratio and \
cruncher.ratio() > best_ratio:
best_ratio, best_i, best_j = cruncher.ratio(), i, j
yield from self._fancy_helper(a, alo, best_i, b, blo, best_j)
aelt, belt = a[best_i], b[best_j]
atags = btags = ""
cruncher.set_seqs(aelt, belt)
for tag, ai1, ai2, bj1, bj2 in cruncher.get_opcodes():
la, lb = ai2 - ai1, bj2 - bj1
if tag == 'replace':
atags += '^' * la
btags += '^' * lb
elif tag == 'delete':
atags += '-' * la
elif tag == 'insert':
btags += '+' * lb
elif tag == 'equal':
atags += ' ' * la
btags += ' ' * lb
else:
raise ValueError('unknown tag %r' % (tag,))
yield from self._qformat(aelt, belt, atags, btags)
yield from self._fancy_helper(a, best_i+1, ahi, b, best_j+1, bhi)
And here is an example of how it works:
a = [
'foo',
'bar',
'foobar',
]
b = [
'foo',
'bar',
'barfoo',
]
print('\n'.join(FullContextDiffer().compare(a, b)))
# Output:
#
# foo
# bar
# - foobar
# ? ---
#
# + barfoo
# ? +++
It seems what you want to do here is not to compare across multiple lines, but across strings. You can then pass your strings directly, without a list, and you should get a behaviour close to the one you are looking for.
>>> print ('\n'.join(list(ndiff('foo', 'foo*****'))))
f
o
o
+ *
+ *
+ *
+ *
+ *
Even though the output format is not the exact one you are looking for, it encapsulate the correct information. We can make an output adapter to give the correct format.
def adapter(out):
chars = []
symbols = []
for c in out:
chars.append(c[2])
symbols.append(c[0])
return ''.join(chars), ''.join(symbols)
This can be used like so.
>>> print ('\n'.join(adapter(ndiff('foo', 'foo*****'))))
foo*****
+++++

Decimal To Binary Python Getting an Extra Zero In Return String

This is for a school project. I need to create a function using recursion to convert an integer to binary string. It must be a str returned, not an int. The base case is n==0, and then 0 would need to be returned. There must be a base case like this, but this is where I think I am getting the extra 0 from (I could be wrong). I am using Python 3.6 with the IDLE and the shell to execute it.
The function works just fine, expect for this additional zero that I need gone.
Here is my function, dtobr:
def dtobr(n):
"""
(int) -> (str)
This function has the parameter n, which is a non-negative integer,
and it will return the string of 0/1's
which is the binary representation of n. No side effects.
Returns bianry string as mentioned. This is like the function
dtob (decimal to bianary) but this is using recursion.
Examples:
>>> dtob(27)
'11011'
>>> dtob(0)
'0'
>>> dtob(1)
'1'
>>> dtob(2)
'10'
"""
if n == 0:
return str(0)
return dtobr(n // 2) + str(n % 2)
This came from the function I already wrote which converted it just fine, but without recursion. For reference, I will include this code as well, but this is not what I need for this project, and there are no errors with this:
def dtob(n):
"""
(int) -> (str)
This function has the parameter n, which is a non-negative integer,
and it will return the string of 0/1's
which is the binary representation of n. No side effects.
Returns bianry string as mentioned.
Examples:
>>> dtob(27)
'11011'
>>> dtob(0)
'0'
>>> dtob(1)
'1'
>>> dtob(2)
'10'
"""
string = ""
if n == 0:
return str(0)
while n > 0:
remainder = n % 2
string = str(remainder) + string
n = n // 2
Hopefully someone can help me get ride of that additional left hand zero. Thanks!
You need to change the condition to recursively handle both the n // 2 and n % 2:
if n <= 1:
return str(n) # per #pault's suggestion, only needed str(n) instead of str(n % 2)
else:
return dtobr(n // 2) + dtobr(n % 2)
Test case:
for i in [0, 1, 2, 27]:
print(dtobr(i))
# 0
# 1
# 10
# 11011
FYI you can easily convert to binary format like so:
'{0:b}'.format(x) # where x is your number
Since there is already an answer that points and resolves the issue with recursive way, lets see some interesting ways to achieve same goal.
Lets define a generator that will give us iterative way of getting binary numbers.
def to_binary(n):
if n == 0: yield "0"
while n > 0:
yield str(n % 2)
n = n / 2
Then you can use this iterable to get decimal to binary conversion in multiple ways.
Example 1.
reduce function is used to concatenate chars received from to_binary iterable (generator).
from functools import reduce
def to_binary(n):
if n == 0: yield "0"
while n > 0:
yield str(n % 2)
n = n / 2
print reduce(lambda x, y: x+y, to_binary(0)) # 0
print reduce(lambda x, y: x+y, to_binary(15)) # 1111
print reduce(lambda x, y: x+y, to_binary(15)) # 11011
Example 2.
join takes iterable, unrolls it and joins them by ''
def to_binary(n):
if n == 0: yield "0"
while n > 0:
yield str(n % 2)
n = n / 2
print ''.join(to_binary(0)) # 0
print ''.join(to_binary(1)) # 1
print ''.join(to_binary(15)) # 1111
print ''.join(to_binary(27)) # 11011

How to stop a loop?

def sum_div(x, y):
for k in range(x,y+1):
for z in range(x,y+1):
sx = 0
sy = 0
for i in range(1, k+1):
if k % i == 0:
sx += i
for j in range(1, z+1):
if z % j == 0:
sy += j
if sx == sy and k!= z:
print "(", k ,",", z, ")"
x = input("Dati x : ")
y = input("Dati y : ")
sum_div(x, y)
How do I stop the looping if the value of z == y?
The loops print a pair of numbers in a range from x to y, but when it hit the y value the loop prints a reverse pair of numbers that I don't need it to.
The break command will break out of the loop. So a line like this:
if (z == y):
break
should do what you want.
What you're think you are asking for is the break command, but what you're actually looking for is removal of duplication.
Your program lacks some clarity. For instance:
for i in range(1, k+1):
if k % i == 0:
sx += i
for j in range(1, z+1):
if z % j == 0:
sy += j
These two things are doing essentially the same thing, which can be written more cleanly with a list comprehension (in the REPL):
>>> def get_divisors(r: int) -> list:
... return [i if r % i == 0 else 0 for i in range(1, r+1)]
...
...
>>> get_divisors(4)
>>> [1, 2, 0, 4]
>>> sum(get_divisors(4))
>>> 7
Your line:
while y:
... will infinitely loop if you find a match. You should just remove it. while y means "while y is true", and any value there will evaluate as true.
This reduces your program to the following:
def get_divisors(r: int) -> list:
return [i if r % i == 0 else 0 for i in range(1, r+1)]
def sum_div(x, y):
for k in range(x,y+1):
sum_of_x_divisors = sum(get_divisors(k)) # Note this is moved here to avoid repeating work.
for z in range(x,y+1):
sum_of_y_divisors = sum(get_divisors(z))
if sum_of_x_divisors == sum_of_y_divisors and k!= z:
print("({},{})".format(k, z))
Testing this in the REPL it seems correct based on the logic of the code:
>>> sum_div(9,15)
(14,15)
(15,14)
>>> sum_div(21, 35)
(21,31)
(31,21)
(33,35)
(35,33)
But it's possible that for sum_div(9,15) you want only one of (14,15) and (15,14). However, this has nothing to do with breaking your loop, but the fact that what you're attempting to do has two valid values when k and z don't equal each other. This is demonstrated by the second test case, where (33,35) is a repeated value, but if you broke the for loop on (21,31) you would not get that second set of values.
One way we can account for this is by reordering when work is done:
def sum_div(x, y):
result_set = set() # Sets cannot have duplicate values
for k in range(x,y+1):
sum_of_x_divisors = sum(get_divisors(k))
for z in range(x,y+1):
sum_of_y_divisors = sum(get_divisors(z))
if sum_of_x_divisors == sum_of_y_divisors and k!= z:
result_set.add(tuple(sorted((k,z)))) # compile the result set by sorting it and casting to a tuple, so duplicates are implicitly removed.
for k, z in result_set: # Print result set after it's been compiled
print("({},{})".format(k, z))
And we see a correct result:
>>> sum_div(9,15)
(14,15)
>>> sum_div(21,35)
(21,31)
(33,35)
Or, the test case you provided in comments. Note the lack of duplicates:
>>> sum_div(10,25)
(16,25)
(14,15)
(15,23)
(10,17)
(14,23)
Some takeaways:
Break out functions that are doing the same thing so you can reason more easily about it.
Name your variables in a human-readable format so that we, the readers of your code (which includes you) understands what is going on.
Don't use loops unless you're actually looping over something. for, while, etc. only need to be used if you're planning on going over a list of things.
When asking questions, be sure to always include test input, expected output and what you're actually getting back.
The current best-practice for printing strings is to use the .format() function, to make it really clear what you're printing.

Sympy: working with equalities manually

I'm currently doing a maths course where my aim is to understand the concepts and process rather than crunch through problem sets as fast as possible. When solving equations, I'd like to be able to poke at them myself rather than have them solved for me.
Let's say we have the very simple equation z + 1 = 4- if I were to solve this myself, I would obviously subtract 1 from both sides, but I can't figure out if sympy provides a simple way to do this. At the moment the best solution I can come up with is:
from sympy import *
z = symbols('z')
eq1 = Eq(z + 1, 4)
Eq(eq1.lhs - 1, eq1.rhs - 1)
# Output:
# z == 3
Where the more obvious expression eq1 - 1 only subtracts from the left-hand side. How can I use sympy to work through equalities step-by-step like this (i.e. without getting the solve() method to just given me the answer)? Any pointers to the manipulations that are actually possible with sympy equalities would be appreciated.
There is a "do" method and discussion at https://github.com/sympy/sympy/issues/5031#issuecomment-36996878 that would allow you to "do" operations to both sides of an Equality. It's not been accepted as an addition to SymPy but it is a simple add-on that you can use. It is pasted here for convenience:
def do(self, e, i=None, doit=False):
"""Return a new Eq using function given or a model
model expression in which a variable represents each
side of the expression.
Examples
========
>>> from sympy import Eq
>>> from sympy.abc import i, x, y, z
>>> eq = Eq(x, y)
When the argument passed is an expression with one
free symbol that symbol is used to indicate a "side"
in the Eq and an Eq will be returned with the sides
from self replaced in that expression. For example, to
add 2 to both sides:
>>> eq.do(i + 2)
Eq(x + 2, y + 2)
To add x to both sides:
>>> eq.do(i + x)
Eq(2*x, x + y)
In the preceding it was actually ambiguous whether x or i
was to be added but the rule is that any symbol that are
already in the expression are not to be interpreted as the
dummy variable. If we try to add z to each side, however, an
error is raised because now it is unclear whether i or z is being
added:
>>> eq.do(i + z)
Traceback (most recent call last):
...
ValueError: not sure what symbol is being used to represent a side
The ambiguity must be resolved by indicating with another parameter
which is the dummy variable representing a side:
>>> eq.do(i + z, i)
Eq(x + z, y + z)
Alternatively, if only one Dummy symbol appears in the expression then
it will be automatically used to represent a side of the Eq.
>>> eq.do(2*Dummy() + z)
Eq(2*x + z, 2*y + z)
Operations like differentiation must be passed as a
lambda:
>>> Eq(x, y).do(lambda i: i.diff(x))
Eq(1, 0)
Because doit=False by default, the result is not evaluated. to
evaluate it, either use the doit method or pass doit=True.
>>> _.doit == Eq(x, y).do(lambda i: i.diff(x), doit=True)
True
"""
if not isinstance(e, (FunctionClass, Lambda, type(lambda:1))):
e = S(e)
imaybe = e.free_symbols - self.free_symbols
if not imaybe:
raise ValueError('expecting a symbol')
if imaybe and i and i not in imaybe:
raise ValueError('indicated i not in given expression')
if len(imaybe) != 1 and not i:
d = [i for i in imaybe if isinstance(i, Dummy)]
if len(d) != 1:
raise ValueError(
'not sure what symbol is being used to represent a side')
i = set(d)
else:
i = imaybe
i = i.pop()
f = lambda side: e.subs(i, side)
else:
f = e
return self.func(*[f(side) for side in self.args], evaluate=doit)
from sympy.core.relational import Equality
Equality.do = do

Categories

Resources