Hashing a Python function - python

def funcA(x):
return x
Is funcA.__code__.__hash__() a suitable way to check whether funcA has changed?
I know that funcA.__hash__() won't work as it the same as id(funcA) / 16. I checked and this isn't true for __code__.__hash__(). I also tested the behaviour in a ipython terminal and it seemed to hold. But is this guaranteed to work?
Why
I would like to have a way of comparing an old version of function to a new version of the same function.
I'm trying to create a decorator for disk-based/long-term caching. Thus I need a way to identify if a function has changed. I also need to look at the call graph to check that none of the called functions have changed but that is not part of this question.
Requirements:
Needs to be stable over multiple calls and machines. 1 says that in Python 3.3 hash() is randomized on each start of a new instance. Although it also says that "HASH RANDOMIZATION IS DISABLED BY DEFAULT". Ideally, I'd like a function that does is stable even with randomization enabled.
Ideally, it would yield the same hash for def funcA: pass and def funcB: pass, i.e. when only the name of the function changes. Probably not necessary.
I only care about Python 3.
One alternative would be to hash the text inside the file that contains the given function.

Yes, it seems that func_a.__code__.__hash__() is unique to the specific functionality of the code. I could not find where this is implemented, or where it is __code__.__hash__() defined.
The perfect way would be to use func_a.__code__.co_code.__hash__() because co_code has the byte code as a string. Note that in this case, the function name is not part of the hash and two functions with the same code but names func_a and func_b will have the same hash.
hash(func_a.__code__.co_code)
Source.

Related

Setting a variable to a parameter value inline when calling a function

In other languages, like Java, you can do something like this:
String path;
if (exists(path = "/some/path"))
my_path = path;
the point being that path is being set as part of specifying a parameter to a method call. I know that this doesn't work in Python. It is something that I've always wished Python had.
Is there any way to accomplish this in Python? What I mean here by "accomplish" is to be able to write both the call to exists and the assignment to path, as a single statement with no prior supporting code being necessary.
I'll be OK with it if a way of doing this requires the use of an additional call to a function or method, including anything I might write myself. I spent a little time trying to come up with such a module, but failed to come up with anything that was less ugly than just doing the assignment before calling the function.
UPDATE: #BrokenBenchmark's answer is perfect if one can assume Python 3.8 or better. Unfortunately, I can't yet do that, so I'm still searching for a solution to this problem that will work with Python 3.7 and earlier.
Yes, you can use the walrus operator if you're using Python 3.8 or above:
import os
if os.path.isdir((path := "/some/path")):
my_path = path
I've come up with something that has some issues, but does technically get me where I was looking to be. Maybe someone else will have ideas for improving this to make it fully cool. Here's what I have:
# In a utility module somewhere
def v(varname, arg=None):
if arg is not None:
if not hasattr(v, 'vals'):
v.vals = {}
v.vals[varname] = arg
return v.vals[varname]
# At point of use
if os.path.exists(v('path1', os.path.expanduser('~/.harmony/mnt/fetch_devqa'))):
fetch_devqa_path = v('path1')
As you can see, this fits my requirement of no extra lines of code. The "variable" involved, path1 in this example, is stored on the function that implements all of this, on a per-variable-name basis.
One can question if this is concise and readable enough to be worth the bother. For me, the verdict is still out. If not for the need to call the v() function a second time, I think I'd be good with it structurally.
The only functional problem I see with this is that it isn't thread-safe. Two copies of the code could run concurrently and run into a race condition between the two calls to v(). The same problem is greatly magnified if one fails to choose unique variable names every time this is used. That's probably the deal killer here.
Can anyone see how to use this to get to a similar solution without the drawbacks?

how to reference previous arguments of a function call in later arguments?

This is in micropython
I'm creating an API to control some hardware. The API will be implemented in C with an interface in micropython.
One example of my API is:
device.set(curr_chan.BipolarRange, curr_chan.BipolarRange.state.ON)
I'd like to be able to achieve the same functionality but shorten the second path by somehow implicitly referencing the first argument:
device.set(curr_chan.BipolarRange, <first arg?>.state.ON)
Is there anyway to do this?
The only way to do something like this now would be
device.set(curr_chan.BipolarRange.state.ON)
and then put an upward pointing C-pointer on both the ON C-object and state C-object so that I know which entry in curr_chan is being referenced.
The micropython runtime - and I assume CPython one - doesn't keep the entire object "tree" available to the developer in memory.
You could have special values for the second (state) argument which tell the function implementation to derive the state from the first argument. You could also introduce a completely separate function which has this behavior.
Or you could have a helper function which determines the state and passes it down to the set function, something like this:
device.set(*state_ON(curr_chan.BipolarRange))
Here, state_ON would return a tuple (curr_chan.BipolarRange, curr_chan.BipolarRange.state.ON).
In any case, there is no direct support for what you are trying to do in Python itself.
Pass the name of the attribute you want as the second argument. Call getattr (or PObject_GetAttr repeatedly to get each element of the .-separated string:
device.set(curr_chan.BipolarRange, 'state.ON')

Python Method Signature for Different Runtime Execution Data

Could someone tell me whether this idea is feasible in Python?
I want to have a method and the datatype of the signature is not fixed.
For example:
Foo(data1, data2) <-- Method Definition in Code
Foo(2,3) <---- Example of what would be executed in runtime
Foo(s,t) <---- Example of what would be executed in runtime
I know the code could work if i change the Foo(s,t) to Foo("s","t"). But I am trying to make the code smarter to recognize the command without the "" ...
singledispatch might be an answer, which transforms a function into a generic function, which can have different behaviors depending upon the type of its first argument.
You could see a concrete example in the above link. And you should do some special things if you want to do generic dispatch on more than one arguments.

Does python remember the result of an operation if it appears twice in the code?

I've been developing a sudoku solver in Python and the following question came up while trying to improve performance:
Does python remember the result of a calculation if the same calculation has to be performed multiple times throughout the code? Example: compare the following 2 bits of code:
if get_single(foo, bar) is not None:
position = get_single(foo, bar)
single = get_single(foo, bar)
if single is not None:
position = single
Are these 2 pieces of code equal in performance or does the second piece perform faster because the calculation is only performed once?
No, Python does not remember function calls or other calculations automatically. In general, it would be very bad if it did—imagine if every call to, say, random.randrange(6) returned the same value as the first call.
However, it's not hard to explicitly make it remember calls for specific functions where it's useful. This is usually called "memoization".
See the lru_cache decorator in the docs, for a nice example built into the stdlib.* All you have to do to make it remember every call to get_single(foo, bar) is change the definition of get_single like this;
#functools.lru_cache(maxsize=None)
def get_single(foo, bar):
# etc.
Or, if get_single is someone else's code that you're importing and can't touch, you can just wrap it:
get_single = functools.lru_cache(maxsize=None)(othermod.get_single)
… and then call your wrapper instead of the module's version.
* Note that lru_cache was added in Python 3.2. If you're using 2.7 (or, for some reason, 3.0-3.1), you can install the backport from PyPI, or find any of dozens of other memoizing caches on PyPI or ActiveState—or even, noticing that the functools docs link to the source, like many other stdlib modules meant to also serve as example code, copy the source to your own project. Although, IIRC, the 3.2 code needs a small change to work with 2.7 because it relies on nonlocal to hide its internals.
That being said, even if you know get_single is memoized, it's still not very good style to call it twice. If you only need to do this once, just write the three lines of code. If you need to do it repeatedly, write a wrapper function that wraps up those three lines or code, and then calling that function will be shorter than even the two-line version.

Analogue of devar in Python

When writing Python code, I often find myself wanting to get behavior similar to Lisp's defvar. Basically, if some variable doesn't exist, I want to create it and assign a particular value to it. Otherwise, I don't want to do anything, and in particular, I don't want to override the variable's current value.
I looked around online and found this suggestion:
try:
some_variable
except NameError:
some_variable = some_expensive_computation()
I've been using it and it works fine. However, to me this has the look of code that's not paradigmatically correct. The code is four lines, instead of the 1 that would be required in Lisp, and it requires exception handling to deal with something that's not "exceptional."
The context is that I'm doing interactively development. I'm executing my Python code file frequently, as I improve it, and I don't want to run some_expensive_computation() each time I do so. I could arrange to run some_expensive_computation() by hand every time I start a new Python interpreter, but I'd rather do something automated, particularly so that my code can be run non-interactively. How would a season Python programmer achieve this?
I'm using WinXP with SP3, Python 2.7.5 via Anaconda 1.6.2 (32-bit), and running inside Spyder.
It's generally a bad idea to rely on the existence or not of a variable having meaning. Instead, use a sentinel value to indicate that a variable is not set to an appropriate value. None is a common choice for this kind of sentinel, though it may not be appropriate if that is a possible output of your expensive computation.
So, rather than your current code, do something like this:
# early on in the program
some_variable = None
# later:
if some_variable is None:
some_variable = some_expensive_computation()
# use some_variable here
Or, a version where None could be a significant value:
_sentinel = object()
some_variable = _sentinel # this means it doesn't have a meaningful value
# later
if some_variable is _sentinel:
some_variable = some_expensive_computation()
It is hard to tell which is of greater concern to you, specific language features or a persistent session. Since you say:
The context is that I'm doing interactively development. I'm executing my Python code file frequently, as I improve it, and I don't want to run some_expensive_computation() each time I do so.
You may find that IPython provides a persistent, interactive environment that is pleasing to you.
Instead of writing Lisp in Python, just think about what you're trying to do. You want to avoid calling an expensive function twice and having it run two times. You can write your function do to that:
def f(x):
if x in cache:
return cache[x]
result = ...
cache[x] = result
return result
Or make use of Python's decorators and just decorate the function with another function that takes care of the caching for you. Python 3.3 comes with functools.lru_cache, which does just that:
import functools
#functools.lru_cache()
def f(x):
return ...
There are quite a few memoization libraries in the PyPi for 2.7.
For the use case you give, guarding with a try ... except seems like a good way to go about it: Your code is depending on leftover variables from a previous execution of your script.
But I agree that it's not a nice implementation of the concept "here's a default value, use it unless the variable is already set". Python does not directly support this for variables, but it does have a default-setter for dictionary keys:
myvalues = dict()
myvalues.setdefault("some_variable", 42)
print some_variable # prints 42
The first argument of setdefault must be a string containing the name of the variable to be defined.
If you had a complicated system of settings and defaults (like emacs does), you'd probably keep the system settings in their own dictionary, so this is all you need. In your case, you could also use setdefault directly on global variables (only), with the help of the built-in function globals() which returns a modifiable dictionary:
globals().setdefault("some_variable", 42)
But I would recommend using a dictionary for your persistent variables (you can use the try... except method to create it conditionally). It keeps things clean and it seems more... pythonic somehow.
Let me try to summarize what I've learned here:
Using exception handling for flow control is fine in Python. I could do it once to set up a dict in which I can store what ever I want.
There are libraries and language features that are designed for some form of persistence; these can provide "high road" solutions for some applications. The shelve module is an obvious candidate here, but I would construe "some form of persistence" broadly enough to include #Blender's suggest to use memoization.

Categories

Resources