Numba & CUDA noob here. I'd like to be able to have one numba.cuda function programmatically call another one from the device, without having to pass any data back to the host. For example, given the setup
from numba import cuda
#cuda.jit('int32(int32)', device=True)
def a(x):
return x+1
#cuda.jit('int32(int32)', device=True)
def b(x):
return 2*x
I'd like to be able to define a composition kernel function like
#cuda.jit('void(int32, __device__, int32)')
def b_comp(x, inner, result):
y = inner(x)
result = b(y)
and successfully obtain
b_comp(1, a, result)
assert result == 4
Ideally I'd like b_comp to accept varying function arguments after it compiles [e.g. after the above call, to still accept b_comp(1, b, result)] -- but a solution where the function arguments become fixed at compile time will still work for me.
From what I've read, it seems that CUDA supports passing function pointers. This post suggests that numba.cuda has no such support, but the post isn't convincing, and is also a year old. The page for supported Python in numba.cuda doesn't mention function pointer support. But it links to the supported Python in numba page, which makes it clear that numba.jit() does support functions as arguments, although they get fixed at compile time. If numba.cuda.jit() does the same, like I said above, that'll work. In that case, when specifying the signature for comp, how should I state the variable type? Or could I use numba.cuda.autojit()?
If numba doesn't support any such direct approach, is metaprogramming a reasonable option? E.g. once I know the inner function, my script could create a new script containing a python function that composes those specific functions, and then apply numba.cuda.jit(), and then import the result. It seems convoluted, but it's the only other numba-based option I could think of.
If numba won't do the trick at all, or at least not without serious cludgery, I'd be happy with an answer that gave a few details, plus a rec like "switch to PyCuda".
Here's what worked for me:
Not decorating my functions with cuda.jit initially, so that they still possess the __name__ attribute
Getting the __name__ attribute
Now applying cuda.jit to my functions by directly calling the decorator
Creating the python for the composition function in a string, and passing it to exec
The exact code:
from numba import cuda
import numpy as np
def a(x):
return x+1
def b(x):
return 2*x
# Here, pretend we've been passed the inner function and the outer function as arguments
inner_fun = a
outer_fun = b
# And pretend we have noooooo idea what functions these guys actually point to
inner_name = inner_fun.__name__
outer_name = outer_fun.__name__
# Now manually apply the decorator
a = cuda.jit('int32(int32)', device=True)(a)
b = cuda.jit('int32(int32)', device=True)(b)
# Now construct the definition string for the composition function, and exec it.
exec_string = '#cuda.jit(\'void(int32, int32[:])\')\n' \
'def custom_comp(x, out_array):\n' \
' out_array[0]=' + outer_name + '(' + inner_name + '(x))\n'
exec(exec_string)
out_array = np.array([-1])
custom_comp(1, out_array)
print(out_array)
As expected, the output is
[4]
Related
To start with, my question here is about the semantics and the logic behind why the Python language was designed like this in the case of chained decorators. Please notice the nuance how this is different from the question
How decorators chaining work?
Link: How decorators chaining work? It seems quite a number of other users had the same doubts, about the call order of chained Python decorators. It is not like I can't add a __call__ and see the order for myself. I get this, my point is, why was it designed to start from the bottom, when it comes to chained Python decorators?
E.g.
def first_func(func):
def inner():
x = func()
return x * x
return inner
def second_func(func):
def inner():
x = func()
return 2 * x
return inner
#first_func
#second_func
def num():
return 10
print(num())
Quoting the documentation on decorators:
The decorator syntax is merely syntactic sugar, the following two function definitions are semantically equivalent:
def f(arg):
...
f = staticmethod(f)
#staticmethod
def f(arg):
...
From this it follows that the decoration in
#a
#b
#c
def fun():
...
is equivalent to
fun = a(b(c(fun)))
IOW, it was designed like that because it's just syntactic sugar.
For proof, let's just decorate an existing function and not return a new one:
def dec1(f):
print(f"dec1: got {vars(f)}")
f.dec1 = True
return f
def dec2(f):
print(f"dec2: got {vars(f)}")
f.dec2 = True
return f
#dec1
#dec2
def foo():
pass
print(f"Fully decked out: {vars(foo)}")
prints out
dec2: got {}
dec1: got {'dec2': True}
Fully decked out: {'dec2': True, 'dec1': True}
TL;DR
g(f(x)) means applying f to x first, then applying g to the output.
Omit the parentheses, add # before and line break after each function name:
#g
#f
x
(Syntax only valid if x is the definition of a function/class.)
Abstract explanation
The reasoning behind this design decision becomes fairly obvious IMHO, if you remember what the decorator syntax - in its most abstract and general form - actually means. So I am going to try the abstract approach to explain this.
It is all about syntax
To be clear here, the distinguishing factor in the concept of the "decorator" is not the object underneath it (so to speak) nor the operation it performs. It is the special syntax and the restrictions for it. Thus, a decorator at its core is nothing more than feature of Python grammar.
The decorator syntax requires a target to be decorated. Initially (see PEP 318) the target could only be function definitions; later class definitions were also allowed to be decorated (see PEP 3129).
Minimal valid syntax
Syntactically, this is valid Python:
def f(): pass
#f
class Target: pass # or `def target: pass`
However, this will (perhaps unsuprisingly) cause a TypeError upon execution. As has been reiterated multiple times here and in other posts on this platform, the above is equivalent to this:
def f(): pass
class Target: pass
Target = f(Target)
Minimal working decorator
The TypeError stems from the fact that f lacks a positional argument. This is the obvious logical restriction imposed by what a decorator is supposed to do. Thus, to achieve not only syntactically valid code, but also have it run without errors, this is sufficient:
def f(x): pass
#f
class Target: pass
This is still not very useful, but it is enough for the most general form of a working decorator.
Decoration is just application of a function to the target and assigning the output to the target's name.
Chaining functions ⇒ Chaining decorators
We can ignore the target and what it is or does and focus only on the decorator. Since it merely stands for applying a function, the order of operations comes into play, as soon as we have more than one. What is the order of operation, when we chain functions?
def f(x): pass
def g(x): pass
class Target: pass
Target = g(f(Target))
Well, just like in the composition of purely mathematical functions, this implies that we apply f to Target first and then apply g to the result of f. Despite g appearing first (i.e. further left), it is not what is applied first.
Since stacking decorators is equivalent to nesting functions, it seems obvious to define the order of operation the same way. This time, we just skip the parentheses, add an # symbol in front of the function name and a line break after it.
def f(x): pass
def g(x): pass
#g
#f
class Target: pass
But, why though?
If after the explanation above (and reading the PEPs for historic background), the reasoning behind the order of operation is still not clear or still unintuitive, there is not really any good answer left, other than "because the devs thought it made sense, so get used to it".
PS
I thought I'd add a few things for additional context based on all the comments around your question.
Decoration vs. calling a decorated function
A source of confusion seems to be the distinction between what happens when applying the decorator versus calling the decorated function.
Notice that in my examples above I never actually called target itself (the class or function being decorated). Decoration is itself a function call. Adding #f above the target is calling the f and passing the target to it as the first positional argument.
A "decorated function" might not even be a function
The distinction is very important because nowhere does it say that a decorator actually needs to return a callable (function or class). f being just a function means it can return whatever it wants. This is again valid and working Python code:
def f(x): return 3.14
#f
def target(): return "foo"
try:
target()
except Exception as e:
print(repr(e))
print(target)
Output:
TypeError("'float' object is not callable")
3.14
Notice that the name target does not even refer to a function anymore. It just holds the 3.14 returned by the decorator. Thus, we cannot even call target. The entire function behind it is essentially lost immediately before it is even available to the global namespace. That is because f just completely ignores its first positional argument x.
Replacing a function
Expanding this further, if we want, we can have f return a function. Not doing that seems very strange, considering it is used to decorate a function. But it doesn't have to be related to the target at all. Again, this is fine:
def bar(): return "bar"
def f(x): return bar
#f
def target(): return "foo"
print(target())
print(target is bar)
Output:
bar
True
It comes down to convention
The way decorators are actually overwhelmingly used out in the wild, is in a way that still keeps a reference to the target being decorated around somewhere. In practice it can be as simple as this:
def f(x):
print(f"applied `f({x.__name__})`")
return
#f
def target(): return "foo"
Just running this piece of code outputs applied f(target). Again, notice that we don't call target here, we only called f. But now, the decorated function is still target, so we could add the call print(target()) at the bottom and that would output foo after the other output produced by f.
The fact that most decorators don't just throw away their target comes down to convention. You (as a developer) would not expect your function/class to simply be thrown away completely, when you use a decorator.
Decoration with wrapping
This is why real-life decorators typically either return the reference to the target at the end outright (like in the last example) or they return a different callable, but that callable itself calls the target, meaning a reference to the target is kept in that new callable's local namespace . These functions are what is usually referred to as wrappers:
def f(x):
print(f"applied `f({x.__name__})`")
def wrapper():
print(f"wrapper executing with {locals()=}")
return x()
return wrapper
#f
def target(): return "foo"
print(f"{target()=}")
print(f"{target.__name__=}")
Output:
applied `f(target)`
wrapper executing with locals()={'x': <function target at 0x7f1b2f78f250>}
target()='foo'
target.__name__='wrapper'
As you can see, what the decorator left us is wrapper, not what we originally defined as target. And the wrapper is what we call, when we write target().
Wrapping wrappers
This is the kind of behavior we typically expect, when we use decorators. And therefore it is not surprising that multiple decorators stacked together behave the way they do. The are called from the inside out (as explained above) and each adds its own wrapper around what it receives from the one applied before:
def f(x):
print(f"applied `f({x.__name__})`")
def wrapper_from_f():
print(f"wrapper_from_f executing with {locals()=}")
return x()
return wrapper_from_f
def g(x):
print(f"applied `g({x.__name__})`")
def wrapper_from_g():
print(f"wrapper_from_g executing with {locals()=}")
return x()
return wrapper_from_g
#g
#f
def target(): return "foo"
print(f"{target()=}")
print(f"{target.__name__=}")
Output:
applied `f(target)`
applied `g(wrapper_from_f)`
wrapper_from_g executing with locals()={'x': <function f.<locals>.wrapper_from_f at 0x7fbfc8d64f70>}
wrapper_from_f executing with locals()={'x': <function target at 0x7fbfc8d65630>}
target()='foo'
target.__name__='wrapper_from_g'
This shows very clearly the difference between the order in which the decorators are called and the order in which the wrapped/wrapping functions are called.
After the decoration is done, we are left with wrapper_from_g, which is referenced by our target name in global namespace. When we call it, wrapper_from_g executes and calls wrapper_from_f, which in turn calls the original target.
I'm trying to create a function that chains results from multiple arguments.
def hi(string):
print(string)<p>
return hi
Calling hi("Hello")("World") works and becomes Hello \n World as expected.
the problem is when I want to append the result as a single string, but
return string + hi produces an error since hi is a function.
I've tried using __str__ and __repr__ to change how hi behaves when it has not input. But this only creates a different problem elsewhere.
hi("Hello")("World") = "Hello"("World") -> Naturally produces an error.
I understand why the program cannot solve it, but I cannot find a solution to it.
You're running into difficulty here because the result of each call to the function must itself be callable (so you can chain another function call), while at the same time also being a legitimate string (in case you don't chain another function call and just use the return value as-is).
Fortunately Python has you covered: any type can be made to be callable like a function by defining a __call__ method on it. Built-in types like str don't have such a method, but you can define a subclass of str that does.
class hi(str):
def __call__(self, string):
return hi(self + '\n' + string)
This isn't very pretty and is sorta fragile (i.e. you will end up with regular str objects when you do almost any operation with your special string, unless you override all methods of str to return hi instances instead) and so isn't considered very Pythonic.
In this particular case it wouldn't much matter if you end up with regular str instances when you start using the result, because at that point you're done chaining function calls, or should be in any sane world. However, this is often an issue in the general case where you're adding functionality to a built-in type via subclassing.
To a first approximation, the question in your title can be answered similarly:
class add(int): # could also subclass float
def __call__(self, value):
return add(self + value)
To really do add() right, though, you want to be able to return a callable subclass of the result type, whatever type it may be; it could be something besides int or float. Rather than trying to catalog these types and manually write the necessary subclasses, we can dynamically create them based on the result type. Here's a quick-and-dirty version:
class AddMixIn(object):
def __call__(self, value):
return add(self + value)
def add(value, _classes={}):
t = type(value)
if t not in _classes:
_classes[t] = type("add_" + t.__name__, (t, AddMixIn), {})
return _classes[t](value)
Happily, this implementation works fine for strings, since they can be concatenated using +.
Once you've started down this path, you'll probably want to do this for other operations too. It's a drag copying and pasting basically the same code for every operation, so let's write a function that writes the functions for you! Just specify a function that actually does the work, i.e., takes two values and does something to them, and it gives you back a function that does all the class munging for you. You can specify the operation with a lambda (anonymous function) or a predefined function, such as one from the operator module. Since it's a function that takes a function and returns a function (well, a callable object), it can also be used as a decorator!
def chainable(operation):
class CallMixIn(object):
def __call__(self, value):
return do(operation(self, value))
def do(value, _classes={}):
t = type(value)
if t not in _classes:
_classes[t] = type(t.__name__, (t, CallMixIn), {})
return _classes[t](value)
return do
add = chainable(lambda a, b: a + b)
# or...
import operator
add = chainable(operator.add)
# or as a decorator...
#chainable
def add(a, b): return a + b
In the end it's still not very pretty and is still sorta fragile and still wouldn't be considered very Pythonic.
If you're willing to use an additional (empty) call to signal the end of the chain, things get a lot simpler, because you just need to return functions until you're called with no argument:
def add(x):
return lambda y=None: x if y is None else add(x+y)
You call it like this:
add(3)(4)(5)() # 12
You are getting into some deep, Haskell-style, type-theoretical issues by having hi return a reference to itself. Instead, just accept multiple arguments and concatenate them in the function.
def hi(*args):
return "\n".join(args)
Some example usages:
print(hi("Hello", "World"))
print("Hello\n" + hi("World"))
Here I came up with the solution to the other question asked by me on how to remove all costly calling to debug output function scattered over the function code (slowdown was 25 times with using empty function lambda *p: None).
The solution is to edit function code dynamically and prepend all function calls with comment sign #.
from __future__ import print_function
DEBUG = False
def dprint(*args,**kwargs):
'''Debug print'''
print(*args,**kwargs)
def debug(on=False,string='dprint'):
'''Decorator to comment all the lines of the function code starting with string'''
def helper(f):
if not on:
import inspect
source = inspect.getsource(f)
source = source.replace(string, '#'+string) #Beware! Swithces off the whole line after dprint statement
with open('temp_f.py','w') as file:
file.write(source)
from temp_f import f as f_new
return f_new
else:
return f #return f intact
return helper
def f():
dprint('f() started')
print('Important output')
dprint('f() ended')
f = debug(DEBUG,'dprint')(f) #If decorator #debug(True) is used above f(), inspect.getsource somehow includes #debug(True) inside the code.
f()
The problems I see now are these:
# commets all line to the end; but there may be other statements separated by ;. This may be addressed by deleting all pprint calls in f, not commenting, still it may be not that trivial, as there may be nested parantheses.
temp_f.py is created, and then new f code is loaded from it. There should be a better way to do this without writing to hard drive. I found this recipe, but haven't managed to make it work.
if decorator is applied with special syntax used #debug, then inspect.getsource includes the line with decorator to the function code. This line can be manually removed from string, but it may lead to bugs if there are more than one decorator applied to f. I solved it with resorting to old-style decorator application f=decorator(f).
What other problems do you see here?
How can all these problems be solved?
What are upsides and downsides of this approach?
What can be improved here?
Is there any better way to do what I try to achieve with this code?
I think it's a very interesting and contentious technique to preprocess function code before compilation to byte-code. Strange though that nobody got interested in it. I think the code I gave may have a lot of shaky points.
A decorator can return either a wrapper, or the decorated function unaltered. Use it to create a better debugger:
from functools import wraps
def debug(enabled=False):
if not enabled:
return lambda x: x # Noop, returns decorated function unaltered
def debug_decorator(f):
#wraps(f)
def print_start(*args, **kw):
print('{0}() started'.format(f.__name__))
try:
return f(*args, **kw)
finally:
print('{0}() completed'.format(f.__name__))
return print_start
return debug_decorator
The debug function is a decorator factory, when called it produces a decorator function. If debugging is disabled, it simply returns a lambda that returns it argument unchanged, a no-op decorator. When debugging is enabled, it returns a debugging decorator that prints when a decorated function has started and prints again when it returns.
The returned decorator is then applied to the decorated function.
Usage:
DEBUG = True
#debug(DEBUG)
def my_function_to_be_tested():
print('Hello world!')
To reiterate: when DEBUG is set to false, the my_function_to_be_tested remains unaltered, so runtime performance is not affected at all.
Here is the solution I came up with after composing answers from another questions asked by me here on StackOverflow.
This solution don't comment anything and just deletes standalone dprint statements. It uses ast module and works with Abstract Syntax Tree, it lets us avoid parsing source code. This idea was written in the comment here.
Writing to temp_f.py is replaced with execution f in necessary environment. This solution was offered here.
Also, the last solution addresses the problem of decorator recursive application. It's solved by using _blocked global variable.
This code solves the problem asked to be solved in the question. But still, it's suggested not to be used in real projects:
You are correct, you should never resort to this, there are so many
ways it can go wrong. First, Python is not a language designed for
source-level transformations, and it's hard to write it a transformer
such as comment_1 without gratuitously breaking valid code. Second,
this hack would break in all kinds of circumstances - for example,
when defining methods, when defining nested functions, when used in
Cython, when inspect.getsource fails for whatever reason. Python is
dynamic enough that you really don't need this kind of hack to
customize its behavior.
from __future__ import print_function
DEBUG = False
def dprint(*args,**kwargs):
'''Debug print'''
print(*args,**kwargs)
_blocked = False
def nodebug(name='dprint'):
'''Decorator to remove all functions with name 'name' being a separate expressions'''
def helper(f):
global _blocked
if _blocked:
return f
import inspect, ast, sys
source = inspect.getsource(f)
a = ast.parse(source) #get ast tree of f
class Transformer(ast.NodeTransformer):
'''Will delete all expressions containing 'name' functions at the top level'''
def visit_Expr(self, node): #visit all expressions
try:
if node.value.func.id == name: #if expression consists of function with name a
return None #delete it
except(ValueError):
pass
return node #return node unchanged
transformer = Transformer()
a_new = transformer.visit(a)
f_new_compiled = compile(a_new,'<string>','exec')
env = sys.modules[f.__module__].__dict__
_blocked = True
try:
exec(f_new_compiled,env)
finally:
_blocked = False
return env[f.__name__]
return helper
#nodebug('dprint')
def f():
dprint('f() started')
print('Important output')
dprint('f() ended')
print('Important output2')
f()
Other relevant links:
Switching off debug prints
I have searched a little bit to try to figure this one out but didn't get a solution that I was exactly looking for.
This is my use case:
I would like to evaluate expressions from a functions/methods doc-string against the f/m's parameters and values, but from outside the function (when being called but outside execution of the function
I can't statically change the source code I am evaluating (cant write in new functionality) but dynamically changing (i.e. wrapping the function or adding attributes at run-time) is acceptable
I would prefer to stick with tools in the standard library but am willing to try external libraries if it will make the task a breeze
Here is a simple example of what I am trying to do:
def f1(a,b):
"""a==b"""
pass
def f2(f):
f_locals = "get f's args and values before f is executed"
return eval(f.__doc__,None,f_locals)
>>> f2(f1(2,2))
While I have no clue why you would want to do this, what you've described can be achieved with the inspect module. This example is as close to your original example that I can come up with.
from inspect import getcallargs
def f1(a,b):
"""a==b"""
pass
def f2(f, *f_args, **f_kwargs):
f_callargs = getcallargs(f, *f_args, **f_kwargs)
return eval(f.__doc__, None, f_callargs)
f2(f1, 2, 2)
This should output True.
Keep in mind that this assumes a great many things about the arguments and docstrings of any such functions passed to f2, not the least of which is that none of the examined functions are malicious or malformed. Why don't you want to call functions normally, and why don't you want to change functions?
Edit: As Pajton pointed out, getcallargs is more appropriate here, and removes the calls to both dict and zip. The above code has been updated to reflect this.
I'm not sure if this is what you are looking for, but here's an alternative without inspect module.
#!/usr/bin/python
# -*- coding: utf-8-unix -*-
"""
This is a sample implementation of Inline.pm (Perl) in Python.
Using #inline decorator, it is now possible to write any code
in any language in docstring, and let it compile down to executable
Python code at runtime.
For this specific example, it simply evals input docstring, so
code in docstring must be in Python as well.
"""
# Language compiler for MyLang
class MyLang:
#classmethod
def compile(self, docstring):
# For this example, this simply generates code that
# evals docstring.
def testfunc(*arg, **kw):
return eval(docstring, None, kw)
return testfunc
# #inline decorator
def inline(lang):
def decorate(func):
parm = func.__code__.co_varnames[0:func.__code__.co_argcount]
fgen = lang.compile(func.__doc__)
def wrap(*arg, **kw):
# turn all args into keyword-args
kw.update(dict(zip(parm, arg)))
return fgen(**kw)
return wrap
return decorate
#inline(MyLang)
def myadd(a, b):
"""a + b"""
print(myadd(1, 9))
print(myadd(b = 8, a = 2))
print(myadd(a = 3, b = 7))
I want to profile my Python code. I am well-aware of cProfile, and I use it, but it's too low-level. (For example, there isn't even a straightforward way to catch the return value from the function you're profiling.)
One of the things I would like to do: I want to take a function in my program and set it to be profiled on the fly while running the program.
For example, let's say I have a function heavy_func in my program. I want to start the program and have the heavy_func function not profile itself. But sometime during the runtime of my program, I want to change heavy_func to profile itself while it's running. (If you're wondering how I can manipulate stuff while the program is running: I can do it either from the debug probe or from the shell that's integrated into my GUI app.)
Is there a module already written which does stuff like this? I can write it myself but I just wanted to ask before so I won't be reinventing the wheel.
It may be a little mind-bending, but this technique should help you find the "bottlenecks", it that's what you want to do.
You're pretty sure of what routine you want to focus on.
If that's the routine you need to focus on, it will prove you right.
If the real problem(s) are somewhere else, it will show you where they are.
If you want a tedious list of reasons why, look here.
I wrote my own module for it. I called it cute_profile. Here is the code. Here are the tests.
Here is the blog post explaining how to use it.
It's part of GarlicSim, so if you want to use it you can install garlicsim and do from garlicsim.general_misc import cute_profile.
If you want to use it on Python 3 code, just install the Python 3 fork of garlicsim.
Here's an outdated excerpt from the code:
import functools
from garlicsim.general_misc import decorator_tools
from . import base_profile
def profile_ready(condition=None, off_after=True, sort=2):
'''
Decorator for setting a function to be ready for profiling.
For example:
#profile_ready()
def f(x, y):
do_something_long_and_complicated()
The advantages of this over regular `cProfile` are:
1. It doesn't interfere with the function's return value.
2. You can set the function to be profiled *when* you want, on the fly.
How can you set the function to be profiled? There are a few ways:
You can set `f.profiling_on=True` for the function to be profiled on the
next call. It will only be profiled once, unless you set
`f.off_after=False`, and then it will be profiled every time until you set
`f.profiling_on=False`.
You can also set `f.condition`. You set it to a condition function taking
as arguments the decorated function and any arguments (positional and
keyword) that were given to the decorated function. If the condition
function returns `True`, profiling will be on for this function call,
`f.condition` will be reset to `None` afterwards, and profiling will be
turned off afterwards as well. (Unless, again, `f.off_after` is set to
`False`.)
`sort` is an `int` specifying which column the results will be sorted by.
'''
def decorator(function):
def inner(function_, *args, **kwargs):
if decorated_function.condition is not None:
if decorated_function.condition is True or \
decorated_function.condition(
decorated_function.original_function,
*args,
**kwargs
):
decorated_function.profiling_on = True
if decorated_function.profiling_on:
if decorated_function.off_after:
decorated_function.profiling_on = False
decorated_function.condition = None
# This line puts it in locals, weird:
decorated_function.original_function
base_profile.runctx(
'result = '
'decorated_function.original_function(*args, **kwargs)',
globals(), locals(), sort=decorated_function.sort
)
return locals()['result']
else: # decorated_function.profiling_on is False
return decorated_function.original_function(*args, **kwargs)
decorated_function = decorator_tools.decorator(inner, function)
decorated_function.original_function = function
decorated_function.profiling_on = None
decorated_function.condition = condition
decorated_function.off_after = off_after
decorated_function.sort = sort
return decorated_function
return decorator