Have trouble using numba atomic operation functions (cuda.atomic.compare_and_swap)

Have trouble using numba atomic operation functions (cuda.atomic.compare_and_swap) - python

I am trying to use Numba to write cuda kernels for my code. And somehow I wanna use the atomic operation in part of my code and I wrote a test kernel to see how cuda.atomic.compare_and_swap works. On the documentation it says this:
enter image description here
from numba import cuda
import numpy as np
#cuda.jit
def atomicCAS(N,out1):
idx = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x
if idx >= N:
return
A = out1[idx:]
cuda.atomic.compare_and_swap(A,idx,0)
N = 1024
out1 = np.arange(N)
out1 = np.zeros(N)
dout1 = cuda.to_device(out1)
tpb = 32
bpg = int(np.ceil(N/tpb))
atomicCAS[bpg,tpb](N,dout1)
hout1 = dout1.copy_to_host()
Then I got this error:
TypingError: Invalid use of Function(<class 'numba.cuda.stubs.atomic.compare_and_swap'>) with argument(s) of type(s): (array(float64, 1d, A), int64, Literal[int](0))
* parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<class 'numba.cuda.stubs.atomic.compare_and_swap'>)
[2] During: typing of call at /home/qinyu/test.py (20)
This is a pretty naive code and I think I feed in the write type of variables but I got this typingerror. It worked pretty well with the other atomic operations in Numba. This is the only one that does not work for me. Can somebody help me figure out the problem or is there another alternative ways to do this? Thanks!

The key in the error message is this:
array(float64, 1d, A), int64, Literal[int](0))
CUDA atomicCAS only supports integer types. You cannot pass a floating point type.

Related

Numba fails to compile np.select based function in nopython mode

I am attempting to compile what is effectively a piecewise function using numba.njit. The Python function is defined as follows, using Numpy:
(for anyone interested in the Sympy origins of this issue, see the Notes below.)
Minimal Example
from numpy import select, less, nan
def f(t):
condlist = [less(t, 5), less(t, 15), less(t, 20), True]
choicelist = [1, 0, 1, 0]
return select(condlist, choicelist, default=nan)
See below for confirmation that this function works in Python.
The Issue: However, Numba fails to JIT this function in nopython mode:
from numba import njit
jit_f = njit(f)
x = np.linspace(0, 50, 500)
jit_f(x)
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
Input In [86], in <cell line: 5>()
2 jit_f = njit(f)
4 x = np.linspace(0,50,500)
----> 5 jit_f(x)
File /usr/local/lib/python3.8/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
464 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
465 f"by the following argument(s):\n{args_str}\n")
466 e.patch_message(msg)
--> 468 error_rewrite(e, 'typing')
469 except errors.UnsupportedError as e:
470 # Something unsupported is present in the user code, add help info
471 error_rewrite(e, 'unsupported_error')
File /usr/local/lib/python3.8/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
407 raise e
408 else:
--> 409 raise e.with_traceback(None)
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function select at 0x105f4d310>) found for signature:
>>> select(LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True))), list(int64)<iv=[1, 0, 1, 0]>, default=float64)
There are 2 candidate implementations:
- Of which 1 did not match due to:
Overload in function 'np_select': File: numba/np/arraymath.py: Line 4358.
With argument(s): '(Poison<LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True)))>, list(int64)<iv=None>, default=float64)':
Rejected as the implementation raised a specific error:
TypingError: Poison type used in arguments; got Poison<LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True)))>
raised from /usr/local/lib/python3.8/site-packages/numba/core/types/functions.py:236
- Of which 1 did not match due to:
Overload in function 'np_select': File: numba/np/arraymath.py: Line 4358.
With argument(s): '(LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True))), list(int64)<iv=[1, 0, 1, 0]>, default=float64)':
Rejected as the implementation raised a specific error:
NumbaTypeError: condlist must be a List or a Tuple
raised from /usr/local/lib/python3.8/site-packages/numba/np/arraymath.py:4375
During: resolving callee type: Function(<function select at 0x105f4d310>)
During: typing of call at /var/folders/dt/q6vbs0g56s70g4p2kyfj4tvh0000gn/T/ipykernel_61924/130570246.py (5)
File "../../../../../../../../../var/folders/dt/q6vbs0g56s70g4p2kyfj4tvh0000gn/T/ipykernel_61924/130570246.py", line 5:
<source missing, REPL/exec in use?>
I'm not a Numba expert, but my feeling is that there is some syntax error. I've played around passing Numpy arrays and different formats of condlist and choicelist, but no luck so far.
Other Notes
The Python function behaves as expected, in this case giving some binary oscillations and then zero:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 50, 500)
plt.plot(x, f(x))
For any Sympy aficionados, the overlying problem here is with using Numba to JIT compile a lambda generated via Sympy from sympy.Piecewise. A lambda very similar to f(t) in the above example can be autogenerated by sympy.lambdify on a Piecewise function.

Numba does not currently implement all Numpy function and the support is sometimes limited. You can find the list of supported functions in the documentation. For np.select, the documentation states that the support is limited to:
only using homogeneous lists or tuples for the first two arguments, condlist and choicelist. Additionally, these two arguments can only contain arrays (unlike Numpy that also accepts tuples).
The thing is condlist is not homogeneous since the 3 first items of the list are arrays while the last is a boolean value. Additionally, choicelist contains integers while it must contains arrays.
One solution to fix this problem is to use the following code:
def f(t):
condlist = [less(t, 5), less(t, 15), less(t, 20), np.full(t.size, True)]
all_zeros = np.zeros(t.size)
all_ones = np.ones(t.size)
choicelist = [all_ones, all_zeros, all_ones, all_zeros]
return select(condlist, choicelist, default=nan)
However, please do not use this code as it is inefficient. Indeed, it creates many temporary arrays that are slow to create and fill. The code will certainly be memory bound and memory is a scarce resource only slowly improving over the last decades (this is called the "memory wall"). Optimizing such code is hard and Numba is not faster than Numpy for that. In fact, Numpy is already quite efficient to do that since it is implemented in C and most function are carefully optimized. Numba is fast when you use loops and avoid creating (useless) temporary arrays. Put it shortly: Numba likes loops as opposed to Numpy. Here is a much faster solution:
def f(t):
result = np.empty(t.size)
for i in range(t.size):
result[i] = t[i] < 5 or 15 <= t[i] < 20
return result
Note that using a boolean or a short integer (eg. int8) output type should be even faster (floating-point numbers are slow to compute and takes a lot of space in memory).

Foobar - Test cases not passing - Disorderly escape

0/10 test cases are passing.
Here is the challenge description:
(to keep formatting nice - i put the description in a paste bin)
Link to challenge description: https://pastebin.com/UQM4Hip9
Here is my trial code - PYTHON - (0/10 test cases passed)
from math import factorial
from collections import Counter
from fractions import gcd
def cycle_count(c, n):
cc=factorial(n)
for a, b in Counter(c).items():
cc//=(a**b)*factorial(b)
return cc
def cycle_partitions(n, i=1):
yield [n]
for i in range(i, n//2 + 1):
for p in cycle_partitions(n-i, i):
yield [i] + p
def solution(w, h, s):
grid=0
for cpw in cycle_partitions(w):
for cph in cycle_partitions(h):
m=cycle_count(cpw, w)*cycle_count(cph, h)
grid+=m*(s**sum([sum([gcd(i, j) for i in cpw]) for j in cph]))
return grid//(factorial(w)*factorial(h))
print(solution(2, 2, 2)) #Outputs 7
This code works in my python compiler on my computer but not on the foobar challenge??
Am I losing accuracy or something?

NOT HELPFUL
Just a guess: wrong types of the return values of the functions? str vs. int?
Improved Answer
First a more detailed explanation for the 'type of values' thing:
In python, values are typed also if you do not have to declare a type. E.g.:
>>> i = 7
>>> s = '7'
>>> print(i, s, type(i), type(s), i == s)
7 7 <class 'int'> <class 'str'> False
I do not know the exact tests that are applied to the code, but typically those involve some equality test with ==. If the type of the expected value and those value returned does not match, the equality fails. Perhaps Elegant ways to support equivalence ("equality") in Python classes might help.
The instruction to the challenge (see pastebin) also explicitly mentions the expected return types: the returned values must be string (see line 41 / 47).
In addtion, the instruction also states that one should 'write a function answer(w, h, s)' (line 6). The posted solution implements a function solution.

I was also stuck on this problem.
The error is because of data type mismatch i.e your function returns an integer but the question specifically asked for a string.
Wrapping the return statement in str() will solve the problem.

Your return type is an integer. Convert it to a string, and it will work.
return str(grid//(factorial(w)*factorial(h)))

multivariable linearization in python: 'Pow' object has no attribute 'sqrt'

As a newcomer to Python world, I'm just simply about to linearize the following two variable function:
function
using the fairly routine Newton method:
linearization method
Here is what I've tried so far:
import numpy as np
import math
from sympy import symbols, diff
d = 1.7
def f(arg1, arg2):
return (arg1 - arg2)/(np.power(np.linalg.norm(arg1 - arg2),2) - np.power(d,2))
def linearize_f(f, arg1, arg2, equi_arg1, equi_arg2):
arg1, arg2 = symbols('arg1 arg2', real=True)
der_1 = diff(f(arg1,arg2), arg1)
der_2 = diff(f(arg1,arg2), arg2)
constant_term = f(equi_arg1, equi_arg2)
vars = sympy.symbols('arg1, arg2')
par_term_1 = sympy.evalf(der_1, subs = dict(zip(vars,[equi_arg1, equi_arg2])))
par_term_2 = sympy.evalf(der_2, subs = dict(zip(vars,[equi_arg1, equi_arg2])))
result = constant_term + par_term_1*(arg1-equi_arg1) + par_term_2*(arg2-equi_arg2)
return result
q0, q1 = symbols('q0 q1', real=True)
result = linearize_f(f,q0,q1,0,0)
print(result)
The interpreter returns a 'Pow' object has no attribute 'sqrt'. However, I've never used any sqrt in my code.
Would you please help me to resolve the case?

You have not called sqrt but np.linalg.norm has. The arg1, arg2 arguments are of type sympy.Symbol. The function expects to get an array-like argument. However, it gets a sympy symbol, which it does not know how to handle.
I looked in the np.linalg source code, and it seems that it checks for some known types and tries to find the square root. Otherwise, it relies on the argument itself to know its own square root. sympy.Symbol has no such thing, and hence the error.
There is no way to avoid this. numpy works with numbers, sympy works with (its own) symbols. You are not supposed to mix them. Most likely sympy will have its own functions for handling its own symbols, but, if not, you are out of luck, unless you add them yourself.

I've narrowed your error to this:
q0, q1 = symbols('q0 q1', real=True)
np.linalg.norm(q0 - q1) # Throws the same error
Here's the source code in np.linalg where it threw the error:
2347
2348 # Immediately handle some default, simple, fast, and common cases.
2349 if axis is None:
2350 ndim = x.ndim
2351 if ((ord is None) or
2352 (ord in ('f', 'fro') and ndim == 2) or
2353 (ord == 2 and ndim == 1)):
2354
2355 x = x.ravel(order='K')
2356 if isComplexType(x.dtype.type):
2357 sqnorm = dot(x.real, x.real) + dot(x.imag, x.imag)
2358 else:
2359 sqnorm = dot(x, x)
2360 ret = sqrt(sqnorm)
2361 if keepdims:
2362 ret = ret.reshape(ndim*[1])
2363 return ret
Apparently, after your sympy object has been processed by dot, it became a Pow object, which is a sympy object that np.sqrt has no idea what to do with.
The reason for this apparently is that you cannot use numpy function for sympy objects. Pow is a sympy object and as such numpy.sqrt cannot operate on this object.
After more reasearch, apparently this question from long time ago sympy AttributeError: 'Pow' object has no attribute 'sin' also point to the same reason.

Rentry to numpy function returns Error: float object has no attribute exp

I finally figured out why I was getting a "weird" error in a function call. But I do not understand WHY I got the error or how to avoid it in the future.
The error was:
rates = Qi*np.exp(-Di*(days-T0))
AttributeError: 'float' object has no attribute 'exp'
This question was asked over 5 years ago (Numpy AttributeError: 'float' object has no attribute 'exp' and it is similar to Different behavior of arithmetics on dtype float 'object' and 'float'), but neither answer tells me how to arbitrarily avoid the problem.
I am using pandas and numpy. The function is called as follows:
df3['Model_EXP'], df3['Model_EXPcum'] =arps.arps(days = df3[dayscol], Qi=parmexp[0], Di=parmexp[1], T0=parmexp[2])
where df3 is a dataframe, the column df3[dayscol] is a column of type int64, and parmexp is a tuple containing parameters for the function.
The function was originally written for numpy (not pandas) and expects numpy vectors. The start of the function def is as follows:
def arps(days = None, Qi = None, Di = None, Bi = 0., T0=None, DTMIN=-5., QMIN = 1., Oil=True):
if (not Di or not Qi):
print("Nominal Decline or Rate not entered in routine arps. Stopping.")
sys.exit()
if (np.isnan(np.sum(days))):
print("Error in days passed to routine arps. Some non-days in vector. Stopping.")
sys.exit()
if (Di < 0.):
Di = -1.*Di # just in case someone got their signs wrong
#
rates = np.zeros(len(days))
cums = np.zeros(len(days))
dtval = np.zeros(len(days))
if (T0 == None):
T0 = days[0]
if (Bi == 0.): # Exponential Decline
rates = Qi*np.exp(-Di*(days-T0)) # THIS IS WHERE THINGS GET INTERESTING
I have called this function multiple times in my application. It had always worked. I added another function to call this routine with another dataframe. However, even though this works during the same execution, I started getting the following error:
File "...Local\conda\conda\envs\py36\lib\tkinter\__init__.py", line 1699, in __call__
return self.func(*args)
File "decline_curve_analysis.py", line 993, in predpred
df3['Model_EXP'], df3['Model_EXPcum'] =arps.arps(days = df3[dayscol], Qi=parmexp[0], Di=parmexp[1], T0=parmexp[2])
File "...\PetroPy\DeclineCurves.py", line 110, in arps
rates = Qi*np.exp(-Di*(days-T0))
AttributeError: 'float' object has no attribute 'exp'
I didn't understand why, so I printed out the dataframe just before the function call. The "dayscol" column had different values, but in both working and non-working cases the days vector (which is the df3[dayscol] column) in "int64". However, when I made the following change in the called function:
print("Type of days",days.dtype)
newexp = (-Di*(days-T0))
print("Type of newexp",newexp.dtype)
rates = Qi*np.exp(-Di*(days-T0)) # THIS IS WHERE THINGS GET INTERESTING
I discovered that I got the following response on the calls which worked:
Type of days int64
Type of newexp float64
but got the following for those which failed:
Type of days int64
Type of newexp object
which I've now solved by rewriting my original function to be:
newexp = (-Di*(days-T0))
newexp = newexp.astype(np.float64)
rates = Qi*np.exp(-Di*(days-T0)) # THIS IS WHERE THINGS GET INTERESTING
QUESTION:
Is there a way I can ensure that this doesn't occur again? It's taken me over a day to debug this, and I'm concerned because I have a lot of non-pandas functions that I don't want to fail when I use them in other projects. Is this even related to the way the function is called?
Thanks.

The underlying type of a numpy array can not be devised.
Python is only dynamically strongly typed (Is Python strongly typed? )
That's it's main strength and one of it's main frustrating weakness.
There is no real way around it in your case. In some case you can Force a function parameter type in Python? but in your case you can not specify the underlying type in numpy.ndarray.
What is troubling is
It's taken me over a day to debug this
error should tip you off to directly check the types in line 110
File "...\PetroPy\DeclineCurves.py", line 110, in arps
rates = Qi * np.exp(-Di*(days-T0))
AttributeError: 'float' object has no attribute 'exp'
Oh I get it big data, slow loading time,... Well check your code when you use transcendental functions.

CudaAPIError 716 when trying to copy data from gpu

I'm learning Numba and CUDA Python. I've been following a set of youtube tutorials and have (I believe) understood the principals. My issue is with copying computed values back from my GPU. I use the following line to do this:
aVals = retVal.copy_to_host()
I've also tried using this line:
retVal.copy_to_host( aVals[:] )
Neither work and both give the same error:
numba.cuda.cudadrv.driver.CudaAPIError: [716] Call to cuMemcpyDtoH results in UNKNOWN_CUDA_ERROR
I'm reasonably confident the above lines are the issue as if I comment out the line the code runs without errors. Is there some underlying issue I'm overlooking with copying an array from GPU to CPU? Have I screwed up my arrays somewhere?
There's a lot of messing around in my code but here's a bare bones version:
import numpy as np
import time
from math import sin, cos, tan, sqrt, pi, floor
from numba import vectorize, cuda
#cuda.jit('void(double[:],double[:],double[:],double)')
def CalculatePreValues(retVal,ecc, incl, ke):
i= cuda.grid(1)
if i >= ecc.shape[0]:
return
retVal[i] = (ke/ecc[i])**(2/3)
def main():
eccen = np.ones(num_lines, dtype=np.float32)
inclin = np.ones(num_lines, dtype=np.float32)
ke = 0.0743669161
aVals = np.zeros(eccen.shape[0])
start = time.time()
retVal = cuda.device_array(aVals.shape[0])
ecc = cuda.to_device(eccen)
inc = cuda.to_device(inclin)
threadsPerBlock = 256
numBlocks = int((ecc.shape[0]+threadsPerBlock-1)/threadsPerBlock)
CalculatePreValues[numBlocks, threadsPerBlock](retVal,ecc,inc)
aVals = retVal.copy_to_host()
preCalcTime = time.time() - start
print ("Precalculation took % seconds" % preCalcTime)
print (aVals.shape[0])
if __name__=='__main__':
main()

There are several points to make here.
Firstly, the source of the error you are seeing is a runtime error coming from the kernel execution. If I run a hacky "fixed" version of your code using cuda-memcheck, I see this:
$ cuda-memcheck python ./error.py
========= CUDA-MEMCHECK
========= Invalid __global__ read of size 8
========= at 0x00000178 in cudapy::__main__::CalculatePreValues$241(Array<double, int=1, A, mutable, aligned>, Array<double, int=1, A, mutable, aligned>, Array<double, int=1, A, mutable, aligned>)
========= by thread (255,0,0) in block (482,0,0)
========= Address 0x7061317f8 is out of bounds
The reason is that the bounds checking in your kernel is broken:
if i > ecc.shape[0]:
return
should be
if i >= ecc.shape[0]:
return
When the question was updated to include a MCVE, it became evident that there was another issue. The kernel signature specifies double for all the arrays:
#cuda.jit('void(double[:],double[:],double[:],double)')
^^^^^^ ^^^^^^ ^^^^^^
but the type of arrays created were actually float (i.e. np.float32):
eccen = np.ones(num_lines, dtype=np.float32)
inclin = np.ones(num_lines, dtype=np.float32)
^^^^^^^^^^
This is a mismatch. Indexing into an array using double indexing, when the array has only been created with float values, will likely create out-of-bounds indexing.
The solution is to convert the created arrays to dtype=np.float64, or else convert the arrays in the signature to float:
#cuda.jit('void(float[:],float[:],float[:],double)')
to eliminate the out-of-bounds indexing.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Have trouble using numba atomic operation functions (cuda.atomic.compare_and_swap) - python

The key in the error message is this: array(float64, 1d, A), int64, Literal[int](0)) CUDA atomicCAS only supports integer types. You cannot pass a floating point type.

Related

Numba fails to compile np.select based function in nopython mode

Foobar - Test cases not passing - Disorderly escape

multivariable linearization in python: 'Pow' object has no attribute 'sqrt'

Rentry to numpy function returns Error: float object has no attribute exp

CudaAPIError 716 when trying to copy data from gpu

Categories

Resources