I am trying to write my first function using numba jit, I have a pandas dataframe that I need to iterate through and find the root mean square for each 350 points, since the for loop of python is quite slow I decided to try numba jit, the code is:
#jit(nopython=True)
def find_rms(data, length):
res = []
for i in range(length, len(data)):
interval = np.array(data[i-length:i])
interval =np.power(interval, 2)
sum = interval.sum()
resI = sum/length
resI = np.sqrt(res)
res.appennd(resI)
return res
mydf = np.array(df.iloc[:]['c0'], dtype=np.float64)
df.iloc[350:]['rms'] = find_rms(mydf, 350)
I read somewhere thad I need to specify datatypes, therefore I wrote "dtype = np.float64" but I still get the error as:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-39-4d388f72efdc> in <module>
----> 1 df.iloc[350:]['rms'] = find_rms(mydf, 350.0)
c:\users\1\appdata\local\programs\python\python35\lib\site-packages\numba\dispatcher.py in _compile_for_args(self, *args, **kws)
346 e.patch_message(msg)
347
--> 348 error_rewrite(e, 'typing')
349 except errors.UnsupportedError as e:
350 # Something unsupported is present in the user code, add help info
c:\users\1\appdata\local\programs\python\python35\lib\site-packages\numba\dispatcher.py in error_rewrite(e, issue_type)
313 raise e
314 else:
--> 315 reraise(type(e), e, None)
316
317 argtypes = []
c:\users\1\appdata\local\programs\python\python35\lib\site-packages\numba\six.py in reraise(tp, value, tb)
656 value = tp()
657 if value.__traceback__ is not tb:
--> 658 raise value.with_traceback(tb)
659 raise value
660
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<built-in function array>) with argument(s) of type(s): (array(float64, 1d, C))
* parameterized
In definition 0:
TypingError: array(float64, 1d, C) not allowed in a homogeneous sequence
raised from c:\users\1\appdata\local\programs\python\python35\lib\site-packages\numba\typing\npydecl.py:463
In definition 1:
TypingError: array(float64, 1d, C) not allowed in a homogeneous sequence
raised from c:\users\1\appdata\local\programs\python\python35\lib\site-packages\numba\typing\npydecl.py:463
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<built-in function array>)
[2] During: typing of call at <ipython-input-34-edd252715b2d> (5)
File "<ipython-input-34-edd252715b2d>", line 5:
def find_rms(data, length):
<source elided>
for i in range(length, len(data)):
interval = np.array(data[i-length:i])
^
This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.
To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/dev/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile
If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new
Does anybody know what the problem is?
You had a typo in append and I think you also made a mistake with what the square root is to be taken of (I believe resI not res).
Other than that, the only problem was the initialization of interval. Numba doesn't want you to pass a numpy array to a numpy array. It doesn't help with anything to wrap the np.array around the slice of the array, python simply doesn't care if you do that and treats the code like you didn't but Numba in nopython mode does care and throws an error. Leaving that part out solved the problem.
#jit(nopython=True)
def find_rms(data, length):
res = []
for i in range(length, len(data)):
interval = data[i-length:i]
interval = np.power(interval, 2)
sum = interval.sum()
resI = sum/length
resI = np.sqrt(resI)
res.append(resI)
return res
mydf = np.array(df.iloc[:]['c0'], dtype=np.float64)
target = find_rms(mydf, 350)
Related
I am attempting to compile what is effectively a piecewise function using numba.njit. The Python function is defined as follows, using Numpy:
(for anyone interested in the Sympy origins of this issue, see the Notes below.)
Minimal Example
from numpy import select, less, nan
def f(t):
condlist = [less(t, 5), less(t, 15), less(t, 20), True]
choicelist = [1, 0, 1, 0]
return select(condlist, choicelist, default=nan)
See below for confirmation that this function works in Python.
The Issue: However, Numba fails to JIT this function in nopython mode:
from numba import njit
jit_f = njit(f)
x = np.linspace(0, 50, 500)
jit_f(x)
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
Input In [86], in <cell line: 5>()
2 jit_f = njit(f)
4 x = np.linspace(0,50,500)
----> 5 jit_f(x)
File /usr/local/lib/python3.8/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
464 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
465 f"by the following argument(s):\n{args_str}\n")
466 e.patch_message(msg)
--> 468 error_rewrite(e, 'typing')
469 except errors.UnsupportedError as e:
470 # Something unsupported is present in the user code, add help info
471 error_rewrite(e, 'unsupported_error')
File /usr/local/lib/python3.8/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
407 raise e
408 else:
--> 409 raise e.with_traceback(None)
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function select at 0x105f4d310>) found for signature:
>>> select(LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True))), list(int64)<iv=[1, 0, 1, 0]>, default=float64)
There are 2 candidate implementations:
- Of which 1 did not match due to:
Overload in function 'np_select': File: numba/np/arraymath.py: Line 4358.
With argument(s): '(Poison<LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True)))>, list(int64)<iv=None>, default=float64)':
Rejected as the implementation raised a specific error:
TypingError: Poison type used in arguments; got Poison<LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True)))>
raised from /usr/local/lib/python3.8/site-packages/numba/core/types/functions.py:236
- Of which 1 did not match due to:
Overload in function 'np_select': File: numba/np/arraymath.py: Line 4358.
With argument(s): '(LiteralList((array(bool, 1d, C), array(bool, 1d, C), array(bool, 1d, C), Literal[bool](True))), list(int64)<iv=[1, 0, 1, 0]>, default=float64)':
Rejected as the implementation raised a specific error:
NumbaTypeError: condlist must be a List or a Tuple
raised from /usr/local/lib/python3.8/site-packages/numba/np/arraymath.py:4375
During: resolving callee type: Function(<function select at 0x105f4d310>)
During: typing of call at /var/folders/dt/q6vbs0g56s70g4p2kyfj4tvh0000gn/T/ipykernel_61924/130570246.py (5)
File "../../../../../../../../../var/folders/dt/q6vbs0g56s70g4p2kyfj4tvh0000gn/T/ipykernel_61924/130570246.py", line 5:
<source missing, REPL/exec in use?>
I'm not a Numba expert, but my feeling is that there is some syntax error. I've played around passing Numpy arrays and different formats of condlist and choicelist, but no luck so far.
Other Notes
The Python function behaves as expected, in this case giving some binary oscillations and then zero:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 50, 500)
plt.plot(x, f(x))
For any Sympy aficionados, the overlying problem here is with using Numba to JIT compile a lambda generated via Sympy from sympy.Piecewise. A lambda very similar to f(t) in the above example can be autogenerated by sympy.lambdify on a Piecewise function.
Numba does not currently implement all Numpy function and the support is sometimes limited. You can find the list of supported functions in the documentation. For np.select, the documentation states that the support is limited to:
only using homogeneous lists or tuples for the first two arguments, condlist and choicelist. Additionally, these two arguments can only contain arrays (unlike Numpy that also accepts tuples).
The thing is condlist is not homogeneous since the 3 first items of the list are arrays while the last is a boolean value. Additionally, choicelist contains integers while it must contains arrays.
One solution to fix this problem is to use the following code:
def f(t):
condlist = [less(t, 5), less(t, 15), less(t, 20), np.full(t.size, True)]
all_zeros = np.zeros(t.size)
all_ones = np.ones(t.size)
choicelist = [all_ones, all_zeros, all_ones, all_zeros]
return select(condlist, choicelist, default=nan)
However, please do not use this code as it is inefficient. Indeed, it creates many temporary arrays that are slow to create and fill. The code will certainly be memory bound and memory is a scarce resource only slowly improving over the last decades (this is called the "memory wall"). Optimizing such code is hard and Numba is not faster than Numpy for that. In fact, Numpy is already quite efficient to do that since it is implemented in C and most function are carefully optimized. Numba is fast when you use loops and avoid creating (useless) temporary arrays. Put it shortly: Numba likes loops as opposed to Numpy. Here is a much faster solution:
def f(t):
result = np.empty(t.size)
for i in range(t.size):
result[i] = t[i] < 5 or 15 <= t[i] < 20
return result
Note that using a boolean or a short integer (eg. int8) output type should be even faster (floating-point numbers are slow to compute and takes a lot of space in memory).
I need to define a dictionary with integers as keys and arrays of float64 as values. In Python I can define it with:
import numpy as np
d = {3: np.array([0, 1, 2, 3, 4])}
To create the same type of dictionary in a Numba-compiled function I do
import numba
#numba.njit()
def generate_d():
d = Dict.empty(types.int64, types.float64[:])
return d
but I get an error at compile time.
I don't understand why it errors, given the very simple instructions.
This is the error when I run generate_d():
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
/tmp/ipykernel_536115/3907784652.py in <module>
----> 1 generate_d()
~/envs/oasis/lib/python3.8/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
466 e.patch_message(msg)
467
--> 468 error_rewrite(e, 'typing')
469 except errors.UnsupportedError as e:
470 # Something unsupported is present in the user code, add help info
~/envs/oasis/lib/python3.8/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
407 raise e
408 else:
--> 409 raise e.with_traceback(None)
410
411 argtypes = []
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function getitem>) found for signature:
>>> getitem(class(float64), slice<a:b>)
There are 22 candidate implementations:
- Of which 22 did not match due to:
Overload of function 'getitem': File: <numerous>: Line N/A.
With argument(s): '(class(float64), slice<a:b>)':
No match.
During: typing of intrinsic-call at /tmp/ipykernel_536115/3046996983.py (4)
During: typing of static-get-item at /tmp/ipykernel_536115/3046996983.py (4)
File "../../../../tmp/ipykernel_536115/3046996983.py", line 4:
<source missing, REPL/exec in use?>
I get the same error even if I explicit the signature
#numba.njit("float64[:]()")
def generate_d():
d = Dict.empty(types.int64, types.float64[:])
return d
I'm using numba v 0.55.1, numpy 1.20.3
How can I get this to work?
As far as I know type expressions are not supported in JIT functions yet (Numba version 0.54.1). You need to create the type outside the function. Here is an example:
import numba
from numba.typed import Dict
# Type defined outside the JIT function
FloatArrayType = numba.types.float64[:]
#numba.njit
def generate_d():
d = Dict.empty(numba.types.int64, FloatArrayType) # <-- and used here
return d
The overall problem that I am trying to solve is to develop code which accepts string equations from user input or files, parses the equations, and solves the equations given a valid set of known values for variables. The approach must allow the user to enter a thermophysical function (such as CoolProp's PropsSI or HAPropsSI) in equation(s), and ideally, any user-defined function or object. Based on initial work I thought Sympy was a way to go.
Therefore, I have been trying to understand how to sympify a numerical function for use in systems of equations in Sympy.
The function is HAPropsSI from the CoolProp library. The Coolprops functions are implemented in C++ and wrapped for use in Python. It is not built on numpy per se, but is vectorized to accept 1D numpy arrays in addition to ints, floats, and lists.
Here is an example of what I tried:
from CoolProp.HumidAirProp import HAPropsSI
from sympy import symbols, sympify
# Example calculating enthalpy as a function of temp., pressure, % RH:
T = 298.15
P = 101325
RH = 0.5
h = HAPropsSI("H", "T", T, "P", P, "R", RH)
print(h) # returns the float value h = 50423.45
# Example using Sympy:
Temp, Press, RH = symbols('Temp Press RH')
sym_h = sympify('HAPropsSI("H", "T", Temp, "P", Press, "R", RH)', {'HAPropsSI':HAPropsSI})
Sympify tries to parse the expression and then use eval on the function with symbols which results in the following traceback:
ValueError Traceback (most recent call last)
ValueError: Error from parse_expr with transformed code: 'HAPropsSI ("H","T",Symbol (\'Temp\' ),"P",Symbol (\'Press\' ),"R",Symbol (\'RH\' ))'
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
C:\Users\JIMCAR~1\AppData\Local\Temp/ipykernel_3076/1321321868.py in <module>
12
13 Temp, Press, RH = symbols('Temp Press RH')
---> 14 sym_h = sympify('HAPropsSI("H", "T", Temp, "P", Press, "R", RH)', {'HAPropsSI':HAPropsSI})
15
16 '''
~\AppData\Roaming\Python\Python38\site-packages\sympy\core\sympify.py in sympify(a, locals, convert_xor, strict, rational, evaluate)
470 try:
471 a = a.replace('\n', '')
--> 472 expr = parse_expr(a, local_dict=locals, transformations=transformations, evaluate=evaluate)
473 except (TokenError, SyntaxError) as exc:
474 raise SympifyError('could not parse %r' % a, exc)
~\AppData\Roaming\Python\Python38\site-packages\sympy\parsing\sympy_parser.py in parse_expr(s, local_dict, transformations, global_dict, evaluate)
1024 for i in local_dict.pop(None, ()):
1025 local_dict[i] = None
-> 1026 raise e from ValueError(f"Error from parse_expr with transformed code: {code!r}")
1027
1028
~\AppData\Roaming\Python\Python38\site-packages\sympy\parsing\sympy_parser.py in parse_expr(s, local_dict, transformations, global_dict, evaluate)
1015
1016 try:
-> 1017 rv = eval_expr(code, local_dict, global_dict)
1018 # restore neutral definitions for names
1019 for i in local_dict.pop(None, ()):
~\AppData\Roaming\Python\Python38\site-packages\sympy\parsing\sympy_parser.py in eval_expr(code, local_dict, global_dict)
909 Generally, ``parse_expr`` should be used.
910 """
--> 911 expr = eval(
912 code, global_dict, local_dict) # take local objects in preference
913 return expr
<string> in <module>
CoolProp\HumidAirProp.pyx in CoolProp.CoolProp.HAPropsSI()
CoolProp\HumidAirProp.pyx in CoolProp.CoolProp.HAPropsSI()
TypeError: Numerical inputs to HAPropsSI must be ints, floats, lists, or 1D numpy arrays.
An example application would be to create an equation and solve for an unknown (Press, Temp, or RH) given the value of h:
eqn = Eq(sym_h, 50423.45)
nsolve(eqn, Press, 1e5)
What I am trying to accomplish is not so different from:
Python: Using sympy.sympify to perform a safe eval() on mathematical functions
Though I admit I am unclear on the details of the subclassing.
Thanks for any insights.
I am trying to use Numba to write cuda kernels for my code. And somehow I wanna use the atomic operation in part of my code and I wrote a test kernel to see how cuda.atomic.compare_and_swap works. On the documentation it says this:
enter image description here
from numba import cuda
import numpy as np
#cuda.jit
def atomicCAS(N,out1):
idx = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x
if idx >= N:
return
A = out1[idx:]
cuda.atomic.compare_and_swap(A,idx,0)
N = 1024
out1 = np.arange(N)
out1 = np.zeros(N)
dout1 = cuda.to_device(out1)
tpb = 32
bpg = int(np.ceil(N/tpb))
atomicCAS[bpg,tpb](N,dout1)
hout1 = dout1.copy_to_host()
Then I got this error:
TypingError: Invalid use of Function(<class 'numba.cuda.stubs.atomic.compare_and_swap'>) with argument(s) of type(s): (array(float64, 1d, A), int64, Literal[int](0))
* parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<class 'numba.cuda.stubs.atomic.compare_and_swap'>)
[2] During: typing of call at /home/qinyu/test.py (20)
This is a pretty naive code and I think I feed in the write type of variables but I got this typingerror. It worked pretty well with the other atomic operations in Numba. This is the only one that does not work for me. Can somebody help me figure out the problem or is there another alternative ways to do this? Thanks!
The key in the error message is this:
array(float64, 1d, A), int64, Literal[int](0))
CUDA atomicCAS only supports integer types. You cannot pass a floating point type.
Using 64-bit Python 3.3.1 and 32GB RAM and this function to generate target expression 1+1/(2+1/(2+1/...)):
def sqrt2Expansion(limit):
term = "1+1/2"
for _ in range(limit):
i = term.rfind('2')
term = term[:i] + '(2+1/2)' + term[i+1:]
return term
I'm getting MemoryError when calling:
simplify(sqrt2Expansion(100))
Shorter expressions work fine, e.g:
simplify(sqrt2Expansion(50))
Is there a way to configure SymPy to complete this calculation? Below is the error message:
MemoryError Traceback (most recent call last)
<ipython-input-90-07c1e2de29d1> in <module>()
----> 1 simplify(sqrt2Expansion(100))
C:\Python33\lib\site-packages\sympy\simplify\simplify.py in simplify(expr, ratio, measure)
2878 from sympy.functions.special.bessel import BesselBase
2879
-> 2880 original_expr = expr = sympify(expr)
2881
2882 expr = signsimp(expr)
C:\Python33\lib\site-packages\sympy\core\sympify.py in sympify(a, locals, convert_xor, strict, rational)
176 try:
177 a = a.replace('\n', '')
--> 178 expr = parse_expr(a, locals or {}, rational, convert_xor)
179 except (TokenError, SyntaxError):
180 raise SympifyError('could not parse %r' % a)
C:\Python33\lib\site-packages\sympy\parsing\sympy_parser.py in parse_expr(s, local_dict, rationalize, convert_xor)
161
162 code = _transform(s.strip(), local_dict, global_dict, rationalize, convert_xor)
--> 163 expr = eval(code, global_dict, local_dict) # take local objects in preference
164
165 if not hit:
MemoryError:
EDIT:
I wrote a version using sympy expressions instead of strings:
def sqrt2Expansion(limit):
x = Symbol('x')
term = 1+1/x
for _ in range(limit):
term = term.subs({x: (2+1/x)})
return term.subs({x: 2})
It runs better: sqrt2Expansion(100) returns valid result, but sqrt2Expansion(200) produces RuntimeError with many pages of traceback and hangs up IPython interpreter with plenty of system memory left unused. I created new question Long expression crashes SymPy with this issue.
SymPy is using eval along the path to turn your string into a SymPy object, and eval uses the built-in Python parser, which has a maximum limit. This isn't really a SymPy issue.
For example, for me:
>>> eval("("*100+'3'+")"*100)
s_push: parser stack overflow
Traceback (most recent call last):
File "<ipython-input-46-1ce3bf24ce9d>", line 1, in <module>
eval("("*100+'3'+")"*100)
MemoryError
Short of modifying MAXSTACK in Parser.h and recompiling Python with a different limit, probably the best way to get where you're headed is to avoid using strings in the first place. [I should mention that the PyPy interpreter can make it up to ~1100 for me.]