Line Profiling inner function with Cython - python

I've had pretty good success using this answer to profile my Cython code, but it doesn't seem to work properly with nested functions. In this notebook you can see that the profile doesn't appear when the line profiler is used on a nested function. Is there a way to get this to work?

tl,dr:
This is seems to be an issue with Cython, there's a hackish way that does the trick but isn't reliable, you could use it for one-off cases until this issue has been fixed*
Change the line_profiler source:
I can't be 100% sure for this but it is working, what you need to do is download the source for line_profiler and go fiddle around in python_trace_callback. After the code object is obtained from the current frame of execution (code = <object>py_frame.f_code), add the following:
if what == PyTrace_LINE or what == PyTrace_RETURN:
code = <object>py_frame.f_code
# Add entry for code object with different address if and only if it doesn't already
# exist **but** the name of the function is in the code_map
if code not in self.code_map and code.co_name in {co.co_name for co in self.code_map}:
for co in self.code_map:
# make condition as strict as necessary
cond = co.co_name == code.co_name and co.co_code == code.co_code
if cond:
del self.code_map[co]
self.code_map[code] = {}
This will replace the code object in self.code_map with the one currently executing that matches its name and co.co_code contents. co.co_code is b'' for Cython, so in essence in matches Cython functions with that name. Here is where it can become more robust and match more attributes of a code object (for example, the filename).
You can then procceed to build it with python setup.py build_ext and install with sudo python setup.py install. I'm currently building it with python setup.py build_ext --inplace in order to work with it locally, I'd suggest you do too. If you do build it with --inplace make sure you navigate to the folder containing the source for line_profiler before importing it.
So, in the folder containing the built shared library for line_profiler I set up a cyclosure.pyx file containing your functions:
def outer_func(int n):
def inner_func(int c):
cdef int i
for i in range(n):
c+=i
return c
return inner_func
And an equivalent setup_cyclosure.py script in order to build it:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
from Cython.Compiler.Options import directive_defaults
directive_defaults['binding'] = True
directive_defaults['linetrace'] = True
extensions = [Extension("cyclosure", ["cyclosure.pyx"], define_macros=[('CYTHON_TRACE', '1')])]
setup(name = 'Testing', ext_modules = cythonize(extensions))
As previously, the build was performed with python setup_cyclosure.py build_ext --inplace.
Launching your interpreter from the current folder and issuing the following yields the wanted results:
>>> import line_profiler
>>> from cyclosure import outer_func
>>> f = outer_func(5)
>>> prof = line_profiler.LineProfiler(f)
>>> prof.runcall(f, 5)
15
>>> prof.print_stats()
Timer unit: 1e-06 s
Total time: 1.2e-05 s
File: cyclosure.pyx
Function: inner_func at line 2
Line # Hits Time Per Hit % Time Line Contents
==============================================================
2 def inner_func(int c):
3 cdef int i
4 1 5 5.0 41.7 for i in range(n):
5 5 6 1.2 50.0 c+=i
6 1 1 1.0 8.3 return c
Issue with IPython %%cython:
Trying to run this from IPython results in an unfortunate situation. While executing, the code object doesn't store the path to the file where it was defined, it simply stored the filename. Since I simply drop the code object into the self.code_map dictionary and since code objects have read-only Attributes, we lose the file path information when using it from IPython (because it stores the files generated from %%cython in a temporary directory).
Because of that, you do get the profiling statistics for your code but you get no contents for the contents. One might be able to forcefully copy the filenames between the two code objects in question but that's another issue altogether.
*The Issue:
The issue here is that for some reason, when dealing with nested and/or enclosed functions, there's an abnormality with the address of the code object when it is created and while it is being interpreted in one of Pythons frames. The issue you were facing was caused by the following condition not being satisfied:
if code in self.code_map:
Which was odd. Creating your function in IPython and adding it to the LineProfiler did indeed add it to the self.code_map dictionary:
prof = line_profiler.LineProfiler(f)
prof.code_map
Out[16]: {<code object inner_func at 0x7f5c65418f60, file "/home/jim/.cache/ipython/cython/_cython_magic_1b89b9cdda195f485ebb96a104617e9c.pyx", line 2>: {}}
When the time came to actually test the previous condition though, and the current code object was snatched from the current execution frame with code = <object>py_frame.f_code, the address of the code object was different:
# this was obtained with a basic print(code) in _line_profiler.pyx
code object inner_func at 0x7f7a54e26150
indicating it was re-created. This only happens with Cython and when a function is defined inside another function. Either this or something that I am completely missing.

Related

What does it mean to "initialize the Julia runtime" when exporting compiled .dll or .so files for use in other langauges?

I'm trying to compile a usable .dll file from Julia to be used in Python as I've already written a large GUI in Python and need some fast optimization work done. Normally I would just call PyJulia or some "live" call, however this program needs to be compiled to distribute within my research team, so whatever solution I end up with needs to be able to run on its own (without Julia or Python actually installed).
Right now I'm able to create .dll files via PackageCompiler.jl, something I learned from previous posts on StackOverflow, however when trying to run these files in Python via the following code
Julia mock package
module JuliaFunctions
# Pkg.add("BlackBoxOptim")
Base.#ccallable function my_main_function(x::Cfloat,y::Cfloat)::Cfloat
z = 0
for i in 1:x
z += i ^ y
end
return z
end
# function julia_main()
# print("Hello from a compiled executable!")
# end
export my_main_function
end # module
Julia script to use PackageCompiler
# using PackageCompiler
using Pkg
# Pkg.develop(path="JuliaFunctions") # This is how you add a local package
# include("JuliaFunctions/src/JuliaFunctions.jl") # this is how you add a local module
using PackageCompiler
# Pkg.add(path="JuliaFunctions")
#time create_sysimage(:JuliaFunctions, sysimage_path="JuliaFunctions.dll")
Trying to use the resulting .dll in CTypes in Python
import ctypes
from ctypes.util import find_library
from ctypes import *
path = os.path.dirname(os.path.realpath(__file__)) + '\\JuliaFunctions.dll'
# _lib = cdll.LoadLibrary(ctypes.util.find_library(path)) # same error
# hllDll = ctypes.WinDLL(path, winmode=0) # same error
with os.add_dll_directory(os.path.dirname(os.path.realpath(__file__))):
_lib = ctypes.CDLL(path, winmode=0)
I get
OSError: [WinError 127] The specified procedure could not be found
With my current understanding, this means that CTypes found the dll and imported it, but didn't find.. something? I've yet to fully grasp how this behaves.
I've verified the function my_main_function is exported in the .dll file via Nirsoft's DLL Export Viewer. Users from previous similar issues have noted that this sysimage is already callable and should work, but they always add at the end something along the lines of "Note that you will also in general need to initialize the Julia runtime."
What does this mean? Is this even something that can be done independently from the Julia installation? The dev docs in PackageCompiler mention this, however they just mention that julia_main is automatically included in the .dll file and gets called as a sort of launch point. This function is also being exported correctly into the .dll file the above code creates. Below is an image of the Nirsoft export viewer output for reference.
Edit 1
Inexplicably, I've rebuilt this .dll on another machine and made progress. Now, the dll is imported correctly. I'm not sure yet why this worked on a fresh Julia install + Python venv, but I'm going to reinstall them on the other one and update this if anything changes. For anyone encountering this, also note you need to specify the expected output, whatever it may be. In my case this is done by adding (after the import):
_lib.testmethod1.restype = c_double # switched from Cfloat earlier, a lot has changed.
_lib.testmethod1.argtypes = [c_double, c_double] # (defined by ctypes)
The current error is now OSError: exception: access violation writing 0x0000000000000024 when trying to actually use the function, which is specific to Python. Any help on this would also be appreciated.

Cython magic command doesn't print in jupyter notebook [duplicate]

I am new to cython(only use it for doing a little hw now).
I use the following code to see a general idea of it in jupyter notebook.
%load_ext Cython
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
print(cfunc(10))
However, it only prints out the result 45 once. When I run the print function, the cell doesn't show 45 anyone.
Is there any problems with the code? How can I make the cell prints out 45 just the same as a normal python code? Thanks.
When running %%cython-magic a lot happens under the hood. One can see parts of it when calling the magic in verbose mode, i.e. %%cython --verbose:
A file called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.pyx is generated. b599dcf313706e8c6031a4a7058da2a2 is the sha1-hash of the %%cython-cell, which is needed for example to be able to reload a %%cython-cell (see this SO-post).
This file is cythonized and build to a c-extension called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.
This extension gets imported - this is the moment your code prints 45, and everything from this module is added to the global namespace.
When you execute the cell again, nothing of the above happens: given the sha-hash the machinery can see, that this cell was already executed and loaded - so nothing to be done. Only when the content of the cell is changed and thus its hash the cash will not be used but the 3 steps above executed.
To enforce that the steps above are performed one has to pass --force (or -f) options to the %%cython-magic-cell, i.e.:
%%cython --force
...
# 45 is printed
However, because building extension anew is quite time consuming one would probably prefer the following
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
# put the code of __main__ into a function
def cython_main():
print(cfunc(10))
# execute the old main
cython_main()
and now calling cython_main() in a new cell, so it gets reevaluated the same way the normal python code would.

Use function from another file python

I have 2 files, 1 file with the classes and 1 file with the 'interface'
In the first file I have:
from second_file import *
class Catalog:
def ListOfBooks(self):
more_20 = input()
# If press 1 starts showing from 20 until 40
if more_20 == '1':
for item in C[20:41:1]:
print("ID:",item['ID'],"Title:",item['title']," Author: ", item['author'])
elif more_20 == '2':
return librarian_option()
test = Catalog()
test.ListOfBooks()
What I try to achieve is when the user presses 2, I want to go back to the function in my other file.
Second file:
def librarian_option():
.......
I don't want to use globals and I have read that the librarian_option() is in the scope of the second file and that's why I can't call it directly. I can't find a solution to it.
I get the following error:
NameError: name 'librarian_option' is not defined
Have you tried? It's just best practice to be explicit instead of using * wildcard
from second_file import librarian_option
Make sure the second_file is in the same directory.
Note: this is not an answer, but an example to help improve the question with more details.
You need to reduce your problem to the minimum amount of code (and actions) necessary to produce your problem. You also need provide how exactly you are running your script, and what version (of Python and your OS) you are using.
For example, I have created the following two scripts (named exactly as shown):
first_file.py:
from second_file import *
class Catalog:
def ListOfBooks(self):
return librarian_option()
test = Catalog()
a = test.ListOfBooks()
print(a)
second_file.py:
def librarian_option():
return 1
These two files are located in the same, random, directory on my computer (MacOS). I run this as follows:
python3.7 first_file.py
and my output is
1
Hence, I can't reproduce your problem.
See if you can still produce your problem with such simplified scripts (i.e., no extra functions or classes, no __init__.py file, etc). You probably want to do this in a temporary directory elsewhere on your system.
If your problem goes away, slowly build back up to where it reappears again. By then, possibly, you've discovered the actual problem as well. If you then don't understand why this (last) change you made caused the problem, feel free to update your question with all the new information.

Reload import in IPython

I'm trying to access some Fortran subroutines using F2PY, but I've ran into the following problem during consecutive calls from IPython. Take this minimal Fortran code (hope that I didn't code anything stupid; my Fortran is a bit rusty..):
! test.f90
module mod
integer i
contains
subroutine foo
i = i+1
print*,i
end subroutine foo
end module mod
If I compile this using F2PY (f2py3.5 -c -m test test.f90), import it in Python and call it twice:
# run.py
import test
test.mod.foo()
test.mod.foo()
The resulting output is:
$ python run.py
1
2
So on every call of foo(), i is incremented, which is supposed to happen. But between different calls of run.py (either from the command line or IPython interpreter), everything should be "reset", i.e. the printed counter should start from 1 for every call. This happens when calling run.py from the command line, but if I call the script multiple times from IPython, i keeps increasing:
In [1]: run run.py
1
2
In [2]: run run.py
3
4
I know that there are lots of posts showing how to reload imports (using autoreload in IPython, importlib.reload(), ...), but none of them seem to work for this example. Is there a way to force a clean reload/import?
Some side notes: (1) The Fortran code that I'm trying to access is quite large, old and messy, so I'd prefer not to change anything in there; (2) I could easily do test.mod.i = something in between calls, but the real Fortran code is too complex for such solutions; (3) I'd really prefer a solution which I can put in the Python code over e.g. settings (autoreload, ..) which I have to manually put in the IPython interpreter (forget it once and ...)
If you can slightly change your fortran code you may be able to reset without re-import (probably faster too).
The change is about introducing i as a common and resetting it from outside. Your changed fortran code will look this
! test.f90
module mod
common /set1/ i
contains
subroutine foo
common /set1/ i
i = i+1
print*,i
end subroutine foo
end module mod
reset the variable i from python as below:
import test
test.mod.foo()
test.mod.foo()
test.set1.i = 0 #reset here
test.mod.foo()
This should produce the result as follows:
python run.py
1
2
1

Python - How to save functions

I´m starting in python. I have four functions and are working OK. What I want to do is to save them. I want to call them whenever I want in python.
Here's the code my four functions:
import numpy as ui
def simulate_prizedoor(nsim):
sim=ui.random.choice(3,nsim)
return sims
def simulate_guess(nsim):
guesses=ui.random.choice(3,nsim)
return guesses
def goat_door(prizedoors, guesses):
result = ui.random.randint(0, 3, prizedoors.size)
while True:
bad = (result == prizedoors) | (result == guesses)
if not bad.any():
return result
result[bad] = ui.random.randint(0, 3, bad.sum())
def switch_guesses(guesses, goatdoors):
result = ui.random.randint(0, 3, guesses.size)
while True:
bad = (result == guesses) | (result == goatdoors)
if not bad.any():
return result
result[bad] = ui.random.randint(0, 3, bad.sum())
What you want to do is to take your Python file, and use it as a module or a library.
There's no way to make those four functions automatically available, no matter what, 100% percent of the time, but you can do something very close.
For example, at the top of your file, you imported numpy. numpy is a module or library which has been set up so it's available any time you run python, as long as you import it.
You want to do the same thing -- save those 4 functions into a file, and import them whenever you want them.
For example, if you copy and paste those four functions into a file named foobar.py, then you can simply do from foobar import *. However, this will only work if you're running Python in the same folder where you saved your code.
If you want to make your module available system-wide, you have to save it somewhere on the PYTHONPATH. Usually, saving it to C:\Python27\Lib\site-packages will work (assuming you're running Windows).
If you decide to put them anywhere in your project folder don`t forget to create a blank init.py file so python can see them. A better answer can be provided here : http://docs.python.org/2/tutorial/modules.html
Save them in a file - this makes them a module.
If you put them in a file called mymod.py, in python you can load them as follows
from mymod import *
simulate_prizedoor(23)
Quick solution, without having to explicitly create a file - relies on IPython and its storemagic
IPython 4.0.1 -- An enhanced Interactive Python.
details.
In [1]: def func(a):
...: print a
...:
In [2]: func = _i #gets the previous input
In [3]: store func #store(magic) the input
#(auto-magic enabled or would need '%store')
Stored 'func' (unicode)
In [4]: exit
IPython 4.0.1 -- An enhanced Interactive Python.
In [1]: store -r func #retrieve stored string
In [2]: exec func #execute string as python code
In [3]: func(10)
10
Once you had stored all your functions just once, then you can restore them all with store -r, and then exec func once for each function, in each new session.
(Came across this question while looking for a solution for 'quick saving' functions (most convenient way) while in an interactive python session - adding my current best solution for future readers)

Categories

Resources