I am new to Cython. I've written a super simple test programm to access benefits of Cython. Yet, my pure python is alot faster. Am I doing something wrong?
test.py:
import timeit
imp = '''
import pyximport; pyximport.install()
from hello_cy import hello_c
from hello_py import hello_p
'''
code_py = '''
hello_p()
'''
code_cy = '''
hello_c()
'''
print(timeit.timeit(stmt=code_py, setup=imp))
print(timeit.timeit(stmt=code_cy, setup=imp))
hello_py.py:
def hello_p():
print('Hello World')
hello_cy.pyx:
from libc.stdio cimport printf
cpdef void hello_c():
cdef char * hello_world = 'hello from C world'
printf(hello_world)
hello_py timeit takes 14.697s
hello_cy timeit takes 98s
Am I missing something? How can I make my calls to cpdef functions run faster?
Thank you very much!
I strongly suspect a problem in your configuration.
I have (partially) reproduced your tests in Windows 10, Python 3.10.0, Cython 0.29.26, MSVC 2022, and got quite different results
Because in my tests the Cython code is slightly faster. I made 2 changes:
in hello_cy.pyx, to make both code closer, I have added the newline:
...
printf("%s\n", hello_world)
In the main script I have splitted the call of the functions and the display of the times:
...
pyp = timeit.timeit(stmt=code_py, setup=imp)
pyc = timeit.timeit(stmt=code_cy, setup=imp)
print(pyp)
print(pyc)
When I run the script I get (after pages of hello...):
...
hello from C world
hello from C world
19.135732599999756
14.712803700007498
Which looks more like what could be expected...
Anyway, we do not really know what is tested here, because as much as possible, the IO should not be tested because it depends of a lot of things outside of the programs themselves.
It just isn't a meaningful test - the two functions aren't the same:
Python print prints a string followed by a newline. I think it then flushes the output.
printf scans the string for formatting characters (e.g. %s) and then prints it without an extra newline. By default printf is line bufferred (i.e. flushes after each new line). Since you never post a new line or flush it, then it may be slowed down by managing an increasingly huge buffer.
In summary, don't be mislead by fairly meaningless microbenchmarks. Especially for terminal IO which is rarely actually a limiting factor.
It takes too long because pyximport compiles the cython code on the fly (so you are measuring also compilation from cython to C and compilation C code to native library). You should measure calling already compiled code - see https://cython.readthedocs.io/en/latest/src/quickstart/build.html#building-a-cython-module-using-setuptools
Related
I am new to cython(only use it for doing a little hw now).
I use the following code to see a general idea of it in jupyter notebook.
%load_ext Cython
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
print(cfunc(10))
However, it only prints out the result 45 once. When I run the print function, the cell doesn't show 45 anyone.
Is there any problems with the code? How can I make the cell prints out 45 just the same as a normal python code? Thanks.
When running %%cython-magic a lot happens under the hood. One can see parts of it when calling the magic in verbose mode, i.e. %%cython --verbose:
A file called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.pyx is generated. b599dcf313706e8c6031a4a7058da2a2 is the sha1-hash of the %%cython-cell, which is needed for example to be able to reload a %%cython-cell (see this SO-post).
This file is cythonized and build to a c-extension called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.
This extension gets imported - this is the moment your code prints 45, and everything from this module is added to the global namespace.
When you execute the cell again, nothing of the above happens: given the sha-hash the machinery can see, that this cell was already executed and loaded - so nothing to be done. Only when the content of the cell is changed and thus its hash the cash will not be used but the 3 steps above executed.
To enforce that the steps above are performed one has to pass --force (or -f) options to the %%cython-magic-cell, i.e.:
%%cython --force
...
# 45 is printed
However, because building extension anew is quite time consuming one would probably prefer the following
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
# put the code of __main__ into a function
def cython_main():
print(cfunc(10))
# execute the old main
cython_main()
and now calling cython_main() in a new cell, so it gets reevaluated the same way the normal python code would.
I'm trying to access some Fortran subroutines using F2PY, but I've ran into the following problem during consecutive calls from IPython. Take this minimal Fortran code (hope that I didn't code anything stupid; my Fortran is a bit rusty..):
! test.f90
module mod
integer i
contains
subroutine foo
i = i+1
print*,i
end subroutine foo
end module mod
If I compile this using F2PY (f2py3.5 -c -m test test.f90), import it in Python and call it twice:
# run.py
import test
test.mod.foo()
test.mod.foo()
The resulting output is:
$ python run.py
1
2
So on every call of foo(), i is incremented, which is supposed to happen. But between different calls of run.py (either from the command line or IPython interpreter), everything should be "reset", i.e. the printed counter should start from 1 for every call. This happens when calling run.py from the command line, but if I call the script multiple times from IPython, i keeps increasing:
In [1]: run run.py
1
2
In [2]: run run.py
3
4
I know that there are lots of posts showing how to reload imports (using autoreload in IPython, importlib.reload(), ...), but none of them seem to work for this example. Is there a way to force a clean reload/import?
Some side notes: (1) The Fortran code that I'm trying to access is quite large, old and messy, so I'd prefer not to change anything in there; (2) I could easily do test.mod.i = something in between calls, but the real Fortran code is too complex for such solutions; (3) I'd really prefer a solution which I can put in the Python code over e.g. settings (autoreload, ..) which I have to manually put in the IPython interpreter (forget it once and ...)
If you can slightly change your fortran code you may be able to reset without re-import (probably faster too).
The change is about introducing i as a common and resetting it from outside. Your changed fortran code will look this
! test.f90
module mod
common /set1/ i
contains
subroutine foo
common /set1/ i
i = i+1
print*,i
end subroutine foo
end module mod
reset the variable i from python as below:
import test
test.mod.foo()
test.mod.foo()
test.set1.i = 0 #reset here
test.mod.foo()
This should produce the result as follows:
python run.py
1
2
1
My python scripts often contain "executable code" (functions, classes, &c) in the first part of the file and "test code" (interactive experiments) at the end.
I want python, py_compile, pylint &c to completely ignore the experimental stuff at the end.
I am looking for something like #if 0 for cpp.
How can this be done?
Here are some ideas and the reasons they are bad:
sys.exit(0): works for python but not py_compile and pylint
put all experimental code under def test():: I can no longer copy/paste the code into a python REPL because it has non-trivial indent
put all experimental code between lines with """: emacs no longer indents and fontifies the code properly
comment and uncomment the code all the time: I am too lazy (yes, this is a single key press, but I have to remember to do that!)
put the test code into a separate file: I want to keep the related stuff together
PS. My IDE is Emacs and my python interpreter is pyspark.
Use ipython rather than python for your REPL It has better code completion and introspection and when you paste indented code it can automatically "de-indent" the pasted code.
Thus you can put your experimental code in a test function and then paste in parts without worrying and having to de-indent your code.
If you are pasting large blocks that can be considered individual blocks then you will need to use the %paste or %cpaste magics.
eg.
for i in range(3):
i *= 2
# with the following the blank line this is a complete block
print(i)
With a normal paste:
In [1]: for i in range(3):
...: i *= 2
...:
In [2]: print(i)
4
Using %paste
In [3]: %paste
for i in range(10):
i *= 2
print(i)
## -- End pasted text --
0
2
4
In [4]:
PySpark and IPython
It is also possible to launch PySpark in IPython, the enhanced Python interpreter. PySpark works with IPython 1.0.0 and later. To use IPython, set the IPYTHON variable to 1 when running bin/pyspark:1
$ IPYTHON=1 ./bin/pyspark
Unfortunately, there is no widely (or any) standard describing what you are talking about, so getting a bunch of python specific things to work like this will be difficult.
However, you could wrap these commands in such a way that they only read until a signifier. For example (assuming you are on a unix system):
cat $file | sed '/exit(0)/q' |sed '/exit(0)/d'
The command will read until 'exit(0)' is found. You could pipe this into your checkers, or create a temp file that your checkers read. You could create wrapper executable files on your path that may work with your editors.
Windows may be able to use a similar technique.
I might advise a different approach. Separate files might be best. You might explore iPython notebooks as a possible solution, but I'm not sure exactly what your use case is.
Follow something like option 2.
I usually put experimental code in a main method.
def main ():
*experimental code goes here *
Then if you want to execute the experimental code just call the main.
main()
With python-mode.el mark arbitrary chunks as section - for example via py-sectionize-region.
Than call py-execute-section.
Updated after comment:
python-mode.el is delivered by melpa.
M-x list-packages RET
Look for python-mode - the built-in python.el provides 'python, while python-mode.el provides 'python-mode.
Developement just moved hereto: https://gitlab.com/python-mode-devs/python-mode
I think the standard ('Pythonic') way to deal with this is to do it like so:
class MyClass(object):
...
def my_function():
...
if __name__ == '__main__':
# testing code here
Edit after your comment
I don't think what you want is possible using a plain Python interpreter. You could have a look at the IEP Python editor (website, bitbucket): it supports something like Matlab's cell mode, where a cell can be defined with a double comment character (##):
## main code
class MyClass(object):
...
def my_function():
...
## testing code
do_some_testing_please()
All code from a ##-beginning line until either the next such line or end-of-file constitutes a single cell.
Whenever the cursor is within a particular cell and you strike some hotkey (default Ctrl+Enter), the code within that cell is executed in the currently running interpreter. An additional feature of IEP is that selected code can be executed with F9; a pretty standard feature but the nice thing here is that IEP will smartly deal with whitespace, so just selecting and pasting stuff from inside a method will automatically work.
I suggest you use a proper version control system to keep the "real" and the "experimental" parts separated.
For example, using Git, you could only include the real code without the experimental parts in your commits (using add -p), and then temporarily stash the experimental parts for running your various tools.
You could also keep the experimental parts in their own branch which you then rebase on top of the non-experimental parts when you need them.
Another possibility is to put tests as doctests into the docstrings of your code, which admittedly is only practical for simpler cases.
This way, they are only treated as executable code by the doctest module, but as comments otherwise.
I wish to write a python script for that needs to do task 'A' and task 'B'. Luckily there are existing Python modules for both tasks, but unfortunately the library that can do task 'A' is Python 2 only, and the library that can do task 'B' is Python 3 only.
In my case the libraries are small and permissively-licensed enough that I could probably convert them both to Python 3 without much difficulty. But I'm wondering what is the "right" thing to do in this situation - is there some special way in which a module written in Python 2 can be imported directly into a Python 3 program, for example?
The "right" way is to translate the Py2-only module to Py3 and offer the translation upstream with a pull request (or equivalent approach for non-git upstream repos). Seriously. Horrible hacks to make py2 and py3 packages work together are not worth the effort.
I presume you know of tools such as 2to3, that aim to make the job of porting code to py3k easier, just repeating it here for others' reference.
In situations where I have to use libraries from python3 and python2, I've been able to work around it using the subprocess module. Alternatively, I've gotten around this issue with shell scripts that pipes output from the python2 script to the python3 script and vice-versa. This of course covers only a tiny fraction of use cases, but if you're transferring text (or maybe even picklable objects) between 2 & 3, it (or a more thought out variant) should work.
To the best of my knowledge, there isn't a best practice when it comes to mixing versions of python.
I present to you an ugly hack
Consider the following simple toy example, involving three files:
# py2.py
# file uses python2, here illustrated by the print statement
def hello_world():
print 'hello world'
if __name__ == '__main__':
hello_world()
# py3.py
# there's nothing py3 about this, but lets assume that there is,
# and that this is a library that will work only on python3
def count_words(phrase):
return len(phrase.split())
# controller.py
# main script that coordinates the work, written in python3
# calls the python2 library through subprocess module
# the limitation here is that every function needed has to have a script
# associated with it that accepts command line arguments.
import subprocess
import py3
if __name__ == '__main__':
phrase = subprocess.check_output('python py2.py', shell=True)
num_words = py3.count_words(phrase)
print(num_words)
# If I run the following in bash, it outputs `2`
hals-halbook: toy hal$ python3 controller.py
2
When writing code In Python I usually use Cprofile which prints the profile results in the console:
import cProfile, pstats, StringIO
pr = cProfile.Profile()
pr.enable()
#do stuff
pr.disable()
s = StringIO.StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
ps.print_stats()
print s.getvalue()
Is there any alternatives in C++?
Edit - I'm using VS 2008 Express, Windows 64 bits.
Just go to Analyze -> Profiler -> Attach/Detach.
Valgrind, specifically callgrind, is pretty much the standard way of doing this in C++. It's a little more complicated than python though, since python can basically monkey patch all calls to every method to generate call graphs and stuff.
http://valgrind.org/docs/manual/cl-manual.html