We have some Java code we want to use with new code we plan to write in Python, hence our interest in using Jython. However we also want to use numpy and pandas libraries to do complex statistical analysis in this Python code.
Is it possible to call numpy and pandas from Jython?
Keep an eye in JyNI which is at alpha.2 version, as of March-2014.
Not directly.
One option which I've used in the past is to use jsonrpclib (which works for both) to communicate between python and jython. There's even a server builtin which makes things quite simple. You'll just need to figure out whether the gains of using numpy are worth the additional overhead.
Especially if you don't want to use raw Numpy, but other Python frameworks that depend on it, JyNI will be the way to go once it is mature. However, it is not yet capable to import Numpy.
Until then you can use Numpy from Java by embedding CPython. See the Numpy4J-project for this (I didn't test it myself though).
You can't use numpy from Jython at this time. But if you're willing to use CPython instead of Jython, there are some open source Java projects that work with numpy (and presumably pandas).
Jep
jpy
JyNI
Related
I have 2 python modules where one only supports Python 2.x and the other one 3.x.
Unfortunately I need both for a project.
My workaround for now is to have them run on their own as separate programs and building up their communication via the socket module.
I will be ending up with 2 executables, what I would like to avoid.
The "connection" between both modules has to be as fast as possible.
So my question is if there is a way to somehow combine both to one executable at the end and if there is a better solution for a fast communication as the client-server construction I have now.
There really is no good way to avoid that workaround.
Conceptually, there's no reason that you couldn't embed two interpreters into the same process. But practically, the CPython interpreter depends on some static/global state. While 3.7 is much better about that than, say, 3.0 or 2.6 was, that state still hasn't nearly been eliminated.1 And, the way C linkage works, there's no way to get around that without changing the interpreter.
Also, embedding CPython isn't hard, but it's not trivial, in the way that running an interpreter as a subprocess is trivial—and it may be harder than coming up with an efficient way to pass or share state between subprocesses.
Of course there are other interpreters besides CPython. But the other major implementation with both 2.7 and 3.x versions isn't easily embeddable (PyPy), and the two that are easily embeddable don't have 3.x versions, and also can only be embedded in another VM, and can't run C extension modules (Jython and IronPython). It is possible to do something like using JEP to embed CPython 3.7 via JNI in a JVM while also using Jython 2.7 natively in that same JVM, but I doubt that approach will work for you.
Meanwhile, I mentioned that passing or sharing data between processes generally isn't that hard.
If you don't have that much data, you can usually just pass it pickled over a pipe.
If you do have a ton of data, it usually is, or could be, stored in memory in some structured form—numpy arrays, big hunks of ASCII or UTF-8 text, arrays of ctypes structs, etc.—that you can overlay on an mmap or shared memory segment.
Or, of course, you can come up with your own protocol and communicate with it over a (UNIX or IP) socket. But you don't necessarily have to jump right to that option.
Notice that multiprocessing supports both of the first two—although to take advantage of it with independent interpreters, you have to dig into its source and pull out the bits you need. And there are also third-party libraries that can help. (For example, if you need to pickle things that don't pickle natively, the answer is often as simple as "replace pickle with dill".)
1. Running multiple subinterpreters in various restricted ways does sort of work with things like mod_wsgi, and PEP 554 aims to get things to the state where you can easily and cleanly run multiple 3.7 subinterpreters in the same process, but still nothing like completely independent embeddings of CPython—the subinterpreters share a GIL, a cycle collector, an atexit handler, etc.
There is a project called JyNI that allows you to run NumPy in Jython. However I haven't come across anywhere on how to get NumPy into Jython. I've tried 'pip install numpy' (which will work for normal python 3.4.3) but gives an error about a missing py3k module. Does anybody have a bit more information about this?
JyNI does state NumPy-support as its main goal, but cannot do it yet, as long as it is still in alpha-state.
However until it is mature enough you can use NumPy via
JEP (https://github.com/mrj0/jep) or
JPY (https://github.com/bcdev/jpy).
Alternatively you can use a Java numerical library for your computation, e.g. one of these:
https://github.com/mikiobraun/jblas
https://github.com/fommil/matrix-toolkits-java
Both are Java-libs that do numerical processing natively backed by blas or lapack (i.e. the same backends NumPy uses), so the performance should equal that of NumPy more or less. However they don't feature such a nice multiarray implementation as NumPy does afaik.
If you need NumPy indirectly to fulfill dependencies of some other framework, these solutions won't do it out of the box. If the dependencies are only marginal you can maybe rewrite/substitute the corresponding calls based on one of the named projects. Otherwise you'll have to wait for JyNI...
If you can make some framework running on Jython this way, please consider to make your work publicly available, ideally as a fork of the framework.
I was wondering if any of you could give me guidance on whether it'd be possible to use some of these TA-LIB functions found here in a python script. I cant find the functions in any other language that I know...
I read this, so there seems to be some level of possibility, however I have little understanding of whats going on in the article since I dont know C at all. Oh and incase you are wondering TA-Lib is ported on python BUT its doesnt really build on mac and most people say they have issues with it.
So essentially, I can't get the whole app to work in swig, I was wondering if I could instead compile the function (not even sure if that makes sense) and use it in a python app (and hopefully some guidance on how to do so).
I believe there are three simple approaches you could take:
SWIG
TA-Lib comes with a Python wrapper that is generated by SWIG. It hasn't been updated in a long time, so is hard-coded to build with Python 2.3. Andy Hawkins wrote some directions to get it to work with newer versions of Python.
Cython
I wrote a TA-Lib python wrapper that uses Cython to wrap all the functions in TA-Lib, and released it on Github. It works really well for me, uses Numpy arrays, is 2-4 times faster, more "pythonic", and easier to install (works on Mac OS X) than the SWIG interface.
Ctypes
If you only need a small number of functions from the library, you can use ctypes to make calls into the TA-Lib library.
If all you need is a single function in a library, yoyu be better of using ctypes - ctypes is a Python module in the standard library that allows to you to perform calls to libraries in native code.
You just have to check on the Python console how to obtain you TA-LIB as a Python object using ctypes, and how to call the function you need. Ctypes converts ints and strings automatically to C for you - you will need to perform some function anotation for other parameter types, though.
I am about to embark on some signal processing work using NumPy/SciPy. However, I have never used Python before and don't know where to start.
I see there are currently two branches of Python in this world: Version 2.x and 3.x.
Being a neophile, I instinctively tend to go for the newer one, but there seems to be a lot of talk about incompatibilities between the two. Numpy seems to be compatible with Python 3. I can't find any documents on SciPy.
Would you recommend to go with Python 3 or 2?
(could you point me to some resources to get started? I know C/C++, Ruby, Matlab and some other stuff and basically want to use NumPy instead of Matlab.)
Both scipy and numpy are compatible with py3k. However, if you'll need to plot stuff: matplotlib is not yet officially compatible with py3k. So, it'll depend on whether your signalling processing involves plotting.
Syntactic differences are not that great between the two version.
I am using Python 2.6 with Numpy. I can confirm that Python 3 is not backward compatible. So I myself am not very confident with upgrading. Have a look at the cookbook to get started
http://www.scipy.org/Cookbook
I personally suggest you begin with 2.7, 'cause it seems to me that there is a lot of time before 2.x will become deprecated.
read more hear http://docs.python.org/dev/whatsnew/2.7.html
#SilentGhost
Scipy for python 3.2 available in beta: http://sourceforge.net/projects/scipy/files/scipy/0.10.0b2/
I am quite conservative in this respect, and so I use Python 2.6. That's what comes pre-installed on my Linux box, and it is also the target version for the latest binary releases of SciPy.
Python 3 is without a doubt a huge step forward, but if you do mainly numerical stuff with NumPy and SciPy, I'd still go for Python 2.
I can recommend Using py3k over py2.6 if possible. Especially if you're a new user, since some of the syntax changes in py3k and it'll be harder to get used the new syntax if you're starting out learning the old.
The modules you mention all have support for py3k but as SilentGhost noted you might want to check for compatibility with plotting libraries too.
I've been able to use the standard Python modules from IronPython, but I haven't gotten SciPy to work yet. Has anyone been able to use SciPy from IronPython? What did you have to do to make it work?
Update: See Numerical computing in IronPython with Ironclad
Update: Microsoft is partnering with Enthought to make SciPy for .NET.
Some of my workmates are working on Ironclad, a project that will make extension modules for CPython work in IronPython. It's still in development, but parts of numpy, scipy and some other modules already work. You should try it out to see whether the parts of scipy you need are supported.
It's an open-source project, so if you're interested you could even help. In any case, some feedback about what you're trying to do and what parts we should look at next is helpful too.
Anything with components written in C (for example NumPy, which is a component of SciPy) will not work on IronPython as the external language interface works differently. Any C language component will probably not work unless it has been explicitly ported to work with IronPython.
You might have to dig into the individual modules and check to see which ones work or are pure python and find out which if any of the C-based ones have been ported yet.