I am Building A program with C as the ui and the main process and python at its backend.
I want to pass the variables/String from C program to Python
A simple File on Hard Drive can be used but it would be better if the ram is utilised to get live change in variables.
what can i do.
I dont want to use any sockets or pipes.
I want to give direct access from the ram.
I think you'll be interested in IPC with mmap and locks.
See docs and for example, here
and maybe here for a code example (with two Python processes only).
Generally using APIs is the best way to communicate between services in different languages. But if you must have one codebase with shared ram, there is actually something for that with C/Python. It's called Cython: https://cython.org/
Related
I have a really specific need :
I want to create a python console with a Qt Widget, and to be able to have several independent interpreters.
Now let me try to explain where are my problems and all tries I did, in order of the ones I'd most like to make working to those I can use by default
The first point is that all functions in the Python C API (PyRun[...], PyEval[...] ...) need the GIL locked, that forbid any concurrent code interpretations from C ( or I'd be really glad to be wrong !!! :D )
Therefore, I tried another approach than the "usual way" : I made a loop in python that call read() on my special file and eval the result. This function (implemented as a built extension) blocks until there is data to read. (Actually, it's currently a while in C code rather than a pthread based condition)
Then, with PyRun_simpleString(), I launch my loop in another thread. This is where the problem is : my read function, in addition to block the current thread (that is totally normal), it blocks the whole interpreter, and PyRun_simpleString() doesn't return...
Finally I've this last idea which risks to be relatively slow : To have a dedicated thread in C++ which run the interpreter, and do every thing in python to manage input/output. This could be a loop which creates the jobs when there is a console needing to execute a command. Seems not to be really hard to do, but I prefer ask you : is there a way to make the above possibilities to work, or is there another way I didn't think about or is my last idea the best ?
One alternative is to just re-use code from IPython and its Qt Console. This assumes by independent interpreters you imply they won't share memory. IPythons run the Python interpreter in multiple processes and communicates with them over TCP or Unix domain sockets with the help of ZeroMQ.
Also, from your question I'm not sure if you're aware of the common blocking I/O idiom in Python C extensions:
Py_BEGIN_ALLOW_THREADS
... Do some blocking I/O operation ...
Py_END_ALLOW_THREADS
This releases the GIL so that other threads can execute Python code while your function is blocking. See Python/C API Reference Manual: Thread State and the Global Interpreter Lock.
If your main requirement is to have several interpreters independent from each other, you'd probably better suited doing fork() and exec() than doing multithreading.
This way each of the interpreters would live in it's own address space not disturbing one of the others.
I'm having a problem creating a inter-process communication for my python application. I have two python scripts at hand, let's say A and B. A is used to open a huge file, keep it in memory and do some processing that Mysql can't do, and B is a process used to query A very often.
Since the file A needs to read is really large, I hope to read it once and have it hang there waiting for my Bs' to query.
What I do now is, I use cherrypy to build a http-server. However, I feel it's kind of awkward to do so since what I'm trying to do is absolutely local. So, I'm wondering are there other more organic way to achieve this goal?
I don't know much about TCP/socket etc. If possible, toy examples would be appreciate (please include the part to read file).
Python has good support for ZeroMQ, which is much easier and more robust than using raw sockets.
The ZeroMQ site treats Python as one of its primary languages and offers copious Python examples in its documentation. Indeed, the example in "Learn the Basics" is written in Python.
A python program opens a new process of the C++ program and is reading the processes stdout.
No problem so far.
But is it possible to have multiple streams like this for communication? I can get two if I misuse stderr too, but not more. Easy way to hack this would be using temporary files. Is there something more elegant that does not need a detour to the filesystem?
PS: *nix specific solutions are welcome too
On unix systems; the usual way to open a subprocess is with fork(), which will leave any open file descriptors (small integers representing open files or sockets) available in both the child, and the parent, and then exec(), which also allows the new executable to use the file descriptors that were open in the old process. This functionality is preserved in the subprocess.Popen() call (adjustable with the close_fds argument). Thus, what you probably want to do is use os.pipe() to create pairs of sockets to communicate on, then use Popen() to launch the other process, with arguments for each of fd's returned by the previous call to pipe() to tell it which fd's it should use.
Sounds like what you want are to use sockets for communication. Both languages let open raw sockets but you might want to check out the zeromq project as well which has some addition advantages for message passing. Check out their hello world in c++ and python.
assuming windows machine.
you could try using the clipboard for exchanging information between python processes and C++.
assign some unique process id followed by your information and write it to clipboard on python side.....now just parse the string on C++ side.
its akin to using temporary files but all done in memory..... but the drawback being you cannot use clipboard for any other application.
hope it helps
With traditional, synchronous programming and the standard Python library, what you're asking is difficult to accomplish. If, instead, you consider using an asynchronous programming model and the Twisted library, it's a piece of cake. The Using Processes HOWTO describes how to easily communicate with as many processes as you like. Admittedly, there's a bit of a learning curve to Twisted but it's well worth the effort.
I have a cpu intensive code which uses a heavy dictionary as data (around 250M data). I have a multicore processor and want to utilize it so that i can run more than one task at a time. The dictionary is mostly read only and may be updated once a day.
How can i write this in python without duplicating the dictionary?
I understand that python threads don't use native threads and will not offer true concurrency. Can i use multiprocessing module without data being serialized between processes?
I come from java world and my requirement would be something like java threads which can share data, run on multiple processors and offers synchronization primitives.
You can share read-only data among processes simply with a fork (on Unix; no easy way on Windows), but that won't catch the "once a day change" (you'd need to put an explicit way in place for each process to update its own copy). Native Python structures like dict are just not designed to live at arbitrary addresses in shared memory (you'd have to code a dict variant supporting that in C) so they offer no solace.
You could use Jython (or IronPython) to get a Python implementation with exactly the same multi-threading abilities as Java (or, respectively, C#), including multiple-processor usage by multiple simultaneous threads.
Use shelve for the dictionary. Since writes are infrequent there shouldn't be an issue with sharing it.
Take a look at this in the stdlib:
http://docs.python.org/library/multiprocessing.html
There are a bunch of wonderful features that will allow you to share data structures between processes very easily.
Would it be possible to make a python cluster, by writing a telnet server, then telnet-ing the commands and output back-and-forth? Has anyone got a better idea for a python compute cluster?
PS. Preferably for python 3.x, if anyone knows how.
The Python wiki hosts a very comprehensive list of Python cluster computing libraries and tools. You might be especially interested in Parallel Python.
Edit: There is a new library that is IMHO especially good at clustering: execnet. It is small and simple. And it appears to have less bugs than, say, the standard multiprocessing module.
You can see most of the third-party packages available for Python 3 listed here; relevant to cluster computation is mpi4py -- most other distributed computing tools such as pyro are still Python-2 only, but MPI is a leading standard for cluster distributed computation and well looking into (I have no direct experience using mpi4py with Python 3, yet, but by hearsay I believe it's a good implementation).
The main alternative is Python's own built-in multiprocessing, which also scales up pretty well if you have no interest in interfacing existing nodes that respect the MPI standards but may not be coded in Python.
There is no real added value in rolling your own (as Atwood says, don't reinvent the wheel, unless your purpose is just to better understand wheels!-) -- use one of the solid, tested, widespread solutions, already tested, debugged and optimized on your behalf!-)
Look into these
http://www.parallelpython.com/
http://pyro.sourceforge.net/
I have used both and both are exellent for distributed computing
for more detailed list of options see
http://wiki.python.org/moin/ParallelProcessing
and if you want to auto execute something on remote machine , better alternative to telnet is ssh as in http://pydsh.sourceforge.net/
What kind of stuff do you want to do? You might want to check out hadoop. The backend, heavy lifting is done in java, but has a python interface, so you can write python scripts create and send the input, as well as process the results.
If you need to write administrative scripts, take a look at the ClusterShell Python library too, or/and its parallel shell clush. It's useful when dealing with node sets also (man nodeset).
I think IPython.parallel is the way to go. I've been using it extensively for the last year and a half. It allows you to work interactively with as many worker nodes as you want. If you are on AWS, StarCluster is a great way to get IPython.parallel up and running quickly and easily with as many EC2 nodes as you can afford. (It can also automatically install Hadoop, and a variety of other useful tools, if needed.) There are some tricks to using it. (For example, you don't want to send large amounts of data through the IPython.parallel interface itself. Better to distribute a script that will pull down chunks of data on each engine individually.) But overall, I've found it to be a remarkably easy way to do distributed processing (WAY better than Hadoop!)
"Would it be possible to make a python cluster"
Yes.
I love yes/no questions. Anything else you want to know?
(Note that Python 3 has few third-party libraries yet, so you may wanna stay with Python 2 at the moment.)