How to synchronize python and c sharp program running in windows? - python

I am currently working in a project where I need to sychronise data between python and c sharp.I need to label data from c sharp using python machine learning program. To label the data, I am using timestamp from both the application and based on the common timestamp, I am labelling the data.
Python program is running every 0.5 to 1.5 sec and C sharp program is running 10 times every 1 sec. Since the two process are running differently, I know there is some time lag. So labelling the data using the timestamp is not much accurate. I want to analyse the time lag properly. For this I am looking for options of real time synchronization between the two programs. I have looked into sockets but I think there is a better way using IPC. I donot know much about this.
I am thinking to create a shared variable between python and c#. Since python is slower, I will update that variable using python and read that variable from c# program. so same variable instance on both the program would tell us that they are synchronized perfectly. So I can look the value of this variable instead of timestamp for labelling the data. I am thinking that this might solve the issue. Please let me know what would be the best optimal solution to minimize the time difference lag between the two program.
since these are complex projects, I cannot implement them in a single program. I need to find a way to synchronize these two programs.
Any suggestions would be appreciated. Thank you.
I tried working with socket programming but they were not that good and a bit complex. So I am now thinking about IPC but still not sure which is the best way.

First of all, I implemented a socket in C# program so that I get data from the socket. Then I implemented multiprocessing in python. One process will request to the socket and another process will work for ML model. I was able to achieve the synchronization using the multiprocessing module. I used multiprocessing.Event() to wait for the event from another process. You can also look into shared variables in python using multiprocessing.Value, multiprocessing.Array, multiprocessing.Event.

Related

How to read a sensor in c but then use that input in python

I have a flow sensor that I have to read with c because python isn't fast enough but the rest of my code is python. What I want to do is have the c code running in the background and just have the python request a value from it every now and then. I know that popen is probably the easiest way to do this but I don't fully understand how to use it. I don't want completed code I just want a way to send text/numbers back and forth between a python and a c code. I am running raspbian on a raspberry pi zero w. Any help would be appreciated.
Probably not a full answer, but I expect it gives some hints and it is far too long for a comment. You should think twice about your requirements, because it will probably not be that easy depending on your proficiency in C and what OS you are using.
If I have correctly understood, you have a sensor that sends data (which is already weird unless the sensor is an intelligent one). You want to write a C program that will read that data and either buffer it, and retain only last (you did not say...) and at the same time will wait for requests from a Python script to give it back what it has received (and kept) from the sensor. That probably means a dual thread program with quite a bit of synchronization.
You will also need to specify the communication way between C and Python. You can certainly use the subprocess module, but do not forget to use unbuffered output in C. But you could also imagine an independant program that uses a FIFO or a named piped with a well defined protocol for external requests, in order to completely separate both problems.
So my opinion is that this is currently too broad for a single SO question...

Optimal way of matlab to python communication

So I am working on a Matlab application that has to do some communication with a Python script. The script that is called is a simple client software. As a side note, if it would be possible to have a Matlab client and a Python server communicating this would solve this issue completely but I haven't found a way to work that out.
Anyhow, after searching the web I have found two ways to call Python scripts, either by the system() command or editing the perl.m file to call Python scripts instead. Both ways are too slow though (tic tocing them to > 20ms and must run faster than 6ms) as this call will be in a loop that is very time sensitive.
As a solution I figured I could instead save a file at a certain location and have my Python script continuously check for this file and when finding it executing the command I want it to. Now after timing each of these steps and summing them up I found it to be incredibly much faster (almost 100x so for sure fast enough) and I cant really believe that, or rather I cant understand why calling python scripts is so slow (not that I have more than a superficial knowledge of the subject). I also found this solution to be really messy and ugly so just wanted to check that, first, is it a good idea and second, is there a better one?
Finally, I realize that the Python time.time() and Matlab tic, toc might not be precise enough to measure time correctly on that scale so also a reason why I ask.
Spinning up new instances of the Python interpreter takes a while. If you spin up the interpreter once, and reuse it, this cost is paid only once, rather than for every run.
This is normal (expected) behaviour, since startup includes large numbers of allocations and imports. For example, on my machine, the startup time is:
$ time python -c 'import sys'
real 0m0.034s
user 0m0.022s
sys 0m0.011s

Speed up feedparser

I'm using feedparser to print the top 5 Google news titles. I get all the information from the URL the same way as always.
x = 'https://news.google.com/news/feeds?pz=1&cf=all&ned=us&hl=en&topic=t&output=rss'
feed = fp.parse(x)
My problem is that I'm running this script when I start a shell, so that ~2 second lag gets quite annoying. Is this time delay primarily from communications through the network, or is it from parsing the file?
If it's from parsing the file, is there a way to only take what I need (since that is very minimal in this case)?
If it's from the former possibility, is there any way to speed this process up?
I suppose that a few delays are adding up:
The Python interpreter needs a while to start and import the module
Network communication takes a bit
Parsing probably consumes only little time but it does
I think there is no straightforward way of speeding things up, especially not the first point. My suggestion is that you have your feeds downloaded on a regularly basis (you could set up a cron job or write a Python daemon) and stored somewhere on your disk (i.e. a plain text file) so you just need to display them at your terminal's startup (echo would probably be the easiest and fastest).
I personally made good experiences with feedparser. I use it to download ~100 feeds every half hour with a Python daemon.
Parse at real time not better case if you want faster result.
You can try does it asynchronously by Celery or by similar other solutions. I like the Celery, it gives many abilities. There are abilities as task as the cron or async and more.

RRD library for handling time series data in a Python application

I am working on a simulation engine using Python where I collect a lot of metrics. The simulation runs at a high speed and generates around 100K events/second (I can do some processing by consolidating these events on a per second basis). I am looking for a mechanism to record these metrics as a time series.
My requirements are:
I would like to have this logging mechanism in the same process
as the simulation as opposed to an external process such as Graphite
The mechanism must be able to handle 100K events/second without slowing down the simulation.
I would like to store data as follows: Every metric related data should be stored with 1 second granularity for 60 minutes, 1 minute granularity for 1 day, 5 minute granularity for two days, 1 hour granularity for 6 months and 1 day granularity for 3 years of duration. I would like this mechanism to handle the consolidation of data as per the ranges specified.
Ideally, I want to maintain one file that holds the metrics information for one simulation run. For another run of the simulation a separate file would have to be created.
It would be nice to have a well-tested library/module that is readily available :)
BTW, I took a cursory look at RRDTool but from what I understand it seems like the Python library is a thin wrapper around the RRDTool binary. I'm looking for a tighter integration if possible.
TIA
The functionality provided by RRDTool fits my requirement. Initially I found a Python library https://pypi.python.org/pypi/python-rrdtool/ and misunderstood the nature of integration. I thought it was executing the binary of RRDTool as a separate process but the documentation says that this is a proper Python accessible wrapper that invokes the functionality in the same process space.
Later on I found this (https://pypi.python.org/pypi/PyRRD) Python library that wraps RRDTool functionality in a more pythonic OOPS kind of fashion that I found comfortable working with. The documentation available on the link page was good so I faced no roadblocks in using it.
This link (http://www.vandenbogaerdt.nl/rrdtool/tutorial/rrdcreate.php) was helpful in figuring out how to configure the RRD database during creation.

Pythonistas, please help convert this to utilize Python Threading concepts

Update : For anyone wondering what I went with at the end -
I divided the result-set into 4 and ran 4 instances of the same program with one argument each indicating what set to process. It did the trick for me. I also consider PP module. Though it worked, it prefer the same program. Please pitch in if this is a horrible implementation! Thanks..
Following is what my program does. Nothing memory intensive. It is serial processing and boring. Could you help me convert this to more efficient and exciting process? Say, I process 1000 records this way and with 4 threads, I can get it to run in 25% time!
I read articles on how python threading can be inefficient if done wrong. Even python creator says the same. So I am scared and while I am reading more about them, want to see if bright folks on here can steer me in the right direction. Muchos gracias!
def startProcessing(sernum, name):
'''
Bunch of statements depending on result,
will write to database (one update statement)
Try Catch blocks which upon failing,
will call this function until the request succeeds.
'''
for record in result:
startProc = startProcessing(str(record[0]), str(record[1]))
Python threads can't run at the same time due to the Global Interpreter Lock; you want new processes instead. Look at the multiprocessing module.
(I was instructed to post this as an answer =p.)

Categories

Resources