I work on a python script designed to control multiple actuator at the same time. To simplify, let's say I need to control 2 electric motors.
With the multiprocessing module I create a process for each motors, and a process to save the data on a sheet.
This part of the script works fine, however I need to command my motors at precise times, every milliseconds, and the time.time() or the time.clock() functions seems to be unreliable (triggs between 0.05 and 30 millisecond!)
Is it "normal" for these functions to be so erratic, or is this caused by another part of my script?
EDIT: I used the datetime function (see below) to improve the precision, but I still have several discrete level of error. For example, if I want 1ms, I get also 1.25, 0.75,1.5...
So IMO this is due to computer hardware(as Serge Ballesta said).
As I "only" need a relative time (1ms between each command), do you
know a way to do that precisely?
The best you can hope for is datetime.datetime.now(), which will give you microsecond resolution:
>>> import datetime
>>> datetime.datetime.now()
datetime.datetime(2014, 7, 15, 14, 31, 1, 521368)
>>> i = datetime.datetime.now()
>>> q = datetime.datetime.now()
>>> (i-q).seconds
86389
>>> (i-q).microseconds
648299
>>> i.microsecond
513160
IMHO the problem is not in your script but more probably in your machine. I assume you are using a standard computer under Linux or Windows. And even if the computer is able to do many things in one single milliseconds, it constantly does with :
network
mouse or keyboard events
screen (including screen savers ...)
antivirus software (mainly on Windows)
system or daemon processes (and there are plenty of them)
multi-task management
You cannot reliably hope to have a one millisecond accuracy without a dedicated hardware equipement or you must use a real time system.
Edit:
As a complement, here is a quote from an answer from Ulrich Eckhardt to this other post Inconsistent Python Performance on Windows :
You have code that performs serial IO, which will block your process and possibly invoke the scheduler. If that happens, it will first give control to a different process and only when that one yield or exceeds its timeslice it will re-schedule your process.
The question for you is: What is the size of the scheduler timeslice of the systems you are running? I believe that this will give you an insight into what is happening.
I can't give You a comment yet (reputation), so a give you an anwser:
It's just a clue:
Try to use cProfile module.
https://docs.python.org/2/library/profile.html
It lets you check how long your script is executed, and every function in the script. Function .run() in the cProfile module returns precise statistics about your script.
Maybe it can help you..
Related
I would like to know that the Python interpreter is doing in my production environments.
Some time ago I wrote a simple tool called live-trace which runs a daemon thread which collects stacktraces every N milliseconds.
But signal handling in the interpreter itself has one disadvantage:
Although Python signal handlers are called asynchronously as far as the Python user is concerned, they can only occur between the “atomic” instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C (such as regular expression matches on large bodies of text) may be delayed for an arbitrary amount of time.
Source: https://docs.python.org/2/library/signal.html
How could I work around above constraint and get a stacktrace, even if the interpreter is in some C code for several seconds?
Related: https://github.com/23andMe/djdt-flamegraph/issues/5
I use py-spy with speedscope now. It is a very cool combination.
py-spy works on Windows/Linux/macOS, can output flame graphs by its own and is actively deployed, eg. subprocess profiling support was added in October 2019.
Have you tried Pyflame? It's based on ptrace, so it shouldn't be affected by CPython's signal handling subtleties.
Maybe the perf-tool from Brendan Gregg can help
So I am working on a Matlab application that has to do some communication with a Python script. The script that is called is a simple client software. As a side note, if it would be possible to have a Matlab client and a Python server communicating this would solve this issue completely but I haven't found a way to work that out.
Anyhow, after searching the web I have found two ways to call Python scripts, either by the system() command or editing the perl.m file to call Python scripts instead. Both ways are too slow though (tic tocing them to > 20ms and must run faster than 6ms) as this call will be in a loop that is very time sensitive.
As a solution I figured I could instead save a file at a certain location and have my Python script continuously check for this file and when finding it executing the command I want it to. Now after timing each of these steps and summing them up I found it to be incredibly much faster (almost 100x so for sure fast enough) and I cant really believe that, or rather I cant understand why calling python scripts is so slow (not that I have more than a superficial knowledge of the subject). I also found this solution to be really messy and ugly so just wanted to check that, first, is it a good idea and second, is there a better one?
Finally, I realize that the Python time.time() and Matlab tic, toc might not be precise enough to measure time correctly on that scale so also a reason why I ask.
Spinning up new instances of the Python interpreter takes a while. If you spin up the interpreter once, and reuse it, this cost is paid only once, rather than for every run.
This is normal (expected) behaviour, since startup includes large numbers of allocations and imports. For example, on my machine, the startup time is:
$ time python -c 'import sys'
real 0m0.034s
user 0m0.022s
sys 0m0.011s
How to exactly use the function get_cpu_percent()?
My code is:
SDKTestSuite.DijSDK_CalculateFps(int(timeForFPS),int(index),cameraName)
cpuUsage = process.get_cpu_percent()
Here I am calling a Function called SDKTestSuite.DijSDK_CalculateFps() and I am calling get_cpu_percent() to get the CPU usage of this call. I am Calling this function for different inputs, The result is sometimes the CPU usage gives 0.0% which is not expected.
So Am I using the get_cpu_percent in the correct manner? How to exactly use this get_cpu_percent function? Is there any interval parameters vary here?
In the actual definition of this function it just sleeps for the given interval and compares the CPU time, but how does it calculates my functionality call here?
If you read the docs, psutil.cpu_percent will:
Return a float representing the current system-wide CPU utilization as a percentage… When interval is 0.0 or None compares system CPU times elapsed since last call or module import, returning immediately…
I'm pretty sure that's not what you want.
First, if you want to know the CPU usage during a specific call, you have to either (a) call it before and after the function, or (b) call it from another thread or process running in parallel with the call, and for the same time as the call (by passing an interval).
Second, if you want to know how much CPU time that call is using, as opposed to how much that call plus everything else being done by every program on your computer are using, you're not even calling the right function, so there's no way to get it to do what you want.
And there's nothing in psutil that's designed to do that. The whole point of psutil is that it provides information about your system, and about your program from the "outside" perspective. It doesn't know anything about what functions you ran when.
However, there are things that come with the stdlib that do things like that, like resource.getrusage or the cProfile module. Without knowing exactly what you're trying to accomplish, it's hard to tell you exactly what to do, but maybe if you read those linked docs they'll give you some ideas.
you need psutil
pip install psutil
then
import psutil,os
p = psutil.Process(os.getpid())
print(p.cpu_percent())
os.getpid - this will return the PID of the program running this code. For example, when you run it on pythonw.exe, it will return PID of pythonw.exe.
Hence p.get_cpu_percent will return CPU usage of that process.
if you want system wide CPU, psutil.cpu_percent can do the job.
Hope this helps, Cheers!
Thanks all...
I got the solution for my query.
>>> p = psutil.Process(os.getpid())
>>> # blocking
>>> p.get_cpu_percent(interval=1)
2.0
>>> # non-blocking (percentage since last call)
>>> p.get_cpu_percent(interval=0)
2.9
Here the interval value matters a lot. The final call will give the usage of cpu in percentage for my actual function call.
I'm using feedparser to print the top 5 Google news titles. I get all the information from the URL the same way as always.
x = 'https://news.google.com/news/feeds?pz=1&cf=all&ned=us&hl=en&topic=t&output=rss'
feed = fp.parse(x)
My problem is that I'm running this script when I start a shell, so that ~2 second lag gets quite annoying. Is this time delay primarily from communications through the network, or is it from parsing the file?
If it's from parsing the file, is there a way to only take what I need (since that is very minimal in this case)?
If it's from the former possibility, is there any way to speed this process up?
I suppose that a few delays are adding up:
The Python interpreter needs a while to start and import the module
Network communication takes a bit
Parsing probably consumes only little time but it does
I think there is no straightforward way of speeding things up, especially not the first point. My suggestion is that you have your feeds downloaded on a regularly basis (you could set up a cron job or write a Python daemon) and stored somewhere on your disk (i.e. a plain text file) so you just need to display them at your terminal's startup (echo would probably be the easiest and fastest).
I personally made good experiences with feedparser. I use it to download ~100 feeds every half hour with a Python daemon.
Parse at real time not better case if you want faster result.
You can try does it asynchronously by Celery or by similar other solutions. I like the Celery, it gives many abilities. There are abilities as task as the cron or async and more.
Update : For anyone wondering what I went with at the end -
I divided the result-set into 4 and ran 4 instances of the same program with one argument each indicating what set to process. It did the trick for me. I also consider PP module. Though it worked, it prefer the same program. Please pitch in if this is a horrible implementation! Thanks..
Following is what my program does. Nothing memory intensive. It is serial processing and boring. Could you help me convert this to more efficient and exciting process? Say, I process 1000 records this way and with 4 threads, I can get it to run in 25% time!
I read articles on how python threading can be inefficient if done wrong. Even python creator says the same. So I am scared and while I am reading more about them, want to see if bright folks on here can steer me in the right direction. Muchos gracias!
def startProcessing(sernum, name):
'''
Bunch of statements depending on result,
will write to database (one update statement)
Try Catch blocks which upon failing,
will call this function until the request succeeds.
'''
for record in result:
startProc = startProcessing(str(record[0]), str(record[1]))
Python threads can't run at the same time due to the Global Interpreter Lock; you want new processes instead. Look at the multiprocessing module.
(I was instructed to post this as an answer =p.)