Get IO Wait time as % in python

Get IO Wait time as % in python - python

I am writing a python script to get some basic system stats. I am using psutil for most of it and it is working fine except for one thing that I need.
I'd like to log the average cpu wait time at the moment.
from top output it would be in CPU section under %wa.
I can't seem to find how to get that in psutil, does anyone know how to get it? I am about to go down a road I really don't want to go on....
That entire CPU row is rather nice, since it totals to 100 and it is easy to log and plot.
Thanks in advance.

%wa is giving your the iowait of the CPU, and if you are using times = psutil.cpu_times() or times = psutil.cpu_times_percent() then it is under the times.iowait variable of the returned value (Assuming you are on a Linux system)

Related

Get current network usage with Python

I'm writing a python script which consists of checking the current network usage of the computer, when downloading something. I've done a lot of research, and most of the things I find online are getting the MAX speed of the PC's NIC. In this case I want the current speed (like in mbps or something) only. The most promising solution I have come across yet is with the library "psutils". So the piece of code goes like this:
import psutil
download = psutil.net_io_counters(pernic=True)["Ethernet"][1]
print(download)
The output I get is '1392877555' which means it is definitely giving me something, but no matter what I have tried to alter this number, it is ALWAYS very close to this number. Only the last 3 digits vary. If I download something at my max speed my ISP allows me to, I get this number. If I reduce network usage to a minimum (I can monitor it in task manager for testing), I still get this number.
Any ideas why this is happening, or do I need to do something else with this data?

To get current network speed you can use speedtest-cli library. Using this library can give you the detailed info on your network speed and it's configurations. For more details you can refer to this article.

Mallet stops working for large data sets?

I am trying to use LDA Mallet to assign my tweets to topics, and it works perfectly well when I feed it with up to 500,000 tweets, but it seems to stop working when I use my whole data set, which is about 2,500,000 tweets. Do you have any solutions for that?
I am monitoring my CPU and RAM usage when I run my codes as one way to make sure the code is actually running (I use Jupyter notebook). I use the code below to assign my tweets to topics.
import os
from gensim.models.wrappers import LdaMallet
os.environ.update({'MALLET_HOME':r'C:/new_mallet/mallet-2.0.8/'})
mallet_path = 'C:/new_mallet/mallet-2.0.8/bin/mallet'
ldamallet = LdaMallet(mallet_path, corpus=corpus, num_topics=10, id2word=id2word)
The code seems to work when my data set contains fewer than 500,000 tweets: it spits out the results, and I can see python and/or java use my RAM and CPU. However, when I feed the code my entire data set, Java and Python temporarily show some CPU and RAM usage in the first few seconds, but after that the CPU usage drops to below 1 percent and the RAM usage starts to shrink gradually. I tried running the code several times, but after waiting on the code for 6-7 hours, I saw no increase in the CPU usage and the RAM usage dropped after a while. Also, the code did not produce any results. I finally had to stop the code.
Has this happen to you? Do you have any solutions for it?
Thank you!

This sounds like a memory issue, but the interaction with gensim may be masking the error? I don't know enough about gensim's java interaction to be able to suggest anything. You might try running from the command line directly in hopes that errors might be propagated more clearly.

Testing a loop that repeats itself every 10 seconds

I have a code that repeats itself every 10 seconds, but I can't test it for a long time because my powershell keeps on hanging and the code just stops for no particular reason (the code is running but it doesn't give out results). Is there a way to test the code, or safely running it without it being interrupted? I tried to search but it seems that a library like Unittest will just crash along with my code due to windowshell if I want to run it for lets say a day. Because it usually hangs just a few hours after I start testing manually.
The code is something like this:
import time
import requests
while True:
getting = requests.get(some_url)
result = getting. json()
posting = requests.post(another_url,headers,json=result)
time.sleep(10)
Thank you for your help.

So after testing and experimenting. It seems like a resource miss-management by windows when I use PowerShell, cmd.exe or the default Python IDE. Thus in case someone wants to test their code for a prolonged period of time, it is recommended to use PyCharm as it has been running for more than a day for me. This should mean that PyCharm has better management in this specific area.
For more details check the comments under the question itself.

scipy.fft hangs with certain sound files

scipy.fft seems to hang when running this simple script:
import scipy
from scipy.io import wavfile
sound = 'sounds/silence/iPhone5.wav'
fs, data = wavfile.read(sound)
print scipy.fft(data)
on certain files. Try this file for example.
A few things I noticed:
Running the individual commands from the interactive interpreter does not hang.
Running with other sound files does not always hang the script (it's not just this file that isn't working though)
Sometimes I get WavFileWarning: chunk not understood, but it doesn't seem to be related to when it happens
If I terminate the script with Ctrl+C I get the result as if it never got stuck.
Opening the file with wave or audiolab leads to the same result.
Is this a bug or am I doing something wrong?

Check the value of data.shape for the files that hang up the system. If your data length happens to be a prime number, or the product of several large prime numbers, there isn't much that the FFT algorithm can do to speed up calculation of the DFT. If you pad with zeros, or trim your data to the nearest power of 2, everything should run much, much faster.

This should have been a comment, but there's just not enough space there...
You could do a bit more debugging, which might help a bit.
(Assuming you're on some sort of unix-like OS)
When the program gets stuck, does it idle or use a lot of CPU? You could use "top" or similar to check.
What is the program doing when it appears stuck? Can you get a stack trace? Either using a debugger like gdb or some other tool.
And I guess what really should be step one. Search the net for your symptoms. If it is a bug, it is likely already found and reported. It might even be fixed already.
By looking at a stack trace it should be possible to see if the program is stuck waiting for something, stuck in a loop somewhere or just doing lots of work.
It might also be able to tell you if the problem is in python code, C extensions or somewhere else. Being used to read stack traces is of course a plus. :)

Speed up feedparser

I'm using feedparser to print the top 5 Google news titles. I get all the information from the URL the same way as always.
x = 'https://news.google.com/news/feeds?pz=1&cf=all&ned=us&hl=en&topic=t&output=rss'
feed = fp.parse(x)
My problem is that I'm running this script when I start a shell, so that ~2 second lag gets quite annoying. Is this time delay primarily from communications through the network, or is it from parsing the file?
If it's from parsing the file, is there a way to only take what I need (since that is very minimal in this case)?
If it's from the former possibility, is there any way to speed this process up?

I suppose that a few delays are adding up:
The Python interpreter needs a while to start and import the module
Network communication takes a bit
Parsing probably consumes only little time but it does
I think there is no straightforward way of speeding things up, especially not the first point. My suggestion is that you have your feeds downloaded on a regularly basis (you could set up a cron job or write a Python daemon) and stored somewhere on your disk (i.e. a plain text file) so you just need to display them at your terminal's startup (echo would probably be the easiest and fastest).
I personally made good experiences with feedparser. I use it to download ~100 feeds every half hour with a Python daemon.

Parse at real time not better case if you want faster result.
You can try does it asynchronously by Celery or by similar other solutions. I like the Celery, it gives many abilities. There are abilities as task as the cron or async and more.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get IO Wait time as % in python - python

%wa is giving your the iowait of the CPU, and if you are using times = psutil.cpu_times() or times = psutil.cpu_times_percent() then it is under the times.iowait variable of the returned value (Assuming you are on a Linux system)

Related

Get current network usage with Python

Mallet stops working for large data sets?

Testing a loop that repeats itself every 10 seconds

scipy.fft hangs with certain sound files

Speed up feedparser

Categories

Resources