Jupyter Notebook Counter when processing - python

I am using Jupyter Notebook and Python 3.0.
I have a block of code that takes a while to execute in Jupyter Notebook and to identify its current status, I would like to make a counter of what loop it is on, something like this:
large_number = 1000
for i in range(large_number):
print('{} / {} complete.'.format(i,large_number))
The problem with this is that it will print a new line for each iteration, which I do not want to do... instead I just want to update the value.
Is there anyway I can do this in Jupyter Notebook?

The de facto standard for this functionality in Jupyter is tqdm, specifically tqdm_notebook. It is simple to use, provides informative output, and has a multitude of options. You can also use it for cli work.
from tqdm import tqdm_notebook
from time import sleep
for i in tqdm_notebook(range(100)):
sleep(.05)
The output would be something like this:

I'm fond of making an ascii status bar. Say you need to run 1000 iterations and want to see 20 updates before it is done:
num_iter = 1000
num_updates = 20
update_per = num_iter // num_updates # make sure it's an integer
print('|{}|'.format(' ' * (num_updates - 2))) # gives you a reference
for i in range(num_iter):
# code stuff
if i % update_per == 0:
print('*', end='', flush=True)
Gives you an update that looks like:
| |
*******
as it runs.

import sys
large_number = 1000
for i in range(large_number):
print('{} / {} complete.'.format(i,large_number), end='\r')
sys.stdout.flush()

I always find that Alexander Kukushkin's Jupyter Widget is the best for these sorts of things. It creates a nice looking progress bar, can work on generators and you can set how often it updates the progress value.
To use it, copy the code from the github link into a cell, then run your code like so:
large_number = 1000
for i in log_progress(range(large_number)):
//your code here - no need to print anything manually!

Related

How to identify where my python script is slow running

I am returning to a functional python script with the intent of optimizing the runtime. For the most part, I have been using timeit and tqmd to track how long individual functions take to run, but is there a way to run a single function and track the performance of all the commands in the python script to get a single output?
For example:
def funct_a(a):
print(a)
def funct_b(b):
complex_function(a)
def funct_c(c):
return c -5
funct_a(5)
funct_b(Oregon)
funct_c(873)
Ideally i would like to see some output of a performance check that reads like this:
funct_a runtime:.000000001 ms
funct_b runtime: 59 ms
funct_c runtime: .00000002 ms
Any ideas would be greatly appreciated
Use a profiler.
I like to use a default profiler (Already included in python) called cProfile.
You can then visualise the data using snakeviz.
This is a rough way on how to use it:
import cProfile
import pstats
with cProfile.Profile() as pr:
{CODE OR FUNCTION HERE}
stats = pstats.Stats(pr)
stats.sort_stats(pstats.SortKey.TIME)
# Now you have two options, either print the data or save it as a file
stats.print_stats() # Print The Stats
stats.dump_stats("File/path.prof") # Saves the data in a file, can me used to see the data visually
Now to visualise it:
Install snakeviz
Go to your filepath
Open cmd/terminal and type snakeviz filename.prof
For further clarification, watch this video:
https://www.youtube.com/watch?v=m_a0fN48Alw&t=188s&ab_channel=mCoding
import time
start = time.time()
#code goes here
end = time.time()
print('Time for code to run: ', end - start)
Use the timeit module:
import timeit
def funct_a(a):
return a
def funct_b(b):
return [b]*20
def funct_c(c):
return c-5
>>> print(timeit.timeit('funct_a(5)', globals=globals()))
0.09223939990624785
>>> print(timeit.timeit('funct_b("Oregon")', globals=globals()))
0.260303599992767
>>> print(timeit.timeit('funct_c(873)', globals=globals()))
0.14657660003285855

How to catch the terminal output?

I'm working on https://github.com/JsBergbau/MiTemperature2 with raspberry pi 3 model b. It's working properly on its own infinite loop but I am not able to catch the output from the terminal. How can I reach to output by using python?
Here is the part of printing:
measurement_time = datetime.datetime.fromtimestamp(measurement.timestamp)
print(measurement_time)
humidity=int.from_bytes(data[2:3],byteorder='little')
print("Temperature: " + str(temp))
print("Humidity: " + str(humidity))
voltage=int.from_bytes(data[3:5],byteorder='little') / 1000.
print("Battery voltage:",voltage,"V")
measurement.temperature = temp
measurement.humidity = humidity
measurement.voltage = voltage
measurement.sensorname = args.name
batteryLevel = min(int(round((voltage - 2.1),2) * 100), 100) #3.1 or above --> 100% 2.1 --> 0 %
measurement.battery = batteryLevel
print("Battery level:",batteryLevel)
measurement_time = datetime.datetime.fromtimestamp(measurement.timestamp)
Here is the script I run on terminal:
python3 LYWSD03MMC.py -d AA:BB:CC:DD:EE:FF
And here is the output:
2021-08-05 11:21:24
Temperature: 24.79
Humidity: 47
Battery voltage: 3.092 V
Battery level: 99
here is the run command and sample output, thanks for helps, best regards.
Change your code so it returns the information instead of just printing it. If you have code which looks like
something = some_function_call(123)
print(something)
other_one = different_function("some data here?").strip()
print(other_one)
probably refactor to
def get_something(number):
return some_function_call(number)
def get_other_one():
return different_function("some data here?").strip()
if __name__ == '__main__':
print(get_something(123))
print(get_other_one())
Now, you can create additional code which retrieves these values without printing them, and does whatever it wants with them. Put them on a web site? Upload them to a database? Rot13 encrypt them and send an email to Bill Gates? Your imagination is the limit.
How exactly you design your code is a broad topic where many books have been written, and more will be. A common arrangement is to make sure the useful parts are in modular functions which do one thing only (ideally without any side effects) so you can import this code and use it from other programs. (That's why the if __name__ part is useful. It makes sure code inside the block doesn't run when you import this file.)
Have you had a closer look at the code? There is a callback option. This is the easiest way to get values from this script. Or is this question more academically on how to capture python output?
If not, that should help you:
Documentation where callback is described:
https://github.com/JsBergbau/MiTemperature2#callback-for-processing-the-data
Accessing the single values:
In sendToInflux.sh https://github.com/JsBergbau/MiTemperature2/blob/master/sendToInflux.sh is an example in which argument are the values like temperature and so on.
Or when using sendToFile.sh it gives line by line
sensorname,temperature,humidity,voltage,humidityCalibrated,timestamp MySensor 20.61 54 2.944 49 1582120122
That data should be easy to process by python or awk.
add commandline
2>&1 | tee result.txt
it can save command line output
If you are running a command from python you can use subprocess.check_output to get the returning output from the terminal. Don't work if the called script runs forever.
Like this:
output = subprocess.check_output([sys.executable, 'LYWSD03MMC.py', '-d', 'AA:BB:CC:DD:EE:FF']).decode()

How to always add all imports to new Python files?

I am working on a Python project in PyCharm.
Every time I create a new file, I have to import all of the basic libraries again. I keep getting kicked out of my workflow for a few seconds every time I notice that I forgot to import a library.
As far as I can tell, there is no reason not to just import every library I use anywhere in every single file, or is there? I would much rather import a library I don't need than risk losing my concentration because I forgot to import something and have to waste a few seconds on it. Since the auto-sorting of libraries doesn't seem to work properly, I even have to add the import statement manually.
Is there a way to give PyCharm a large list of libraries and just import those in absolutely every file by default?
For comparison, this is possible to do in Jupyter Lab: You can have one notebook that contains all import statements, and just call "%run import_everything.ipynb" in all the other notebooks. This has saved me a lot of time, and is also much more readable than having different imports in every notebook.
(I care specifically about how to do this in PyCharm, but if there is a more generic way to do it, that information would also be appreciated)
I wrote a short script to go over all my files and combine all basic import statements. It also wraps it in comments that say "BASIC IMPORTS", so in the future I can just use PyCharms "replace string in whole project" feature to mass-replace all imports by looking for these comments with a regex.
from pathlib import Path
import re
p = Path(__file__).absolute().parent.parent
excluded_files = ['__init__.py', 'setup.py']
affected_files = [a for a in p.glob('**/*.py') if a.name not in excluded_files]
acc = {}
for a in affected_files:
b = a.read_text()
b = b[:b.index("\n\n")]
finds = re.findall("(import |from )(.*)", b)
for c in finds:
acc[c] = True
assert "intnet" not in c[1], a
lst = list(acc.keys())
lst.sort(key=lambda a: a[1])
res = '\n'.join(f"{a[0]}{a[1]}" for a in lst)
res = "# BASIC IMPORTS\n" + res + "\n# / BASIC IMPORTS"
for a in affected_files:
b = a.read_text()
b = res + b[b.index("\n\n"):]
a.write_text(b)

Python network bandwidth monitor

I am developing a program in python, and one element tells the user how much bandwidth they have used since the program has opened (not just within the program, but regular web browsing while the program has been opened). The output should be displayed in GTK
Is there anything in existence, if not can you point me in the right direction. It seems like i would have to edit an existing proxy script like pythonproxy, but i can't see how i would use it.
Thanks,
For my task I wrote very simple solution using psutil:
import time
import psutil
def main():
old_value = 0
while True:
new_value = psutil.net_io_counters().bytes_sent + psutil.net_io_counters().bytes_recv
if old_value:
send_stat(new_value - old_value)
old_value = new_value
time.sleep(1)
def convert_to_gbit(value):
return value/1024./1024./1024.*8
def send_stat(value):
print ("%0.3f" % convert_to_gbit(value))
main()
import time
def get_bytes(t, iface='wlan0'):
with open('/sys/class/net/' + iface + '/statistics/' + t + '_bytes', 'r') as f:
data = f.read();
return int(data)
while(True):
tx1 = get_bytes('tx')
rx1 = get_bytes('rx')
time.sleep(1)
tx2 = get_bytes('tx')
rx2 = get_bytes('rx')
tx_speed = round((tx2 - tx1)/1000000.0, 4)
rx_speed = round((rx2 - rx1)/1000000.0, 4)
print("TX: %fMbps RX: %fMbps") % (tx_speed, rx_speed)
should be work
Well, not quiet sure if there is something in existence (written in python) but you may want to have a look at the following.
Bandwidth Monitoring (Not really an active project but may give you an idea).
Munin Monitoring (A pearl based Network Monitoring Project)
ntop (written in C/C++, based on libpcap)
Also just to give you pointers if you are looking to do something on your own, one way could be to count and store packets using sudo cat /proc/net/dev
A proxy would only cover network applications that were configured to use it. You could set, e.g. a web browser to use a proxy, but what happens when your proxy exits?
I think the best thing to do is to hook in lower down the stack. There is a program that does this already, iftop. http://en.wikipedia.org/wiki/Iftop
You could start by reading the source code of iftop, perhaps wrap that into a Python C extension. Or rewrite iftop to log data to disk and read it from Python.
Would something like WireShark (https://wiki.wireshark.org/FrontPage) do the trick? I am tackling a similar problem now, and am inclined to use pyshark, a WireShark/TShark wrapper, for the task. That way you can get capture file info readily.

How to split python script in parts and to import the parts in a loop?

First, sorry for my stupid title :) And here is my problem.. Actually it's not a problem. Everything works, but I want to have better structure...
I have a python script with a loop "looped" each second.
In the loop there are many many IFs. Is it possible to put each IF in a separate file and then to include it in the loop? So this way every time the loop is "looped", all the IFs will be passed, too..
There are too many conditions in my script and all of them are different generally from the otheres so I want to have some kind of folder with modules - mod_wheather.py, mod_sport.py, mod_horoscope.py, etc..
Thanks in advance. I hope I wrote everything understandable..
EDIT:
Here is a structural example of what I have now:
while True:
if condition=='news':
#do something
if condition=='sport':
#so something else
time.sleep(1)
It will be good if I can have something like this:
while True:
import mod_news
import mod_sport
time.sleep(1)
And these IFs from the first example to be separated in files mod_news.py, mod_sport.py...
perhaps you wonder how to work with your own modules in general.
make one file named 'weather.py' and have it contain the appropriate if-statements like:
""" weather.py - conditions to check """
def check_all(*args, **kwargs):
""" check all conditions """
if check_temperature(kwargs['temperature']):
... your code ...
def check_temperature(temp):
-- perhaps some code including temp or whatever ...
return temp > 40
same for sport.py, horoscope.py etc
then your main script would look like:
import time, weather, sport, horoscope
kwargs = {'temperature':30}
condition = 'weather'
while True:
if condition == 'weather':
weather.check_all(**kwargs)
elif condition == 'sport':
sport.check_all()
elif condition == 'horoscope':
horoscope.check_all()
time.sleep(1)
edit: edited according to the edit in your question. Note that I suggest importing all modules only one time, at the beginning of the script, and using its functions. This is better than executing code by importing. But if you insist, you could use reload(weather), which actually performs a reload including code execution. But I cannot stress too much that using functions of external modules is a better way to go!
Put them in functions in separate files and then Import them:
"""thing1.py
A function to demonstrate
"""
def do_things(some_var):
print("Doing things with %s" % (some_var))
``
"""thing2.py
Demonstrates the same thing with a condition
"""
def do_things(some_var):
if len(some_var) < 10:
print("%s is < 10 characters long" % (some_var))
else:
print("too long")
``
"""main_program.py"""
import thing1, thing2
myvar = "cats"
thing1.do_things(myvar)
thing2.do_things(myvar)
I believe you are looking for some kind of PHP-like include() or C prepocessor #include. You would have a file such as the included.py below:
a = 2
print "ok"
and another file which has the following code:
for i in values:
import included
and you want the result to be equivalent to
for i in values:
a = 2
print "ok"
Is it what you are looking for? If so... no, it is not possible. Once Python imports a module, the code of the module is executed and following imports of the same mode only retrieve the already imported instance of the module. The code of a module is not executed everytime it is imported.
I can invent some crazy ways of doing it (let us say, file.read() + eval(), or calling reload() in an imported module.) but it would be a bad idea anyway. I bet we can think of a better solution to your real problem :)
Perhaps all you need is to call functions in your loop; and have those functions in other modules, which you import as needed.
while true:
if condition:
from module_a import f
f()
if condition2
from module_b import g
g()
Though the above is legal Python, and so answers your question, you should in practice write all the imports at the top of your file.
You could import the needed modules if they're needed, for example:
if condition:
import weather
... do something
However I'm not sure if that's what you really want.
I have a python script with a loop "looped" each second. In the loop
there are many many IFs.
Then you must optimize the repeatedly executed tests. Suppose there are 50 IFs blocks in your code and that in a turn of the for-loop, the N th condition is True: that means that the N-1 other conditions must be tested before the N th is tested and triggers the execution of the corresponding code.
It would be preferable to do so:
# to_include.py
def func_weather(*args,**kwargs):
# code
return "I'm the weather"
def func_horoscope(*args,**kwargs):
# code
return "Give me your birth'date"
def func_gastronomy(*args,**kwargs):
# code
return 'Miam crunch'
def func_sports(*args,**kwargs):
# code
return 'golf, swimming and canoeing in the resort station'
didi = {'weather':func_weather, 'horoscope':func_horoscope,
'gastronomy':func_gastronomy, 'sports':func_sports}
and the main module:
# use_to_include.py
import to_include
x = 'sports'
y = to_include.didi[x]()
# instead of
# if x =='weather' : y = func_weather()
# elif x=='horoscope' : y = func_horoscope()
# elif x=='gastronomy': y = func_gastronomy()
# elif x=='sports' : y = func_sports()
print y
result
golf, swimming and canoeing in the resort station

Categories

Resources