Trying to get CPU usage in Python without using PSUtil.
I've tried the following but it always seems to report the same figure...
def getCPUuse():
return(str(os.popen("top -n1 | awk '/Cpu\(s\):/ {print $2}'").readline().strip(\
)))
print(getCPUuse())
This always seems to report 3.7% even when I load up the CPU.
I have also tried the following...
str(round(float(os.popen('''grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print usage }' ''').readline()),2))
This always seems to return 5.12. Must admit I don't really know what the above does. If I enter grep cpu /proc/stat into the command line I get something like this...
cpu 74429 1 19596 1704779 5567 0 284 0 0 0
cpu0 19596 0 4965 422508 1640 0 279 0 0 0
cpu1 18564 1 4793 427115 1420 0 1 0 0 0
cpu2 19020 0 4861 426916 1206 0 2 0 0 0
cpu3 17249 0 4977 428240 1301 0 2 0 0 0
I'm guessing my command isn't properly extracting the values for all of my CPU cores from the above output?
My objective is to get total CPU % from my device (Raspberry PI) without using PSUtil. The figure should reflect what is displayed in the OS Task Manager.
What PSUtil, htop, mpstat and the like do, is reading the line starting with "cpu" (actually the first line) from /proc/stat, and then calculate a percentage from the values in that line. You can find the meaning of the values on that line in man 5 proc (search for "proc/stat").
That's also what the grep cpu /proc/stat | awk .... command you mentioned does. But the values in /proc/stat represent the times spent since last boot! Maybe they wrap around after a while, I'm not sure, but the point is that these are numbers measured over a really long time.
So if you run that command, and run it again a few seconds (, minutes or even hours) later, they won't have changed much! That's why you saw it always return 5.12.
Programs like top remember the previous values and subtract them from the newly read values. From the result a 'live' percentage can be calculated that actually reflect recent CPU load.
To do something like that in python as simply as possible, but without running external commands to read /proc/stat and do the calculations for us, we can store the values we've read into a file. The next run we can read them back in, and subtract them from the new values.
#!/usr/bin/env python2.7
import os.path
# Read first line from /proc/stat. It should start with "cpu"
# and contains times spent in various modes by all CPU's totalled.
#
with open("/proc/stat") as procfile:
cpustats = procfile.readline().split()
# Sanity check
#
if cpustats[0] != 'cpu':
raise ValueError("First line of /proc/stat not recognised")
#
# Refer to "man 5 proc" (search for /proc/stat) for information
# about which field means what.
#
# Here we do calculation as simple as possible:
# CPU% = 100 * time_doing_things / (time_doing_things + time_doing_nothing)
#
user_time = int(cpustats[1]) # time spent in user space
nice_time = int(cpustats[2]) # 'nice' time spent in user space
system_time = int(cpustats[3]) # time spent in kernel space
idle_time = int(cpustats[4]) # time spent idly
iowait_time = int(cpustats[5]) # time spent waiting is also doing nothing
time_doing_things = user_time + nice_time + system_time
time_doing_nothing = idle_time + iowait_time
# The times read from /proc/stat are total times, i.e. *all* times spent
# doing things and doing nothing since last boot.
#
# So to calculate meaningful CPU % we need to know how much these values
# have *changed*. So we store them in a file which we read next time the
# script is run.
#
previous_values_file = "/tmp/prev.cpu"
prev_time_doing_things = 0
prev_time_doing_nothing = 0
try:
with open(previous_values_file) as prev_file:
prev1, prev2 = prev_file.readline().split()
prev_time_doing_things = int(prev1)
prev_time_doing_nothing = int(prev2)
except IOError: # To prevent error/exception if file does not exist. We don't care.
pass
# Write the new values to the file to use next run
#
with open(previous_values_file, 'w') as prev_file:
prev_file.write("{} {}\n".format(time_doing_things, time_doing_nothing))
# Calculate difference, i.e: how much the number have changed
#
diff_time_doing_things = time_doing_things - prev_time_doing_things
diff_time_doing_nothing = time_doing_nothing - prev_time_doing_nothing
# Calculate a percentage of change since last run:
#
cpu_percentage = 100.0 * diff_time_doing_things/ (diff_time_doing_things + diff_time_doing_nothing)
# Finally, output the result
#
print "CPU", cpu_percentage, "%"
Here's a version that, not unlike top, prints CPU usage every second, remembering CPU times from previous measurement in variables instead of a file:
#!/usr/bin/env python2.7
import os.path
import time
def get_cpu_times():
# Read first line from /proc/stat. It should start with "cpu"
# and contains times spend in various modes by all CPU's totalled.
#
with open("/proc/stat") as procfile:
cpustats = procfile.readline().split()
# Sanity check
#
if cpustats[0] != 'cpu':
raise ValueError("First line of /proc/stat not recognised")
# Refer to "man 5 proc" (search for /proc/stat) for information
# about which field means what.
#
# Here we do calculation as simple as possible:
#
# CPU% = 100 * time-doing-things / (time_doing_things + time_doing_nothing)
#
user_time = int(cpustats[1]) # time spent in user space
nice_time = int(cpustats[2]) # 'nice' time spent in user space
system_time = int(cpustats[3]) # time spent in kernel space
idle_time = int(cpustats[4]) # time spent idly
iowait_time = int(cpustats[5]) # time spent waiting is also doing nothing
time_doing_things = user_time + nice_time + system_time
time_doing_nothing = idle_time + iowait_time
return time_doing_things, time_doing_nothing
def cpu_percentage_loop():
prev_time_doing_things = 0
prev_time_doing_nothing = 0
while True: # loop forever printing CPU usage percentage
time_doing_things, time_doing_nothing = get_cpu_times()
diff_time_doing_things = time_doing_things - prev_time_doing_things
diff_time_doing_nothing = time_doing_nothing - prev_time_doing_nothing
cpu_percentage = 100.0 * diff_time_doing_things/ (diff_time_doing_things + diff_time_doing_nothing)
# remember current values to subtract next iteration of the loop
#
prev_time_doing_things = time_doing_things
prev_time_doing_nothing = time_doing_nothing
# Output latest perccentage
#
print "CPU", cpu_percentage, "%"
# Loop delay
#
time.sleep(1)
if __name__ == "__main__":
cpu_percentage_loop()
That's not really easy, since most of the process you describe provide the cumulative, or total average of the CPU usage.
Maybe you can try to use the mpstat command that comes with te systat package.
So, the steps I used for the following script are:
Ask mpstat to generate 2 reports, one right now and the other after 1 second (mpstat 1 2)
Then we get the Average line (the last line)
The last column is the %idle column, so we get that with the $NF variable from awk
We use Popen from subprocess but setting shell=True to accept our pipes (|) options.
We execute the command (communicate())
Clean the output with a strip
And subtract that (the idle percentage) from 100, so we can get the used value.
Since it will sleep for 1 second, don't be worry that it is not an instant command.
import subprocess
cmd = "mpstat 1 2 | grep Average | awk '{print $NF}'"
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
out, err = p.communicate()
idle = float(out.strip())
print(100-idle)
Related
I am trying to run a simple experiment using python. I want to present two different types of audio stimuli, an higher and a lower pitch. The higher pitch has a fixed duration of 200ms while the lower pitch come in pairs, the first with a fixed duration of 250ms and the second with a variable duration that can take the following values [.4, .6, .8, 1, 1.2]. I need to know at what time (machine) the stimuli start and end, and their duration (precision is not the most important issue, I have a tolerance of ~ 10ms), thus I log this information
I am using the library audiomath to create and present the stimuli and I have create several custom functions to manage the other aspects of the task. I have 3 scripts: one in which I define the functions, one in which I set the specific parameters for the experiment for each subject (source) and one with the main()
My problem is that the main() works erratically: it works sometimes, some other times it seems to enter into an infinite loop and a certain sound is presented and never stop playing. The point is that this behavior seems to be really random, with the problem that presents itself at different trials, or not at all, even with the exact same parameter.
This is my code:
source file
#%%imports
from exp_funcs import tone440Hz, tone880Hz
import numpy as np
#%%global var
n_long = 10
n_short = 10
short_duration = .2
long_durations = [.4, .6, .8, 1, 1.2]
#%%calculations
n_tot = n_long + n_long
trial_types = ['short_blink'] * n_short + ['long_blink'] * n_long
sounds = [tone880Hz] * n_short + [tone440Hz] * n_long
np.random.seed(10)
durations = [short_duration] * n_short + [el for el in np.random.choice(long_durations, n_long)]
durations = [.5 if el < .2 else el for el in durations]
cue_duration = [.25] * n_tot
spacing = [1.25] * n_tot
np.random.seed(10)
iti = [el for el in (3 + np.random.normal(0, .25, n_tot))]
functions
import numpy as np
import audiomath as am
import time
import pandas as pd
TWO_PI = 2.0 * np.pi
#am.Synth(fs=22050)
def tone880Hz(fs, sampleIndices, channelIndices):
timeInSeconds = sampleIndices / fs
return np.sin(TWO_PI * 880 * timeInSeconds)
#am.Synth(fs=22050)
def tone440Hz(fs, sampleIndices, channelIndices):
timeInSeconds = sampleIndices / fs
return np.sin(TWO_PI * 440 * timeInSeconds)
def short_blink(sound, duration):
p = am.Player(sound)
init = time.time()
while time.time() < init + duration:
p.Play()
end = time.time()
p.Stop()
print(f'start {init} end {end} duration {end - init}')
return(init, end, end - init)
def long_blink(sound, duration, cue_duration, spacing):
p = am.Player(sound)
i_ = time.time()
while time.time() < i_ + cue_duration:
p.Play()
p.Stop()
time.sleep(spacing)
init = time.time()
while time.time() < init + duration:
p.Play()
end = time.time()
p.Stop()
print(f'start {init} end {end} duration {end - init}')
return(init, end, end - init)
def run_trial(ttype, sound, duration, cue_duration, spacing):
if ttype == 'short_blink':
init, end, effective_duration = short_blink(sound, duration)
else:
init, end, effective_duration = long_blink(sound, duration,
cue_duration, spacing)
otp_df = pd.DataFrame([[ttype, init, end, effective_duration]],
columns = ['trial type', 'start', 'stop',
'effective duration'])
return(otp_df)
main
import pandas as pd
import sys
import getopt
import os
import time
import random
from exp_funcs import run_trial
from pathlib import PurePath
def main(argv):
try:
opts, args = getopt.getopt(argv,'hs:o:',['help', 'source_file=', 'output_directory='])
except getopt.GetoptError:
print ('experiment.py -s source file -o output directory')
sys.exit(2)
for opt, arg in opts:
if opt == '-h':
print ('experiment.py -s source file')
sys.exit()
elif opt in ("-s", "--source_file"):
source_file = arg
elif opt in ("-o", "--output_directory"):
output_dir = arg
os.chdir(os.getcwd())
if not os.path.isfile(f'{source_file}.py'):
raise FileNotFoundError('{source_file} does not exist')
else:
source = __import__('source')
complete_param = list(zip(source.trial_types,
source.sounds,
source.durations,
source.cue_duration,
source.spacing,
source.iti))
# shuffle_param = random.sample(complete_param, len(complete_param))
shuffle_param = complete_param
dfs = []
for ttype, sound, duration, cue_duration, spacing, iti in shuffle_param:
time.sleep(iti)
df = run_trial(ttype, sound, duration, cue_duration, spacing)
dfs.append(df)
dfs = pd.concat(dfs)
dfs.to_csv(PurePath(f'{output_dir}/{source_file}.csv'), index = False)
if __name__ == "__main__":
main(sys.argv[1:])
The 3 files are in the same directory, I browse with the terminal within the directory and run the main as follow python experiment.py -s source -o /whatever/output/directory.
Any help would be more than appreciated
This is too big/complex a program to hope for help on non-specific "erratic" behavior here on stackoverflow. You need to boil it down into a small reproducible example that behaves unexpectedly. If it works sometimes and not others, systematically home in on the conditions that make it fail. I did make one attempt to run the whole thing, but after fixing a few missing imports there was still the matter of the unspecified "source file" content.
So I don't know specifically what your problem is. However, from the audiomath and general real-time-performance perspectives, I can certainly identify a few things you shouldn't be doing:
Although Player instances are designed to be played, stopped or manipulated at time-critical moments, they are not (by default) designed to be created and destroyed at time-critical moments. If you want to create/destroy them fast, pre-initialize a persistent Stream() instance and pass it as the stream argument when creating the Player, as described towards the end of https://audiomath.readthedocs.io/en/release/auto/Examples.html#play-sounds
If you are using Synth instances, you could take advantage of their .duration attribute instead of checking the clock explicitly in a while loop. For example, you can set tone880Hz.duration = 0.5, and then play the sound synchronously with p.Play(wait=True). The big problem with your clock-watching while loops is that they are currently "busy-wait" loops that will thrash the CPU, likely leading to sporadic disruption to your sound (Python's multithreading is far from perfect). However, before you fix this problem you should know...
The strategy "Play(), wait, sleep, Play()" is never going to achieve precise timing of one stimulus relative to the other anyway. First, whenever you issue a command to play a sound in any software, there will unavoidably be a non-zero (and randomly varying!) latency between the command and the physical onset of the sound. Second, sleep() is unlikely to be as precise as you think it is. This applies both to the sleep() you’ve been using to create a gap, and also to the sleep() that would be used internally by Play(wait=True). Sleep implementations suspend operation for "at least" the specified amount of time but they don't guarantee an upper bound on that. This is very hardware- and OS-dependent; on some Windows systems you may even find that the granularity never gets any better than 10ms.
If you really want to use the Synth approach I suppose you could program the gap procedurally into the function definitions of tone440Hz() and tone880Hz(), accessing cue_duration, duration and spacing as global variables (in fact, while you're at it, why not make frequency a global variable too, and only write one function). But I don't see any great advantage in this, either in performance or in code maintainability.
What I would do instead is pre-initialize the following (once, at the start of your program):
max_duration = 1 # length, in seconds, of the longest continuous tone we'll need
tone440Hz = am.Sound(fs=22050).GenerateWaveform(freq_hz=440, duration_msec=max_duration*1000)
tone880Hz = am.Sound(fs=22050).GenerateWaveform(freq_hz=880, duration_msec=max_duration*1000)
m = am.Stream()
Then compose each "long blink" stimulus as a static Sound using the parameters you want.
This will ensure that the tone and gap durations are precise:
s = tone440Hz[:cue_duration] % spacing % tone440Hz[:duration]
For best real-time performance, you could pre-compute a whole set of these stimuli with different parameters. Or, if it turns out that those composition operations (slicing and splicing) happen fast enough, you might decide you can get away with doing that at trial time, in your long_blink() function.
Either way, when it comes to playing the stimulus at trial time:
p = am.Player(s, stream=m) # to make Player() initialization fast, use a pre-initialized Stream() instance
p.Play(wait=True)
Finally: in implementing this, start from scratch—start simple, and test the performance of a few simple cases before compounding things.
Background
I have been trying to write a reliable timer with resolution of at least microseconds in Python (3.7). The purpose is to run some specific task every few us, continuously over long period of time.
After some research I settled with perf_counter_ns because of its higher consistency and tested resolution among others (monotonic_ns, time_ns, process_time_ns, and thread_time_ns), details of which can be found in the time module documentation and PEP 564
Test
To ensure the precision (and accuracy) of perf_counter_ns, I set up a test to collect the delays between consecutive timestamps, as shown below.
import time
import statistics as stats
# import resource
def practical_res_test(clock_timer_ns, count, expected_res):
counter = 0
diff = 0
timestamp = clock_timer_ns() # initial timestamp
diffs = []
while counter < count:
new_timestamp = clock_timer_ns()
diff = new_timestamp - timestamp
if (diff > 0):
diffs.append(diff)
timestamp = new_timestamp
counter += 1
print('Mean: ', stats.mean(diffs))
print('Mode: ', stats.mode(diffs))
print('Min: ', min(diffs))
print('Max: ', max(diffs))
outliers = list(filter(lambda diff: diff >= expected_res, diffs))
print('Outliers Total: ', len(outliers))
if __name__ == '__main__':
count = 10000000
# ideally, resolution of at least 1 us is expected
# but let's just do 10 us for the sake of this test
expected_res = 10000
practical_res_test(time.perf_counter_ns, count, expected_res)
# other method benchmarks
# practical_res_test(time.time_ns, count, expected_res)
# practical_res_test(time.process_time_ns, count, expected_res)
# practical_res_test(time.thread_time_ns, count, expected_res)
# practical_res_test(
# lambda: int(resource.getrusage(resource.RUSAGE_SELF).ru_stime * 10**9),
# count,
# expected_res
# )
Problem and Question
Question: Why are there occasional significant skips in time between timestamps?
Multiple tests with 10,000,000 count on my Raspberry Pi 3 Model B V1.2 yielded similar results, one of which is as follows (time is of course in nano seconds):
Mean: 2440.1013097
Mode: 2396
Min: 1771
Max: 1450832 # huge skip as I mentioned
Outliers Total: 8724 # delays that are more than 10 us
Another test on my Windows desktop:
Mean: 271.05812 # higher end machine - better resolution
Mode: 200
Min: 200
Max: 30835600 # but there're still skips, even more significant
Outliers Total: 49021
Although I am aware that resolution will differ on different systems, it is easy to notice a much lower resolution in my test compared to what is rated in PEP 564. Most importantly, occasional skips are observed.
Please let me know if you have any insight into why this is happening. Does it have anything to do with my test, or is perf_counter_ns bound to fail in such use cases? If so do you have any suggestions for a better solution?
Do let me know if there is any other info I need to provide.
Additional Info
For completion, here is the clock info from time.get_clock_info()
On my raspberry pi:
Clock: perf_counter
Adjustable: False
Implementation: clock_gettime(CLOCK_MONOTONIC)
Monotonic: True
Resolution(ns): 1
On my Windows desktop:
Clock: perf_counter
Adjustable: False
Implementation: QueryPerformanceCounter()
Monotonic: True
Resolution(ns): 100
It is also worth mentioning that I am aware of time.sleep() but from my tests and use case it is not particularly reliable as others have discussed here
If you plot the list of time differences, you will see a rather low baseline with peaks that increase over time.
This is caused by the append() operation that occasionally has to reallocate the underlying array (which is how the Python list is implemented).
By pre-allocating the array, the result will improve:
import time
import statistics as stats
import gc
import matplotlib.pyplot as plt
def practical_res_test(clock_timer_ns, count, expected_res):
counter = 0
diffs = [0] * count
gc.disable()
timestamp = clock_timer_ns() # initial timestamp
while counter < count:
new_timestamp = clock_timer_ns()
diff = new_timestamp - timestamp
if diff > 0:
diffs[counter] = diff
timestamp = new_timestamp
counter += 1
gc.enable()
print('Mean: ', stats.mean(diffs))
print('Mode: ', stats.mode(diffs))
print('Min: ', min(diffs))
print('Max: ', max(diffs))
outliers = list(filter(lambda diff: diff >= expected_res, diffs))
print('Outliers Total: ', len(outliers))
plt.plot(diffs)
plt.show()
if __name__ == '__main__':
count = 10000000
# ideally, resolution of at least 1 us is expected
# but let's just do 10 us for the sake of this test
expected_res = 10000
practical_res_test(time.perf_counter_ns, count, expected_res)
These are the results I get:
Mean: 278.6002
Mode: 200
Min: 200
Max: 1097700
Outliers Total: 3985
In comparison, these are the results on my system with the original code:
Mean: 333.92254
Mode: 300
Min: 200
Max: 50507300
Outliers Total: 2590
To get even better performance, you might want to run on Linux and use SCHED_FIFO. But always remember that real-time tasks with microsecond precision are not done in Python.
If your problem is soft real-time, you can get away with it but it all depends on the penalty for missing a deadline and your understanding of the time complexities of both your code and the Python interpreter.
Hey guys (and off course Ladys) ,
i have this little script which should show me some
nice rrd graphs. But i seems like i cant find a way to
bring it to work to show me some stats. This is my Script:
# Function: Simple ping plotter for rrd
import rrdtool,tempfile,commands,time,sys
from model import hosts
sys.path.append('/home/dirk/devel/python/stattool/stattool/lib')
import nurrd
from nurrd import RRDplot
class rrdPing(RRDplot):
def __init__(self):
self.DAY = 86400
self.YEAR = 365 * self.DAY
self.rrdfile = 'hostname.rrd'
self.interval = 300
self.probes = 5
self.rrdList = []
def create_rrd(self, interval):
ret = rrdtool.create("%s" % self.rrdfile, "--step", "%s" % self.interval,
"DS:packets:COUNTER:600:U:U",
"RRA:AVERAGE:0.5:1:288",
"RRA:AVERAGE:0.5:1:336")
def getHosts(self, userID):
myHosts = hosts.query.filter_by(uid=userID).all()
return myHosts.pop(0)
def _doPing(self,host):
for x in xrange(0, self.probes):
ans,unans = commands.getstatusoutput("ping -c 3 -w 6 %s| grep rtt| awk -F '/' '{ print $5 }'" % host)
print x
self.probes -=1
self.rrdList.append(unans)
return self.rrdList
def plotRRD(self):
self.create_rrd(self.interval)
times = self._doPing(self.getHosts(3))
for x in xrange(0,len(times)):
loc = times.pop(0)
rrdtool.update(self.rrdfile, '%d:%d' % (int(time.time()), int(float(loc))))
print '%d:%d' % (int(time.time()), int(float(loc)))
time.sleep(5)
self.graph(60)
def graph(self, mins):
ret = rrdtool.graph( "%s.png" % self.rrdfile, "--start", "-1", "--end" , "+1","--step","300",
"--vertical-label=Bytes/s",
"DEF:inoctets=%s:packets:AVERAGE" % self.rrdfile ,
"AREA:inoctets#7113D6:In traffic",
"CDEF:inbits=inoctets,8,*",
"COMMENT:\\n",
"GPRINT:inbits:AVERAGE:Avg In traffic\: %6.2lf \\r",
"COMMENT: ",
"GPRINT:inbits:MAX:Max In traffic\: %6.2lf")
if __name__ == "__main__":
ping = rrdPing()
ping.plotRRD()
info = rrdtool.info('hostname.rrd')
print info['last_update']
Could somebody please give me some advice or some tips how to solve this?
(Sorry code is a litte mess)
Thanks in advance
Kind regards,
Dirk
Several issues.
Firstly, you appear to only be collecting a single data sample and storing it before you try to generate a graph. You will need at least two samples, separated by about 300s, before you can get a single Primary Data Point and therefore something to graph.
Secondly, you do not post nay information as to what data you are actually storing. Are you sure your rrdPing function is returning valid data to store? You are not testing the error status of the write either.
Thirdly, the data you are collecting seems to be ping times or similar, which is a GAUGE type value. Yet, your RRD DS definition uses a COUNTER type and your graph call is treating it as network traffic data. A COUNTER type assumes increasing values and converts to a rate of change, so if you give it ping RTT data you'll get either unknowns or zeroes stored, which will not show up on a graph.
Fourthly, your call to RRDGraph is specifying a start of -1 and and end of +1. From 1 second in the past to 1 second in the future? Since your step is 300s this is an odd graph. Maybe you should have--end 'now-1' --start 'end-1day' or similar?
You should make your code test the return values for any error messages produced by the RRDTool library -- this is good practice anyway. When testing, print out the values you are updating with to be sure you are giving valid values. With RRDTool, you should collect several data samples at the step interval and store them before expecting to see a line on the graph. Also, make sure you are using the correct data type, GAUGE or COUNTER.
I am writing a script to convert a picture into MIDI notes based on the RGBA values of the individual pixels. However, I cannot seem to get the last step working, which is to actually output the notes to a file.
I have tried using the MIDIUtil library, however its documentation is not the greatest and I can't seem to figure it out.
If anyone could tell me how to sequence the notes (so that they don't all begin at the beginning) it would be greatly appreciated.
Looking at the sample, something like
from midiutil.MidiFile import MIDIFile
# create your MIDI object
mf = MIDIFile(1) # only 1 track
track = 0 # the only track
time = 0 # start at the beginning
mf.addTrackName(track, time, "Sample Track")
mf.addTempo(track, time, 120)
# add some notes
channel = 0
volume = 100
pitch = 60 # C4 (middle C)
time = 0 # start on beat 0
duration = 1 # 1 beat long
mf.addNote(track, channel, pitch, time, duration, volume)
pitch = 64 # E4
time = 2 # start on beat 2
duration = 1 # 1 beat long
mf.addNote(track, channel, pitch, time, duration, volume)
pitch = 67 # G4
time = 4 # start on beat 4
duration = 1 # 1 beat long
mf.addNote(track, channel, pitch, time, duration, volume)
# write it to disk
with open("output.mid", 'wb') as outf:
mf.writeFile(outf)
I know this is an old post, but I'm the author of the library, and I wanted to mention that python 2 and 3 support have now been unified and with the demise of Google Code the code is now hosted on GitHub and can be installed via pip, ie:
pip install MIDIUtil
Documentation is available at Read The Docs.
(Tried to comment but I lacked the experience points.)
The end-of-track message is created automatically when the file is written to disk.
I've got a little script for sorting out my dowloaded files and it works great, but I'd like to print out the progress of a file move, for when it's doing the big ones, right now I do something like:
print "moving..."
os.renames(pathTofile, newName)
print "done"
But I'd like to be able to see something like a progress bar ( [..... ] style) or a percentage printed to stdout.
I don't need/want a gui of any sort, just the simplest / least-work ( :) ) way to get the operation progress).
Thanks!
You won't be able to get that kind of information using os.renames. Your best bet is to replace that with a home grown file copy operation but call stat on the file beforehand in order to get the complete size so you can track how far through you are.
Something like this:
source_size = os.stat(SOURCE_FILENAME).st_size
copied = 0
source = open(SOURCE_FILENAME, 'rb')
target = open(TARGET_FILENAME, 'wb')
while True:
chunk = source.read(32768)
if not chunk:
break
target.write(chunk)
copied += len(chunk)
print '\r%02d%%' % (copied * 100 / source_size),
source.close()
target.close()
Note however that this will more than likely be markedly slower than using os.rename.
There isn't any way to get a progress bar because the "rename" call that moves the file is a single OS call.
It's worth noting that the "rename" call only takes time if the source and destination are on different physical volumes. If they're on the same volume, then the rename will take almost no time. If you know that you're copying data between volumes, you may wish to use functions from the shutil module such as copyfileobj. There is no callback for progress monitoring, however you can implement your own source or destination file-like object to track progress.
This example method expands on the answer by Benno by estimating the time remaining and removing the progress line when the copy is complete.
def copy_large_file(src, dst):
'''
Copy a large file showing progress.
'''
print('copying "{}" --> "{}"'.format(src, dst))
# Start the timer and get the size.
start = time.time()
size = os.stat(src).st_size
print('{} bytes'.format(size))
# Adjust the chunk size to the input size.
divisor = 10000 # .1%
chunk_size = size / divisor
while chunk_size == 0 and divisor > 0:
divisor /= 10
chunk_size = size / divisor
print('chunk size is {}'.format(chunk_size))
# Copy.
try:
with open(src, 'rb') as ifp:
with open(dst, 'wb') as ofp:
copied = 0 # bytes
chunk = ifp.read(chunk_size)
while chunk:
# Write and calculate how much has been written so far.
ofp.write(chunk)
copied += len(chunk)
per = 100. * float(copied) / float(size)
# Calculate the estimated time remaining.
elapsed = time.time() - start # elapsed so far
avg_time_per_byte = elapsed / float(copied)
remaining = size - copied
est = remaining * avg_time_per_byte
est1 = size * avg_time_per_byte
eststr = 'rem={:>.1f}s, tot={:>.1f}s'.format(est, est1)
# Write out the status.
sys.stdout.write('\r{:>6.1f}% {} {} --> {} '.format(per, eststr, src, dst))
sys.stdout.flush()
# Read in the next chunk.
chunk = ifp.read(chunk_size)
except IOError as obj:
print('\nERROR: {}'.format(obj))
sys.exit(1)
sys.stdout.write('\r\033[K') # clear to EOL
elapsed = time.time() - start
print('copied "{}" --> "{}" in {:>.1f}s"'.format(src, dst, elapsed))
You can see a fully functioning version in the gist entry here: https://gist.github.com/jlinoff/0f7b290dc4e1f58ad803.