Slow processing time of a list

Slow processing time of a list - python

Why is my code so sluggish (inefficient)? I need to make two methods to record the time it takes to process a list of a given size. I have a search_fast and search_slow method. Even though there is a difference between those two search times. Search_fast is still pretty slow. I'd like to optimise the processing time so instead of getting 8.99038815498 with search_fast and 65.0739619732 with search_slow. It would only take a fraction of a second. What can I do? I'd be eternally grateful for some tips as coding is still pretty new to me. :)
from timeit import Timer
def fillList(l, n):
l.extend(range(1, n + 1))
l = []
fillList(l, 100)
def search_fast(l):
for item in l:
if item == 10:
return True
return False
def search_slow(l):
return_value = False
for item in l:
if item == 10:
return_value = True
return return_value
t = Timer(lambda: search_fast(l))
print t.timeit()
t = Timer(lambda: search_slow(l))
print t.timeit()

The fastest way is using in operator, which tests membership of a value in a sequence.
if value in some_container:
…
Reference: https://docs.python.org/3/reference/expressions.html#membership-test-operations
Update: also, if you frequently need to test the membership, consider using sets instead of lists.
Some pros and cons can be found here: https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset

Adding the following code to above:
t = Timer(lambda: 10 in l)
print(t.timeit())
produces the following on my system:
0.6166538814701169
3.884095008084452
0.29087270299795875
>>>
Hope this helps. The basic idea is to tap into underlying C code and not make your own Python code.

I managed to find out what made the code sluggish. It was a simple mistake of adding to the list byextend instead of append.
def fillList(l, n):
l.**append**(range(1, n + 1))
l = []
fillList(l, 100)
Now search_slowclocks in at 3.91826605797 instead of 65.0739619732. But I have no idea why it changes the performance so much.

Related

a to the power of b without (a**b), Python

Struggling with an exercise that asks me to write a**b without this operator. Tried to write something myself but am not getting correct results. Instead of one value am getting two, both incorrect. Seems like the counter doesnt really increase. May I ask for help? Thanks!
def powerof(base,exp):
result=1
counter=0
# until counter reaches exponent, go on
if counter<=exp:
# result multiplies itself by base, starting at 1
result=result*base
# increase counter
counter=counter+1
return result
return counter # here it says "unreachable code". Can I not return more variables at the same time?
else: # counter already reached exponent, stop
return
# I want to print 2**8. Suprisingly getting two (incorrect) values as a result
print(powerof(2,8))

Try with recursion:
def powerof(base,exp):
if exp == 0:
return 1
if exp == 1:
return base
return base * powerof(base, exp-1)
# I want to print 2**8. Suprisingly getting two (incorrect) values as a result
print(powerof(2,8))
So what it does, it calls itself while decreasing the exponent, thus the call will look like:
2*(2*(2*2))) ... when being executed.
You could also do this in a for-loop, but recursion is more compact.

Naive implementation(not the the best of solutions but i think you should be able to follow this one):
def powerof(base, exp):
results = 1
for n in range(exp):
results *= base
return results
print(powerof(5,2))
Hope it helps.

I would certainly recommend recursion too, but obviously that's not an option ;-)
So, let's try to fix your code. Why are you trying to return something in your if statement ?
return result
return counter # here it says "unreachable code". Can I not return more variables at the same time?
You know that when you return, you exit from your function ? This is not what you meant. What I guess you want is to multiply result as long as you did not do it exp times. In other words, you want to repeat the code inside your if statement until you did it exp times. You have a keyword for that : while.
And while certainly includes that condition you tried to provide with your if.
Good luck!
edit: btw I don't understand why you say you are getting two results. This is suspicious, are you sure of that ?

You can solve the task "raise a to the power of b without using a**b" in one of the following ways:
>>> a, b = 2, 8
>>>
>>> pow(a, b)
>>> a.__pow__(b)
>>>
>>> sum(a**i for i in range(b)) + 1 # Okay, technically this uses **.
>>>
>>> import itertools as it
>>> from functools import reduce
>>> import operator as op
>>> reduce(op.mul, it.repeat(a, b))
>>>
>>> eval('*'.join(str(a) * b)) # Don't use that one.

Why does 'while' cause the program to run longer in Python

I wrote a program to find out the primes in a list of numbers, just so practice formatting and all. Here is my code:
from math import *
#Defining range to be checked
a = range(1,10)
#Opening empty list
b = []
#Wilsons method for primes
for n in a:
if ((factorial(n-1))+1)%n == 0:
b.append(n)
This code runs without issue and fairly fast, at least at this stage of use. However, when I include a while statement(see below), it is substantially slower.
from math import *
#Defining range to be checked
a = range(1,10)
#Opening empty list
b = []
#Wilson't method for primes
for n in a:
while n>1:
if ((factorial(n-1))+1)%n == 0:
b.append(n)
Could anyone explain why that would be the case?
n.b: I know there are more efficient method to find primes. I am just practicing formatting and all, although I don't mind improving my code.
edit: mistaken addition of less than symbol rather than the appropriate greater than symbol. Corrected.

As pointed out by several people your code will result in an infinite loop as the value of n does not change within your while-loop.
You are probably not looking for a while loop in the first place. It should be sufficient to use the for loop without the first iteration (n = 1). If you insist on including n=1, using an if statement is a possible workaround:
a=range(1,10)
b=[]
for n in a:
if n>1:
if ((factorial(n-1))+1)%n == 0:
b.append(n)

Most efficient way to check if numbers in a list is divisible by another number

So I just finished a coding test yesterday and I'm a tad neurotic. I was asked to create a class or function to check if elements in a list were all divisible by a scalable list of elements. This is the best I could come up with and was wondering if this could be improved. Thanks! And to get in front of it, I deliberately used a partial instead of lambda. To me it is much cleaner, and allows for better code re-use. Plus, I think Guido strongly discourages the use of Lambda and advises people switch to partials.
from functools import partial
def is_divisible(mod_vals, num):
"""A partial that runs a number against the list of modulus arguments, returning a bool value"""
for mod in mod_vals:
if num%mod != 0:
return False
return True
def divisible_by_factor(*mod_vals):
"""Returns a list based off a scalable amount of modulus arguments, no range support currently"""
comparison_list = []
div_partial = partial(is_divisible, (mod_vals))
for i in range(1, 100):
if div_partial(num=i):
comparison_list.append(i)
return comparison_list

>>> def divisible_by_factor(mod_vals):
>>> return [i for i in range(1, 100) if all(i % j == 0 for j in mod_vals)]
>>> print divisible_by_factor([2, 3, 5])
[30, 60, 90]
For every i test whether it's divisible by all provided values. Keep only values that pass this test.

How to decrease running time of this particular solution?

I am trying to write a piece of code that will generate a permutation, or some series of characters that are all different in a recursive fashion.
def getSteps(length, res=[]):
if length == 1:
if res == []:
res.append("l")
res.append("r")
return res
else:
for i in range(0,len(res)):
res.append(res[i] + "l")
res.append(res[i] + "r")
print(res)
return res
else:
if res == []:
res.append("l")
res.append("r")
return getSteps(length-1,res)
else:
for i in range(0,len(res)):
res.append(res[i] + "l")
res.append(res[i] + "r")
print(res)
return getSteps(length-1,res)
def sanitize(length, res):
return [i for i in res if len(str(i)) == length]
print(sanitize(2,getSteps(2)))
So this would return
"LL", "LR", "RR, "RL" or some permutation of the series.
I can see right off the bat that this function probably runs quite slowly, seeing as I have to loop through an entire array. I tried to make the process as efficient as I could, but this is as far as I can get. I know that some unnecessary things happen during the run, but I don't know how to make it much better. So my question is this: what would I do to increase the efficiency and decrease the running time of this code?
edit = I want to be able to port this code to java or some other language in order to understand the concept of recursion rather than use external libraries and have my problem solved without understanding it.

Your design is broken. If you call getSteps again, res won't be an empty list, it will have garbage left over from the last call in it.
I think you want to generate permutations recursively, but I don't understand where you are going with the getSteps function
Here is a simple recursive function
def fn(x):
if x==1:
return 'LR'
return [j+i for i in fn(x-1) for j in "LR"]

Is there a way to combine the binary approach and a recursive approach?
Yes, and #gribbler came very close to that in the post to which that comment was attached. He just put the pieces together in "the other order".
How can you construct all the bitstrings of length n, in increasing order (when viewed as binary integers)? Well, if you already have all the bitstrings of length n-1, you can prefix them all with 0, and then prefix them all again with 1. It's that easy.
def f(n):
if n == 0:
return [""]
return [a + b for a in "RL" for b in f(n-1)]
print(f(3))
prints
['RRR', 'RRL', 'RLR', 'RLL', 'LRR', 'LRL', 'LLR', 'LLL']
Replace R with 0, and L with 1, and you have the 8 binary integers from 0 through 7 in increasing order.

You should look into itertools. There is a function there called permutations which does exactly what you want to achieve here.

How to retrieve an element from a set without removing it?

Suppose the following:
>>> s = set([1, 2, 3])
How do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host.
Quick and dirty:
>>> elem = s.pop()
>>> s.add(elem)
But do you know of a better way? Ideally in constant time.

Two options that don't require copying the whole set:
for e in s:
break
# e is now an element from s
Or...
e = next(iter(s))
But in general, sets don't support indexing or slicing.

Least code would be:
>>> s = set([1, 2, 3])
>>> list(s)[0]
1
Obviously this would create a new list which contains each member of the set, so not great if your set is very large.

I wondered how the functions will perform for different sets, so I did a benchmark:
from random import sample
def ForLoop(s):
for e in s:
break
return e
def IterNext(s):
return next(iter(s))
def ListIndex(s):
return list(s)[0]
def PopAdd(s):
e = s.pop()
s.add(e)
return e
def RandomSample(s):
return sample(s, 1)
def SetUnpacking(s):
e, *_ = s
return e
from simple_benchmark import benchmark
b = benchmark([ForLoop, IterNext, ListIndex, PopAdd, RandomSample, SetUnpacking],
{2**i: set(range(2**i)) for i in range(1, 20)},
argument_name='set size',
function_aliases={first: 'First'})
b.plot()
This plot clearly shows that some approaches (RandomSample, SetUnpacking and ListIndex) depend on the size of the set and should be avoided in the general case (at least if performance might be important). As already shown by the other answers the fastest way is ForLoop.
However as long as one of the constant time approaches is used the performance difference will be negligible.
iteration_utilities (Disclaimer: I'm the author) contains a convenience function for this use-case: first:
>>> from iteration_utilities import first
>>> first({1,2,3,4})
1
I also included it in the benchmark above. It can compete with the other two "fast" solutions but the difference isn't much either way.

tl;dr
for first_item in muh_set: break remains the optimal approach in Python 3.x. Curse you, Guido.
y u do this
Welcome to yet another set of Python 3.x timings, extrapolated from wr.'s excellent Python 2.x-specific response. Unlike AChampion's equally helpful Python 3.x-specific response, the timings below also time outlier solutions suggested above – including:
list(s)[0], John's novel sequence-based solution.
random.sample(s, 1), dF.'s eclectic RNG-based solution.
Code Snippets for Great Joy
Turn on, tune in, time it:
from timeit import Timer
stats = [
"for i in range(1000): \n\tfor x in s: \n\t\tbreak",
"for i in range(1000): next(iter(s))",
"for i in range(1000): s.add(s.pop())",
"for i in range(1000): list(s)[0]",
"for i in range(1000): random.sample(s, 1)",
]
for stat in stats:
t = Timer(stat, setup="import random\ns=set(range(100))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Quickly Obsoleted Timeless Timings
Behold! Ordered by fastest to slowest snippets:
$ ./test_get.py
Time for for i in range(1000):
for x in s:
break: 0.249871
Time for for i in range(1000): next(iter(s)): 0.526266
Time for for i in range(1000): s.add(s.pop()): 0.658832
Time for for i in range(1000): list(s)[0]: 4.117106
Time for for i in range(1000): random.sample(s, 1): 21.851104
Faceplants for the Whole Family
Unsurprisingly, manual iteration remains at least twice as fast as the next fastest solution. Although the gap has decreased from the Bad Old Python 2.x days (in which manual iteration was at least four times as fast), it disappoints the PEP 20 zealot in me that the most verbose solution is the best. At least converting a set into a list just to extract the first element of the set is as horrible as expected. Thank Guido, may his light continue to guide us.
Surprisingly, the RNG-based solution is absolutely horrible. List conversion is bad, but random really takes the awful-sauce cake. So much for the Random Number God.
I just wish the amorphous They would PEP up a set.get_first() method for us already. If you're reading this, They: "Please. Do something."

To provide some timing figures behind the different approaches, consider the following code.
The get() is my custom addition to Python's setobject.c, being just a pop() without removing the element.
from timeit import *
stats = ["for i in xrange(1000): iter(s).next() ",
"for i in xrange(1000): \n\tfor x in s: \n\t\tbreak",
"for i in xrange(1000): s.add(s.pop()) ",
"for i in xrange(1000): s.get() "]
for stat in stats:
t = Timer(stat, setup="s=set(range(100))")
try:
print "Time for %s:\t %f"%(stat, t.timeit(number=1000))
except:
t.print_exc()
The output is:
$ ./test_get.py
Time for for i in xrange(1000): iter(s).next() : 0.433080
Time for for i in xrange(1000):
for x in s:
break: 0.148695
Time for for i in xrange(1000): s.add(s.pop()) : 0.317418
Time for for i in xrange(1000): s.get() : 0.146673
This means that the for/break solution is the fastest (sometimes faster than the custom get() solution).

Since you want a random element, this will also work:
>>> import random
>>> s = set([1,2,3])
>>> random.sample(s, 1)
[2]
The documentation doesn't seem to mention performance of random.sample. From a really quick empirical test with a huge list and a huge set, it seems to be constant time for a list but not for the set. Also, iteration over a set isn't random; the order is undefined but predictable:
>>> list(set(range(10))) == range(10)
True
If randomness is important and you need a bunch of elements in constant time (large sets), I'd use random.sample and convert to a list first:
>>> lst = list(s) # once, O(len(s))?
...
>>> e = random.sample(lst, 1)[0] # constant time

Yet another way in Python 3:
next(iter(s))
or
s.__iter__().__next__()

Seemingly the most compact (6 symbols) though very slow way to get a set element (made possible by PEP 3132):
e,*_=s
With Python 3.5+ you can also use this 7-symbol expression (thanks to PEP 448):
[*s][0]
Both options are roughly 1000 times slower on my machine than the for-loop method.

I use a utility function I wrote. Its name is somewhat misleading because it kind of implies it might be a random item or something like that.
def anyitem(iterable):
try:
return iter(iterable).next()
except StopIteration:
return None

Following #wr. post, I get similar results (for Python3.5)
from timeit import *
stats = ["for i in range(1000): next(iter(s))",
"for i in range(1000): \n\tfor x in s: \n\t\tbreak",
"for i in range(1000): s.add(s.pop())"]
for stat in stats:
t = Timer(stat, setup="s=set(range(100000))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Output:
Time for for i in range(1000): next(iter(s)): 0.205888
Time for for i in range(1000):
for x in s:
break: 0.083397
Time for for i in range(1000): s.add(s.pop()): 0.226570
However, when changing the underlying set (e.g. call to remove()) things go badly for the iterable examples (for, iter):
from timeit import *
stats = ["while s:\n\ta = next(iter(s))\n\ts.remove(a)",
"while s:\n\tfor x in s: break\n\ts.remove(x)",
"while s:\n\tx=s.pop()\n\ts.add(x)\n\ts.remove(x)"]
for stat in stats:
t = Timer(stat, setup="s=set(range(100000))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Results in:
Time for while s:
a = next(iter(s))
s.remove(a): 2.938494
Time for while s:
for x in s: break
s.remove(x): 2.728367
Time for while s:
x=s.pop()
s.add(x)
s.remove(x): 0.030272

What I usually do for small collections is to create kind of parser/converter method like this
def convertSetToList(setName):
return list(setName)
Then I can use the new list and access by index number
userFields = convertSetToList(user)
name = request.json[userFields[0]]
As a list you will have all the other methods that you may need to work with

You can unpack the values to access the elements:
s = set([1, 2, 3])
v1, v2, v3 = s
print(v1,v2,v3)
#1 2 3

I f you want just the first element try this:
b = (a-set()).pop()

How about s.copy().pop()? I haven't timed it, but it should work and it's simple. It works best for small sets however, as it copies the whole set.

Another option is to use a dictionary with values you don't care about. E.g.,
poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
...
You can treat the keys as a set except that they're just an array:
keys = poor_man_set.keys()
print "Some key = %s" % keys[0]
A side effect of this choice is that your code will be backwards compatible with older, pre-set versions of Python. It's maybe not the best answer but it's another option.
Edit: You can even do something like this to hide the fact that you used a dict instead of an array or set:
poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
poor_man_set = poor_man_set.keys()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Slow processing time of a list - python

Adding the following code to above: t = Timer(lambda: 10 in l) print(t.timeit()) produces the following on my system: 0.6166538814701169 3.884095008084452 0.29087270299795875 >>> Hope this helps. The basic idea is to tap into underlying C code and not make your own Python code.

Related

a to the power of b without (a**b), Python

Why does 'while' cause the program to run longer in Python

Most efficient way to check if numbers in a list is divisible by another number

How to decrease running time of this particular solution?

How to retrieve an element from a set without removing it?

Categories

Resources