I am currently learning Python (3), having mostly experience with R as main programming language. While in R for-loops have mostly the same functionality as in Python, I was taught to avoid using it for big operations and instead use apply, which is more efficient.
My question is: how efficient are for-loops in Python, are there alternatives and is it worth exploring those possibilities as a Python newbie?
For example:
p = some_candidate_parameter_generator(data)
for i in p:
fit_model_with paramter(data, i)
Bear with me, it is tricky to give an example without going too much into specific code. But this is something that in R I would have writting with apply, especially if p is large.
The comments correctly point out that for loops are "only as efficient as your logic"; however, the range and xrange in Python do have performance implications, and this may be what you had in mind when asking this question. These methods have nothing to do with the intrinsic performance of for loops though.
In Python 3.0, xrange is now implicitly just range; however, in Python versions less than 3.0, there used to be a distinction – range loaded your entire iterable into memory, and then iterated over each item, while xrange was more akin to a generator, where each item was loaded into memory only when needed and then removed from memory after it was iterated over.
After your updated question:
In other words, if you have a giant list of items that you need to iterate over via a for loop, it is often more memory efficient to use a generator, not a list or a tuple, etc. Again though, this has nothing to do with how the Python for-loop operates, but more to do with what you're iterating over. If in doubt, use a generator, and your memory-efficiency will be as good as it will get with Python.
Related
newbie programmer here. Just started learning some functional programming and I was wondering what's going on behind the scenes in the various scenarios of reduce, a for loop, and built in functions. One thing I noticed when I calculated the times for running each of these was that using reduce() took the longest, the for loop inside the function took the second longest, and using a built in function max() took the shortest. Can somebody explain what's going on behind the scenes that causes these speed differences?
I defined the for loop as:
def f(iterable):
j = next(iterable)
for i in iterable:
if i > j:
j = i
return j
and then compared it with
max(iterable)
and
reduce(lambda x, y: x if x>y else y, iterable)
and noticed, as stated previously, that using reduce() took the longest, the for loop inside the function took the second longest, and using a built in function max() took the shortest.
Python is an interpreted language. (At least, it's partly interpreted. Technically source code is compiled into byte code which is then interpreted.) Code running in an interpreter is almost always going to be a lot slower than native code running on the raw hardware of your machine.
But, a lot of the builtin functions and objects of Python are not written in the Python language itself. A function like max is implemented in C, so it can be pretty fast. It can be a lot faster than pure Python code that the interpreter needs to handle through.
Furthermore, some parts of pure Python code are faster than other parts. Function calls are notoriously slower than most other bits of code, so doing a lot of function calls is generally to be avoided if possible in performance-sensitive sections of your code.
So lets examine your three examples again with these performance thoughts in mind. The max function is implemented in C, so it's fastest. The pure-Python function is slower because its loop and comparisons all need to be interpreted, and while it contains several function calls, most of them are to builtin functions (like next which in turn calls __next__ method of your iterator, both of which are likely builtins). The slowest example is the one using reduce, which, though it is builtin itself, keeps calling back out to the lambda function you gave it as an argument. The repeated function calls to the relatively slow lambda function are what make it the slowest of your three examples.
Note that none of these speed differences change the asymptotic performance of your code. All three of your examples are O(N) where N is the number of items in the iterable. And often asymptotic performance is a lot more important than raw per-item speed if you need your code to be able to scale up to a larger problem. If you were instead comparing a exponentially scaling algorithm with an alternative that was linear (or even polynomial), you'd see vastly different performance numbers once the input size got large enough. Of course it's also possible that you won't care about scalability, if you only need the code to work once for a relatively modest data set. But in that case, the performance differences between builtin functions and lambdas probably don't matter all that much either.
I am currently trying to implement fold/reduce in Python, since I don't like the version from functools. This naturally involved implementing something like the Lisp CDR function, since Python doesn't seem to have anything like it. Here is what I am thinking of trying:
def tail(lat):
# all elements of list except first
acc = []
for i in range(1,len(lat)):
acc = acc + [lat[i]]
Would this be an efficient way of implementing this function? Am I missing some kind of built-in function? Thanks in advance!
"Something like the Lisp CDR function" is trivial:
acc[1:]
This will be significantly faster than your attempt, but only by a constant factor.
However, it doesn't make much sense to do this in the first place. The whole point of CDR is that, when your lists are linked lists stored in CONS cells, going from one cell to its tail is a single machine-language operation. But with arrays (which is what Python lists are), acc[1:]—or the more complicated thing you tried to write, or in fact any possible implementation—allocates a whole new array of size N-1 and copies over N-1 values.
The efficiency cost of doing that over and over again (in an algorithm that was expecting it to be nearly free) is going to be so huge that the constant-factor speedup of using acc[1:] is unlikely to be nearly enough of an improvemnt to make it acceptable.
Most algorithms that are fast with CDR are going to be slow with this kind of slicing, and most algorithms that are fast with this kind of slicing would be slow with CDR. That's why we have multiple data structures in the first place: because they're good for different things.
If you want to know the most efficient way to fold/reduce on an array—it's the way functools.reduce (and the variations of it that libraries like toolz offer) do it: just iterate.
And just iterating has another huge advantage. Python doesn't just have lists, it has an abstraction called iterables, which include iterators and other types that can generate their contents lazily. If you're folding forward, you can take advantage of that laziness. (Folding backward does of course take linear space, either explicitly or on the stack—but it's still better than quadratic copying.) Ignoring that fact defeats the purpose.
I understand the concept behind Generators and why one would choose that over lists.., but i'm struggling so much with getting quality practice by actually implementing them in my coding..Any suggestions on the type of problems I should play around with? I did the 'Fibonacci' code already but would like to practice with other types that would put generators to good use.--thanks--
How about this one: implement a generator that reads chunks from a large file or a big database (so big that it wouldn't fit into the memory). Alternatively, consider a stream of infinitely many values as input.
As you might already have learned, this is a common use case in real world applications:
https://docs.python.org/3/howto/functional.html
With a list comprehension, you get back a Python list; [...] Generator expressions return an iterator that computes the values as necessary, not needing to materialize all the values at once. This means that list comprehensions aren’t useful if you’re working with iterators that return an infinite stream or a very large amount of data. Generator expressions are preferable in these situations.
http://naiquevin.github.io/python-generators-and-being-lazy.html
Now you may ask how does this differ from an ordinary list and what is the use of all this anyway? The key difference is that the generator gives out new values on the fly and doesn't keep the elements in memory.
https://wiki.python.org/moin/Generators
The performance improvement from the use of generators is the result of the lazy (on demand) generation of values, which translates to lower memory usage. Furthermore, we do not need to wait until all the elements have been generated before we start to use them. This is similar to the benefits provided by iterators, but the generator makes building iterators easy.
I am new-ish to Python and I am finding that I am writing the same pattern of code over and over again:
def foo(list):
results = []
for n in list:
#do some or a lot of processing on N and possibly other variables
nprime = operation(n)
results.append(nprime)
return results
I am thinking in particular about the creation of the empty list followed by the append call. Is there a more Pythonic way to express this pattern? append might not have the best performance characteristics, but I am not sure how else I would approach it in Python.
I often know exactly the length of my output, so calling append each time seems like it might be causing memory fragmentation, or performance problems, but I am also wondering if that is just my old C ways tripping me up. I am writing a lot of text parsing code that isn't super performance sensitive on any particular loop or piece because all of the performance is really contained in gensim or NLTK code and is in much more capable hands than mine.
Is there a better/more pythonic pattern for doing this type of operation?
First, a list comprehension may be all you need (if all the processing mentioned in your comment occurs in operation.
def foo(list):
return [operation(n) for n in list]
If a list comprehension will not work in your situation, consider whether foo really needs to build the list and could be a generator instead.
def foo(list):
for n in list:
# Processing...
yield operation(n)
In this case, you can iterate over the sequence, and each value is calculated on demand:
for x in foo(myList):
...
or you can let the caller decide if a full list is needed:
results = list(foo())
If neither of the above is suitable, then building up the return list in the body of the loop as you are now is perfectly reasonable.
[..] so calling append each time seems like it might be causing memory fragmentation, or performance problems, but I am also wondering if that is just my old C ways tripping me up.
If you are worried about this, don't. Python over-allocates when a new resizing of the list is required (lists are dynamically resized based on their size) in order to perform O(1) appends. Either you manually call list.append or build it with a list comprehension (which internally also uses .append) the effect, memory wise, is similar.
The list-comprehension just performs (speed wise) a bit better; it is optimized for creating lists with specialized byte-code instructions that aid it (LIST_APPEND mainly that directly calls lists append in C).
Of course, if memory usage is of concern, you could always opt for the generator approach as highlighted in chepners answer to lazily produce your results.
In the end, for loops are still great. They might seem clunky in comparison to comprehensions and maps but they still offer a recognizable and readable way to achieve a goal. for loops deserve our love too.
I am going through a link about generators that someone posted. In the beginning he compares the two functions below. On his setup he showed a speed increase of 5% with the generator.
I'm running windows XP, python 3.1.1, and cannot seem to duplicate the results. I keep showing the "old way"(logs1) as being slightly faster when tested with the provided logs and up to 1GB of duplicated data.
Can someone help me understand whats happening differently?
Thanks!
def logs1():
wwwlog = open("big-access-log")
total = 0
for line in wwwlog:
bytestr = line.rsplit(None,1)[1]
if bytestr != '-':
total += int(bytestr)
return total
def logs2():
wwwlog = open("big-access-log")
bytecolumn = (line.rsplit(None,1)[1] for line in wwwlog)
getbytes = (int(x) for x in bytecolumn if x != '-')
return sum(getbytes)
*edit, spacing messed up in copy/paste
For what it's worth, the main purpose of the speed comparison in the presentation was to point out that using generators does not introduce a huge performance overhead. Many programmers, when first seeing generators, might start wondering about the hidden costs. For example, is there all sorts of fancy magic going on behind the scenes? Is using this feature going to make my program run twice as slow?
In general that's not the case. The example is meant to show that a generator solution can run at essentially the same speed, if not slightly faster in some cases (although it depends on the situation, version of Python, etc.). If you are observing huge differences in performance between the two versions though, then that would be something worth investigating.
In David Beazley's slides that you linked to, he states that all tests were run with "Python 2.5.1 on OS X 10.4.11," and you say you're running tests with Python 3.1 on Windows XP. So, realize you're doing some apples to oranges comparison. I suspect of the two variables, the Python version matters much more.
Python 3 is a different beast than Python 2. Many things have changed under the hood, (even within the Python 2 branch). This includes performance optimizations as well as performance regressions (see, for example, Beazley's own recent blog post on I/O in Python 3). For this reason, the Python Performance Tips page states explicitly,
You should always test these tips with
your application and the version of
Python you intend to use and not just
blindly accept that one method is
faster than another.
I should mention that one area that you can count on generators helping is in reducing memory consumption, rather than CPU consumption. If you have a large amount of data where you calculate or extract something from each individual piece, and you don't need the data after, generators will shine. See generator comprehension for more details.
You don't have an answer after almost a half an hour. I'm posting something that makes sense to me, not necessarily the right answer. I figure that this is better than nothing after almost half an hour:
The first algorithm uses a generator. A generator functions by loading the first page of results from the list (into memory) and continually loads the successive pages (into memory) until there is nothing left to read from input.
The second algorithm uses two generators, each with an if statement for a total of two comparisons per loop as opposed to the first algorithm's one comparison.
Also the second algorithm calls the sum function at the end as opposed to the first algorithm that simply keeps adding relevant integers as it keeps encountering them.
As such, for sufficiently large inputs, the second algorithm has more comparisons and an extra function call than the first. This could possibly explain why it takes longer to finish than the first algorithm.
Hope this helps