Function in Python list comprehension, don't eval twice - python

I'm composing a Python list from an input list run through a transforming function. I would like to include only those items in the output list for which the result isn't None. This works:
def transform(n):
# expensive irl, so don't execute twice
return None if n == 2 else n**2
a = [1, 2, 3]
lst = []
for n in a:
t = transform(n)
if t is not None:
lst.append(t)
print(lst)
[1, 9]
I have a hunch that this can be simplified with a comprehension. However, the straighforward solution
def transform(n):
return None if n == 2 else n**2
a = [1, 2, 3]
lst = [transform(n) for n in a if transform(n) is not None]
print(lst)
is no good since transform() is applied twice to each entry. Any way around this?

Use the := operator for python >=3.8.
lst = [t for n in a if (t:= transform(n)) is not None]

If not able/don't want to use walrus operator, one can use #functools.lru_cache to cache the result from calling the function and avoid calling it twice
import functools
eggs = [2, 4, 5, 3, 2]
#functools.lru_cache
def spam(foo):
print(foo) # to demonstrate each call
return None if foo % 2 else foo
print([spam(n) for n in eggs if spam(n) is not None])
output
2
4
5
3
[2, 4, 2]
Compared with walrus operator (currently accepted answer) this will be the better option if there are duplicate values in the input list, i.e. walrus operator will always run the function once per element in the input list. Note, you may combine finctools.lru_cache with walrus operator, e.g. for readability.
eggs = [2, 4, 5, 3, 2]
def spam(foo):
print(foo) # to demonstrate each call
return None if foo % 2 else foo
print([bar for n in eggs if (bar:=spam(n)) is not None])
output
2
4
5
3
2
[2, 4, 2]

Related

Unexpected behaviour with a conditional generator expression [duplicate]

This question already has answers here:
Generator expression uses list assigned after the generator's creation
(5 answers)
Closed 4 years ago.
I was running a piece of code that unexpectedly gave a logic error at one part of the program. When investigating the section, I created a test file to test the set of statements being run and found out an unusual bug that seems very odd.
I tested this simple code:
array = [1, 2, 2, 4, 5] # Original array
f = (x for x in array if array.count(x) == 2) # Filters original
array = [5, 6, 1, 2, 9] # Updates original to something else
print(list(f)) # Outputs filtered
And the output was:
>>> []
Yes, nothing. I was expecting the filter comprehension to get items in the array with a count of 2 and output this, but I didn't get that:
# Expected output
>>> [2, 2]
When I commented out the third line to test it once again:
array = [1, 2, 2, 4, 5] # Original array
f = (x for x in array if array.count(x) == 2) # Filters original
### array = [5, 6, 1, 2, 9] # Ignore line
print(list(f)) # Outputs filtered
The output was correct (you can test it for yourself):
>>> [2, 2]
At one point I outputted the type of the variable f:
array = [1, 2, 2, 4, 5] # Original array
f = (x for x in array if array.count(x) == 2) # Filters original
array = [5, 6, 1, 2, 9] # Updates original
print(type(f))
print(list(f)) # Outputs filtered
And I got:
>>> <class 'generator'>
>>> []
Why is updating a list in Python changing the output of another generator variable? This seems very odd to me.
Python's generator expressions are late binding (see PEP 289 -- Generator Expressions) (what the other answers call "lazy"):
Early Binding versus Late Binding
After much discussion, it was decided that the first (outermost) for-expression [of the generator expression] should be evaluated immediately and that the remaining expressions be evaluated when the generator is executed.
[...] Python takes a late binding approach to lambda expressions and has no precedent for automatic, early binding. It was felt that introducing a new paradigm would unnecessarily introduce complexity.
After exploring many possibilities, a consensus emerged that binding issues were hard to understand and that users should be strongly encouraged to use generator expressions inside functions that consume their arguments immediately. For more complex applications, full generator definitions are always superior in terms of being obvious about scope, lifetime, and binding.
That means it only evaluates the outermost for when creating the generator expression. So it actually binds the value with the name array in the "subexpression" in array (in fact it's binding the equivalent to iter(array) at this point). But when you iterate over the generator the if array.count call actually refers to what is currently named array.
Since it's actually a list not an array I changed the variable names in the rest of the answer to be more accurate.
In your first case the list you iterate over and the list you count in will be different. It's as if you used:
list1 = [1, 2, 2, 4, 5]
list2 = [5, 6, 1, 2, 9]
f = (x for x in list1 if list2.count(x) == 2)
So you check for each element in list1 if its count in list2 is two.
You can easily verify this by modifying the second list:
>>> lst = [1, 2, 2]
>>> f = (x for x in lst if lst.count(x) == 2)
>>> lst = [1, 1, 2]
>>> list(f)
[1]
If it iterated over the first list and counted in the first list it would've returned [2, 2] (because the first list contains two 2). If it iterated over and counted in the second list the output should be [1, 1]. But since it iterates over the first list (containing one 1) but checks the second list (which contains two 1s) the output is just a single 1.
Solution using a generator function
There are several possible solutions, I generally prefer not to use "generator expressions" if they aren't iterated over immediately. A simple generator function will suffice to make it work correctly:
def keep_only_duplicated_items(lst):
for item in lst:
if lst.count(item) == 2:
yield item
And then use it like this:
lst = [1, 2, 2, 4, 5]
f = keep_only_duplicated_items(lst)
lst = [5, 6, 1, 2, 9]
>>> list(f)
[2, 2]
Note that the PEP (see the link above) also states that for anything more complicated a full generator definition is preferrable.
A better solution using a generator function with a Counter
A better solution (avoiding the quadratic runtime behavior because you iterate over the whole array for each element in the array) would be to count (collections.Counter) the elements once and then do the lookup in constant time (resulting in linear time):
from collections import Counter
def keep_only_duplicated_items(lst):
cnts = Counter(lst)
for item in lst:
if cnts[item] == 2:
yield item
Appendix: Using a subclass to "visualize" what happens and when it happens
It's quite easy to create a list subclass that prints when specific methods are called, so one can verify that it really works like that.
In this case I just override the methods __iter__ and count because I'm interested over which list the generator expression iterates and in which list it counts. The method bodies actually just delegate to the superclass and print something (since it uses super without arguments and f-strings it requires Python 3.6 but it should be easy to adapt for other Python versions):
class MyList(list):
def __iter__(self):
print(f'__iter__() called on {self!r}')
return super().__iter__()
def count(self, item):
cnt = super().count(item)
print(f'count({item!r}) called on {self!r}, result: {cnt}')
return cnt
This is a simple subclass just printing when the __iter__ and count method are called:
>>> lst = MyList([1, 2, 2, 4, 5])
>>> f = (x for x in lst if lst.count(x) == 2)
__iter__() called on [1, 2, 2, 4, 5]
>>> lst = MyList([5, 6, 1, 2, 9])
>>> print(list(f))
count(1) called on [5, 6, 1, 2, 9], result: 1
count(2) called on [5, 6, 1, 2, 9], result: 1
count(2) called on [5, 6, 1, 2, 9], result: 1
count(4) called on [5, 6, 1, 2, 9], result: 0
count(5) called on [5, 6, 1, 2, 9], result: 1
[]
As others have mentioned Python generators are lazy. When this line is run:
f = (x for x in array if array.count(x) == 2) # Filters original
nothing actually happens yet. You've just declared how the generator function f will work. Array is not looked at yet. Then, you create a new array that replaces the first one, and finally when you call
print(list(f)) # Outputs filtered
the generator now needs the actual values and starts pulling them from the generator f. But at this point, array already refers to the second one, so you get an empty list.
If you need to reassign the list, and can't use a different variable to hold it, consider creating the list instead of a generator in the second line:
f = [x for x in array if array.count(x) == 2] # Filters original
...
print(f)
Others have already explained the root cause of the issue - the generator is binding to the name of the array local variable, rather than its value.
The most pythonic solution is definitely the list comprehension:
f = [x for x in array if array.count(x) == 2]
However, if there is some reason that you don't want to create a list, you can also force a scope close over array:
f = (lambda array=array: (x for x in array if array.count(x) == 2))()
What's happening here is that the lambda captures the reference to array at the time the line is run, ensuring that the generator sees the variable you expect, even if the variable is later redefined.
Note that this still binds to the variable (reference), not the value, so, for example, the following will print [2, 2, 4, 4]:
array = [1, 2, 2, 4, 5] # Original array
f = (lambda array=array: (x for x in array if array.count(x) == 2))() # Close over array
array.append(4) # This *will* be captured
array = [5, 6, 1, 2, 9] # Updates original to something else
print(list(f)) # Outputs [2, 2, 4, 4]
This is a common pattern in some languages, but it's not very pythonic, so only really makes sense if there's a very good reason for not using the list comprehension (e.g., if array is very long, or is being used in a nested generator comprehension, and you're concerned about memory).
You are not using a generator correctly if this is the primary use of this code. Use a list comprehension instead of a generator comprehension. Just replace the parentheses with brackets. It evaluates to a list if you don't know.
array = [1, 2, 2, 4, 5]
f = [x for x in array if array.count(x) == 2]
array = [5, 6, 1, 2, 9]
print(f)
#[2, 2]
You are getting this response because of the nature of a generator. You're calling the generator when it't contents will evaluate to []
Generators are lazy, they won't be evaluated until you iterate through them. In this case that's at the point you create the list with the generator as input, at the print.
The root cause of the problem is that generators are lazy; variables are evaluated each time:
>>> l = [1, 2, 2, 4, 5, 5, 5]
>>> filtered = (x for x in l if l.count(x) == 2)
>>> l = [1, 2, 4, 4, 5, 6, 6]
>>> list(filtered)
[4]
It iterates over the original list and evaluates the condition with the current list. In this case, 4 appeared twice in the new list, causing it to appear in the result. It only appears once in the result because it only appeared once in the original list. The 6s appear twice in the new list, but never appear in the old list and are hence never shown.
Full function introspection for the curious (the line with the comment is the important line):
>>> l = [1, 2, 2, 4, 5]
>>> filtered = (x for x in l if l.count(x) == 2)
>>> l = [1, 2, 4, 4, 5, 6, 6]
>>> list(filtered)
[4]
>>> def f(original, new, count):
current = original
filtered = (x for x in current if current.count(x) == count)
current = new
return list(filtered)
>>> from dis import dis
>>> dis(f)
2 0 LOAD_FAST 0 (original)
3 STORE_DEREF 1 (current)
3 6 LOAD_CLOSURE 0 (count)
9 LOAD_CLOSURE 1 (current)
12 BUILD_TUPLE 2
15 LOAD_CONST 1 (<code object <genexpr> at 0x02DD36B0, file "<pyshell#17>", line 3>)
18 LOAD_CONST 2 ('f.<locals>.<genexpr>')
21 MAKE_CLOSURE 0
24 LOAD_DEREF 1 (current)
27 GET_ITER
28 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
31 STORE_FAST 3 (filtered)
4 34 LOAD_FAST 1 (new)
37 STORE_DEREF 1 (current)
5 40 LOAD_GLOBAL 0 (list)
43 LOAD_FAST 3 (filtered)
46 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
49 RETURN_VALUE
>>> f.__code__.co_varnames
('original', 'new', 'count', 'filtered')
>>> f.__code__.co_cellvars
('count', 'current')
>>> f.__code__.co_consts
(None, <code object <genexpr> at 0x02DD36B0, file "<pyshell#17>", line 3>, 'f.<locals>.<genexpr>')
>>> f.__code__.co_consts[1]
<code object <genexpr> at 0x02DD36B0, file "<pyshell#17>", line 3>
>>> dis(f.__code__.co_consts[1])
3 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 32 (to 38)
6 STORE_FAST 1 (x)
9 LOAD_DEREF 1 (current) # This loads the current list every time, as opposed to loading a constant.
12 LOAD_ATTR 0 (count)
15 LOAD_FAST 1 (x)
18 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
21 LOAD_DEREF 0 (count)
24 COMPARE_OP 2 (==)
27 POP_JUMP_IF_FALSE 3
30 LOAD_FAST 1 (x)
33 YIELD_VALUE
34 POP_TOP
35 JUMP_ABSOLUTE 3
>> 38 LOAD_CONST 0 (None)
41 RETURN_VALUE
>>> f.__code__.co_consts[1].co_consts
(None,)
To reiterate: The list to be iterated is only loaded once. Any closures in the condition or expression, however, are loaded from the enclosing scope each iteration. They are not stored in a constant.
The best solution for your problem would be to create a new variable referencing the original list and use that in your generator expression,.
Generator evaluation is "lazy" -- it doesn't get executed until you actualize it with a proper reference. With your line:
Look again at your output with the type of f: that object is a generator, not a sequence. It's waiting to be used, an iterator of sorts.
Your generator isn't evaluated until you start requiring values from it. At that point, it uses the available values at that point, not the point at which it was defined.
Code to "make it work"
That depends on what you mean by "make it work". If you want f to be a filtered list, then use a list, not a generator:
f = [x for x in array if array.count(x) == 2] # Filters original
Generators are lazy and your newly defined array is used when you exhaust your generator after redefining. Therefore, the output is correct. A quick fix is to use a list comprehension by replacing parentheses () by brackets [].
Moving on to how better to write your logic, counting a value in a loop has quadratic complexity. For an algorithm that works in linear time, you can use collections.Counter to count values, and keep a copy of your original list:
from collections import Counter
array = [1, 2, 2, 4, 5] # original array
counts = Counter(array) # count each value in array
old_array = array.copy() # make copy
array = [5, 6, 1, 2, 9] # updates array
# order relevant
res = [x for x in old_array if counts[x] >= 2]
print(res)
# [2, 2]
# order irrelevant
from itertools import chain
res = list(chain.from_iterable([x]*count for x, count in counts.items() if count >= 2))
print(res)
# [2, 2]
Notice the second version doesn't even require old_array and is useful if there is no need to maintain ordering of values in your original array.

Python difference between filter() and map()

Being new to python I am just trying to figure out the difference between filter() and map().
I wrote a sample script as follows:
def f(x): return x % 2 == 0
def m(y): return y * 2
list = [1,2,3,4]
flist = filter(f, list)
print(list)
print(flist)
mlist = map(m, list)
print(list)
print(mlist)
We see that to both the filter and map we pass a list and assign their output to a new list.
Output of this script is
[1, 2, 3, 4]
[2, 4]
[1, 2, 3, 4]
[2, 4, 6, 8]
Question arises is that function call of both filter and map looks same so how will they behave if we interchange the contents of functions passed to them.
def f(x): return x * 2
def m(y): return y % 2 == 0
list = [1,2,3,4]
flist = filter(f, list)
print(list)
print(flist)
mlist = map(m, list)
print(list)
print(mlist)
This results in
[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]
[False, True, False, True]
This shows filter evaluates the function and if true it returns back the passed element.
Here the function
def f(x): return x * 2
evaluates to
def f(x): return x * 2 != 0
In contrast map evaluates the function expression and returns back the result as items.
So filter always expects its function to do comparison type of task to filter out the elements while map expects its functions to evaluate a statement to get some result.
Is this understanding correct?
They both work a little bit differently but you've got the right idea.
Map takes all objects in a list and allows you to apply a function to it
Filter takes all objects in a list and runs that through a function to create a new list with all objects that return True in that function.
Here's an example
def square(num):
return num * num
nums = [1, 2, 3, 4, 5]
mapped = map(square, nums)
print(*nums)
print(*mapped)
The output of this is
1 2 3 4 5
1 4 9 16 25
Here's an example of filter
def is_even(num):
return num % 2 == 0
nums = [2, 4, 6, 7, 8]
filtered = filter(is_even, nums)
print(*nums)
print(*filtered)
The output of this would be
2 4 6 7 8
2 4 6 8
In map: Function will be applied to all objects of iterable.
In filter: Function will be applied to only those objects of iterable who goes True on the condition specified in expression.
As per my understanding below are the difference between map and filter:
def even(num):
if(num % 2 == 0):
return 'Even'
num_list = [1,2,3,4,5]
print(list(filter(even,num_list))) ->>>>>>>output: [2, 4]
print(list(map(even,num_list))) ->>>>>>> output: [None, 'Even', None, 'Even', None]
So, we can say that:
filter(): formats new list that contains elements which satisfy specific condition.
map(): function iterates through a all items in the given iterable and executes a function which we passed as an argument.
I think yes you got the picture pretty much.
both Map and filter are ways of applying function to iterables.
in Map you can use multiple iterables
definition : map(function_object, iterable1, iterable2,...)
whereas
in filter only one iterable can be used
definition : filter(function_object, iterable)
further in filter the function_object has to return boolean only.
for sake of example following is the Map with multiple iterables as input
list_a = [1, 2, 3]
list_b = [10, 20, 30]
map(lambda x, y: x + y, list_a, list_b) # Output: [11, 22, 33]
The filter() and map() functions are a little bit different.
While Maps takes a normal function, Filter takes Boolean functions. As a matter of fact, filter are maps with conditional logic, a Boolean logic.
Your example is too accurate.
In filter function your supposed to pass a function and a list(the function must evaluate to true or false). If the element passed in the function returns true the filter function will put the element passed into a new list. Where as map function will take an element pass it through a function and return the output of the function and store that to the new list.
map(): Function will be applied to all objects of iterable, we can use as many literables as wee needed
filter(): Function will be applied to only those objects of iterable and added to result which item is True, we can use only one literable
In the below, code 0 is not add in the filter function because 0 is a representation for False in some cases so it is not added to the filter and added in the map function result
def check(num):
return num*1
nums = [0,2, 4, 6, 7, 8]
result = filter(check, nums)
print(list(result))
def check(num):
return num*1
nums = [0,2, 4, 6, 7, 8]
result = map(check, nums)
print(list(result))
map() applies any applicable logic presented to any number of arguments of type list and returns an iterable containing values mapped to each respective members of the argument list(s).
example:
m = map(lambda x,y: 10+x+y, [1,2,3,4],[10,20,30,40])
print(list(m))
output:
[21, 32, 43, 54]
filter() applies the condition specified to one argument of type list and returns an iterable containing values that satisfy the specified condition and thus selected from the argument.
example:
f = filter(lambda x: x<3, [1,2,3,4])
print(list(f))
output:
[1, 2]
The main difference between a map and a filter is the return of values. A map will always have a representation for elements in the list. The filter will filter out the only elements that will meet the conditions in the function.
def checkElementIn(a):
nameList = ['b','a','l','l']
if a in nameList:
return a
testList = ['r','e','d','b','a','l','l']
m_list = map(checkElementIn,testList)
for i in m_list:
print(i)
None
None
None
b
a
l
l
f_list = filter(checkElementIn,testList)
for i in f_list:
print(i)
b
a
l
l
Those are completely different
just take a look at this clear example down below:
def sqr(x):
return x%2==0
mp = map(sqr, [-1,0,1,2,3,4,5,6])
print(list(mp))
[False, True, False, True, False, True, False, True]
fl = filter(sqr, [-1,0,1,2,3,4,5,6])
print(list(fl))
[0, 2, 4, 6]
as you can see in this clear example the filter doesn't care about the function results! It just checks which one of the list items would be true belonging to the calculation def, and the return is a list [0, 2, 4, 6] which means we have got a true result of numbers

Python 3: Removing list item with for loop, is this the right way? [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 6 years ago.
I am still learning the basics of python, and I have just spent a while reading about how to remove an item from a list in python from within a for loop. Everything I've read suggests complex ways of doing this, and they say that you cannot remove an item from a list while you're iterating over it. However... this seems to work:
class Object():
def __init__(self):
self.y = 0
object_list = [Object(), Object(), Object()]
for thing in object_list:
thing.y += 1
if thing.y > 10:
object_list.remove(thing)
Why is this working when others say it isn't and write complicated workarounds? Is it because you aren't allowed to do it in Python 2 but can in Python 3?
And is this the right way to do this? Will it work as I want it or will it be prone to bugs? Would it be advisable to iterate over the list in reverse order if I plan to remove items?
Sorry if this has been answered before, but it's hard to know which resources refer to what as they all just say "python" in the tag (at least, the ones I've been reading, maybe that's because all the ones I have read are python 2?)
Thanks!
EDIT:
Sorry, there were a couple of copy and paste errors... I've fixed them...
EDIT:
I've been watching another one of Raymond Hettinger's videos... He mentions a way of removing items from a dictionary while iterating over it by using dict.keys(). Something like:
d = {'text': 'moreText', 'other': 'otherText', 'blah': 'moreBlah'}
for k in d.keys():
if k.startswith('o'):
del d[k]
Apparently using the keys makes it safe to remove the item while iterating. Is there an equivalent for lists? If there was I could iterate backwards over the list and remove items safely
Here are some examples
def example1(lst):
for item in lst:
if item < 4:
lst.remove(item)
return lst
def example2(lst):
for item in lst[:]:
if item < 4:
lst.remove(item)
return lst
def example3(lst):
i = 0
while i < len(lst):
if lst[i] < 4:
lst.pop(i)
else:
i += 1
return lst
def example4(lst):
return [item for item in lst if not item < 4]
def example5(lst):
for item in reversed(lst):
if item < 4:
lst.remove(item)
return lst
def example6(lst):
for i, item in reversed(list(enumerate(lst))):
if item < 4:
lst.pop(i)
return lst
def example7(lst):
size = len(lst) - 1
for i, item in enumerate(reversed(lst)):
if item < 4:
lst.pop(size - i)
return lst
def example8(lst):
return list(filter(lambda item: not item < 4, lst))
import itertools
def example9(lst):
return list(itertools.filterfalse(lambda item: item < 4, lst))
# Output
>>> lst = [1, 1, 2, 3, 2, 3, 4, 5, 6, 6]
>>> example1(lst[:])
[1, 3, 3, 4, 5, 6, 6]
>>> example2(lst[:])
[4, 5, 6, 6]
>>> example3(lst[:])
[4, 5, 6, 6]
>>> example4(lst[:])
[4, 5, 6, 6]
>>> example5(lst[:])
[4, 5, 6, 6]
>>> example6(lst[:])
[4, 5, 6, 6]
>>> example7(lst[:])
[4, 5, 6, 6]
>>> example8(lst[:])
[4, 5, 6, 6]
>>> example9(lst[:])
[4, 5, 6, 6]
Example 1
This example involves iterating through the list and removing values from it. The issue with this is that you are modifying the list as you go through it so your list changes during iteration and so some elements get skipped over.
Example 2
Here we are iterating over a shallow copy of the list instead of the list itself. The issue with this is if you have a large list it could be expensive to do so.
Example 3
The following is an example using pop instead of remove, the issue with remove is that it removes the first instance of the value it finds from the list. This will typically be of no issue unless you have objects which are equal. (See example 10)
Example 4
Instead of modifying the list here instead we create a new list using list comprehension allowing only specified values.
Example 5
This is an example of iterating through the list in reverse, the difference is that we use the built-in reversed function to apply a for-loop to in stead of a while loop with a counter.
Example 6
Similar example using pop instead.
Example 7
Better example using pop as we don't have to cast back to a list to use the reversed function.
Example 8
Example using the built-in filter method to remove the specified values.
Example 9
Similar example using the filerfalse method from itertools
class Example(object):
ID = 0
def __init__(self, x):
self._x = x
self._id = str(Example.ID)
Example.ID += 1
def __eq__(self, other):
return self._x == other._x
def __repr__(self):
return 'Example({})'.format(self._id)
def example10():
lst = [Example(5), Example(5)]
print(lst)
lst.remove(lst[1])
return lst
#Output
>>> example10()
[Example(0), Example(1)]
[Example(1)]
Example 10
Here we create two Example objects with the same values and by the equality method they are equal. The ID variable is there to help us differentiate between the two. Now we have specified that we want to remove the 2nd object from the list, however because both are equal the first item is actually removed instead.
Timings
These are pretty rough times and can vary slightly depending on your device. Although these identify which one is faster, this was tested with a list of 10,000 items so if you don't have anything close to that then any choice is fine really.
import timeit
import random
# Code from above is here
def test(func_name):
global test_lst
test_lst = lst[:]
return timeit.timeit("{}(test_lst)".format(func_name),
setup="from __main__ import {}, test_lst".format(func_name), number = 1)
if __name__ == '__main__':
NUM_TRIALS = 1000
lst = list(range(10000))
random.shuffle(lst) # Don't have to but makes it a bit interesting
test_list = lst[:]
for func in ('example2', 'example3', 'example4', 'example5',
'example6', 'example7', 'example8', 'example9'):
trials = []
for _ in range(NUM_TRIALS):
trials.append(test(func))
print(func, sum(trials) / len(trials) * 10000)
#Output
example2 8.487979147454494
example3 20.407155912623292
example4 5.4595031069025035
example5 7.945100572479213
example6 14.43537688078149
example7 9.088818018676008
example8 14.898256300967116
example9 13.865010859443247
It will work. However it's never a good idea to modify an object while you're iterating over it. You'll likely to get unexpected behaviour.
If I did this:
my_list = [1, 2, 3, 4]
for x in my_list:
if x+1 in my_list:
my_list.remove(x+1)
I'd expect my_list = [1] at the end. 1 remove 2, 2 removes 3, and 3 removes 4. If I check though I find my_list=[1,3]. This is because 2 was removed from the list in the first loop, so the second loop used 3 to remove 4, and 3 is still in the list.

concatenate an arbitrary number of lists in a function in Python

I hope to write the join_lists function to take an arbitrary number of lists and concatenate them. For example, if the inputs are
m = [1, 2, 3]
n = [4, 5, 6]
o = [7, 8, 9]
then we I call print join_lists(m, n, o), it will return [1, 2, 3, 4, 5, 6, 7, 8, 9]. I realize I should use *args as the argument in join_lists, but not sure how to concatenate an arbitrary number of lists. Thanks.
Although you can use something which invokes __add__ sequentially, that is very much the wrong thing (for starters you end up creating as many new lists as there are lists in your input, which ends up having quadratic complexity).
The standard tool is itertools.chain:
def concatenate(*lists):
return itertools.chain(*lists)
or
def concatenate(*lists):
return itertools.chain.from_iterable(lists)
This will return a generator which yields each element of the lists in sequence. If you need it as a list, use list: list(itertools.chain.from_iterable(lists))
If you insist on doing this "by hand", then use extend:
def concatenate(*lists):
newlist = []
for l in lists: newlist.extend(l)
return newlist
Actually, don't use extend like that - it's still inefficient, because it has to keep extending the original list. The "right" way (it's still really the wrong way):
def concatenate(*lists):
lengths = map(len,lists)
newlen = sum(lengths)
newlist = [None]*newlen
start = 0
end = 0
for l,n in zip(lists,lengths):
end+=n
newlist[start:end] = list
start+=n
return newlist
http://ideone.com/Mi3UyL
You'll note that this still ends up doing as many copy operations as there are total slots in the lists. So, this isn't any better than using list(chain.from_iterable(lists)), and is probably worse, because list can make use of optimisations at the C level.
Finally, here's a version using extend (suboptimal) in one line, using reduce:
concatenate = lambda *lists: reduce((lambda a,b: a.extend(b) or a),lists,[])
One way would be this (using reduce) because I currently feel functional:
import operator
from functools import reduce
def concatenate(*lists):
return reduce(operator.add, lists)
However, a better functional method is given in Marcin's answer:
from itertools import chain
def concatenate(*lists):
return chain(*lists)
although you might as well use itertools.chain(*iterable_of_lists) directly.
A procedural way:
def concatenate(*lists):
new_list = []
for i in lists:
new_list.extend(i)
return new_list
A golfed version: j=lambda*x:sum(x,[]) (do not actually use this).
You can use sum() with an empty list as the start argument:
def join_lists(*lists):
return sum(lists, [])
For example:
>>> join_lists([1, 2, 3], [4, 5, 6])
[1, 2, 3, 4, 5, 6]
Another way:
>>> m = [1, 2, 3]
>>> n = [4, 5, 6]
>>> o = [7, 8, 9]
>>> p = []
>>> for (i, j, k) in (m, n, o):
... p.append(i)
... p.append(j)
... p.append(k)
...
>>> p
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
This seems to work just fine:
def join_lists(*args):
output = []
for lst in args:
output += lst
return output
It returns a new list with all the items of the previous lists. Is using + not appropriate for this kind of list processing?
Or you could be logical instead, making a variable (here 'z') equal to the first list passed to the 'join_lists' function
then assigning the items in the list (not the list itself) to a new list to which you'll then be able add the elements of the other lists:
m = [1, 2, 3]
n = [4, 5, 6]
o = [7, 8, 9]
def join_lists(*x):
z = [x[0]]
for i in range(len(z)):
new_list = z[i]
for item in x:
if item != z:
new_list += (item)
return new_list
then
print (join_lists(m, n ,o)
would output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]

A cleaner/shorter way to solve this problem?

This exercise is taken from Google's Python Class:
D. Given a list of numbers, return a list where
all adjacent == elements have been reduced to a single element,
so [1, 2, 2, 3] returns [1, 2, 3]. You may create a new list or
modify the passed in list.
Here's my solution so far:
def remove_adjacent(nums):
if not nums:
return nums
list = [nums[0]]
for num in nums[1:]:
if num != list[-1]:
list.append(num)
return list
But this looks more like a C program than a Python script, and I have a feeling this can be done much more elegant.
EDIT
So [1, 2, 2, 3] should give [1, 2, 3] and [1, 2, 3, 3, 2] should give [1, 2, 3, 2]
There is function in itertools that works here:
import itertools
[key for key,seq in itertools.groupby([1,1,1,2,2,3,4,4])]
You can also write a generator:
def remove_adjacent(items):
# iterate the items
it = iter(items)
# get the first one
last = next(it)
# yield it in any case
yield last
for current in it:
# if the next item is different yield it
if current != last:
yield current
last = current
# else: its a duplicate, do nothing with it
print list(remove_adjacent([1,1,1,2,2,3,4,4]))
itertools to the rescue.
import itertools
def remove_adjacent(lst):
i = iter(lst)
yield next(i)
for x, y in itertools.izip(lst, i):
if x != y:
yield y
L = [1, 2, 2, 3]
print list(remove_adjacent(L))
Solution using list comprehensions, zipping then iterating through a twice. Inefficient, but short and sweet. It also has the problem of extending a[1:] with something.
a = [ 1,2,2,2,3,4,4,5,3,3 ]
b = [ i for i,j in zip(a,a[1:] + [None]) if not i == j ]
This works, but I'm not quite happy with it yet because of the +[None] bit to ensure that the last element is also returned...
>>> mylist=[1,2,2,3,3,3,3,4,5,5,5]
>>> [x for x, y in zip(mylist, mylist[1:]+[None]) if x != y]
[1, 2, 3, 4, 5]
The most Pythonic way is probably to go the path of least resistance and use itertools.groupby() as suggested by THC4K and be done with it.
>>> def collapse( data ):
... return list(sorted(set(data)))
...
>>> collapse([1,2,2,3])
[1, 2, 3]
Second attempt after the additional requirment was added:
>>> def remove_adjacent( data ):
... last = None
... for datum in data:
... if datum != last:
... last = datum
... yield datum
...
>>> list( remove_adjacent( [1,2,2,3,2] ) )
[1, 2, 3, 2]
You may want to look at itertools. Also, here's a tutorial on Python iterators and generators (pdf).
This is also somewhat functional; it could be written as a one-liner using lambdas but that would just make it more confusing. In Python 3 you'd need to import reduce from functools.
def remove_adjacent(nums):
def maybe_append(l, x):
return l + ([] if len(l) and l[-1] == x else [x])
return reduce(maybe_append, nums, [])

Categories

Resources