I'm sitting here for almost 5 hours trying to solve the problem and now I'm hoping for your help.
Here is my Python Code:
def powerset3(a):
if (len(a) == 0):
return frozenset({})
else:
s=a.pop()
b=frozenset({})
b|=frozenset({})
b|=frozenset({s})
for subset in powerset3(a):
b|=frozenset({str(subset)})
b|=frozenset({s+subset})
return b
If I run the program with:
print(powerset3(set(['a', 'b'])))
I get following solution
frozenset({'a', 'b', 'ab'})
But I want to have
{frozenset(), frozenset({'a'}), frozenset({'b'}), frozenset({'b', 'a'})}
I don't want to use libraries and it should be recursive!
Thanks for your help
Here's a slightly more readable implementation using itertools, if you don't want to use a lib for the combinations, you can replace the combinations code with its implementation e.g. from https://docs.python.org/2/library/itertools.html#itertools.combinations
def powerset(l):
result = [()]
for i in range(len(l)):
result += itertools.combinations(l, i+1)
return frozenset([frozenset(x) for x in result])
Testing on IPython, with different lengths
In [82]: powerset(['a', 'b'])
Out[82]:
frozenset({frozenset(),
frozenset({'b'}),
frozenset({'a'}),
frozenset({'a', 'b'})})
In [83]: powerset(['x', 'y', 'z'])
Out[83]:
frozenset({frozenset(),
frozenset({'x'}),
frozenset({'x', 'z'}),
frozenset({'y'}),
frozenset({'x', 'y'}),
frozenset({'z'}),
frozenset({'y', 'z'}),
frozenset({'x', 'y', 'z'})})
In [84]: powerset([])
Out[84]: frozenset({frozenset()})
You sort of have the right idea. If a is non-empty, then the powerset of a can be formed by taking some element s from a, and let's called what's left over rest. Then build up the powerset of s from the powerset of rest by adding to it, for each subset in powerset3(rest) both subset itself and subset | frozenset({s}).
That last bit, doing subset | frozenset({s}) instead of string concatenation is half of what's missing with your solution. The other problem is the base case. The powerset of the empty set is not the empty set, is the set of one element containing the empty set.
One more issue with your solution is that you're trying to use frozenset, which is immutable, in mutable ways (e.g. pop(), b |= something, etc.)
Here's a working solution:
from functools import partial
def helper(x, accum, subset):
return accum | frozenset({subset}) | frozenset({frozenset({x}) | subset})
def powerset(xs):
if len(xs) == 0:
return frozenset({frozenset({})})
else:
# this loop is the only way to access elements in frozenset, notice
# it always returns out of the first iteration
for x in xs:
return reduce(partial(helper, x), powerset(xs - frozenset({x})), frozenset({}))
a = frozenset({'a', 'b'})
print(powerset(a))
Related
The list.index(x) function returns the index in the list of the first item whose value is x.
Is there a function, list_func_index(), similar to the index() function that has a function, f(), as a parameter. The function, f() is run on every element, e, of the list until f(e) returns True. Then list_func_index() returns the index of e.
Codewise:
>>> def list_func_index(lst, func):
for i in range(len(lst)):
if func(lst[i]):
return i
raise ValueError('no element making func True')
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
>>> list_func_index(l,is_odd)
3
Is there a more elegant solution? (and a better name for the function)
You could do that in a one-liner using generators:
next(i for i,v in enumerate(l) if is_odd(v))
The nice thing about generators is that they only compute up to the requested amount. So requesting the first two indices is (almost) just as easy:
y = (i for i,v in enumerate(l) if is_odd(v))
x1 = next(y)
x2 = next(y)
Though, expect a StopIteration exception after the last index (that is how generators work). This is also convenient in your "take-first" approach, to know that no such value was found --- the list.index() function would throw ValueError here.
One possibility is the built-in enumerate function:
def index_of_first(lst, pred):
for i,v in enumerate(lst):
if pred(v):
return i
return None
It's typical to refer a function like the one you describe as a "predicate"; it returns true or false for some question. That's why I call it pred in my example.
I also think it would be better form to return None, since that's the real answer to the question. The caller can choose to explode on None, if required.
#Paul's accepted answer is best, but here's a little lateral-thinking variant, mostly for amusement and instruction purposes...:
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
...
>>> l.index(X(is_odd))
3
essentially, X's purpose is to change the meaning of "equality" from the normal one to "satisfies this predicate", thereby allowing the use of predicates in all kinds of situations that are defined as checking for equality -- for example, it would also let you code, instead of if any(is_odd(x) for x in l):, the shorter if X(is_odd) in l:, and so forth.
Worth using? Not when a more explicit approach like that taken by #Paul is just as handy (especially when changed to use the new, shiny built-in next function rather than the older, less appropriate .next method, as I suggest in a comment to that answer), but there are other situations where it (or other variants of the idea "tweak the meaning of equality", and maybe other comparators and/or hashing) may be appropriate. Mostly, worth knowing about the idea, to avoid having to invent it from scratch one day;-).
Not one single function, but you can do it pretty easily:
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z', 'x']
>>> map(test, data).index(True)
3
>>>
If you don't want to evaluate the entire list at once you can use itertools, but it's not as pretty:
>>> from itertools import imap, ifilter
>>> from operator import itemgetter
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z']
>>> ifilter(itemgetter(1), enumerate(imap(test, data))).next()[0]
3
>>>
Just using a generator expression is probably more readable than itertools though.
Note in Python3, map and filter return lazy iterators and you can just use:
from operator import itemgetter
test = lambda c: c == 'x'
data = ['a', 'b', 'c', 'x', 'y', 'z']
next(filter(itemgetter(1), enumerate(map(test, data))))[0] # 3
A variation on Alex's answer. This avoids having to type X every time you want to use is_odd or whichever predicate
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> L = [8,10,4,5,7]
>>> is_odd = X(lambda x: x%2 != 0)
>>> L.index(is_odd)
3
>>> less_than_six = X(lambda x: x<6)
>>> L.index(less_than_six)
2
you could do this with a list-comprehension:
l = [8,10,4,5,7]
filterl = [a for a in l if a % 2 != 0]
Then filterl will return all members of the list fulfilling the expression a % 2 != 0. I would say a more elegant method...
Intuitive one-liner solution:
i = list(map(lambda value: value > 0, data)).index(True)
Explanation:
we use map function to create a list containing True or False based on if each element in our list pass the condition in the lambda or not.
then we convert the map output to list
then using the index function, we get the index of the first true which is the same index of the first value passing the condition.
This question already has answers here:
How to efficiently compare two unordered lists (not sets)?
(12 answers)
Closed 6 years ago.
Sorry for the simple question, but I'm having a hard time finding the answer.
When I compare 2 lists, I want to know if they are "equal" in that they have the same contents, but in different order.
Ex:
x = ['a', 'b']
y = ['b', 'a']
I want x == y to evaluate to True.
You can simply check whether the multisets with the elements of x and y are equal:
import collections
collections.Counter(x) == collections.Counter(y)
This requires the elements to be hashable; runtime will be in O(n), where n is the size of the lists.
If the elements are also unique, you can also convert to sets (same asymptotic runtime, may be a little bit faster in practice):
set(x) == set(y)
If the elements are not hashable, but sortable, another alternative (runtime in O(n log n)) is
sorted(x) == sorted(y)
If the elements are neither hashable nor sortable you can use the following helper function. Note that it will be quite slow (O(n²)) and should generally not be used outside of the esoteric case of unhashable and unsortable elements.
def equal_ignore_order(a, b):
""" Use only when elements are neither hashable nor sortable! """
unmatched = list(b)
for element in a:
try:
unmatched.remove(element)
except ValueError:
return False
return not unmatched
Determine if 2 lists have the same elements, regardless of order?
Inferring from your example:
x = ['a', 'b']
y = ['b', 'a']
that the elements of the lists won't be repeated (they are unique) as well as hashable (which strings and other certain immutable python objects are), the most direct and computationally efficient answer uses Python's builtin sets, (which are semantically like mathematical sets you may have learned about in school).
set(x) == set(y) # prefer this if elements are hashable
In the case that the elements are hashable, but non-unique, the collections.Counter also works semantically as a multiset, but it is far slower:
from collections import Counter
Counter(x) == Counter(y)
Prefer to use sorted:
sorted(x) == sorted(y)
if the elements are orderable. This would account for non-unique or non-hashable circumstances, but this could be much slower than using sets.
Empirical Experiment
An empirical experiment concludes that one should prefer set, then sorted. Only opt for Counter if you need other things like counts or further usage as a multiset.
First setup:
import timeit
import random
from collections import Counter
data = [str(random.randint(0, 100000)) for i in xrange(100)]
data2 = data[:] # copy the list into a new one
def sets_equal():
return set(data) == set(data2)
def counters_equal():
return Counter(data) == Counter(data2)
def sorted_lists_equal():
return sorted(data) == sorted(data2)
And testing:
>>> min(timeit.repeat(sets_equal))
13.976069927215576
>>> min(timeit.repeat(counters_equal))
73.17287588119507
>>> min(timeit.repeat(sorted_lists_equal))
36.177085876464844
So we see that comparing sets is the fastest solution, and comparing sorted lists is second fastest.
This seems to work, though possibly cumbersome for large lists.
>>> A = [0, 1]
>>> B = [1, 0]
>>> C = [0, 2]
>>> not sum([not i in A for i in B])
True
>>> not sum([not i in A for i in C])
False
>>>
However, if each list must contain all the elements of other then the above code is problematic.
>>> A = [0, 1, 2]
>>> not sum([not i in A for i in B])
True
The problem arises when len(A) != len(B) and, in this example, len(A) > len(B). To avoid this, you can add one more statement.
>>> not sum([not i in A for i in B]) if len(A) == len(B) else False
False
One more thing, I benchmarked my solution with timeit.repeat, under the same conditions used by Aaron Hall in his post. As suspected, the results are disappointing. My method is the last one. set(x) == set(y) it is.
>>> def foocomprehend(): return not sum([not i in data for i in data2])
>>> min(timeit.repeat('fooset()', 'from __main__ import fooset, foocount, foocomprehend'))
25.2893661496
>>> min(timeit.repeat('foosort()', 'from __main__ import fooset, foocount, foocomprehend'))
94.3974742993
>>> min(timeit.repeat('foocomprehend()', 'from __main__ import fooset, foocount, foocomprehend'))
187.224562545
As mentioned in comments above, the general case is a pain. It is fairly easy if all items are hashable or all items are sortable. However I have recently had to try solve the general case. Here is my solution. I realised after posting that this is a duplicate to a solution above that I missed on the first pass. Anyway, if you use slices rather than list.remove() you can compare immutable sequences.
def sequences_contain_same_items(a, b):
for item in a:
try:
i = b.index(item)
except ValueError:
return False
b = b[:i] + b[i+1:]
return not b
Say I have a list, and I want to produce a list of all unique pairs of elements without considering the order. One way to do this is:
mylist = ['W','X','Y','Z']
for i in xrange(len(mylist)):
for j in xrange(i+1,len(mylist)):
print mylist[i],mylist[j]
W X
W Y
W Z
X Y
X Z
Y Z
I want to do this with iterators, I thought of the following, even though it doesn't have brevity:
import copy
it1 = iter(mylist)
for a in it1:
it2 = copy.copy(it1)
for b in it2:
print a,b
But this doesn't even work. What is a more pythonic and efficient way of doing this, with iterators or zip, etc.?
This has already been done and is included in the standard library as of Python 2.6:
import itertools
mylist = ['W', 'X', 'Y', 'Z']
for pair in itertools.combinations(mylist, 2):
print pair # pair is a tuple of 2 elements
Seems pretty Pythonic to me ;-)
Note that even if you're calculating a lot of combinations, the combinations() function returns an iterator so that you can start printing them right away. See the docs.
Also, you are referring to the result as a Cartesian product between the list and itself, but this is not strictly correct: The Cartesian product would have 16 elements (4x4). Your output is a subset of that, namely only the 2-element combinations (with repetition not allowed) of the values of the list.
#Cameron's answer is correct.
I just wanted to point out that
for i in range(len(mylist)):
do_something_to(mylist[i])
is nastily un-Pythonic; if your operation is read-only (does not have to be stored back to the array), do
for i in mylist:
do_something_to(i)
otherwise
mylist = [do_something_to(i) for i in mylist]
I have a list of strings. I have a function that given a string returns 0 or 1. How can I delete all strings in the list for which the function returns 0?
[x for x in lst if fn(x) != 0]
This is a "list comprehension", one of Python's nicest pieces of syntactical sugar that often takes lines of code in other languages and additional variable declarations, etc.
See:
http://docs.python.org/tutorial/datastructures.html#list-comprehensions
I would use a generator expression over a list comprehension to avoid a potentially large, intermediate list.
result = (x for x in l if f(x))
# print it, or something
print list(result)
Like a list comprehension, this will not modify your original list, in place.
edit: see the bottom for the best answer.
If you need to mutate an existing list, for example because you have another reference to it somewhere else, you'll need to actually remove the values from the list.
I'm not aware of any such function in Python, but something like this would work (untested code):
def cull_list(lst, pred):
"""Removes all values from ``lst`` which for which ``pred(v)`` is false."""
def remove_all(v):
"""Remove all instances of ``v`` from ``lst``"""
try:
while True:
lst.remove(v)
except ValueError:
pass
values = set(lst)
for v in values:
if not pred(v):
remove_all(v)
A probably more-efficient alternative that may look a bit too much like C code for some people's taste:
def efficient_cull_list(lst, pred):
end = len(lst)
i = 0
while i < end:
if not pred(lst[i]):
del lst[i]
end -= 1
else:
i += 1
edit...: as Aaron pointed out in the comments, this can be done much more cleanly with something like
def reversed_cull_list(lst, pred):
for i in range(len(lst) - 1, -1, -1):
if not pred(lst[i]):
del lst[i]
...edit
The trick with these routines is that using a function like enumerate, as suggested by (an) other responder(s), will not take into account the fact that elements of the list have been removed. The only way (that I know of) to do that is to just track the index manually instead of allowing python to do the iteration. There's bound to be a speed compromise there, so it may end up being better just to do something like
lst[:] = (v for v in lst if pred(v))
Actually, now that I think of it, this is by far the most sensible way to do an 'in-place' filter on a list. The generator's values are iterated before filling lst's elements with them, so there are no index conflict issues. If you want to make this more explicit just do
lst[:] = [v for v in lst if pred(v)]
I don't think it will make much difference in this case, in terms of efficiency.
Either of these last two approaches will, if I understand correctly how they actually work, make an extra copy of the list, so one of the bona fide in-place solutions mentioned above would be better if you're dealing with some "huge tracts of land."
>>> s = [1, 2, 3, 4, 5, 6]
>>> def f(x):
... if x<=2: return 0
... else: return 1
>>> for n,x in enumerate(s):
... if f(x) == 0: s[n]=None
>>> s=filter(None,s)
>>> s
[3, 4, 5, 6]
With a generator expression:
alist[:] = (item for item in alist if afunction(item))
Functional:
alist[:] = filter(afunction, alist)
or:
import itertools
alist[:] = itertools.ifilter(afunction, alist)
All equivalent.
You can also use a list comprehension:
alist = [item for item in alist if afunction(item)]
An in-place modification:
import collections
indexes_to_delete= collections.deque(
idx
for idx, item in enumerate(alist)
if afunction(item))
while indexes_to_delete:
del alist[indexes_to_delete.pop()]
I want to filter repeated elements in my list
for instance
foo = ['a','b','c','a','b','d','a','d']
I am only interested with:
['a','b','c','d']
What would be the efficient way to do achieve this ?
Cheers
list(set(foo)) if you are using Python 2.5 or greater, but that doesn't maintain order.
Cast foo to a set, if you don't care about element order.
Since there isn't an order-preserving answer with a list comprehension, I propose the following:
>>> temp = set()
>>> [c for c in foo if c not in temp and (temp.add(c) or True)]
['a', 'b', 'c', 'd']
which could also be written as
>>> temp = set()
>>> filter(lambda c: c not in temp and (temp.add(c) or True), foo)
['a', 'b', 'c', 'd']
Depending on how many elements are in foo, you might have faster results through repeated hash lookups instead of repeated iterative searches through a temporary list.
c not in temp verifies that temp does not have an item c; and the or True part forces c to be emitted to the output list when the item is added to the set.
>>> bar = []
>>> for i in foo:
if i not in bar:
bar.append(i)
>>> bar
['a', 'b', 'c', 'd']
this would be the most straightforward way of removing duplicates from the list and preserving the order as much as possible (even though "order" here is inherently wrong concept).
If you care about order a readable way is the following
def filter_unique(a_list):
characters = set()
result = []
for c in a_list:
if not c in characters:
characters.add(c)
result.append(c)
return result
Depending on your requirements of speed, maintanability, space consumption, you could find the above unfitting. In that case, specify your requirements and we can try to do better :-)
If you write a function to do this i would use a generator, it just wants to be used in this case.
def unique(iterable):
yielded = set()
for item in iterable:
if item not in yielded:
yield item
yielded.add(item)
Inspired by Francesco's answer, rather than making our own filter()-type function, let's make the builtin do some work for us:
def unique(a, s=set()):
if a not in s:
s.add(a)
return True
return False
Usage:
uniq = filter(unique, orig)
This may or may not perform faster or slower than an answer that implements all of the work in pure Python. Benchmark and see. Of course, this only works once, but it demonstrates the concept. The ideal solution is, of course, to use a class:
class Unique(set):
def __call__(self, a):
if a not in self:
self.add(a)
return True
return False
Now we can use it as much as we want:
uniq = filter(Unique(), orig)
Once again, we may (or may not) have thrown performance out the window - the gains of using a built-in function may be offset by the overhead of a class. I just though it was an interesting idea.
This is what you want if you need a sorted list at the end:
>>> foo = ['a','b','c','a','b','d','a','d']
>>> bar = sorted(set(foo))
>>> bar
['a', 'b', 'c', 'd']
import numpy as np
np.unique(foo)
You could do a sort of ugly list comprehension hack.
[l[i] for i in range(len(l)) if l.index(l[i]) == i]