Make a copy of a set and exclude one item - python

Im trying to make a set based in another set, and exclude only one item...
(do a for loop inside another for loop with an object that is inside a set, but not iterate with itself on the second loop)
Code:
for animal in self.animals:
self.exclude_animal = set((animal,))
self.new_animals = set(self.animals)
self.new_animals.discard(self.exclude_animal) # parameters for a set needs to be a set?
for other_animal in (self.new_animals):
if animal.pos[0] == other_animal.pos[0]:
if animal.pos[1] == other_animal.pos[1]:
self.animals_to_die.add(animal)
print other_animal
print animal
self.animals_to_die.add(other_animal)
Point is, my print statement returns the object id(x), so I know that they are the same object, but they should not be, I discard it on that set new_animals set.
Any insight in why this doesn't exclude the item?

set.discard() removes one item from the set, but you pass in a whole set object.
You need to remove the element itself, not another set with the element inside:
self.new_animals.discard(animal)
Demo:
>>> someset = {1, 2, 3}
>>> someset.discard({1})
>>> someset.discard(2)
>>> someset
set([1, 3])
Note how 2 was removed, but 1 remained in the set.
It would be easier to just loop over the set difference here:
for animal in self.animals:
for other_animal in set(self.animals).difference([animal]):
if animal.pos == other_animal.pos:
self.animals_to_die.add(animal)
print other_animal
print animal
self.animals_to_die.add(other_animal)
(where I assume that .pos is a tuple of two values, you can just test for direct equality here).
You don't need to store new_animals on self all the time; using local names suffices and is not even needed here.

As you mark both animals to die, you don't need to compare A to B and also B to A (which your current code does). You can ensure you get only unique pairs of animals by using itertools.combinations():
for animal, other_animal in itertools.combinations(self.animals, 2):
if animal.pos==other_animal.pos:
self.animals_to_die.update([animal, other_animal])
Just for fun, I'll point out you can even do it as a single expression (though I think it reads better as an explicit loop):
self.animals_to_die.update(sum(((animal,other_animal)
for animal,other_animal in itertools.combinations(self.animals, 2)
if animal.pos == other_animal.pos), ()))
For clarity, itertools.combinations() gives you all the unique combinations of its input. The second argument controls how many elements are selected each time:
>>> list(itertools.combinations(["A", "B", "C", "D"], 2))
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]
>>> list(itertools.combinations(["A", "B", "C", "D"], 3))
[('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'C', 'D'), ('B', 'C', 'D')]
That works well in this case as the code given appears to be symmetric (two animals on the same location mutually annihilate each other). If it had been asymmetric (one kills the other) then you would have to remember to test both ways round on each iteration.

Related

Merge two sorted lists and preserve order

Given two lists e.g.
l1 = ["A","D","B","C"]
l2 = ["X","A","Y","B"]
I want the result to preserve the order given by both input lists i.e.
["X","A","D","Y","B","C"]
The result is not unique as "D" and "Y" could also be switched (but if it is not clear lexicographic order should handle the conflict)
Also if I had smth. like
l1 = ["B","A"]
l2 = ["X","A","Y","B"]
positions of either B or A should be treated interchangeably (i.e. no unique order can be constructed and the order of B< A and B> A should be treated as A=B) which would lead to accepted results
["X","A","Y","B"]
(preferably as it is in deterministic lexicographic order of A and B) or
["X","B","Y","A"]
Put in another way I want the combination of lists to preserve the order where unique and where it's not, the result should be deterministic e.g. according to lexicographic ordering.
Is there a library for python that accomplishes that or would I have to implement it on my own?
I looked at OrderedDict and OrderedSet but both do not handle the merge as I want.
You might want to use to use topological sorting. If you don't want to implement the algorithm from scratch, you can use the NetworkX Python package:
from itertools import chain
from networkx import DiGraph, topological_sort
l1 = ["A", "D", "B", "C"]
l2 = ["X", "A", "Y", "B"]
# Build the graph
G = DiGraph(chain(zip(l1[:-1], l1[1:]), zip(l2[:-1], l2[1:])))
# Return a list of nodes in topological sort order
result = list(topological_sort(G))
# result: ['X', 'A', 'Y', 'D', 'B', 'C']
Basically, you build a graph where every directed edge from vertex u to vertex v implies that u comes before v in the ordering. In this specific example, "A" comes before "D", "D" comes before "B" etc:
>>> G.edges
[('A', 'D'), ('A', 'Y'), ('D', 'B'), ('B', 'C'), ('X', 'A'), ('Y', 'B')]

How to represent multiple level combinations?

The book Introduction to Probability by Blitzstein and Hwang provides an example of combinations using ice cream.
The first level is cone: Waffle or Cake
The second level is flavour: chocolate, vanilla or strawberry
The basic example is 2 * 3 = 6 separate choices.
I can represent each level of choice separately:
from sympy.functions.combinatorial.numbers import nC
from sympy.utilities.iterables import combinations, combinations_with_replacement
cones = combinations('CW', 1)
list(cones)
>>> [('C',), ('W',)]
flavours = combinations('cvs', 1)
list(flavours)
>>> [('c',), ('v',), ('s',)]
# how to get a list representing all choices? (Cc, Cv, Cs, Wc, Wv, Ws)
# how to return a count of the choices, e.g. with nC()?
I was wondering if it is possible to combine the levels with sympy and return a list of each combination and a count of the available combinations?
What you need is cartes
from sympy.utilities.iterables import cartes
print list(cartes('CW', 'csv'))
# >>> [('C', 'c'), ('C', 's'), ('C', 'v'), ('W', 'c'), ('W', 's'), ('W', 'v')]
print [''.join(x) for x in list(cartes('CW', 'csv'))]
# >>> ['Cc', 'Cs', 'Cv', 'Wc', 'Ws', 'Wv']
print len(list(cartes('CW', 'csv')))
# >>> 6
What you are doing is a set multiplication. eg., {A,B}*{1,2} -> {{A,1}, {A,2}, {B,1}, {B,2}). In python you do this with itertools.product:
from itertools import product
allChoices = set(product(set('CW'), set('csv')))
allChoicesPretty = set(a+b for a, b in allChoices)
numberOfChoices = len(allChoices)
print(allChoices)
print(allChoicesPretty)
print(numberOfChoices)
output:
{('C', 'v'), ('W', 's'), ('W', 'c'), ('C', 'c'), ('C', 's'), ('W', 'v')}
{'Wv', 'Ws', 'Cs', 'Wc', 'Cc', 'Cv'}
6
Actually you don't need sympyat all for this, furthermore cartes is actually an alias to iterable.product [1]
Comments
In a set each element happens only once, and there is no order in set. If you need one of these or both, replace set by list and {} by []. This might matter for instance in probability when you pick an object from a bag and put it back. But operations on sets are faster. A way to have multiple times the "same event" while using sets, is to add a tag, such as a number eg., A,A,A -> A1,A2,A3. This is very practical to think this way, because it's usually simpler to think about events (or compute probabilities) with tags and then remove the tags (equivalent to say "the order doesn't matter") when we do probabilities.
It's also related to the fact that in mathematics we can express (interpret, build) everything in set theory [2], which is actually needs few deductive elements of proof theory [4]; which is a way to build the foundations of mathematics (ZFC := Zermelo Fraenkel Choice). All mathematical proof happens inside ZFC, but it's almost never mentionned in the proof.
There are other ways to interpret all mathematics, such as category theory [3], which is closely related to computing langages. The third possible foundation I know is homotopy type theory [5,6], with which to date we can do less many things because the field is very new, but what we can do is nore natural and is incredibly interesting conceptually.
[1] https://github.com/sympy/sympy/blob/da9fdef5e00f40dfd500bfa356c61ce6bad1b559/sympy/utilities/iterables.py#L6
[2] https://en.wikipedia.org/wiki/Set_theory
[3] https://en.wikipedia.org/wiki/Category_theory
[4] https://en.wikipedia.org/wiki/Proof_theory
[5] https://en.wikipedia.org/wiki/Homotopy_type_theory
[6] https://homotopytypetheory.org/book/

Most efficient way to list all oriented cycles given n elements in Python

I have a list of elements which can be quite big (100+ elements): elements = [a, b, c, d, e, f, g...].
and I need to build the list of all possible directed cycles, considering that the sequences
[a,b,c,d,e], [b,c,d,e,a], [c,d,e,a,b], [d,e,a,b,c], [e,a,b,c,d] are considered identical since they are different representations of the same directed cycle. Only the starting point differs.
Also, since direction matters, [a,b,c,d,e] and [e,d,c,b,a] are different.
I am looking for all the oriented cycles of all lengths, from 2 to len(elements). What's the most pythonic way to do it leveraging the optimization of built-in permutations, combinations, etc ?.
Maybe I'm missing something, but this seems straightforward to me:
def gen_oriented_cycles(xs):
from itertools import combinations, permutations
for length in range(2, len(xs) + 1):
for pieces in combinations(xs, length):
first = pieces[0], # 1-tuple
for rest in permutations(pieces[1:]):
yield first + rest
Then, e.g.,
for c in gen_oriented_cycles('abcd'):
print c
displays:
('a', 'b')
('a', 'c')
('a', 'd')
('b', 'c')
('b', 'd')
('c', 'd')
('a', 'b', 'c')
('a', 'c', 'b')
('a', 'b', 'd')
('a', 'd', 'b')
('a', 'c', 'd')
('a', 'd', 'c')
('b', 'c', 'd')
('b', 'd', 'c')
('a', 'b', 'c', 'd')
('a', 'b', 'd', 'c')
('a', 'c', 'b', 'd')
('a', 'c', 'd', 'b')
('a', 'd', 'b', 'c')
('a', 'd', 'c', 'b')
Is that missing some essential property you're looking for?
EDIT
I thought it might be missing this part of your criteria:
Also, since direction matters, [a,b,c,d,e] and [e,d,c,b,a] are different.
but on second thought I think it meets that requirement, since [e,d,c,b,a] is the same as [a,e,d,c,b] to you.
Is there any good reason to have a canonical representation in memory of this? It's going to be huge, and possibly whatever use case you have for this may have a better way of dealing with it.
It looks like for your source material, you would use any combination of X elements, not necessarily even homogeneous ones? (i.e. you would have (a,e,g,x,f) etc.). Then, I would do this as a nested loop. The outer one would select by length, and select subsets of the entire list to use. The inner one would construct combinations of the subset, and then throw out matching items. It's going to be slow no matter how you do it, but I would use a dictionary with a frozenset as the key (of the items, for immutability and fast lookup), and the items to be a list of already-detected cycles. It's going to be slow/long-running no matter how you do it, but this is one way.
First, you need a way to determine if two tuples (or lists) represent the same cycle. You can do that like this:
def match_cycle(test_cycle, ref_cycle):
try:
refi = ref_cycle.index(test_cycle[0])
partlen = len(ref_cycle) - refi
return not (any(test_cycle[i] - ref_cycle[i+refi] for i in range(partlen)) or
any(test_cycle[i+partlen] - ref_cycle[i] for i in range(refi)))
except:
return False
Then, the rest.
def all_cycles(elements):
for tuple_length in range(2, len(elements)):
testdict = defaultdict(list)
for outer in combinations(elements, tuple_length):
for inner in permutations(outer):
testset = frozenset(inner)
if not any(match_cycle(inner, x) for x in testdict[testset]):
testdict[testset].append(inner)
yield inner
This produced 60 items for elements of length 5, which seems about right and looked OK from inspection. Note that this is going to be exponential though.... length(5) took 1.34 ms/loop. length(6) took 22.1 ms. length(7) took 703 ms, length(8) took 33.5 s. length(100) might finish before you retire, but I wouldn't bet on it.
there might a better way, and probably is, but in general the number of subsets in 100 elements is pretty large, even when reduced some for cycles. So this is probably not the right way to approach whatever problem you are trying to solve.
This may work:
import itertools
import collections
class Cycle(object):
def __init__(self, cycle):
self.all_possible = self.get_all_possible(cycle)
self.canonical = self.get_canonical(self.all_possible)
def __eq__(self, other):
return self.canonical == other.canonical
def __hash__(self):
return hash(self.canonical)
def get_all_possible(self, cycle):
output = []
cycle = collections.deque(cycle)
for i in xrange(len(cycle)):
cycle.rotate(1)
output.append(list(cycle))
return output
def get_canonical(self, cycles):
return min(map(tuple, cycles), key=lambda item: hash(item))
def __repr__(self):
return 'Cycle({0})'.format(self.canonical)
def list_cycles(elements):
output = set()
for i in xrange(2, len(elements) + 1):
output.update(set(map(Cycle, itertools.permutations(elements, i))))
return list(output)
def display(seq):
for cycle in seq:
print cycle.canonical
print '\n'.join(' ' + str(item) for item in cycle.all_possible)
def main():
elements = 'abcdefghijkl'
final = list_cycles(elements)
display(final)
if __name__ == '__main__':
main()
It creates a class to represent any given cycle, which will be hashed and checked for equality against a canonical representation of the cycle. This lets a Cycle object be placed in a set, which will automatically filter out any duplicates. Unfortunately, it's not going to be highly efficient, since it generates every single possible permutation first.
This should give you the right answer with cycles with length 2 to len(elements). Might not be the fastest way to do it though. I used qarma's hint of rotating it to always start with the smallest element.
from itertools import permutations
def rotate_min(l):
'''Rotates the list so that the smallest element comes first '''
minIndex = l.index(min(l))
rotatedTuple = l[minIndex:] + l[:minIndex]
return rotatedTuple
def getCycles(elements):
elementIndicies = tuple(range(len(elements))) #tupple is hashable so it works with set
cyclesIndices = set()
cycles = []
for length in range(2, len(elements)+1):
allPermutation = permutations(elementIndicies, length)
for perm in allPermutation:
rotated_perm = rotate_min(perm)
if rotated_perm not in cyclesIndices:
#If the cycle of indices is not in the set, add it.
cyclesIndices.add(rotated_perm)
#convert indicies to the respective elements and append
cycles.append([elements[i] for i in rotated_perm])
return cycles

Generate combinations of elements from multiple lists

I'm making a function that takes a variable number of lists as input (i.e., an arbitrary argument list).
I need to compare each element from each list to each element of all other lists, but I couldn't find any way to approach this.
Depending on your goal, you can make use of some of the itertools utilities. For example, you can use itertools.product on *args:
from itertools import product
for comb in product(*args):
if len(set(comb)) < len(comb):
# there are equal values....
But currently it's not very clear from your question what you want to achieve. If I didn't understand you correctly, you can try to state the question in a more specific way.
I think #LevLeitsky's answer is the best way to do a loop over the items from your variable number of lists. However, if purpose the loop is just to find common elements between pairs of items from the lists, I'd do it a bit differently.
Here's an approach that finds the common elements between each pair of lists:
import itertools
def func(*args):
sets = [set(l) for l in args]
for a, b in itertools.combinations(sets, 2):
common = a & b # set intersection
# do stuff with the set of common elements...
I'm not sure what you need to do with the common elements, so I'll leave it there.
The itertools module provides a lot of useful tools just for such tasks. You can adapt the following example to your task by integrating it into your specific comparison logic.
Note that the following assumes a commutative function. That is, about half of the tuples are omitted for reasons of symmetry.
Example:
import itertools
def generate_pairs(*args):
# assuming function is commutative
for i, l in enumerate(args, 1):
for x, y in itertools.product(l, itertools.chain(*args[i:])):
yield (x, y)
# you can use lists instead of strings as well
for x, y in generate_pairs("ab", "cd", "ef"):
print (x, y)
# e.g., apply your comparison logic
print any(x == y for x, y in generate_pairs("ab", "cd", "ef"))
print all(x != y for x, y in generate_pairs("ab", "cd", "ef"))
Output:
$ python test.py
('a', 'c')
('a', 'd')
('a', 'e')
('a', 'f')
('b', 'c')
('b', 'd')
('b', 'e')
('b', 'f')
('c', 'e')
('c', 'f')
('d', 'e')
('d', 'f')
False
True
if you want the arguments as dictionary
def kw(**kwargs):
for key, value in kwargs.items():
print key, value
if you want all the arguments as list:
def arg(*args):
for item in args:
print item
you can use both
def using_both(*args, **kwargs) :
kw(kwargs)
arg(args)
call it like that:
using_both([1,2,3,4,5],a=32,b=55)

Is there a 'multimap' implementation in Python?

I am new to Python, and I am familiar with implementations of Multimaps in other languages. Does Python have such a data structure built-in, or available in a commonly-used library?
To illustrate what I mean by "multimap":
a = multidict()
a[1] = 'a'
a[1] = 'b'
a[2] = 'c'
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']
Such a thing is not present in the standard library. You can use a defaultdict though:
>>> from collections import defaultdict
>>> md = defaultdict(list)
>>> md[1].append('a')
>>> md[1].append('b')
>>> md[2].append('c')
>>> md[1]
['a', 'b']
>>> md[2]
['c']
(Instead of list you may want to use set, in which case you'd call .add instead of .append.)
As an aside: look at these two lines you wrote:
a[1] = 'a'
a[1] = 'b'
This seems to indicate that you want the expression a[1] to be equal to two distinct values. This is not possible with dictionaries because their keys are unique and each of them is associated with a single value. What you can do, however, is extract all values inside the list associated with a given key, one by one. You can use iter followed by successive calls to next for that. Or you can just use two loops:
>>> for k, v in md.items():
... for w in v:
... print("md[%d] = '%s'" % (k, w))
...
md[1] = 'a'
md[1] = 'b'
md[2] = 'c'
Just for future visitors. Currently there is a python implementation of Multimap. It's available via pypi
Stephan202 has the right answer, use defaultdict. But if you want something with the interface of C++ STL multimap and much worse performance, you can do this:
multimap = []
multimap.append( (3,'a') )
multimap.append( (2,'x') )
multimap.append( (3,'b') )
multimap.sort()
Now when you iterate through multimap, you'll get pairs like you would in a std::multimap. Unfortunately, that means your loop code will start to look as ugly as C++.
def multimap_iter(multimap,minkey,maxkey=None):
maxkey = minkey if (maxkey is None) else maxkey
for k,v in multimap:
if k<minkey: continue
if k>maxkey: break
yield k,v
# this will print 'a','b'
for k,v in multimap_iter(multimap,3,3):
print v
In summary, defaultdict is really cool and leverages the power of python and you should use it.
You can take list of tuples and than can sort them as if it was a multimap.
listAsMultimap=[]
Let's append some elements (tuples):
listAsMultimap.append((1,'a'))
listAsMultimap.append((2,'c'))
listAsMultimap.append((3,'d'))
listAsMultimap.append((2,'b'))
listAsMultimap.append((5,'e'))
listAsMultimap.append((4,'d'))
Now sort it.
listAsMultimap=sorted(listAsMultimap)
After printing it you will get:
[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (4, 'd'), (5, 'e')]
That means it is working as a Multimap!
Please note that like multimap here values are also sorted in ascending order if the keys are the same (for key=2, 'b' comes before 'c' although we didn't append them in this order.)
If you want to get them in descending order just change the sorted() function like this:
listAsMultimap=sorted(listAsMultimap,reverse=True)
And after you will get output like this:
[(5, 'e'), (4, 'd'), (3, 'd'), (2, 'c'), (2, 'b'), (1, 'a')]
Similarly here values are in descending order if the keys are the same.
The standard way to write this in Python is with a dict whose elements are each a list or set. As stephan202 says, you can somewhat automate this with a defaultdict, but you don't have to.
In other words I would translate your code to
a = dict()
a[1] = ['a', 'b']
a[2] = ['c']
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']
Or subclass dict:
class Multimap(dict):
def __setitem__(self, key, value):
if key not in self:
dict.__setitem__(self, key, [value]) # call super method to avoid recursion
else
self[key].append(value)
There is no multi-map in the Python standard libs currently.
WebOb has a MultiDict class used to represent HTML form values, and it is used by a few Python Web frameworks, so the implementation is battle tested.
Werkzeug also has a MultiDict class, and for the same reason.

Categories

Resources