Checking if a set of tuple contains items from another set - python

Let's say I have a set of tuples like this:
foo = {('A', 'B'), ('C', 'D'), ('B', 'C'), ('A', 'C')}
var = {'A', 'C', 'B'}
I want to check if every item from var is in any place in the set of tuples and returning True if it is and False if it isn't.
I tried with this but I don't have luck so far.
all((x for x in var) in (a,b) for (a,b) in foo)
Desired output : True
Actual output : False
However if:
var = {'A','C','D'}
I want it to return False, the logic is checking if the strings 'know' eachother.
Alright, let's explain this, for my last var.
A is paired with C, C is paired D, however D is not paired with A.
For my first logic,
A is paired with B,B is paired with C,C is paired with B, C is paired with A, Everyone 'knows' each other.
.

Generate all the pairs you expect to be present and see if they're there with a subset check:
from itertools import combinations
def _norm(it):
return {tuple(sorted(t)) for t in it}
def set_contains(foo, var):
return _norm(combinations(var, 2)) <= _norm(foo)
print(set_contains({('A', 'B'), ('C', 'D'), ('B', 'C'), ('A', 'C')},
{'A', 'C', 'B'})) # True
print(set_contains({('A', 'B'), ('C', 'D'), ('B', 'C'), ('A', 'C')},
{'A', 'C', 'D'})) # False
It may be possible to reduce on the amount of sorting, depending on how exactly combinations works (I'm not 100% sure what to make of the docs) and if you reuse either foo or var several times and can thus sort one of the parts just once beforehand.

Try this:
foo = {('A', 'B'), ('C', 'D'), ('B', 'C'), ('A', 'C')}
var = {'A', 'C', 'B'}
for elem in var:
if any(elem in tuples for tuples in foo):
print(True)

This is not as 'compact' as the others but works the same.
for x in var:
for y in foo:
if x in y:
print('Found %s in %s' % (x, y))
else:
print('%s not in %s' % (x, y))
B not in ('C', 'D')
B not in ('A', 'C')
Found B in ('A', 'B')
Found B in ('B', 'C')
A not in ('C', 'D')
Found A in ('A', 'C')
Found A in ('A', 'B')
A not in ('B', 'C')
Found C in ('C', 'D')
Found C in ('A', 'C')
C not in ('A', 'B')
Found C in ('B', 'C')

Related

How can I generate all unique nested 2-tuples (nested pairings) of a set of n objects in Python?

By nested 2-tuples, I mean something like this: ((a,b),(c,(d,e))) where all tuples have two elements. I don't need different orderings of the elements, just the different ways of putting parentheses around them. For items = [a, b, c, d], there are 5 unique pairings, which are:
(((a,b),c),d)
((a,(b,c)),d)
(a,((b,c),d))
(a,(b,(c,d)))
((a,b),(c,d))
In a perfect world I'd also like to have control over the maximum depth of the returned tuples, so that if I generated all pairings of items = [a, b, c, d] with max_depth=2, it would only return ((a,b),(c,d)).
This problem turned up because I wanted to find a way to generate the results of addition on non-commutative, non-associative numbers. If a+b doesn't equal b+a, and a+(b+c) doesn't equal (a+b)+c, what are all the possible sums of a, b, and c?
I have made a function that generates all pairings, but it also returns duplicates.
import itertools
def all_pairings(items):
if len(items) == 2:
yield (*items,)
else:
for i, pair in enumerate(itertools.pairwise(items)):
for pairing in all_pairings(items[:i] + [pair] + items[i+2:]):
yield pairing
For example, it returns ((a,b),(c,d)) twice for items=[a, b, c, d], since it pairs up (a,b) first in one case and (c,d) first in the second case.
Returning duplicates becomes a bigger and bigger problem for larger numbers of items. With duplicates, the number of pairings grows factorially, and without duplicates it grows exponentially, according to the Catalan Numbers (https://oeis.org/A000108).
n
With duplicates: (n-1)!
Without duplicates: (2(n-1))!/(n!(n-1)!)
1
1
1
2
1
1
3
2
2
4
6
5
5
24
14
6
120
42
7
720
132
8
5040
429
9
40320
1430
10
362880
4862
Because of this, I have been trying to come up with an algorithm that doesn't need to search through all the possibilities, only the unique ones. Again, it would also be nice to have control over the maximum depth, but that could probably be added to an existing algorithm. So far I've been unsuccessful in coming up with an approach, and I also haven't found any resources that cover this specific problem. I'd appreciate any help or links to helpful resources.
Using a recursive generator:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(1, len(l)):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
Output:
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
Check of uniqueness:
assert len(list(split(list(range(10))))) == 4862
Reversed order of the items:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(len(l)-1, 0, -1):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
[((('a', 'b'), 'c'), 'd'),
(('a', ('b', 'c')), 'd'),
(('a', 'b'), ('c', 'd')),
('a', (('b', 'c'), 'd')),
('a', ('b', ('c', 'd')))]
With maxdepth:
items = ['a', 'b', 'c', 'd']
def split(l, maxdepth=None):
if len(l) == 1:
yield l[0]
elif maxdepth is not None and maxdepth <= 0:
yield tuple(l)
else:
for i in range(1, len(l)):
for a in split(l[:i], maxdepth=maxdepth and maxdepth-1):
for b in split(l[i:], maxdepth=maxdepth and maxdepth-1):
yield (a, b)
list(split(items))
# or
list(split(items, maxdepth=3))
# or
list(split(items, maxdepth=2))
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
list(split(items, maxdepth=1))
[('a', ('b', 'c', 'd')),
(('a', 'b'), ('c', 'd')),
(('a', 'b', 'c'), 'd')]
list(split(items, maxdepth=0))
[('a', 'b', 'c', 'd')]
Full-credit to mozway for the algorithm - my original idea was to represent the pairing in reverse-polish notation, which would not have lent itself to the following optimizations:
First, we replace the two nested loops:
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
-with itertools.product, which will itself cache the results of the inner split(...) call, as well as produce the pairing in internal C code, which will run much faster.
yield from product(split(l[:i]), split(l[i:]))
Next, we cache the results of the previous split(...) calls. To do this we must sacrifice the laziness of generators, as well as ensure that our function parameters are hashable. Explicitly, this means creating a wrapper that casts the input list to a tuple, and to modify the function body to return lists instead of yielding.
def split(l):
return _split(tuple(l))
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
We then decorate the function with functools.cache, to perform the caching. So putting it all together:
from itertools import product
from functools import cache
def split(l):
return _split(tuple(l))
#cache
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
Testing for following input-
test = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n']`
-produces the following timings:
Original: 5.922573089599609
Revised: 0.08888077735900879
I did also verify that the results matched the original exactly- order and all.
Again, full credit to mozway for the algorithm. I've just applied a few optimizations to speed it up a bit.

Function does not store all the output

I have the following list,
p=[list(['a', 'b', 'c']), list(['d', 'e'])]
I would like to make the subset of each element (of size 2) and list them, this would give the output as follow:
[[('a', 'b'), ('a', 'c'), ('b', 'c')],[('d', 'e')]]
To achieve this I wrote the following function,
def x(m,n):
for i in x:
z=list(itertools.combinations(i, n))
return(z)
yet when I apply ie z(m,2) I only get the last element:
[('d', 'e')]
I wonder what am I doing wrong?
it is because you are setting z each time instead of appending it:
def x(m,n):
z = []
for i in m:
z.append(list(itertools.combinations(i, n)))
return(z)
yileds:
[[('a', 'b'), ('a', 'c'), ('b', 'c')], [('d', 'e')]]

How to convert a Python multilevel dictionary into tuples?

I have a multi level dictionary, example below, which needs to be converted into tuples in reverse order i.e, the innermost elements should be used to create tuple first.
{a: {b:c, d:{e:f, g:h, i:{j:['a','b']}}}}
Output should be something like this:
[(j,['a','b']), (i,j), (g,h), (e,f), (d,e), (d,g), (d,i), (b,c), (a,b), (a,d)]
There you go, this will produce what you want (also tested):
def create_tuple(d):
def create_tuple_rec(d, arr):
for k in d:
if type(d[k]) is not dict:
arr.append((k, d[k]))
else:
for subk in d[k]:
arr.append((k, subk))
create_tuple_rec(d[k], arr)
return arr
return create_tuple_rec(d, [])
# Running this
d = {'a': {'b':'c', 'd':{'e':'f', 'g':'h', 'i':{'j':['a','b']}}}}
print str(create_tuple(d))
# Will print:
[('a', 'b'), ('a', 'd'), ('b', 'c'), ('d', 'i'), ('d', 'e'), ('d', 'g'), ('i', 'j'), ('j', ['a', 'b']), ('e', 'f'), ('g', 'h')]

Checking that the geometry for a triangle is contained in a list of lines

I have a list of lines Lines=([('B', 'C'), ('D', 'A'), ('D', 'C'), ('A', 'B'), ('D', 'B')]) and geometry = ('B', 'C', 'D') is a list of points that set up the triangle (B,C,D).
I want to check whether geometry can be set up from list of lines in Lines. How can I create a function to check that status? True or False.
Sample Functionality with input Lines:
>> Lines=([('B', 'C'), ('D', 'A'), ('D', 'C'), ('A', 'B'), ('D', 'B'),])
>> geometry1 = ('B', 'C', 'D')
>> check_geometry(Lines, geometry1)
True
>> geometry2 = ('A', 'B', 'E')
>> check_geometry(Lines, geometry2)
False
This is my code, but the result is wrong:
import itertools
def check_geometry(line, geometry):
dataE = [set(x) for x in itertools.combinations(geometry, 2)]
for data in dataE:
if data not in line:
return False
return True
Lines = [('B', 'C'), ('D', 'A'), ('D', 'C'), ('A', 'B'), ('D', 'B'),]
geometry1 = ('B', 'C', 'D')
print check_geometry(Lines, geometry1)
Output:
False
For triangles:
You could use the built-in all to do this, making sure to first sort the list contents since their order might differ than that generated from itertools.combinations:
sLines = [tuple(sorted(l)) for l in Lines]
dataE = itertools.combinations('BCD', 2)
Now you can call all which will check that every value in dataE is present in sLines:
all(l1 in sLines for l1 in dataE)
Which will return True.
So, your check_geometry function could look something like:
def check_geometry(line, geometry):
sLines = [tuple(sorted(l)) for l in line]
dataE = itertools.combinations(geometry, 2)
return all(l1 in sLines for l1 in dataE)
Calls made will now check if the Lines contain the geometry:
check_geometry(Lines, 'BCD')
# returns True
check_geometry(Lines, 'ABE')
# returns False
A bit more general:
To generalize this a bit, we can drop itertools.combinations and instead utilize zip. The following makes some appropriate changes to the function in order to acommodate zip but performs similar stuff:
def check_geometry(line, geometry):
sLines = [sorted(l) for l in line]
dataE = [sorted(x) for x in zip(geometry, geometry[1:] + geometry[:1])]
return all(l1 in sLines for l1 in dataE)
The key difference here is:
dataE is now a list of lists containing the result of zip(geometry, geometry[1:] + geometry[:1]). What zip does in this case is it takes a string like "BCDA" and the same string with the first element added to the end geometry[1:] + geometry[:1] (i.e "CDAB") and creates entries signifying the sides of a shape:
>>> s = "BCDA"
>>> s[1:] + s[:1]
>>> 'CDAB'
>>> list(zip(s, s[1:] + s[:1]))
[('B', 'C'), ('C', 'D'), ('D', 'A'), ('A', 'B')]
Now we can check that a geometry with points "BCDA" can be constructed by the lines in Lines:
check_geometry(Lines, "BCD")
# True
check_geometry(Lines, "BCDA")
# True
check_geometry(Lines, "BCDF")
# False
Note 1: Lines can be written as:
Lines=[('B', 'C'), ('D', 'A'), ('D', 'C'), ('A', 'B'), ('D', 'B')]
The parenthesis () and comma , have no additional effect here, you can drop them :-) .
Note 2: The geometry parameter for check_geometry can be any iterable (tuples, lists, strings):
check_geometry(lines, "BCD") == check_geometry(lines, ('B', 'C', 'D'))
Creating and passing a tuple to it seems somewhat odd in this case (alas, you might have a good reason to do so). Unless reasons require it, I would suggest going with strings as the value for parameter geometry.
I think A,B,C can be string or whatever which define a point that set up a line
Okay, I'll be using strings for my answer then, you should be able to adjust the code to your needs.
def check_for_triangle(tri, lines):
lines_needed = zip(tri, (tri[1], tri[2], tri[0]))
return all(line in lines or line[::-1] in lines for line in lines_needed)
lines=[('B', 'C'), ('D', 'A'), ('D', 'C'), ('A', 'B'), ('D', 'B')]
tri1 = ('B', 'C', 'D')
tri2 = ('A', 'B', 'E')
print(check_for_triangle(tri1, lines)) # True
print(check_for_triangle(tri2, lines)) # False
The idea is to generate all lines (represented by a pair of points) we need to find in lines for a given triangle with zip. After that, we check whether all these lines can be found in lines.
Checking for line[::-1] as well is needed because the line ('A', 'B') is the same line as ('B', 'A').

Removing duplicates from tuples within a list

I have a list of tuples:
lst = [('a','b'), ('c', 'b'), ('a', 'd'), ('e','f'), ('a', 'b')]
I want the following output list:
output = [('a','b'), ('e','f')]
i.e I want to compare the elements of first tuple with remaining tuples and remove the tuple which contains either one or more duplicate elements.
My attempt:
I was thinking of using for loops, but that wont be feasible once i have very large list. I browsed through following posts but could not get the right solution:
Removing duplicates members from a list of tuples
How do you remove duplicates from a list in whilst preserving order?
If somebody could guide me the right direction, it will be very helpful. Thanks!
Assuming that you want "duplicates" of all elements to be suppressed, and not just the first one, you could use:
lst = [('a','b'), ('c', 'b'), ('a', 'd'), ('e','f'), ('a', 'b')]
def merge(x):
s = set()
for i in x:
if not s.intersection(i):
yield i
s.update(i)
gives
>>> list(merge(lst))
[('a', 'b'), ('e', 'f')]
>>> list(merge([('a', 'b'), ('c', 'd'), ('c', 'e')]))
[('a', 'b'), ('c', 'd')]
>>> list(merge([('a', 'b'), ('a', 'c'), ('c', 'd')]))
[('a', 'b'), ('c', 'd')]
Sets should help:
>>> s = map(set, lst)
>>> first = s[0]
>>> [first] + [i for i in s if not i & first]
[set(['a', 'b']), set(['e', 'f'])]
Or with ifilterfalse:
>>> from itertools import ifilterfalse
>>> s = map(set, lst)
>>> [first] + list(ifilterfalse(first.intersection, s))
[set(['a', 'b']), set(['e', 'f'])]

Categories

Resources