I'm new to Python and got stuck while implementing an algorithm for drawing graphs. I have a list of tuples that contain certain nodes and their 'level' (for layering). Initially that looks like this:
[(0, 10), (1, 'empty'), (2, 'empty'), (3, 'empty'), (4, 'empty'), (5, 'empty'), (6, 'empty'), (7, 'empty'), (8, 'empty'), (9, 'empty')] (node,level)
Now I need to assign levels to these nodes as long as any node has no level, respective the 'empty' attribute.
I tried several structures of possible while conditions, that syntactically weren't wrong but semantically didn't make any sense, since the while loop didn't terminate:
while (True for node, level in g.nodes(data='level') if level == 'empty'):
or
while ( 'empty' in enumerate(g.nodes(data='level') ):
and certain other similar constructs, that didn't work and I don't remember..
Until now it doesn't seem clear to me why this won't work - python didn't even enter the while loop with these conditions. Can you explain me why and hand me a clue how to fix it?
hint:
g.nodes(data='level') is a networkx function, that returns the upper list of tuples
So the things you have in the parentheses are generator expressions and they just always evaluate to True.
You have the right idea but maybe need to learn a bit more about these expressions and what to do with them. In your first suggestions, here is what I _ think_ you mean:
while any(level == 'empty' for node, level in g.nodes(data='level')):
...
In your version, what you do is create a generator expression that will container a number of True values, one for each level that is empty. However, even if that generator expression will have no elements in it, it is in and of itself not an empty object, and thus it evaluates to True.
You can try this out:
bool([]) # --> False because empty sequence
bool(list(range(1, -10))) # --> False because empty sequence
bool((i for i in range(1,-10))) # --> True because non-None generator expression
So you need to turn your generator expression into a truth value that actually reflects whether it has any true elements in it, and that's what the any function does: Take an iterator (or a generator expression) and return true if any of its elements are true.
You're trying to make an elemental check do your searching for you. Instead, concentrate on extracting exactly the values you need to check. Let's start with the list of tuples as
g = [(0, 10), (1, 'empty'), (2, 'empty'), (3, 'empty'), (4, 'empty'),
(5, 'empty'), (6, 'empty'), (7, 'empty'), (8, 'empty'), (9, 'empty')]
Now get a list of only the levels:
level = [node[1] for node in g]
Now, your check is simply
while "empty" in level:
# Assign levels as appropriate
# Repeat the check for "empty"
level = [node[1] for node in g]
If you're consistently updating the main graph, g, then fold the level extraction into the while:
while "empty" in [node[1] for node in g]:
# Assign levels
Originally, your while failed because you return True so long as there's anything in g:
while (True for node, level in ...)
You have a good approach, but made it one level too complex, and got stuck with True.
Your enumerate attempt fails because you search for a string in an enumerator, rather than in its returned values. You could make a list of the returned values, but then we're back where we started, with a list of tuples. This would compare "empty" against each tuple, and fail to find the desired string -- it's one level farther down.
The other answers may be best. But the specific test you requested can be written a few ways:
This is similar to what you had, but creates a list instead of a generator. The list will evaluate as False if it is empty, but the generator doesn't.
while [True for node, level in g.nodes(data='level') if level == 'empty']:
...
This is an equally effective version. The list just has to be empty or not; it doesn't matter whether the elements are True or False or something else:
while [node for node, level in g.nodes(data='level') if level == 'empty']:
...
Or you can use Python's any function, which was made for this and will be a little more efficient (it stops checking elements after the first match). This can use a generator or a list.
while any(level == 'empty' for node, level in g.nodes(data='level')):
...
It is often more Pythonic just to use a for loop:
nodes = [(0, 10),
(1, 'empty'),
(2, 'empty'),
(3, 'empty'),
(4, 'empty'),
(5, 'empty'),
(6, 'empty'),
(7, 'empty'),
(8, 'empty'),
(9, 'empty')]
for node in nodes:
if node[1]=="empty":
node[1] = set_level_func() # customise this function as you wish
If you're just trying to assign levels for each if one is empty then a simple for loop should suffice:
data = [(0, 10), (1, 'empty'), (2, 'empty'), (3, 'empty'), (4, 'empty'), (5, 'empty'), (6, 'empty'), (7, 'empty'), (8, 'empty'), (9, 'empty')]
output = []
for node, level in data:
if level == 'empty':
level = ? #question mark is whatever level logic you wish to insert
output.append((node, level))
Related
I have a list of tuples that I want to get all the combinations for as a set and filter out certain sets based on criteria.
For example
pairs = [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
combs = []
for i in range(len(pairs)):
intermediate = (set(list(combinations(pairs, i))))
if ((2,3) and (2,2) and (3,2)) in intermediate:
combs.append(intermediate)
But it doesn't detect if any of the tuples are in the set, so I tried a more basic test version.
pairs = [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
test = set(combinations(pairs,1))
if (1,1) in test:
print("true")
This also doesn't work even though I can clearly see in my variable explorer that the set contains (1,1).
I've also tried adding an integer 1 as one of the elements in pairs and checking if 1 is in the set but that still doesn't work. I've run out of ideas and some help would be appreciated.
There are two issues here...
First, it would seem like you are not testing membership at the correct depth of a nested data structure. When you call set(combinations(pairs, i)), you get a structure that is 3 levels deep: A set of tuples of tuples of ints (ints 3 containers deep).
>>> pairs = [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
>>> test = set(combinations(pairs,1))
>>> test
{((3, 2),), ((2, 3),), ((1, 1),), ((2, 2),), ((3, 1),), ((1, 3),), ((1, 2),), ((3, 3),), ((2, 1),)}
It is perfectly valid to test if a specific tuple of tuples of ints is contained within the set, but those tuples aren't automatically flattened for you to be able to test against a simple tuple of ints.
>>> ((1,1),) in test
True
>>> (1,1) in test
False
If you want to check if any tuples within the set contain a specific sub-tuple, you'll have to iterate over the set and check each top level tuple individually (hint: things like map can make this iteration a little shorter and sweeter)
for top_tuple in test:
if (1,1) in top_tuple:
print("found it!")
Second, is a somewhat common trap for new python programmers, which is chaining logical operators. You must think of and or in etc.. as similar to mathematical operators similar to + - * / etc.. The other important thing is how the logical operators treat things that aren't True and False. In general python treats things that are empty such as empty lists, strings, tuples, sets, etc.. as False, as well as things that are equal to 0. Basically everything else non-zero or non-empty is treated as True. Then when you run into an and, if the first value (on the left) is True-ish the return value of the and statement will be whatever is on the right. if The first value is False-ish, the return value will be that first value. When you chain them together, they get evaluated left to right.
>>> (1,1) and "cookies"
"cookies"
>>> False and "cookies"
False
>>> (2,3) and (2,2) and (3,2)
(3, 2)
Here is your test set, which does not contain (1,1) as an isolated tuple. It is a tuple inside a tuple.
{((3, 2),), ((2, 3),), ((1, 1),), ((2, 2),), ((3, 1),), ((1, 3),), ((1, 2),), ((3, 3),), ((2, 1),)}
To detect it, you can:
for combo in test:
if (1,1) in combo:
print("true")
#output: true
list=[(2, 5), (1, 2), (4, 4), (2, 3), (2, 1)]
def st(element):
return (element[1])
print(sorted(list,key=st))
Can someone explain the working of this program? When return statement returns 5,2,4,3,1 to the sorted funtion how do we get the list of tuples sorted ?
Instead shouldn't the output be [1,2,3,4,5].Since only the second value returns to the sorted function.
The sorted function will sort the list (ie, the entire set of tuples) using the function st() to provide the keys for sorting. The st() function in no way limits what gets sorted, only what get used to decide how to sort.
You can think of it as you providing an answer to the question the sorted function needs answered for each element: What is the value for this element I should use when deciding where to put it?
The output is:
[(2, 1), (1, 2), (2, 3), (4, 4), (2, 5)]
which is the sorted list using the second item of each tuple to decide order.
Summary
Sorting in Python is guaranteed to be stable since Python 2.2, as documented here and here.
Wikipedia explains what the property of being stable means for the behavior of the algorithm:
A sorting algorithm is stable if whenever there are two records R and S with the same key, and R appears before S in the original list, then R will always appear before S in the sorted list.
However, when sorting objects, such as tuples, sorting appears to be unstable.
For example,
>>> a = [(1, 3), (3, 2), (2, 4), (1, 2)]
>>> sorted(a)
[(1, 2), (1, 3), (2, 4), (3, 2)]
However, to be considered stable, I thought the new sequence should've been
[(1, 3), (1, 2), (2, 4), (3, 2)]
because, in the original sequence, the tuple (1, 3) appears before tuple (1, 2). The sorted function is relying on the 2-ary "keys" when the 1-ary "keys" are equal. (To clarify, the 1-ary key of some tuple t would be t[0] and the 2-ary t[1].)
To produce the expected result, we have to do the following:
>>> sorted(a, key=lambda t: t[0])
[(1, 3), (1, 2), (2, 4), (3, 2)]
I'm guessing there's a false assumption on my part, either about sorted or maybe on how tuple and/or list types are treated during comparison.
Questions
Why is the sorted function said to be "stable" even though it alters the original sequence in this manner?
Wouldn't setting the default behavior to that of the lambda version be more consistent with what "stable" means? Why is it not set this way?
Is this behavior simply a side-effect of how tuples and/or lists are inherently compared (i.e. the false assumption)?
Thanks.
Please note that this is not about whether the default behavior is or isn't useful, common, or something else. It's about whether the default behavior is consistent with the definition of what it means to be stable (which, IMHO, does not appear to be the case) and the guarantee of stability mentioned in the docs.
Think about it - (1, 2) comes before (1, 3), does it not? Sorting a list by default does not automatically mean "just sort it based off the first element". Otherwise you could say that apple comes before aardvark in the alphabet. In other words, this has nothing to do with stability.
The docs also have a nice explanation about how data structures such as lists and tuples are sorted lexicographically:
In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
Stable sort keeps the order of those elements which are considered equal from the sorting point of view. Because tuples are compared element by element lexicographically, (1, 2) precedes (1, 3), so it should go first:
>>> (1, 2) < (1, 3)
True
A tuple's key is made out of all of its items.
>>> (1,2) < (1,3)
True
I'm trying to add a new tuple to a list of tuples (sorted by first element in tuple), where the new tuple contains elements from both the previous and the next element in the list.
Example:
oldList = [(3, 10), (4, 7), (5,5)]
newList = [(3, 10), (4, 10), (4, 7), (5, 7), (5, 5)]
(4,10) was constructed from and added in between (3,10) and (4,7).
Construct (x,y) from (a,y) and (x,b)
I've tried using enumerate() to insert at the specific position, but that doesn't really let me access the next element.
oldList = [(3, 10), (4, 7), (5,5)]
def pair(lst):
# create two iterators
it1, it2 = iter(lst), iter(lst)
# move second to the second tuple
next(it2)
for ele in it1:
# yield original
yield ele
# yield first ele from next and first from current
yield (next(it2)[0], ele[1])
Which will give you:
In [3]: oldList = [(3, 10), (4, 7), (5, 5)]
In [4]: list(pair(oldList))
Out[4]: [(3, 10), (4, 10), (4, 7), (5, 7), (5, 5)]
Obviously we need to do some error handling to handle different possible situations.
You could also do it using a single iterator if you prefer:
def pair(lst):
it = iter(lst)
prev = next(it)
for ele in it:
yield prev
yield (prev[0], ele[1])
prev = ele
yield (prev[0], ele[1])
You can use itertools.tee in place of calling iter:
from itertools import tee
def pair(lst):
# create two iterators
it1, it2 = tee(lst)
# move second to the second tuple
next(it2)
for ele in it1:
# yield original
yield ele
# yield first ele from next and first from current
yield (next(it2)[0], ele[1])
You can use a list comprehension and itertools.chain():
>>> list(chain.from_iterable([((i, j), (x, j)) for (i, j), (x, y) in zip(oldList, oldList[1:])])) + oldList[-1:]
[(3, 10), (4, 10), (4, 7), (5, 7), (5, 5)]
Not being a big fan of one-liners (or complexity) myself, I will propose a very explicit and readable (which is usually a good thing!) solution to your problem.
So, in a very simplistic approach, you could do this:
def insertElements(oldList):
"""
Return a new list, alternating oldList tuples with
new tuples in the form (oldList[i+1][0],oldList[i][1])
"""
newList = []
for i in range(len(oldList)-1):
# take one tuple as is
newList.append(oldList[i])
# then add a new one with items from current and next tuple
newList.append((oldList[i+1][0],oldList[i][1]))
else:
# don't forget the last tuple
newList.append(oldList[-1])
return newList
oldList = [(3, 10), (4, 7), (5, 5)]
newList = insertElements(oldList)
That will give you the desired result in newList:
print(newList)
[(3, 10), (4, 10), (4, 7), (5, 7), (5, 5)]
This is not much longer code than other more sophisticated (and memory efficient!) solutions, like using generators, AND I consider it a lot easier to read than intricate one-liners. Also, it would be easy to add some checks to this simple function (like making sure you have a list of tuples).
Unless you already know you need to optimize this particular piece of your code (assuming this is part of a bigger project), this should be good enough. At the same time it is: easy to implement, easy to read, easy to explain, easy to maintain, easy to extend, easy to refactor, etc.
Note: all other previous answers to your question are also better solutions than this simple one, in many ways. Just wanted to give you another choice. Hope this helps.
In the list of tuples called mixed_sets, three separate sets exist. Each set contains tuples with values that intersect. A tuple from one set will not intersect with a tuple from another set.
I've come up with the following code to sort out the sets. I found that the python set functionality was limited when tuples are involved. It would be nice if the set intersection operation could look into each tuple index and not stop at the enclosing tuple object.
Here's the code:
mixed_sets= [(1,15),(2,22),(2,23),(3,13),(3,15),
(3,17),(4,22),(4,23),(5,15),(5,17),
(6,21),(6,22),(6,23),(7,15),(8,12),
(8,15),(9,19),(9,20),(10,19),(10,20),
(11,14),(11,16),(11,18),(11,19)]
def sort_sets(a_set):
idx= 0
idx2=0
while len(mixed_sets) > idx and len(a_set) > idx2:
if a_set[idx2][0] == mixed_sets[idx][0] or a_set[idx2][1] == mixed_sets[idx][1]:
a_set.append(mixed_sets[idx])
mixed_sets.pop(idx)
idx=0
else:
idx+=1
if idx == len(mixed_sets):
idx2+=1
idx=0
a_set.pop(0) #remove first item; duplicate
print a_set, 'a returned set'
return a_set
sorted_sets=[]
for new_set in mixed_sets:
sorted_sets.append(sort_sets([new_set]))
print mixed_sets #Now empty.
OUTPUT:
[(1, 15), (3, 15), (5, 15), (7, 15), (8, 15), (3, 13), (3, 17), (5, 17), (8, 12)] a returned set
[(2, 22), (2, 23), (4, 23), (6, 23), (4, 22), (6, 22), (6, 21)] a returned set
[(9, 19), (10, 19), (10, 20), (11, 19), (9, 20), (11, 14), (11, 16), (11, 18)] a returned set
Now this doesn't look like the most pythonic way of doing this task. This code is intended for large lists of tuples (approx 2E6) and I felt the program would run quicker if it didn't have to check tuples already sorted. Therefore I used pop() to shrink the mixed_sets list. I found using pop() made list comprehensions, for loops or any iterators problematic, so I've used the while loop instead.
It does work, but is there a more pythonic way of carrying out this task that doesn't use while loops and the idx and idx2 counters?.
Probably you can increase the speed by first computing a set of all the first elements in the tuples in the mixed_sets, and a set of all the second elements. Then in your iteration you can check if the first or the second element is in one of these sets, and find the correct complete tuple using binary search.
Actually you'd need multi-sets, which you can simulate using dictionaries.
Something like[currently not tested]:
from collections import defaultdict
# define the mixed_sets list.
mixed_sets.sort()
first_els = defaultdict(int)
secon_els = defaultdict(int)
for first,second in mixed_sets:
first_els[first] += 1
second_els[second] += 1
def sort_sets(a_set):
index= 0
while mixed_sets and len(a_set) > index:
first, second = a_set[index]
if first in first_els or second in second_els:
if first in first_els:
element = find_tuple(mixed_sets, first, index=0)
first_els[first] -= 1
if first_els[first] <= 0:
del first_els[first]
else:
element = find_tuple(mixed_sets, second, index=1)
second_els[second] -= 1
if second_els[second] <= 0:
del second_els[second]
a_set.append(element)
mixed_sets.remove(element)
index += 1
a_set.pop(0) #remove first item; duplicate
print a_set, 'a returned set'
return a_set
Where "find_tuple(mixed_sets, first, index=0,1)" return the tuple belonging to mixed_sets that has "first" at the given index.
Probably you'll have to duplicate also mixed_sets and order one of the copies by the first element and the other one by the second element.
Or maybe you could play with dictionaries again. Adding to the values in "first_els" and "second_els" also a sorted list of tuples.
I don't know how the performances will scale, but I think that if the data is in the order of 2 millions you shouldn't have too much to worry about.