I'm taking a data structures course in Python, and a suggestion for a solution includes this code which I don't understand.
This is a sample of a dictionary:
vc_metro = {
'Richmond-Brighouse': set(['Lansdowne']),
'Lansdowne': set(['Richmond-Brighouse', 'Aberdeen'])
}
It is suggested that to remove some of the elements in the value, we use this code:
vc_metro['Lansdowne'] -= set(['Richmond-Brighouse'])
I have never seen such a structure, and using it in a basic situation such as:
my_list = [1, 2, 3, 4, 5, 6]
other_list = [1, 2]
my_list -= other_list
doesn't work. Where can I learn more about this recommended strategy?
You can't subtract lists, but you can subtract set objects meaningfully. Sets are hashtables, somewhat similar to dict.keys(), which allow only one instance of an object.
The -= operator is equivalent to the difference method, except that it is in-place. It removes all the elements that are present in both operands from the left one.
Your simple example with sets would look like this:
>>> my_set = {1, 2, 3, 4, 5, 6}
>>> other_set = {1, 2}
>>> my_set -= other_set
>>> my_set
{3, 4, 5, 6}
Curly braces with commas but no colons are interpreted as a set object. So the direct constructor call
set(['Richmond-Brighouse'])
is equivalent to
{'Richmond-Brighouse'}
Notice that you can't do set('Richmond-Brighouse'): that would add all the individual characters of the string to the set, since strings are iterable.
The reason to use -=/difference instead of remove is that differencing only removes existing elements, and silently ignores others. The discard method does this for a single element. Differencing allows removing multiple elements at once.
The original line vc_metro['Lansdowne'] -= set(['Richmond-Brighouse']) could be rewritten as
vc_metro['Lansdowne'].discard('Richmond-Brighouse')
Related
I have been having this annoying problem where I have an int, and I want to see if that int has an equivalence in a set. If it does, I don't want it in my "nset" values. However, when I attempt this, which seems pretty straightforward, it acts like the item has not been filtered properly.
Example Logs:
RDM is 8
RDM is not in
{1, 5, 6, 7, 8, 9}
{1, 3, 4, 5, 7, 8}
{2, 5, 6, 7, 8}
Code:
nset = list(range(1, self.n2 + 1))
for i in nset:
if(i in self.valuesInRows[space[0]]):
nset.remove(i)
elif(i in self.valuesInCols[space[1]]):
nset.remove(i)
elif(i in self.valuesInBoxes[self.spaceToBox(space[0], space[1])]):
nset.remove(i)
rdm = -1
while(rdm not in nset):
rdm = random.randint(0, self.n2)
print("RDM {}".format(rdm))
print("RDM is {}".format(rdm))
print("RDM is not in")
print(self.valuesInBoxes[self.spaceToBox(space[0], space[1])])
print(self.valuesInRows[space[0]])
print(self.valuesInCols[space[1]])
print()
return rdm
Any explanation would be fantastic, because I've looked at the documentation and it shows that this is the approach I would want to do, but I seem to be missing something.
You are using list instead of sets. Lists are standard Python data types that store values in a sequence. Sets are another standard Python data type that also stores values. The major difference is that sets, unlike lists or tuples, cannot have multiple occurrences of the same element and store unordered values.
By definition, sets won't allow for duplicate values. In lists, Before inserting the value, you have to check the occurrence by in keyword.
https://docs.python.org/3/reference/expressions.html#membership-test-operations
The frozenset docs says:
The frozenset type is immutable and hashable — its contents cannot be altered after it is created; it can therefore be used as a dictionary key or as an element of another set.
However, the docs for for python sets says:
Since sets only define partial ordering (subset relationships), the output of the list.sort() method is undefined for lists of sets.
This makes me ask: why is the case? And, if I wanted to sort a list of sets by set content, how could I do this? I know that the extension intbitset: https://pypi.python.org/pypi/intbitset/2.3.0 , has a function for returning a bit sequence that represents the set contents. Is there something comparable for python sets?
Tuples, lists, strings, etc. have a natural lexicographic ordering and can be sorted because you can always compare two elements of a given collection. That is, either a < b, b < a, or a == b.
A natural comparison between two sets is having a <= b mean a is a subset of b, which is what the expression a <= b actually does in Python. What the documentation means by "partial ordering" is that not all sets are comparable. Take, for example, the following sets:
a = {1, 2, 3}
b = {4, 5, 6}
Is a a subset of b? No. Is b a subset of a? No. Are they equal? No. If you can't compare them at all, you clearly can't sort them.
The only way you can sort a collection of sets is if your comparison function actually can compare any two elements (a total order). This means you can still sort a collection of sets using the above subset relation, but you will have to ensure that all of the sets are comparable (e.g. [{1}, {1, 2, 4}, {1, 2}]).
The easiest way to do what you want is to transform each individual set into something that you actually can compare. Basically, you do f(a) <= f(b) (where <= is obvious) for some simple function f. This is done with the key keyword argument:
In [10]: def f(some_set):
... return max(some_set)
...
In [11]: sorted([{1, 2, 3, 999}, {4, 5, 6}, {7, 8, 9}], key=f)
Out[11]: [{4, 5, 6}, {7, 8, 9}, {1, 2, 3, 999}]
You're sorting [f(set1), f(set2), f(set3)] and applying the resulting ordering to [set1, set2, set3].
Take an example: say you wanted to sort a list of sets by the "first element" of each set. The issue is that Python sets or frozensets don't have a "first element." They have no sense of their own ordering. A set is an unordered collection with no duplicate elements.
Furthermore, list.sort() sorts the list in place, using only the < operator between items.
If you just use a.sort() without passing any key parameter, saying set_a < set_b (or set_a.__lt__(set_b)) is insufficient. By insufficient, I mean that set_a.__lt__(set_b) is a subset operator. (Is a a subset of b?). As mentioned by #Blender and referenced in your question, this provides for partial rather than total ordering, which is insufficient for defining what ever sequence holds the sets.
From the docs:
set < other: Test whether the set is a proper subset of other, that
is, set <= other and set != other.
You could pass a key to sort(), it just couldn't refer to anything to do with the "ordering" of the sets internally, because remember--there is none.
>>> a = {2, 3, 1}
>>> b = {6, 9, 0, 1}
>>> c = {0}
>>> i = [b, a, c]
>>> i.sort(key=len)
>>> i
[{0}, {1, 2, 3}, {0, 9, 6, 1}]
In my program I have many lines where I need to both iterate over a something and modify it in that same for loop.
However, I know that modifying the thing over which you're iterating is bad because it may - probably will - result in an undesired result.
So I've been doing something like this:
for el_idx, el in enumerate(theList):
if theList[el_idx].IsSomething() is True:
theList[el_idx].SetIt(False)
Is this the best way to do this?
This is a conceptual misunderstanding.
It is dangerous to modify the list itself from within the iteration, because of the way Python translates the loop to lower level code. This can cause unexpected side effects during the iteration, there's a good example here :
https://unspecified.wordpress.com/2009/02/12/thou-shalt-not-modify-a-list-during-iteration/
But modifying mutable objects stored in the list is acceptable, and common practice.
I suspect that you're thinking that because the list is made up of those objects, that modifying those objects modifies the list. This is understandable - it's just not how it's normally thought of. If it helps, consider that the list only really contains references to those objects. When you modify the objects within the loop - you are merely using the list to modify the objects, not modifying the list itself.
What you should not do is add or remove items from the list during the iteration.
Your problem seems to be unclear to me. But if we talk about harmful of modifying list during a for loop iteration in Python. I can think about two scenarios.
First, You modify some elements in list that suppose to be used on the next round of computation as its original value.
e.g. You want to write a program that have such inputs and outputs like these.
Input:
[1, 2, 3, 4]
Expected output:
[1, 3, 6, 10] #[1, 1 + 2, 1 + 2 + 3, 1 + 2 + 3 + 4]
But...you write a code in this way:
#!/usr/bin/env python
mylist = [1, 2, 3, 4]
for idx, n in enumerate(mylist):
mylist[idx] = sum(mylist[:idx + 1])
print mylist
Result is:
[1, 3, 7, 15] # undesired result
Second, you make some change on size of list during a for loop iteration.
e.g. From python-delete-all-entries-of-a-value-in-list:
>>> s=[1,4,1,4,1,4,1,1,0,1]
>>> for i in s:
... if i ==1: s.remove(i)
...
>>> s
[4, 4, 4, 0, 1]
The example shows the undesired result that raised from side-effect of changing size in list. This obviously shows you that for each loop in Python can not handle list with dynamic size in a proper way. Below, I show you some simple way to overcome this problem:
#!/usr/bin/env python
s=[1, 4, 1, 4, 1, 4, 1, 1, 0, 1]
list_size=len(s)
i=0
while i!=list_size:
if s[i]==1:
del s[i]
list_size=len(s)
else:
i=i + 1
print s
Result:
[4, 4, 4, 0]
Conclusion: It's definitely not harmful to modify any elements in list during a loop iteration, if you don't 1) make change on size of list 2) make some side-effect of computation by your own.
you could get index first
idx = [ el_idx for el_idx, el in enumerate(theList) if el.IsSomething() ]
[ theList[i].SetIt(False) for i in idx ]
Is there any pre-made optimized tool/library in Python to cut/slice lists for values "less than" something?
Here's the issue: Let's say I have a list like:
a=[1,3,5,7,9]
and I want to delete all the numbers which are <= 6, so the resulting list would be
[7,9]
6 is not in the list, so I can't use the built-in index(6) method of the list. I can do things like:
#!/usr/bin/env python
a = [1, 3, 5, 7, 9]
cut=6
for i in range(len(a)-1, -2, -1):
if a[i] <= cut:
break
b = a[i+1:]
print "Cut list: %s" % b
which would be fairly quick method if the index to cut from is close to the end of the list, but which will be inefficient if the item is close to the beginning of the list (let's say, I want to delete all the items which are >2, there will be a lot of iterations).
I can also implement my own find method using binary search or such, but I was wondering if there's a more... wide-scope built in library to handle this type of things that I could reuse in other cases (for instance, if I need to delete all the number which are >=6).
Thank you in advance.
You can use the bisect module to perform a sorted search:
>>> import bisect
>>> a[bisect.bisect_left(a, 6):]
[7, 9]
bisect.bisect_left is what you are looking for, I guess.
If you just want to filter the list for all elements that fulfil a certain criterion, then the most straightforward way is to use the built-in filter function.
Here is an example:
a_list = [10,2,3,8,1,9]
# filter all elements smaller than 6:
filtered_list = filter(lambda x: x<6, a_list)
the filtered_list will contain:
[2, 3, 1]
Note: This method does not rely on the ordering of the list, so for very large lists it might be that a method optimised for ordered searching (as bisect) performs better in terms of speed.
Bisect left and right helper function
#!/usr/bin/env python3
import bisect
def get_slice(list_, left, right):
return list_[
bisect.bisect_left(list_, left):
bisect.bisect_left(list_, right)
]
assert get_slice([0, 1, 1, 3, 4, 4, 5, 6], 1, 5) == [1, 1, 3, 4, 4]
Tested in Ubuntu 16.04, Python 3.5.2.
Adding to Jon's answer, if you need to actually delete the elements less than 6 and want to keep the same reference to the list, rather than returning a new one.
del a[:bisect.bisect_right(a,6)]
You should note as well that bisect will only work on a sorted list.
Python not support adding a tuple to a list:
>>> [1,2,3] + (4,5,6)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "tuple") to list
What are the disadvantages for providing such a support in the language? Note that I would expect this to be symmetric: [1, 2] + (3, 4) and (1, 2) + [3, 4] would both evaluate to a brand-new list [1, 2, 3, 4]. My rationale is that once someone applied operator + to a mix of tuples and lists, they are likely to do it again (very possibly in the same expression), so we might as well provide the list to avoid extra conversions.
Here's my motivation for this question.
It happens quite often that I have small collections that I prefer to store as tuples to avoid accidental modification and to help performance. I then need to combine such tuples with lists, and having to convert each of them to list makes for very ugly code.
Note that += or extend may work in simple cases. But in general, when I have an expression
columns = default_columns + columns_from_user + calculated_columns
I don't know which of these are tuples and which are lists. So I either have to convert everything to lists:
columns = list(default_columns) + list(columns_from_user) + list(calculated_columns)
Or use itertools:
columns = list(itertools.chain(default_columns, columns_from_user, calculated_columns))
Both of these solutions are uglier than a simple sum; and the chain may also be slower (since it must iterate through the inputs an element at a time).
This is not supported because the + operator is supposed to be symmetric. What return type would you expect? The Python Zen includes the rule
In the face of ambiguity, refuse the temptation to guess.
The following works, though:
a = [1, 2, 3]
a += (4, 5, 6)
There is no ambiguity what type to use here.
Why python doesn't support adding different type: simple answer is that they are of different types, what if you try to add a iterable and expect a list out? I myself would like to return another iterable. Also consider ['a','b']+'cd' what should be the output? considering explicit is better than implicit all such implicit conversions are disallowed.
To overcome this limitation use extend method of list to add any iterable e.g.
l = [1,2,3]
l.extend((4,5,6))
If you have to add many list/tuples write a function
def adder(*iterables):
l = []
for i in iterables:
l.extend(i)
return l
print adder([1,2,3], (3,4,5), range(6,10))
output:
[1, 2, 3, 3, 4, 5, 6, 7, 8, 9]
You can use the += operator, if that helps:
>>> x = [1,2,3]
>>> x += (1,2,3)
>>> x
[1, 2, 3, 1, 2, 3]
You can also use the list constructor explicitly, but like you mentioned, readability might suffer:
>>> list((1,2,3)) + list((1,2,3))
[1, 2, 3, 1, 2, 3]