Python List Help

Python List Help - python

I have a list of lists that looks like:
floodfillque = [[1,1,e],[1,2,w], [1,3,e], [2,1,e], [2,2,e], [2,3,w]]
for each in floodfillque:
if each[2] == 'w':
floodfillque.remove(each)
else:
tempfloodfill.append(floodfillque[each[0+1][1]])
That is a simplified, but I think relevant part of the code.
Does the floodfillque[each[0+1]] part do what I think it is doing and taking the value at that location and adding one to it or no? The reason why I ask is I get this error:
TypeError: 'int' object is unsubscriptable
And I think I am misunderstanding what that code is actually doing or doing it wrong.

In addition to the bug in your code that other answers have already spotted, you have at least one more:
for each in floodfillque:
if each[2] == 'w':
floodfillque.remove(each)
don't add or remove items from the very container you're looping on. While such a bug will usually be diagnosed only for certain types of containers (not including lists), it's just as terrible for lists -- it will end up altering your intended semantics by skipping some items or seeing some items twice.
If you can't substantially alter and enhance your logic (generally by building a new, separate container rather than mucking with the one you're looping on), the simplest workaround is usually to loop on a copy of the container you must alter:
for each in list(floodfillque):
Now, your additions and removals won't alter what you're actually looping on (because what you're looping on is a copy, a "snapshot", made once and for all at loop's start) so your semantics will work as intended.
Your specific approach to altering floodfillque also has a performance bug -- it behaves quadratically, while sound logic (building a new container rather than altering the original one) would behave linearly. But, that bug is harder to fix without refactoring your code from the current not-so-nice logic to the new, well-founded one.

Here's what's happening:
On the first iteration of the loop, each is [1, 1, 'e']. Since each[2] != 'w', the else is executed.
In the else, you take each[0+1][1], which is the same as (each[0+1])[1]. each[0+1] is 1, and so you are doing (1)[1]. int objects can't be indexed, which is what's raising the error.

Does the floodfillque[each[0+1] part
do what I think it is doing and taking
the value at that location and adding
one to it or no?
No, it sounds like you want each[0] + 1.
Either way, the error you're getting is because you're trying to take the second item of an integer... each[0+1][1] resolves to each[1][1] which might be something like 3[1], which doesn't make any sense.

The other posters are correct. However, there is another bug in this code, which is that you are modifying floodfillque as you are iterating over it. This will cause problems, because Python internally maintains a counter to handle the loop, and deleting elements does not modify the counter.
The safe way to do this is to iterate of a copy of the loop:
for each in floodfillque[ : ]:
([ : ] is Python's notation for a copy.)

Here is how I understand NoahClark's intentions:
Remove those sublists whose third element is 'w'
For the remaining sublist, add 1 to the second item
If this is the case, the following will do:
# Here is the original list
floodfillque = [[1,1,'e'], [1,2,'w'], [1,3,'e'], [2,1,'e'], [2,2,'e'], [2,3,'w']]
# Remove those sublists which have 'w' as the third element
# For the rest, add one to the second element
floodfillque = [[a,b+1,c] for a,b,c in floodfillque if c != 'w']
This solution works fine, but it is not the most efficient: it creates a new list instead of patching up the original one.

Related

I have a list with 7 elements in Python, but the len operator returns a length of 1

New to Python, this is my first application. I've been staring at this a while, and I'm sure I have some fundamental misunderstanding about what's going on.
In this example I have a list of 7 str (entries), and an assignment statement:
listLen = len(entries)
Followed by a breakpoint, and below is a screen capture showing the debugger where listLen is assigned a value of 1, and entries is a {list: 7}
I'd expect len(entries) to return a value of 7, but I can't seem to get the expected behavior. What am I missing?
UPDATE: I thought the answer was in the for loop modifying the list but apparently not.
If I set a breakpoint before assigning entries and single step through with the debugger including the for loop everything looks good and works.
If I set a breakpoint ON the for loop and single step once, entries again appears to be a {list: 7} but the len(entries) appears to be 1. The for loop executes one loop and exits.
The deep copy entriesCopy I made for debug is used nowhere else, and gets changed to [''], but I assume that since it's not used it gets optimized out or garbage collected, though it doesn't when single-stepping from an earlier breakpoint.
After breaking on the 'for' loop and single stepping once to the beginning of the 'while' loop:
Why would single stepping through the code work fine, but breaking at the for loop cause len(entries) to be wrong?
Single stepping from earlier breakpoint works fine, and the program returns the correct result:
I'm still struggling to get a minimum reproducible sample of code.
Here's more of the code:
entries = self.userQuery.getEntries()
entriesCopy = copy.deepcopy(self.userQuery.getEntries())
entryList = list()
listLen = len(entries)
for ii in range(0,listLen):
while ("\n\n") in entries[ii]: entries[ii]=entries[ii].replace("\n\n","\n") #strip double newlines
while ("\t") in entries[ii]: entries[ii] = entries[ii].replace("\t", "") # strip tabs
entryList=entries[ii].split("\n")
while("" in entryList): entryList.remove('')
self.SCPIDictionary[self.instructions[ii][1].replace("\n","")]=entryList;

Look a little higher in your debug output- you can see on line 42 entries: ['']
I can't read the code in your for loop so I don't know whats happening, but you seem to be modifying the list in there. If you use the "hover" to look at the value, you would get the current value of that variable. You set the breakpoint on the "for" part of the loop- try setting it on the first line of the loop and the line before the loop and watch for that entries list to get mutated.
--- edit ---
You provided more code. Its... kind of insane. Why are you modifying the "entries" object repeatedly in while loops? Then you copy the entry into another object, and then replace a value in some dictionary with the entry you just copied (with the key determined after running string transformations on a matrix dictionary?)
Two things-
To debug this, I am concerned about the types. Does "getEntries" actually return a list of strings, or is it a resultproxy or something similar? Sqlalchemy for example does not actually return a list. The python debugger is great, but you're doing so much mutation here- instead, lets use print statements. do print(entries) after every line. That will let you see when things are changing, and at least how many times your loop is executing. If it is something like a result proxy, as an example, after you finished iterating over it, there may just not be anything left in there when you look at it in the debugger.
consider this- instead of modifying all these mutable objects, pull out the values and modify those. As a rough draft-
for entry in entries:
values = []
for val in entry.replace("\n\n", "\n").replace("\t, "").split("\n"):
if val:
values.append(val)
self.CCPIDictionary[something?] = values

Python delete multiple element(s) in list if in another list

I have two arrays, where if an element exists in an array received from a client then it should delete the matching array in the other array. This works when the client array has just a single element but not when it has more than one.
This is the code:
projects = ['xmas','easter','mayday','newyear','vacation']
for i in self.get_arguments('del[]'):
try:
if i in projects:
print 'PROJECTS', projects
print 'DEL', self.get_arguments('del[]')
projects.remove(i)
except ValueError:
pass
self.get_arguments('del[]'), returns an array from the client side in the format:
[u'xmas , newyear, mayday']
So it reads as one element not 3 elements, as only one unicode present.
How can I get this to delete multiple elements?
EDIT: I've had to make the list into one with several individual elements.

How about filter?
projects = filter(lambda a: a not in self.get_arguments('del[]'), projects)

Could try something uber pythonic like a list comprehension:
new_list = [i for i in projects if i not in array_two]
You'd have to write-over your original projects, which isn't the most elegant, but this should work.

The reason this doesn't work is that remove just removes the first element that matches. You could fix that by just repeatedly calling remove until it doesn't exist anymore—e.g., by changing your if to a while, like this:
while i in projects:
print 'PROJECTS', projects
print 'DEL', self.get_arguments('del[]')
projects.remove(i)
But in general, using remove is a bad idea—especially when you already searched for the element. Now you're just repeating the search so you can remove it. Besides the obvious inefficiency, there are many cases where you're going to end up trying to delete the third instance of i (because that's the one you found) but actually deleting the first instead. It just makes your code harder to reason about. You can improve both the complexity and the efficiency by just iterating over the list once and removing as you go.
But even this is overly complicated—and still inefficient, because every time you delete from a list, you're moving all the other elements of the list. It's almost always simpler to just build a new list from the values you want to keep, using filter or a list comprehension:
arguments = set(self.get_arguments('del[]'))
projects = [project for project in projects if project not in arguments]
Making arguments into a set isn't essential here, but it's conceptually cleaner—you don't care about the order of the arguments, or need to retain any duplicates—and it's more efficient—sets can test membership instantly instead of by comparing to each element.

Difference between two "contains" operations for python lists

I'm fairly new to python and have found that I need to query a list about whether it contains a certain item.
The majority of the postings I have seen on various websites (including this similar stackoverflow question) have all suggested something along the lines of
for i in list
if i == thingIAmLookingFor
return True
However, I have also found from one lone forum that
if thingIAmLookingFor in list
# do work
works.
I am wondering if the if thing in list method is shorthand for the for i in list method, or if it is implemented differently.
I would also like to which, if either, is more preferred.

In your simple example it is of course better to use in.
However... in the question you link to, in doesn't work (at least not directly) because the OP does not want to find an object that is equal to something, but an object whose attribute n is equal to something.
One answer does mention using in on a list comprehension, though I'm not sure why a generator expression wasn't used instead:
if 5 in (data.n for data in myList):
print "Found it"
But this is hardly much of an improvement over the other approaches, such as this one using any:
if any(data.n == 5 for data in myList):
print "Found it"

the "if x in thing:" format is strongly preferred, not just because it takes less code, but it also works on other data types and is (to me) easier to read.
I'm not sure how it's implemented, but I'd expect it to be quite a lot more efficient on datatypes that are stored in a more searchable form. eg. sets or dictionary keys.

The if thing in somelist is the preferred and fastest way.
Under-the-hood that use of the in-operator translates to somelist.__contains__(thing) whose implementation is equivalent to: any((x is thing or x == thing) for x in somelist).
Note the condition tests identity and then equality.

for i in list
if i == thingIAmLookingFor
return True
The above is a terrible way to test whether an item exists in a collection. It returns True from the function, so if you need the test as part of some code you'd need to move this into a separate utility function, or add thingWasFound = False before the loop and set it to True in the if statement (and then break), either of which is several lines of boilerplate for what could be a simple expression.
Plus, if you just use thingIAmLookingFor in list, this might execute more efficiently by doing fewer Python level operations (it'll need to do the same operations, but maybe in C, as list is a builtin type). But even more importantly, if list is actually bound to some other collection like a set or a dictionary thingIAmLookingFor in list will use the hash lookup mechanism such types support and be much more efficient, while using a for loop will force Python to go through every item in turn.
Obligatory post-script: list is a terrible name for a variable that contains a list as it shadows the list builtin, which can confuse you or anyone who reads your code. You're much better off naming it something that tells you something about what it means.

Easy way to delete elements from a list in Python [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Removing an element from a list based on a predicate
Supposing I have a list and i want to delete from it elements that respect a condition,, how can i implement this easier?
I tried with:
for i in range (len(list)):
if [condition]:
del(list[i]);
Obviously it does not work...the only solution in my mind is do to shifts to left to replace the element i want to delete and then to delete the last element.. Anyway is there a faster solution?

How about using a list comprehension:
mylist = [x for x in mylist if not condition]

The simplest way is to create a copy of the list using filter:
list_removed = filter(lambda item: not condition(item), list)

I recommend #Space_C0wb0y's solution; however, for completeness I want to point out that
for i in range(len(lst)-1, -1, -1):
if (condition):
del lst[i]
works properly.

If you need to modify the list in-place (so that the suggestions others have made of filter or list comprehension won't help you) then:
You can avoid the outright failure of the code you give by processing elements in reverse order, so that deleting one doesn't affect the numbering of the ones processed later.
Shifting the elements around to put the "dead" ones at the end is almost certainly not worth it, but if you do it you can get one more little optimization by doing a single deletion at the end to remove all the dead elements, rather than removing each one as you see it. (The gain from this is likely to be small. Deleting the last element of a list is cheap unless it happens to trigger an actual resize, which by design it doesn't do very often.)
If it happens that you're deleting a lot of elements -- a substantial fraction of all the elements in the list -- then the "not worth it"s in 2 above are less obvious and you should benchmark it both ways.

LIst Comprehensions: References to the Components

In sum: I need to write a List Comprehension in which i refer to list that is being created by the List Comprehension.
This might not be something you need to do every day, but i don't think it's unusual either.
Maybe there's no answer here--still, please don't tell me i ought to use a for loop. That might be correct, but it's not helpful. The reason is the problem domain: this line of code is part of an ETL module, so performance is relevant, and so is the need to avoid creating a temporary container--hence my wish to code this step in a L/C. If a for loop would work for me here, i would just code one.
In any event, i am unable to write this particular list comprehension. The reason: the expression i need to write has this form:
[ some_function(s) for s in raw_data if s not in this_list ]
In that pseudo-code, "this_list" refers to the list created by evaluating that list comprehension. And that's why i'm stuck--because this_list isn't built until my list comprehension is evaluated, and because this list isn't yet built by the time i need to refer to it, i don't know how to refer to it.
What i have considered so far (and which might be based on one or more false assumptions, though i don't know exactly where):
doesn't the python interpreter have
to give this list-under-construction
a name? i think so
that temporary name is probably taken
from some bound method used to build
my list ('sum'?)
but even if i went to the trouble to
find that bound method and assuming
that it is indeed the temporary name
used by the python interpreter to
refer to the list while it is under
construction, i am pretty sure you
can't refer to bound methods
directly; i'm not aware of such an
explicit rule, but those methods (at
least the few that i've actually
looked at) are not valid python
syntax. I'm guessing one reason why
is so that we do not write them into
our code.
so that's the chain of my so-called reasoning, and which has led me to conclude, or at least guess, that i have coded myself into a corner. Still i thought i ought to verify this with the Community before turning around and going a different direction.

There used to be a way to do this using the undocumented fact that while the list was being built its value was stored in a local variable named _[1].__self__. However that quit working in Python 2.7 (maybe earlier, I wasn't paying close attention).
You can do what you want in a single list comprehension if you set up an external data structure first. Since all your pseudo code seemed to be doing with this_list was checking it to see if each s was already in it -- i.e. a membership test -- I've changed it into a set named seen as an optimization (checking for membership in a list can be very slow if the list is large). Here's what I mean:
raw_data = [c for c in 'abcdaebfc']
seen = set()
def some_function(s):
seen.add(s)
return s
print [ some_function(s) for s in raw_data if s not in seen ]
# ['a', 'b', 'c', 'd', 'e', 'f']
If you don't have access to some_function, you could put a call to it in your own wrapper function that added its return value to the seen set before returning it.
Even though it wouldn't be a list comprehension, I'd encapsulate the whole thing in a function to make reuse easier:
def some_function(s):
# do something with or to 's'...
return s
def add_unique(function, data):
result = []
seen = set(result) # init to empty set
for s in data:
if s not in seen:
t = function(s)
result.append(t)
seen.add(t)
return result
print add_unique(some_function, raw_data)
# ['a', 'b', 'c', 'd', 'e', 'f']
In either case, I find it odd that the list being built in your pseudo code that you want to reference isn't comprised of a subset of raw_data values, but rather the result of calling some_function on each of them -- i.e. transformed data -- which naturally makes one wonder what some_function does such that its return value might match an existing raw_data item's value.

I don't see why you need to do this in one go. Either iterate through the initial data first to eliminate duplicates - or, even better, convert it to a set as KennyTM suggests - then do your list comprehension.
Note that even if you could reference the "list under construction", your approach would still fail because s is not in the list anyway - the result of some_function(s) is.

As far as I know, there is no way to access a list comprehension as it's being built.
As KennyTM mentioned (and if the order of the entries is not relevant), then you can use a set instead. If you're on Python 2.7/3.1 and above, you even get set comprehensions:
{ some_function(s) for s in raw_data }
Otherwise, a for loop isn't that bad either (although it will scale terribly)
l = []
for s in raw_data:
item = somefunction(s)
if item not in l:
l.append(item)

Why don't you simply do:[ some_function(s) for s in set(raw_data) ]
That should do what you are asking for. Except when you need to preserve the order of the previous list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.