python for loop list plus one item - python

I'm handling mouse clicks on objects based on the location of the object on the screen. I record the xy coord of the mouse click and see if it matches any of the objects that are allowed to be clicked on. The objects are in different lists or just single objects, but I want them in one big list so I can just loop through the whole thing once, if the click is on one of the objects, do the work, break. You can only click on one object.
First method: how i'm doing it now:
list = [obj1, obj2, obj3]
singleobj
copylist = list
copylist.append(singleobj)
for item in copylist:
if item.pos == mouseclick.pos:
doWork(item)
break
Second method: I'd rather do something like below, but obviously the list+singleobj is not valid:
for item in list+singleobj:
if item.pos == mouseclick.pos:
doWork(item)
break
Third method: Or if I absolutely had to, I could do this horrible terrible code:
list = [obj1, obj2, obj3]
foundobj = None
for item in list:
if item.pos == mouseclick.pos:
foundobj = item
break
if foundobj is None:
if singleobj.pos == mouseclick.pos:
foundobj = singleobj
#possibly repeated several times here....
if foundobj is not None:
doWork(foundobj)
The first method seems slow because I have to copy all of the (possibly many) lists and single objects into one list.
The second method seems ideal because it's compact and easy to maintain. Although, as it stands now it's simply pseudo code.
The third method is bulky and clunky.
Which method should I use? If the second, can you give the actual code?

For 2nd method you need itertools.chain
for item in itertools.chain(list, [singleobj]):
...

DrTyrsa's answer is what I was about to say about your precise question, but let me give you a few other tips:
copylist = list
copylist.append(singleobj)
this does not create a copy of the list, you might want to do copylist = list[:] or copylist = list(lst) (I changed the name here to lst, because list is a builtin. you know).
about your second method, you can do:
for item in list + [singleobj]:
...
and if you are going to use itertools, a tiny improvement is to use a tuple rather than a list to hold the extra object, as it's a little bit more lightweight:
for item in itertools.chain(list, (singleobj,)):
...
the other thing is that you should not be looping your objects to see if the coordinates match, you can have them indexed by their boundaries (in something like a BSP tree or a Quadtree) and do a faster lookup.

Related

How to search for strings within nested lists

One of the questions for an assignment I'm doing consists of looking within a nested lists consisting of "an ultrashort story and its author.", to find a string that was inputted by a user. Not to sure on how to go about this, here is the assignment brief below if anyone would like more clarification. There are also more questions I'm not to sure on eg "find all stories by a certain author". Some explanations, or point me in the right direction is greatly appreciated :)
list = []
mylist = [['a','b','c'],['d','e','f']]
string = input("String?")
if string in [elem for sublist in mylist for elem in sublist] == True:
list.append(elem)
This is just an example of something i've tried, the list above is similar enough to the one i'm actually using for the question. I've just currently been going through different methods of iterating over a nested lists and adding mathcing items to another list. above code is just one example of an attemp i've made at this proccess.
""" the image above states that the data is in the
form of an list of sublists, with each sublist containing
two strings
"""
stories = [
['story string 1', 'author string 1'],
['story string 2', 'author string 2']
]
""" find stories that contain a given string
"""
stories_with_substring = []
substring = 'some string' # search string
for story, author in stories:
# if the substring is not in the story, a ValueError is raised
try:
story.index(substring)
stories_with_substring.append((story, author))
except ValueError:
continue
""" find stories by a given author
"""
stories_by_author = []
target_author = 'first last'
for story, author in stories:
if author == target_author:
stories_by_author.append((story, author))
This line here
for story, author in stories:
'Unpacks' the array. It's equivalent to
for pair in stories:
story = pair[0]
author = pair[1]
Or to go even further:
i = 0
while i < len(stories):
pair = stories[i]
story = pair[0]
author = pair[1]
I'm sure you can see how useful this is when dealing with lists that contain lists/tuples.
You may need to call .lower() on some of the strings if you want the search to be case insensitive
You can do a few things here. Your example showed the use of a list comprehension, so let's focus on some other aspects of this problem.
Recursion
You can define a function that iterates through all the items in the top level list. Assuming you know for sure all items are either strings or more lists, you can use type() to check if each item is another list, or is a string. If it's a string, do your search - if it's a list, have your function call itself. Let's look at an example. Please note that we should never using variables named list or string - these are core value types and we don't want to accidentally overwrite them!
mylist = [['a','b','c'],['d','e','f']]
def find_nested_items(my_list, my_input):
results = []
for i in mylist:
if type(i) == list:
items = find_nested_items(i, my_input)
results += items
elif my_input in i:
results.append(i)
return results
We're doing a few things here:
Creating an empty list named results
Iterating through the top level items of my_list
If one of those items is another list, we have our function call itself - at some point this will trigger the condition where an item is not a list, and will eventually return the results from that. For now, we assume the results we're getting back are going to be correct, so we concatenate those results to our top level results list
If the item is not a list, we simply check for the existence of our input and if so, add it to our results list
This kind of recursion is typically very safe, because it's inherently limited by our data structure. It can't run forever unless the data structure itself is infinitely deep.
Generators
Next, let's look at a much cooler function of python 3: generators. Right now, we're doing all the work of collecting the results in one go. If we later on want to iterate through those results, we need to iterate over them separately.
Instead of doing that, we can define a generator. This works almost the same, practically speaking, but instead of collecting the results in one loop and then using them in a second, we can collect and use each result all within a single loop. A generator "yields" a value, then stops until it is called the next time. Let's modify our example to make it a generator:
mylist = [['a','b','c'],['d','e','f']]
def find_nested_items(my_list, my_input):
for i in mylist:
if type(i) == list:
yield from find_nested_items(i, my_input)
elif my_input in i:
yield i
You'll notice this version is a fair bit shorter. There's no need to hold items in a temporary list - each item is "yielded", which means it's passed directly to the caller to use immediately, and the caller will stop our generator until it needs the next value.
yield from basically does the same recursion, it simply sets up a generator within a generator to return those nested items back up the chain to the caller.
These are some good techniques to try - please give them a go!

Search tuple elements within in list

I have a list in Python as
list_data = [('a','b',5),('aa','bb',50)]
and some variables:
a = ('a','b','2')
c = ('aaa','bbb','500')
Now how can I search if a is already there in list_data?
If yes add 2 to the value of a, if not append to list_data?
The result should be as
list_data = [('a','b',7),('aa','bb',50),('aaa','bbb','500')]
Actually, this question is a good way to several demonstrate Pythonic ways of doing things. So lets see what we can do.
In order to check if something is in python list you can just use operator in:
if a in list_data:
do_stuff()
But what you ask is a bit different. You want to do something like a search by multiple keys, if I understand correctly. In this case you can 'trim' your tuple by discarding last entry.
Slicing is handy for this:
value_trimmed = value[:-1]
Now you can make a list of trimmed tuples:
list_trimmed = []
for a in list_data:
list_trimmed.append(a[:-1])
And then search there:
if a[:-1] in list_trimmed:
do_smth()
This list can be constructed in a less verbose way using list_comprehension:
list_trimmed = [item[:-1] for item in list_data]
To find where your item exactly is you can use index() method of list:
list_trimmed.index(a[:-1])
This will return index of a[:-1] first occurrence in list_trimmed or throw if it cant be found. We can avoid explicitly checking if item is in the list, and do the insertion only if the exception is caught.
Your full code will look like this:
list_data = [('a','b',5), ('aa','bb',50)]
values_to_find = [('a','b','2'), ('aaa','bbb','500')]
list_trimmed = [item[:-1] for item in list_data]
for val in values_to_find:
val_trimmed = val[:-1]
try:
ind = list_trimmed.index(val_trimmed)
src_tuple = list_data[ind]
# we can't edit tuple inplace, since they are immutable in python
list_data[ind] = (src_tuple[0], src_tuple[1], src_tuple[2]+2)
except ValueError:
list_data.append(val)
print list_data
Of course, if speed or memory-efficiency is your main concern this code is not very appropriate, but you haven't mentioned these in your question, and that is not what python really about in my opinion.
Edit:
You haven't specified what happens when you check for ('aaa','bbb','500') second time - should we use the updated list and increment matching tuple's last element, or should we stick to the original list and insert another copy?
If we use updated list, it is not clear how to handle incrementing string '500' by 2 (we can convert it to integer, but you should have constructed your query appropriately in the first place).
Or maybe you meant add last element of tuple being searched to the tuple in list if found ? Please edit your question to make it clear.

Python Function Not Working

I am trying to create a function, new_function, that takes a number as an argument.
This function will manipulate values in a list based on what number I pass as an argument. Within this function, I will place another function, new_sum, that is responsible for manipulating values inside the list.
For example, if I pass 4 into new_function, I need new_function to run new_sum on each of the first four elements. The corresponding value will change, and I need to create four new lists.
example:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
return new_list
# running new_function(4) should return four new lists
# [(new value for index zero, based on new_sum),2,3,4,5]
# [1,(new value for index one, based on new_sum),3,4,5]
# [1,2,(new value for index two, based on new_sum),4,5]
# [1,2,3,(new value for index three, based on new_sum),5]
My problem is that i keep on getting one giant list. What am I doing wrong?
Fix the indentation of return statement:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
return new_list
The problem with return new_list is that once you return, the function is done.
You can make things more complicated by accumulating the results and returning them all at the end:
listone=[1,2,3,4,5]
def new_function(value):
new_lists = []
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
new_lists.append(new_list)
return new_lists
However, this is exactly what generators are for: If you yield instead of return, that gives the caller one value, and then resumes when he asks for the next value. So:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
yield new_list
The difference is that the first version gives the caller a list of four lists, while the second gives the caller an iterator of four lists. Often, you don't care about the difference—and, in fact, an iterator may be better for responsiveness, memory, or performance reasons.*
If you do care, it often makes more sense to just make a list out of the iterator at the point you need it. In other words, use the second version of the function, then just writes:
new_lists = list(new_function(4))
By the way, you can simplify this by not trying to mutate new_list in-place, and instead just change the values while copying. For example:
def new_function(value):
for i in range(value):
yield listone[:i] + [new_sum(i)] + listone[i+1:]
* Responsiveness is improved because you get the first result as soon as it's ready, instead of only after they're all ready. Memory use is improved because you don't need to keep all of the lists in memory at once, just one at a time. Performance may be improved because interleaving the work can result in better cache behavior and pipelining.

What is the fastest way to add data to a list without duplication in python (2.5)

I have about half a million items that need to be placed in a list, I can't have duplications, and if an item is already there I need to get it's index. So far I have
if Item in List:
ItemNumber=List.index(Item)
else:
List.append(Item)
ItemNumber=List.index(Item)
The problem is that as the list grows it gets progressively slower until at some point it just isn't worth doing. I am limited to python 2.5 because it is an embedded system.
You can use a set (in CPython since version 2.4) to efficiently look up duplicate values. If you really need an indexed system as well, you can use both a set and list.
Doing your lookups using a set will remove the overhead of if Item in List, but not that of List.index(Item)
Please note ItemNumber=List.index(Item) will be very inefficient to do after List.append(Item). You know the length of the list, so your index can be retrieved with ItemNumber = len(List)-1.
To completely remove the overhead of List.index (because that method will search through the list - very inefficient on larger sets), you can use a dict mapping Items back to their index.
I might rewrite it as follows:
# earlier in the program, NOT inside the loop
Dup = {}
# inside your loop to add items:
if Item in Dup:
ItemNumber = Dup[Item]
else:
List.append(Item)
Dup[Item] = ItemNumber = len(List)-1
If you really need to keep the data in an array, I'd use a separate dictionary to keep track of duplicates. This requires twice as much memory, but won't slow down significantly.
existing = dict()
if Item in existing:
ItemNumber = existing[Item]
else:
ItemNumber = existing[Item] = len(List)
List.append(Item)
However, if you don't need to save the order of items you should just use a set instead. This will take almost as little space as a list, yet will be as fast as a dictionary.
Items = set()
# ...
Items.add(Item) # will do nothing if Item is already added
Both of these require that your object is hashable. In Python, most types are hashable unless they are a container whose contents can be modified. For example: lists are not hashable because you can modify their contents, but tuples are hashable because you cannot.
If you were trying to store values that aren't hashable, there isn't a fast general solution.
You can improve the check a lot:
check = set(List)
for Item in NewList:
if Item in check: ItemNumber = List.index(Item)
else:
ItemNumber = len(List)
List.append(Item)
Or, even better, if order is not important you can do this:
oldlist = set(List)
addlist = set(AddList)
newlist = list(oldlist | addlist)
And if you need to loop over the items that were duplicated:
for item in (oldlist & addlist):
pass # do stuff

Python: Adding element to list while iterating

I know that it is not allowed to remove elements while iterating a list, but is it allowed to add elements to a python list while iterating. Here is an example:
for a in myarr:
if somecond(a):
myarr.append(newObj())
I have tried this in my code and it seems to work fine, however I don't know if it's because I am just lucky and that it will break at some point in the future?
EDIT: I prefer not to copy the list since "myarr" is huge, and therefore it would be too slow. Also I need to check the appended objects with "somecond()".
EDIT: At some point "somecond(a)" will be false, so there can not be an infinite loop.
EDIT: Someone asked about the "somecond()" function. Each object in myarr has a size, and each time "somecond(a)" is true and a new object is appended to the list, the new object will have a size smaller than a. "somecond()" has an epsilon for how small objects can be and if they are too small it will return "false"
Why don't you just do it the idiomatic C way? This ought to be bullet-proof, but it won't be fast. I'm pretty sure indexing into a list in Python walks the linked list, so this is a "Shlemiel the Painter" algorithm. But I tend not to worry about optimization until it becomes clear that a particular section of code is really a problem. First make it work; then worry about making it fast, if necessary.
If you want to iterate over all the elements:
i = 0
while i < len(some_list):
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
If you only want to iterate over the elements that were originally in the list:
i = 0
original_len = len(some_list)
while i < original_len:
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
well, according to http://docs.python.org/tutorial/controlflow.html
It is not safe to modify the sequence
being iterated over in the loop (this
can only happen for mutable sequence
types, such as lists). If you need to
modify the list you are iterating over
(for example, to duplicate selected
items) you must iterate over a copy.
You could use the islice from itertools to create an iterator over a smaller portion of the list. Then you can append entries to the list without impacting the items you're iterating over:
islice(myarr, 0, len(myarr)-1)
Even better, you don't even have to iterate over all the elements. You can increment a step size.
In short: If you'are absolutely sure all new objects fail somecond() check, then your code works fine, it just wastes some time iterating the newly added objects.
Before giving a proper answer, you have to understand why it considers a bad idea to change list/dict while iterating. When using for statement, Python tries to be clever, and returns a dynamically calculated item each time. Take list as example, python remembers a index, and each time it returns l[index] to you. If you are changing l, the result l[index] can be messy.
NOTE: Here is a stackoverflow question to demonstrate this.
The worst case for adding element while iterating is infinite loop, try(or not if you can read a bug) the following in a python REPL:
import random
l = [0]
for item in l:
l.append(random.randint(1, 1000))
print item
It will print numbers non-stop until memory is used up, or killed by system/user.
Understand the internal reason, let's discuss the solutions. Here are a few:
1. make a copy of origin list
Iterating the origin list, and modify the copied one.
result = l[:]
for item in l:
if somecond(item):
result.append(Obj())
2. control when the loop ends
Instead of handling control to python, you decides how to iterate the list:
length = len(l)
for index in range(length):
if somecond(l[index]):
l.append(Obj())
Before iterating, calculate the list length, and only loop length times.
3. store added objects in a new list
Instead of modifying the origin list, store new object in a new list and concatenate them afterward.
added = [Obj() for item in l if somecond(item)]
l.extend(added)
You can do this.
bonus_rows = []
for a in myarr:
if somecond(a):
bonus_rows.append(newObj())
myarr.extend( bonus_rows )
Access your list elements directly by i. Then you can append to your list:
for i in xrange(len(myarr)):
if somecond(a[i]):
myarr.append(newObj())
make copy of your original list, iterate over it,
see the modified code below
for a in myarr[:]:
if somecond(a):
myarr.append(newObj())
I had a similar problem today. I had a list of items that needed checking; if the objects passed the check, they were added to a result list. If they didn't pass, I changed them a bit and if they might still work (size > 0 after the change), I'd add them on to the back of the list for rechecking.
I went for a solution like
items = [...what I want to check...]
result = []
while items:
recheck_items = []
for item in items:
if check(item):
result.append(item)
else:
item = change(item) # Note that this always lowers the integer size(),
# so no danger of an infinite loop
if item.size() > 0:
recheck_items.append(item)
items = recheck_items # Let the loop restart with these, if any
My list is effectively a queue, should probably have used some sort of queue. But my lists are small (like 10 items) and this works too.
You can use an index and a while loop instead of a for loop if you want the loop to also loop over the elements that is added to the list during the loop:
i = 0
while i < len(myarr):
a = myarr[i];
i = i + 1;
if somecond(a):
myarr.append(newObj())
Expanding S.Lott's answer so that new items are processed as well:
todo = myarr
done = []
while todo:
added = []
for a in todo:
if somecond(a):
added.append(newObj())
done.extend(todo)
todo = added
The final list is in done.
Alternate solution :
reduce(lambda x,newObj : x +[newObj] if somecond else x,myarr,myarr)
Assuming you are adding at the last of this list arr, You can try this method I often use,
arr = [...The list I want to work with]
current_length = len(arr)
i = 0
while i < current_length:
current_element = arr[i]
do_something(arr[i])
# Time to insert
insert_count = 1 # How many Items you are adding add the last
arr.append(item_to_be inserted)
# IMPORTANT!!!! increase the current limit and indexer
i += 1
current_length += insert_count
This is just boilerplate and if you run this, your program will freeze because of infinite loop. DO NOT FORGET TO TERMINATE THE LOOP unless you need so.

Categories

Resources