I am trying to create a function, new_function, that takes a number as an argument.
This function will manipulate values in a list based on what number I pass as an argument. Within this function, I will place another function, new_sum, that is responsible for manipulating values inside the list.
For example, if I pass 4 into new_function, I need new_function to run new_sum on each of the first four elements. The corresponding value will change, and I need to create four new lists.
example:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
return new_list
# running new_function(4) should return four new lists
# [(new value for index zero, based on new_sum),2,3,4,5]
# [1,(new value for index one, based on new_sum),3,4,5]
# [1,2,(new value for index two, based on new_sum),4,5]
# [1,2,3,(new value for index three, based on new_sum),5]
My problem is that i keep on getting one giant list. What am I doing wrong?
Fix the indentation of return statement:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
return new_list
The problem with return new_list is that once you return, the function is done.
You can make things more complicated by accumulating the results and returning them all at the end:
listone=[1,2,3,4,5]
def new_function(value):
new_lists = []
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
new_lists.append(new_list)
return new_lists
However, this is exactly what generators are for: If you yield instead of return, that gives the caller one value, and then resumes when he asks for the next value. So:
listone=[1,2,3,4,5]
def new_function(value):
for i in range(0,value):
new_list=listone[:]
variable=new_sum(i)
new_list[i]=variable
yield new_list
The difference is that the first version gives the caller a list of four lists, while the second gives the caller an iterator of four lists. Often, you don't care about the difference—and, in fact, an iterator may be better for responsiveness, memory, or performance reasons.*
If you do care, it often makes more sense to just make a list out of the iterator at the point you need it. In other words, use the second version of the function, then just writes:
new_lists = list(new_function(4))
By the way, you can simplify this by not trying to mutate new_list in-place, and instead just change the values while copying. For example:
def new_function(value):
for i in range(value):
yield listone[:i] + [new_sum(i)] + listone[i+1:]
* Responsiveness is improved because you get the first result as soon as it's ready, instead of only after they're all ready. Memory use is improved because you don't need to keep all of the lists in memory at once, just one at a time. Performance may be improved because interleaving the work can result in better cache behavior and pipelining.
Related
I'm writing a function to flatten a nested array (Python list). e.g turn [1,2,[3]] into [1,2,3], [[1,2,[3]],4] into [1,2,3,4] etc.
I have the following:
def flatten_array(array):
flattened_array = []
for item in array:
if not isinstance(item, list):
flattened_array.append(item)
else:
flatten_array(item)
return flattened_array
So the idea is to have the function be recursive, to handle situations where there is nesting to an unknown depth. My problem is that flattened_array is getting re-initialized each time a nested list is encountered (when flatten_array is called recursively).
print flatten_array([1,2,[3]])
[1,2]
How can I maintain the state of flattened_array when recursive calls are made?
Change the lines
else:
flatten_array(item)
to
else:
flattened_array+=flatten_array(item)
So the full function reads like
def flatten_array(array):
flattened_array = []
for item in array:
if not isinstance(item, list):
flattened_array.append(item)
else:
flattened_array+=flatten_array(item)
return flattened_array
this gives
flatten_array([1,2,[3]]) # [1,2,3]
flatten_array([1,2,[3,[4,5]]]) # [1,2,3,4,5]
flatten_array([1,2,[3,[4,5]],6,7,[8]]) # [1,2,3,4,5,6,7,8]
Your original code is not doing anything with the recursive call. You get back the result on the list, but just discard it. What we want to do is attach it to the end of the existing list.
Additionally, if you don't want to keep creating temporary arrays, we can create one array with the first call to the function and just append to it.†
def flatten_array(array,flattened_array=None):
if flattened_array is None:
flattened_array = []
for item in array:
if not isinstance(item,list):
flattened_array.append(item)
else:
flatten_array(item,flattened_array)
return flattened_array
The results of this version are the same, and it can be used the same way, but in the original, each call to the function creates a new empty array to work with. Normally this isn't a problem, but depending on the depth or how large the sub-arrays are this can build up in memory.
This version flattens the array into a given array. When called with just the input (like flatten_array([1,2,[3]])), it creates an empty array to work with, otherwise it just adds to the given array (thus the recursive call just needs to give the array to add to), modifying it in place.
This has the advantage of allowing you to add to an existing array if we want:
a = [1,2,3]
b = [2,3,[4]] # we want to add flatten this to the end of a
flatten_array(b,a) # we don't bother catching the return result here
print(a) # [1,2,3,2,3,4]
† There is a subtle point here. You may ask why we didn't define the function as def flatten_array(array,flattened_array=[]) and get arid of the test inside the function. Try that and call the function a few times. What happens is that the default value is created once at function definition and not each time the function is called. This means that the default array which is modified in place is shared by each function call, resulting in it accumulating the results.
This is likely not what we want. By setting the default value to None and creating a new empty array inside the function each time, we ensure that each call to the function has a unique empty array to work with.
I'm learning about data structures and algorithm efficiency in my CS class right now, and I've created an algorithm to return a new list of items that in reverse order from the original list. I'm trying to figure out how to do this in place with Python lists using recursion, but the solution is eluding me. Here is my code:
def reverse_list(in_list):
n = len(in_list)
print('List is: ', in_list)
if n <= 2:
in_list[0], in_list[-1] = in_list[-1], in_list[0]
return in_list
else:
first = [in_list[0]]
last = [in_list[-1]]
temp = reverse_list(in_list[1:-1])
in_list = last + temp + first
print('Now list is: ', in_list)
return in_list
if __name__ == '__main__':
list1 = list(range(1, 12))
list2 = reverse_list(list1)
print(list2)
As a side note, this is O(n) in average case due to it being n/2 right?
You really don't want to be using recursion here as it doesn't really simplify the problem for the extra overhead involved with the function calls:
list3 = list(range(1, 10))
for i in range(len(list3)/2):
list3[i], list3[len(list3)-i-1] = list3[len(list3)-i-1], list3[i]
print(list3)
if you wanted to approach it using recursion, you'd pass in an index counter to each call of your function till you had gotten half way through the list. This would give you O(n/2) runtime.
Your n <= 2 case is good. It is "in-place" in both of the usual senses. It modifies the same list as is passed in, and it only uses a constant amount of memory in addition to the list passed in.
Your general case is where you have problems. It uses a lot of extra memory (n/2 levels of recursion, each of which holds two lists across the recursive call, so all those lists at all levels have to exist at once), and it does not modify the list passed in (rather, it returns a new list).
I don't want to do too much for you, but the key to solving this is to pass one (or two if you prefer) extra parameter(s) into the recursive step. Either an optional parameter to your function, or define a recursive helper function that has them. Use this to indicate what region of the list to reverse in-place.
As someone mentioned in comments, CPython doesn't have a tail-recursion optimization. This means that no algorithm that uses call-recursion like this is ever going to be truly "in-place", since it will use linear amounts of stack. But "in-place" in the sense of "modifies the original list instead of returning a new one" is certainly doable.
I'm trying to change the value of the list that i put as argument in the function.
this is the code:
def shuffle(xs,n=1):
if xs: #if list isn't empty
if n>0:
#gets the index of the middle of the list
sizel=len(xs)
midindex=int((sizel-1)/2)
for times in range(n):
xs=interleave(xs[0:midindex],xs[midindex:sizel])
return None
The interleave code returns a list with the values of both lists mixed up.
However when i run:
t=[1,2,3,4,5,6,7]
shuffle(t,n=2)
print t
The list t didn't changed it's order. The function needs to return None so i can jst use t=shuffle(t,n). There's anyway i can do this?
Your problem is right here:
xs=interleave(xs[0:midindex],xs[midindex:sizel])
You're making slices of the list to pass to your interleave() function. These are essentially copies of part of the list. There's no way that what comes back from the function can be anything than a different list from xs.
Fortunately, you can just reassign the new list you get back into the original list. That is, keep xs pointing to the same list, but replace all the items in it with what you get back from the interleave() function.
xs[:]=interleave(xs[0:midindex],xs[midindex:sizel])
This is called a slice assignment. Since xs remains the same list that was passed in, all references to the list outside the function will also see the changes.
xs is a reference local to the function, and is independant of t. When you reassign xs, t still points to the original list.
Since you must not return anything from the function, a workaround is to keep a reference to the original list and repopulate it using slice assignment:
orig_xs = xs
# do stuff here
orig_xs[:] = xs
I am trying to wrap my head around recursion and have posted a working algorithm to produce all the subsets of a given list.
def genSubsets(L):
res = []
if len(L) == 0:
return [[]]
smaller = genSubsets(L[:-1])
extra = L[-1:]
new = []
for i in smaller:
new.append(i+extra)
return smaller + new
Let's say my list is L = [0,1], correct output is [[],[0],[1],[0,1]]
Using print statements I have narrowed down that genSubsets is called twice before I ever get to the for loop. That much I get.
But why does the first for loop initiate a value of L as just [0] and the second for loop use [0,1]? How exactly do the recursive calls work that incorporate the for loop?
I think this would actually be easier to visualize with a longer source list. If you use [0, 1, 2], you'll see that the recursive calls repeatedly cut off the last item from the list. That is, recusion builds up a stack of recursive calls like this:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0])
genSubsets([])
At this point it hits the "base case" of the recursive algorithm. For this function, the base case is when the list given as a parameter is empty. Hitting the base case means it returns an list containing an empty list [[]]. Here's how the stack looks when it returns:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0]) <- gets [[]] returned to it
So that return value gets back to the previous level, where it is saved in the smaller variable. The variable extra gets assigned to be a slice including only the last item of the list, which in this case is the whole contents, [0].
Now, the loop iterates over the values in smaller, and adds their concatenation with extra to new. Since there's just one value in smaller (the empty list), new ends up with just one value too, []+[0] which is [0]. I assume this is the value you're printing out at some point.
Then the last statement returns the concatenation of smaller and new, so the return value is [[],[0]]. Another view of the stack:
genSubsets([0,1,2])
genSubsets([0,1]) <- gets [[],[0]] returned to it
The return value gets assigned to smaller again, extra is [1], and the loop happens again. This time, new gets two values, [1] and [0,1]. They get concatenated onto the end of smaller again, and the return value is [[],[0],[1],[0,1]]. The last stack view:
genSubsets([0,1,2]) <- gets [[],[0],[1],[0,1]] returned to it
The same thing happens again, this time adding 2s onto the end of each of the items found so far. new ends up as [[2],[0,2],[1,2],[0,1,2]].
The final return value is [[],[0],[1],[0,1],[2],[0,2],[1,2],[0,1,2]]
I am no big fan of trying to visualize the entire call graph for recursive function to understand what they do.
I believe there is a much simpler way:
Enter fairy tale land where recursive functions do the right thing™.
Just assume that genSubsets(L) works:
# This computes the powerset of the list L minus the last element
smaller = genSubsets(L[:-1])
Because this magically worked, the only entries that are missing are those, that contain the last element.
This fragment constructs all those missing subsets:
new = []
for i in smaller:
new.append(i+extra)
Now we have those subsets containing the last element in new and we have those subsets not containing the last element in smaller.
It follows that we must now have all subsets, so we can return new + smaller.
The only thing left is the base case to make sure the recursion stops. Because the empty set (or list in this case) is an element of every power set, we can use that to stop the recursion: Requesting the powerset of an empty set is a set containing the empty set. So our base case is correct. Since every recursive step removes one element off the list, the base case must be encountered at some time.
Thus, the code really does produce the power set.
Note: The principle behind this is that of induction. If something works for some known n0, and we can prove that: The algorithm working for n implies it works for n+1, it must thus work for all n ≥ n0.
I know that it is not allowed to remove elements while iterating a list, but is it allowed to add elements to a python list while iterating. Here is an example:
for a in myarr:
if somecond(a):
myarr.append(newObj())
I have tried this in my code and it seems to work fine, however I don't know if it's because I am just lucky and that it will break at some point in the future?
EDIT: I prefer not to copy the list since "myarr" is huge, and therefore it would be too slow. Also I need to check the appended objects with "somecond()".
EDIT: At some point "somecond(a)" will be false, so there can not be an infinite loop.
EDIT: Someone asked about the "somecond()" function. Each object in myarr has a size, and each time "somecond(a)" is true and a new object is appended to the list, the new object will have a size smaller than a. "somecond()" has an epsilon for how small objects can be and if they are too small it will return "false"
Why don't you just do it the idiomatic C way? This ought to be bullet-proof, but it won't be fast. I'm pretty sure indexing into a list in Python walks the linked list, so this is a "Shlemiel the Painter" algorithm. But I tend not to worry about optimization until it becomes clear that a particular section of code is really a problem. First make it work; then worry about making it fast, if necessary.
If you want to iterate over all the elements:
i = 0
while i < len(some_list):
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
If you only want to iterate over the elements that were originally in the list:
i = 0
original_len = len(some_list)
while i < original_len:
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
well, according to http://docs.python.org/tutorial/controlflow.html
It is not safe to modify the sequence
being iterated over in the loop (this
can only happen for mutable sequence
types, such as lists). If you need to
modify the list you are iterating over
(for example, to duplicate selected
items) you must iterate over a copy.
You could use the islice from itertools to create an iterator over a smaller portion of the list. Then you can append entries to the list without impacting the items you're iterating over:
islice(myarr, 0, len(myarr)-1)
Even better, you don't even have to iterate over all the elements. You can increment a step size.
In short: If you'are absolutely sure all new objects fail somecond() check, then your code works fine, it just wastes some time iterating the newly added objects.
Before giving a proper answer, you have to understand why it considers a bad idea to change list/dict while iterating. When using for statement, Python tries to be clever, and returns a dynamically calculated item each time. Take list as example, python remembers a index, and each time it returns l[index] to you. If you are changing l, the result l[index] can be messy.
NOTE: Here is a stackoverflow question to demonstrate this.
The worst case for adding element while iterating is infinite loop, try(or not if you can read a bug) the following in a python REPL:
import random
l = [0]
for item in l:
l.append(random.randint(1, 1000))
print item
It will print numbers non-stop until memory is used up, or killed by system/user.
Understand the internal reason, let's discuss the solutions. Here are a few:
1. make a copy of origin list
Iterating the origin list, and modify the copied one.
result = l[:]
for item in l:
if somecond(item):
result.append(Obj())
2. control when the loop ends
Instead of handling control to python, you decides how to iterate the list:
length = len(l)
for index in range(length):
if somecond(l[index]):
l.append(Obj())
Before iterating, calculate the list length, and only loop length times.
3. store added objects in a new list
Instead of modifying the origin list, store new object in a new list and concatenate them afterward.
added = [Obj() for item in l if somecond(item)]
l.extend(added)
You can do this.
bonus_rows = []
for a in myarr:
if somecond(a):
bonus_rows.append(newObj())
myarr.extend( bonus_rows )
Access your list elements directly by i. Then you can append to your list:
for i in xrange(len(myarr)):
if somecond(a[i]):
myarr.append(newObj())
make copy of your original list, iterate over it,
see the modified code below
for a in myarr[:]:
if somecond(a):
myarr.append(newObj())
I had a similar problem today. I had a list of items that needed checking; if the objects passed the check, they were added to a result list. If they didn't pass, I changed them a bit and if they might still work (size > 0 after the change), I'd add them on to the back of the list for rechecking.
I went for a solution like
items = [...what I want to check...]
result = []
while items:
recheck_items = []
for item in items:
if check(item):
result.append(item)
else:
item = change(item) # Note that this always lowers the integer size(),
# so no danger of an infinite loop
if item.size() > 0:
recheck_items.append(item)
items = recheck_items # Let the loop restart with these, if any
My list is effectively a queue, should probably have used some sort of queue. But my lists are small (like 10 items) and this works too.
You can use an index and a while loop instead of a for loop if you want the loop to also loop over the elements that is added to the list during the loop:
i = 0
while i < len(myarr):
a = myarr[i];
i = i + 1;
if somecond(a):
myarr.append(newObj())
Expanding S.Lott's answer so that new items are processed as well:
todo = myarr
done = []
while todo:
added = []
for a in todo:
if somecond(a):
added.append(newObj())
done.extend(todo)
todo = added
The final list is in done.
Alternate solution :
reduce(lambda x,newObj : x +[newObj] if somecond else x,myarr,myarr)
Assuming you are adding at the last of this list arr, You can try this method I often use,
arr = [...The list I want to work with]
current_length = len(arr)
i = 0
while i < current_length:
current_element = arr[i]
do_something(arr[i])
# Time to insert
insert_count = 1 # How many Items you are adding add the last
arr.append(item_to_be inserted)
# IMPORTANT!!!! increase the current limit and indexer
i += 1
current_length += insert_count
This is just boilerplate and if you run this, your program will freeze because of infinite loop. DO NOT FORGET TO TERMINATE THE LOOP unless you need so.