Accessing the list while being sorted - python

Can I access a list while it is being sorted in the list.sort()
b = ['b', 'e', 'f', 'd', 'c', 'g', 'a']
f = 'check this'
def m(i):
print i, b, f
return None
b.sort(key=m)
print b
this returns
b [] check this
e [] check this
f [] check this
d [] check this
c [] check this
g [] check this
a [] check this
Note that individual items of list b is sent to function m. But at m the list b is empty, however it can see the variable f, which has same scope as list b. Why does function m print b as []?

Looking at the source code (of CPython, maybe different behaviour for other implementations) the strange output of your script becomes obvious:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
saved_ob_size = Py_SIZE(self);
saved_ob_item = self->ob_item;
saved_allocated = self->allocated;
Py_SET_SIZE(self, 0);
The comment says it all: When you begin sorting, the list is emptied. Well, it is "empty" in the eye of an external observer.
I quite like the term "core-dump factory".
Compare also:
b = ['b','e','f','d','c','g','a']
f = 'check this'
def m(i):
print i, b, f
return None
b = sorted(b, key= m)
print b

This is something you can't rely on in general - not just for lists - unless the documentation for the method you're using explicitly says otherwise. Accessing an object in an intermediate state - ie, after some iteration has been started, but before it has been finished - is a problem that concurrent code runs into a lot. You've found a rare non-concurrent case of it, but the advice is the same: avoid this situation. The intermediate state is not guaranteed to be meaningful to you, and is not guaranteed to be a "valid" state according to the rules of that object (when it tends to be called an "inconsistent" state).

Related

Creating a recursive function to cycle through lists that produce lists that produce lists... and so on

First off I'm using python.
I have a list of items called tier1 it looks like this.
tier1 = ['a1','a2,'a3',..,'an']
I have 2 functions called functionA and functionZ.
They both take a string as their argument and produce a list output like this. The lists must be produced during execution time and are not available from the start. Only tier1 is available.
listOutput = functionA(tier1[0]).
listOutput looks like this
listOutput = ['b1','b2,'b3',..,'bn']
The next time functionA is used on listOutput lets say item 'b1', it will produce
listOutput = functionA('b1')
output:
listOutput = ['bc1','bc2,'bc3',..,'bcn']
This time when functionA is used, on lets say 'bc1', it might come up empty, so functionZ is used on 'bc1' is used instead and the output is stored somewhere.
listOutput = functionA('bc1')
output
listOutput = []
So I use
listOutput = functionZ('bc1')
output
listOutput = ['series1','series2','series3',....,'seriesn']
Now I have to go back and try bc2, until bcn doing the same logic. Once that's done, I will use functionA on 'b2'. and so on.
The depth of each item is variable.
It looks something like this
As long as listOutput is not empty, functionA must be used on the listOutput items or tier1 items until it comes up empty. Then functionZ must be used on whichever item in the list on which functionA comes up empty.
After tier1, listOutput will also always be a list, which must also be cycled through one by one and the same logic must be used.
I am trying to make a recursive function based on this but I'm stuck.
So far I have,
def recursivefunction (idnum): #idnum will be one of the list items from tier1 or the listOutputs produced
listOutput = functionA(idnum)
if not listOutput:
return functionZ(idnum)
else:
return recursivefunction(listOutput)
But my functions return lists, how do I get them to go deeper into each list until functionZ is used and once it's used to move on to the next item in the list.
Do I need to create a new kind of data structure?
I have no idea where to start, should I be looking to create some kind of class with linked lists?
The way I understand your problem:
there is an input list tier1, which is a list of strings
there are two functions, A and Z
A, when applied to a string, returns a list of strings
Z, when applied to a string, returns some value (type is unclear, assume list of string as well)
the algorithm:
for each element of tier1, apply A to the element
if the result is an empty list, apply Z to the element instead, no further processing
otherwise, if the result is not empty, apply the algorithm on the list
So, in Python:
from random import randint
# since you haven't shared what A and Z do,
# I'm just having them do random stuff that matches your description
def function_a(s):
# giving it a 75% chance to be empty
if randint(1, 4) != 1:
return []
else:
# otherwise between 1 and 4 random strings from some selection
return [['a', 'b', 'c'][randint(0, 2)] for _ in range(randint(1,4))]
# in the real case, I'm sure the result depends on `s` but it doesn't matter
def function_z(s):
# otherwise between 0 and 4 random strings from some selection
return [['x', 'y', 'z'][randint(0, 2)] for _ in range(randint(0,4))]
def solution(xs):
# this is really the answer to your question:
rs = []
for x in xs:
# first compute A of x
r = function_a(x)
# if that's the empty list
if not r:
# then we want Z of x instead
r = function_z(x)
else:
# otherwise, it's the same algorithm applied to all of r
r = solution(r)
# whatever the result, append it to rs
rs.append(r)
return rs
tier1 = ['a1', 'a2', 'a3', 'a4']
print(solution(tier1))
Note that function_a and function_z are just functions generating random results with the types of results you specified. You didn't share what the logic of A and Z really is, so it's hard to verify if the results are what you want.
However, the function solution does exactly what you say it should - if I understand you somewhat complicated explanation of it correctly.
Given that the solution to your question is basically this:
def solution(xs):
rs = []
for x in xs:
r = function_a(x)
if not r:
r = function_z(x)
else:
r = solution(r)
rs.append(r)
return rs
Which can even be rewritten to:
def solution_brief(xs):
return [function_z(r) if not r else solution(r) for r in [function_a(x) for x in xs]]
You should reexamine your problem description. The key with programming is understanding the problem and breaking it down to its essential steps. Once you've done that, code is quick to follow. Whether you prefer the first or second solution probable depends on experience and possibly on tiny performance differences.
By the way, any solution written as a recursive function, can also be written purely iterative - that's often preferable from a memory and performance perspective, but recursive functions can have the advantage of being very clean and simple and therefore easier to maintain.
Putting my coding where my mouth is, here's an iterative solution of the same problem, just for fun (not optimal by any means):
def solution_iterative(xs):
if not xs:
return xs
rs = xs.copy()
stack_rs = [rs]
stack_is = [0]
while stack_rs:
r = function_a(stack_rs[-1][stack_is[-1]])
if not r:
stack_rs[-1][stack_is[-1]] = function_z(stack_rs[-1][stack_is[-1]])
stack_is[-1] += 1
else:
stack_rs[-1][stack_is[-1]] = r
stack_rs.append(r)
stack_is.append(0)
while stack_is and stack_is[-1] >= len(stack_rs[-1]):
stack_is.pop()
stack_rs.pop()
if stack_is:
stack_is[-1] += 1
return rs

Python - Questions for the different use of comma

Recently, I am solving problems No. 113 Path Sum II on LeetCode and I find a solution online.
Given a binary tree and a sum, find all root-to-leaf paths where each path's sum equals the given sum.
Code as below:
class Solution:
def pathSum(self, R: TreeNode, S: int) -> List[List[int]]:
A, P = [], []
def dfs(N):
if N == None: return
P.append(N.val)
if (N.left,N.right) == (None,None) and sum(P) == S: A.append(list(P))
else: dfs(N.left), dfs(N.right)
P.pop()
dfs(R)
return A
- Junaid Mansuri
- Chicago, IL
I would like to ask some questions based on the above code to help me understand how Python works more.
Why do we need to use list(), A.append(list(P)), to successfully append the list into A if P itself is already a list?
What happens when the interpreter runs dfs(N.left), dfs(N.right). Both of the function will append a value into P, but they seem don't affect the other functions(like they are running at the exact same time with the exact same P), is it something like multithreading?
A related question of the above, is A, P = [ ], [ ] works with same concept as dfs(N.left), dfs(N.right)? If not, what is the difference?
what does P.pop() pop indeed, I mean which value will be poped out if both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P after the two functions run?
Updates (more question)
10 while head != None:
11 if id(head.next) in hashMap: return True
12 head = head.next
13 hashMap.add(id(head.next))
Line 13: AttributeError: 'NoneType' object has no attribute 'next'
The above is part of the code. It just simply look into a linked list. It will show an error as above which I think is normal when it reaches the end of the linked list.
What I want to understand is that if the code changed as below, there will be no error and the code run successfully. Is that related to the comma or there is another reason that makes it runs?
10 while head != None:
11 if id(head.next) in hashMap: return True
12 head, _ = head.next, hashMap.add(id(head.next))
An internet search will point you to lots of articles on depth-first search.
To answer your immediate questions.
Why do we need to use list(), A.append(list(P)), to successfully
append the list into A if P itself is already a list?
A.append(list(P))
Uses the list constructor to make a shallow copy of P to add to A.
Otherwise, if you just used:
A.append(P)
then the list in A will change every time P changes
What happens when the interpreter runs dfs(N.left), dfs(N.right). Both
of the function will append a value into P, but they seem don't affect
the other functions(like they are running at the exact same time with
the exact same P), is it something like multithreading?
These functions are run sequentially. First dfs(N.left) followed by dfs(N.right).
This performs a depth-first search (DFS) on the left subtree, followed by a DFS on right subtree.
Each function is run for its side-effect of updating A and P.
A related question of the above, is A, P = [ ], [ ] works with same
concept as dfs(N.left), dfs(N.right)? If not, what is the difference?
what does P.pop() pop indeed, I mean which value will be poped out if
both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P
after the two functions run?
Variable A and P are local variables of pathSum. dfs being a nested function within pathSum has access to these local variables. Thus there is only one A and one P which dfs updates as it is called recursively.
A, P = [], []
Is initializing A, P (done once within pathSum).
dfs(N.left), dfs(N.right)
Is calling the dfs methods on the left and right subnodes, which performs updates on A and P as the recursive calls run.
what does P.pop() pop indeed, I mean which value will be poped out if
both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P
after the two functions run?
P.pop()
Removes the last value appended to list P.
dfs(N.left) and dfs(N.right) are run one after the other. For example with N correspoding the value = 1:
dfs(N.left), dfs(N.right)
First dfs(N.left) will recursively traverse nodes with values of:
2, 4, 5
Then, dfs(N.right) will traverse the node with value 3.
The values of A, P will be updated during the traversal. P contains the path to the current node. When we branch left (i.e. DFS(N.left) the left path is added to P. When we return we need to remove this (thus P.pop()). Similarly, N.right() is run.
When we have traversed both the left and right children, we remove the current node with P.pop() and control flow returns to the parent.
Tuple Abbreviation
a, b is an abbreviation for tuple (a, b)
Thus:
[], [] computes tuple ([], [])
b, a computes tuple (b, a)
dfs(N.left), dbs(N.right) computes tuple (dfs(N.left), dfs(N.right))
Tuples can be unpacked
Example:
t = (2, 4, 6, 8)
x, y, z, w = t
Will have x = 2, y = 4, z = 6, w = 8
With:
A, P = [], [] equivalent to A, P = ([], [])
Unpacking then has:
A = [], P = []
With:
a, b = b, a is equivalent to a, b = (b, a)
Unpacking then has:
a = b and b == a
In terms of result think of these assignments as occuring in parallel.
So with:
(a, b) = (5, 3)
a, b = (b, a) will have a = 3, b = 5 (thus swapping a & b)
DFS(N.left), DFS(N.right) is computing the tuple(DFS(N.left, DFS(N.right)).
This necessitates running DFS on N.left and N.right.
The resulting tuple is discarded, but this has the desired effect of updating A & P as DFS is run recursively.

Recursively generating a list of lists in a triangular format given a height and value

I recently started looking into recursion to clean up my code and "up my game" as it were. As such, I'm trying to do things which could normally be accomplished rather simply with loops, etc., but practicing them with recursive algorithms instead.
Currently, I am attempting to generate a two-dimensional array which should theoretically resemble a sort of right-triangle in an NxN formation given some height n and the value which will get returned into the 2D-array.
As an example, say I call: my_function(3, 'a');, n = 3 and value = 'a'
My output returned should be: [['a'], ['a', 'a'], ['a', 'a', 'a']]
[['a'],
['a', 'a'],
['a', 'a', 'a']]
Wherein n determines both how many lists will be within the outermost list, as well as how many elements should successively appear within those inner-lists in ascending order.
As it stands, my code currently looks as follows:
def my_function(n, value):
base_val = [value]
if n == 0:
return [base_val]
else:
return [base_val] + [my_function(n-1, value)]
Unfortunately, using my above example n = 3 and value = 'a', this currently outputs: [['a'], [['a'], [['a'], [['a']]]]]
Now, this doesn't have to get formatted or printed the way I showed above in a literal right-triangle formation (that was just a visualization of what I want to accomplish).
I will answer any clarifying questions you need, of course!
return [base_val]
Okay, for n == 0 we get [[value]]. Solid. Er, sort of. That's the result with one row in it, right? So, our condition for the base case should be n == 1 instead.
Now, let's try the recursive case:
return [base_val] + [my_function(n-1, value)]
We had [[value]], and we want to end up with [[value], [value, value]]. Similarly, when we have [[value], [value, value]], we want to produce [[value], [value, value], [value, value, value]] from it. And so on.
The plan is that we get one row at the moment, and all the rest of the rows by recursing, yes?
Which rows will we get by recursing? Answer: the ones at the beginning, because those are the ones that still look like a triangle in isolation.
Therefore, which row do we produce locally? Answer: the one at the end.
Therefore, how do we order the results? Answer: we need to get the result from the recursive call, and add a row to the end of it.
Do we need to wrap the result of the recursive call? Answer: No. It is already a list of lists. We're just going to add one more list to the end of it.
How do we produce the last row? Answer: we need to repeat the value, n times, in a list. Well, that's easy enough.
Do we need to wrap the local row? Answer: Yes, because we want to append it as a single item to the recursive result - not concatenate all its elements.
Okay, let's re-examine the base case. Can we properly handle n == 0? Yes, and it makes perfect sense as a request, so we should handle it. What does our triangle look like with no rows in it? Well, it's still a list of rows, but it doesn't have any rows in it. So that's just []. And we can still append the first row to that, and proceed recursively. Great.
Let's put it all together:
if n == 0:
return []
else:
return my_function(n-1, value) + [[value] * n]
Looks like base_val isn't really useful any more. Oh well.
We can condense that a little further, with a ternary expression:
return [] if n == 0 else (my_function(n-1, value) + [[value] * n])
You have a couple logic errors: off-by-1 with n, growing the wrong side (critically, the non-base implementation should not use a base-sized array), growing by an array of the wrong size. A fixed version:
#!/usr/bin/env python3
def my_function(n, value):
if n <= 0:
return []
return my_function(n-1, value) + [[value]*n]
def main():
print(my_function(3, 'a'))
if __name__ == '__main__':
main()
Since you're returning mutable, you can get some more efficiency by using .append rather than +, which would make it no longer functional. Also note that the inner mutable objects don't get copied (but since the recursion is internal this doesn't really matter in this case).
It would be possible to write a tail-recursive version of this instead, by adding a parameter.
But python is a weird language for using unnecessary recursion.
The easiest way for me to think about recursive algorithms is in terms of the base case and how to build on that.
The base case (case where no recursion is necessary) is when n = 1 (or n = 0, but I'm going to ignore that case). A 1x1 "triangle" is just a 1x1 list: [[a]].
So how do we build on that? Well, if n = 2, we can assume we already have that base case value (from calling f(1)) of [[a]]. So we need to add [a, a] to that list.
We can generalize this as:
f(1) = [[a]]
f(n > 1) = f(n - 1) + [[a] * n]
, or, in Python:
def my_function(n, value):
if n == 1:
return [[value]]
else:
return my_function(n - 1, value) + [[value] * n]
While the other answers proposed another algorithm for solving your Problem, it could have been solved by correcting your solution:
Using a helper function such as:
def indent(x, lst):
new_lst = []
for val in lst:
new_lst += [x] + val
return new_lst
You can implement the return in the original function as:
return [base_val] + indent(value, [my_function(n-1, value)])
The other solutions are more elegant though so feel free to accept them.
Here is an image explaining this solution.
The red part is your current function call and the green one the previous function call.
As you can see, we also need to add the yellow part in order to complete the triangle.
These are the other solutions.
In these solutions you only need to add a new row, so that it's more elegant overall.

Associate two items of list/dict in Python (change the value of one would modify the other)

In Python, is it possible to associate two items in a list/dict so that value changes of one would trigger the modification of the other?
For example, I have a list:
# Python example
# item_one = 'item_one'
# item_two = 'item_one'
list_with_associated_items = [item_one, item_two]
list_with_associated_items[0] = 'item_two'
# that's it, the value of list_with_associated_items[1] should turn to
# 'item_two' now.
So, what does the structure of the list/dict need to be to make the value of list_with_associated_items[1] turn into 'item_two' as well?
I can use pointers in C language to achieve such a goal, like:
/* C example */
char buffer[40] = 'associated_item';
char *item_one_pointer = buffer;
char *item_two_pointer = buffer;
char *associated_list[2] = [item_one_pointer, item_two_pointer];
/* Change the content of item one */
strcpy(associated_list[0], 'item_one');
printf('%s\n', associated_list[1]);
/* The printing result will be 'item_one', which means the second
item has been modified because it is "associated" with item
one (sharing the same memory) */
Are there similar techniques in Python? Thanks a lot.
Yes, it is possible to have two (or more) variable names bound to the same object.
A = [1,2,3]
B = A
B[1] = 42
print(A) # prints: [1,42,3]
C = [A, B]
C[0][1] = 9
print(A, B) # prints: [1,9,3] [1,9,3]
print(C[0], C[1]) # prints: [1,9,3] [1,9,3]
As long as A, B, C[0], and C[1] are all bound (think pointing to) the same list. Changing the list elements changes it for all bindings.
However: B = ['a', 'b'] binds B to a new list. It does not change the old list. So:
B = ['a', 'b']
print(A, B) # prints: [1,9,3] ['a','b']
It's kind of like the difference between these two C statements:
item_two_pointer[1] = 'Z';
item_two_pointer = another_buffer;

Multiple-Target Assignments

I am reading a book about Python and there is a special part in the book about Multiple-Target Assignments. Now the book explains it like this:
but I dont see use of this. This makes no sense for me. Why would you use more variables?
Is there a reason to do this? What makes this so different from using: a='spam'and then printing out a 3 times?
I can only think of using it for emptying variables in one line.
A very good use for multiple assignment is setting a bunch of variables to the same number.
Below is a demonstration:
>>> vowels = consonants = total = 0
>>> mystr = "abcdefghi"
>>> for char in mystr:
... if char in "aeiou":
... vowels += 1
... elif char in "bcdfghjklmnpqrstvwxyz":
... consonants += 1
... total += 1
...
>>> print "Vowels: {}\nConsonants: {}\nTotal: {}".format(vowels, consonants, total)
Vowels: 3
Consonants: 6
Total: 9
>>>
Without multiple assignment, I'd have to do this:
>>> vowels = 0
>>> consonants = 0
>>> total = 0
As you can see, this is a lot more long-winded.
Summed up, multiple assignment is just Python syntax sugar to make things easier/cleaner.
It's mainly just for convenience. If you want to initialize a bunch of variables, it's more convenient to do them all on one line than several. The book even mentions that at the end of the snippet that you quoted: "for example, when initializing a set of counters to zero".
Besides that, though, the book is actually wrong. The example shown
a = b = c = 'spam'
is NOT equivalent to
c = 'spam'
b = c
a = b
What it REALLY does is basically
tmp = 'spam'
a = tmp
b = tmp
c = tmp
del tmp
Notice the order of the assignments! This makes a difference when some of the targets depend on each other. For example,
>>> x = [3, 5, 7]
>>> a = 1
>>> a = x[a] = 2
>>> a
2
>>> x
[3, 5, 2]
According to the book, x[1] would become 2, but clearly this is not the case.
For further reading, see these previous Stack Overflow questions:
How do chained assignments work?
What is this kind of assignment in Python called? a = b = True
Python - are there advantages/disadvantages to assignment statements with multiple (target list "=") groups?
And probably several others (check out the links on the right sidebar).
You might need to initialize several variables with the same value, but then use them differently.
It could be for something like this:
def fibonacci(n):
a = b = 1
while a < n:
c = a
a = a + b
b = c
return a
(variable swapping with tuple unpacking ommited to avoid confusion as with the downvoted answer)
An important note:
>>> a = b = []
is dangerous. It probably doesn't do what you think it does.
>>> b.append(7)
>>> print(b)
[7]
>>> print(a)
[7] # ???????
This is due to how variables work as names, or labels, in Python, rather than containers of values in other languages. See this answer for a full explanation.
Presumably you go on to do something else with the different variables.
a = do_something_with(a)
b = do_something_else_with(b)
#c is still 'spam'
Trivial example and the initialization step questionably didn't save you any work, but it's a valid way to code. There are certainly places where initializing a significant number of variables is needed, and as long as they're immutable this idiom can save space.
Though as the book pointed out, you can only use this type of grammar for immutable types. For mutable types you need to explicitly create multiple objects:
a,b,c = [mutable_type() for _ in range(3)]
Otherwise you end up with surprising results since you have three references to the same object rather than three objects.

Categories

Resources