Python Lists and Yielding - python

I have the following (correct) solution to Project Euler problem 24. I'm relatively new to Python, and am stumped on a couple of Python points.
First, the code:
# A permutation is an ordered arrangement of objects. For example, 3124 is one possible permutation of the digits 1, 2, 3 and 4.
# If all of the permutations are listed numerically or alphabetically, we call it lexicographic order.
# The lexicographic permutations of 0, 1 and 2 are: 012 021 102 120 201 210
# What is the millionth lexicographic permutation of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9?
permutations = []
def getLexicographicPermutationsOf(digits, state):
if len(digits) == 0:
permutations.append(str(state))
for i in range(len(digits)):
state.append(digits[i])
rest = digits[:i] + digits[i+1:]
getLexicographicPermutationsOf(rest, state)
state.pop()
digits = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
getLexicographicPermutationsOf(digits, [])
print(permutations[999999])
My first query is regarding the use of the yield statement. Instead of defining the permutations list at the top, my first design was to replace the permutations.append line with yield state. I would then assign the return value of the method to a variable. I checked, and the return value was a generator, as expected. However, looping over its contents indicated that no values were being generated. Am I missing something here?
My second query is about the final line - printing a value from the list. When I run this, it outputs the values as though it was a list, whereas it should be a string. In fact, replacing print(permutations[999999]) with print(type(permutations[999999])) results in < class str>. So why is it being printed like a list (with square brackets, separated by commas)?

When you recursively call getLexicographicPermutationsOf, you need to yield results from there too.
for result in getLexicographicPermutationsOf(rest, state):
yield result
permutations.append(str(state)) creates a string representation of state, which is a list. This explains why it looks like a list when printed.

There is a much less computationally intensive way to calculate this. It might actually not be so easy to write a program, but it lets you work out the answer by hand. :) (Hint: how many permutations are there? How many of them start with 0?)
Also, range(len(x)) is highly un-Pythonic. Granted, it would be nice to have the indices in order to slice the list to remove the 'current' element... but there is another way: just ask Python to remove the elements with that value (since there is only one such element). That allows us to loop over element values directly:
for digit in digits:
state.append(digit)
rest = digits[:]
rest.remove(digit) # a copy with the current value removed.
getLexicographicPermutationsOf(rest, state)
state.pop()
range is primarily useful for actually creating data ranges - such as you initialize digits with. :)
Am I missing something here?
You're missing that just calling a function recursively won't magically put its results anywhere. In fact, even if you 'yield' the results of a recursive call, it still won't do what you want - you'll end up with a generator that returns generators (that return generators, etc.... down to the base of the recursion) when you want one generator. (FogleBird's answer explains how to deal with this: you must take the generator from the recursive call, and explicitly "feed" its yielded elements into the current generator.)
But there is a much simpler way anyway: the library already has this algorithm built in.
The entire program can be done thus:
from itertools import permutations, islice
print next(islice(permutations(range(10)), 1000000, None))
why is it being printed like a list (with square brackets, separated by commas)?
Because the string contains square brackets and commas. That's what you get when you use str on a list (state, in this case).

Related

Accessing the lowest value when comparing two python lists

I am comparing two lists of integers and am trying to access the lowest value without using a for-loop as the lists are quite large. I have tried using set comparison, yet I receive an empty set when doing so. Currently my approach is:
differenceOfIpLists = list(set(reservedArray).difference(set(ipChoicesArray)))
I have also tried:
differenceOfIpLists = list(set(reservedArray) - set(ipChoicesArray))
And the lists are defined as such:
reservedArray = [169017344, 169017345, 169017346, 169017347, 169017348, 169017349, 169017350, 169017351, 169017352, 169017353, 169017354, 169017355, 169017356, 169017357, 169017358, 169017359, 169017360, 169017361, 169017362, 169017363, 169017364, 169017365, 169017366, 169017367, 169017368, 169017369, 169017600, 169017601, 169017602, 169017603, 169017604, 169017605, 169017606, 169017607, 169017608, 169017609, 169017610, 169017611, 169017612, 169017613, 169017614, 169017615, 169017616, 169017617, 169017618, 169017619...]
ipChoicesArray = [169017344, 169017345, 169017346, 169017347, 169017348, 169017349, 169017350, 169017351, 169017352, 169017353, 169017354, 169017355, 169017356, 169017357, 169017358, 169017359, 169017360, 169017361, 169017362, 169017363, 169017364, 169017365, 169017366, 169017367, 169017368, 169017369, 169017370, 169017371, 169017372, 169017373, 169017374, 169017375, 169017376, 169017377, 169017378, 169017379, 169017380, 169017381, 169017382...]
Portions of these lists are the same, yet they are vastly different as the lengths are:
reservedArrayLength = 6658
ipChoicesArray = 65536
I have also tried converting these values to strings and doing the same style of comparison, also to no avail.
Once I am able to extract a list of the elements in the ipChoicesArray that are not in the reservedArray, I will return the smallest element after sorting.
I do not believe that I am facing a max length issue...
Subtracting the sets should work as you desire, see below:
ipChoicesArray = [1,3,4,7,1]
reservedArray = [1,2,5,7,8,2,1]
min(list(set(ipChoicesArray) - set(reservedArray)))
###Output###
[3]
By the way, max list is a length of 536,870,912 elements
without using a for-loop as the lists are quite large
The presumption that a for-loop is a poor choice because the list is large is likely incorrect. Creation of a set from a list and vice-versa will not only iterate through the containers under the hood anyway (just like a for-loop) in addition to allocating new containers and taking up more memory. Profile your code before you assume something won't perform well.
That aside, in your code it seems the reason you are getting an empty result is because your difference is inverted. To get the elements in ipChoicesArray but not in reservedArray you want to difference the latter from the former:
diff = set(ipChoicesArray) - set(reservedArray)
The obvious solution (you just did the set difference in the wrong direction):
print(min(set(ipChoicesArray) - set(reservedArray)))
You said they're always sorted, and your reverse difference being empty (and thinking about what you're doing) suggests that the "choices" are a superset of the "reserved", so then this also works and could be faster:
print(next(c for c, r in zip(ipChoicesArray, reservedArray) if c != r))
Disclaimer: Python docs states that
A set is an unordered collection with no duplicate elements.
But I can see that the output of an unordered set is an ordered set:
s = {'z', 1, 0, 'a'}
s #=> {0, 1, 'a', 'z'}
next(iter(s)) #=> 0
So, I don't know if this approach is reliable. Maybe some other user can deny or confirmi this with an appropriate reference to the set behaviour.
Having said this...
Don't know if I'm getting the point, but..
Not knowing where the smallest value is, you could use this approach (here using smaller values and shorter list):
a = [2, 5, 5, 1, 6, 7, 8, 9]
b = [2, 3, 4, 5, 6, 6, 1]
Find the smallest of the union:
union = set_a | set_b
next(iter(union))
#=> 1
Or just:
min([next(iter(set_a)), next(iter(set_b))])
#=> 1
Or, maybe this fits better your question:
next(iter(set_a-set_b)) #=> 8

When splicing a Python list how would I specify that I want it to create including all the elements following the first one specified?

numbers = [1, 2, 3, 4, 5]
print(numbers[1:])
print(numbers[1:0])
I want to get the same result as the first print statement but with a number after the colon. My question is what number would I put after that colon to get the same result(new list with numbers from index 1 to end of list)?
You'd need to do something like
print(numbers[1:len(numbers)])
Anything else leads to a shorter list than you want.
numbers[1:0] # empty list
numbers[1:-1] # loses the last value
of course, numbers[1:] is actually the best way to go about it, so I'm curious why you explicitly need the number included.

Deleting items that have the place swapped around in a returned list

My function takes a number, and a list of numbers.
If 2 numbers in the list add up to the original number, in the form [Num1, Num2].
Now I don't want any "duplicates" i.e. I only want [4, -7] returned and not [4, -7], [-7, 4].
def pairs(n, num_list):
newest_list = []
for j in range(len(num_list)):
for i in range(len(num_list)-1):
if num_list[j] + num_list[i+1] == n:
newest_list.append([num_list[j], num_list[i+1]])
return newest_list
Now I'd like a hint rather than code posted, a simple.
My question is:
Do I have the ability to do that within my code, and if so, a hint would be great, or will I need to define another function to do that for me?
You definitely have the ability to do that within your code.
A hint to complete this would be to think about at what point in your code it makes sense to stop searching for further matches and to return what you've found. Let me know if that's too cryptic!
You can still do that in your current code by simply appending these two numbers into a Set. For more info, this will help you.
if you have 2 lists l1, and l2 where:
l1=[1,2]
l2=[2,1]
If you convert them to sets, you can compare them and they will evaluate to True if they have the same elements, no matter what the order is:
set(l1) == set(l2) # this evaluates to True
In your if condition, before appending the numbers, you can check if the set set([num_list[j], num_list[i+1]]) is already in newest_list.
I am tempted to write some code, but you said not to, so I'll leave it here :p
You can leave your code the way it is, but before you return the list, you can filter the list with a predicate that the pair [a,b] is only accepted if pair [b,a] is not in the list
When adding a pair [a, b] to the result list, sort the pair, then see if it's in the result list. If so, don't add it.
Also, consider using a Python set.

Insert number to a list

I have an ordered dictionary like following:
source =([('a',[1,2,3,4,5,6,7,11,13,17]),('b',[1,2,3,12])])
I want to calculate the length of each key's value first, then calculate the sqrt of it, say it is L.
Insert L to the positions which can be divided without remainder and insert "1" after other number.
For example, source['a'] = [1,2,3,4,5,6,7,11,13,17] the length of it is 9.
Thus sqrt of len(source['a']) is 3.
Insert number 3 at the position which can be divided exactly by 3 (eg. position 3, position 6, position 9) if the position of the number can not be divided exactly by 3 then insert 1 after it.
To get a result like folloing:
result=([('a',["1,1","2,1","3,3","4,1","5,1","6,3","7,1","11,1","13,3","10,1"]),('b',["1,1","2,2","3,1","12,2"])]
I dont know how to change the item in the list to a string pair. BTW, this is not my homework assignment, I was trying to build a boolean retrival engine, the source data is too big, so I just created a simple sample here to explain what I want to achive :)
As this seems to be a homework, I will try to help you with the part you are facing problem with
I dont know how to change the item in the list to a string pair.
As the entire list needs to be updated, its better to recreate it rather than update it in place, though its possible as lists are mutable
Consider a list
lst = [1,2,3,4,5]
to convert it to a list of strings, you can use list comprehension
lst = [str(e) for e in lst]
You may also use built-in map as map(str,lst), but you need to remember than in Py3.X, map returns a map object, so it needs to be handled accordingly
Condition in a comprehension is best expressed as a conditional statement
<TRUE-STATEMENT> if <condition> else <FALSE-STATEMENT>
To get the index of any item in a list, your best bet is to use the built-in enumerate
If you need to create a formatted string expression from a sequence of items, its suggested to use the format string specifier
"{},{}".format(a,b)
The length of any sequence including a list can be calculated through the built-in len
You can use the operator ** with fractional power or use the math module and invoke the sqrt function to calculate the square-root
Now you just have to combine each of the above suggestion to solve your problem.

Recursion in Python 3.2

I am trying to wrap my head around recursion and have posted a working algorithm to produce all the subsets of a given list.
def genSubsets(L):
res = []
if len(L) == 0:
return [[]]
smaller = genSubsets(L[:-1])
extra = L[-1:]
new = []
for i in smaller:
new.append(i+extra)
return smaller + new
Let's say my list is L = [0,1], correct output is [[],[0],[1],[0,1]]
Using print statements I have narrowed down that genSubsets is called twice before I ever get to the for loop. That much I get.
But why does the first for loop initiate a value of L as just [0] and the second for loop use [0,1]? How exactly do the recursive calls work that incorporate the for loop?
I think this would actually be easier to visualize with a longer source list. If you use [0, 1, 2], you'll see that the recursive calls repeatedly cut off the last item from the list. That is, recusion builds up a stack of recursive calls like this:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0])
genSubsets([])
At this point it hits the "base case" of the recursive algorithm. For this function, the base case is when the list given as a parameter is empty. Hitting the base case means it returns an list containing an empty list [[]]. Here's how the stack looks when it returns:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0]) <- gets [[]] returned to it
So that return value gets back to the previous level, where it is saved in the smaller variable. The variable extra gets assigned to be a slice including only the last item of the list, which in this case is the whole contents, [0].
Now, the loop iterates over the values in smaller, and adds their concatenation with extra to new. Since there's just one value in smaller (the empty list), new ends up with just one value too, []+[0] which is [0]. I assume this is the value you're printing out at some point.
Then the last statement returns the concatenation of smaller and new, so the return value is [[],[0]]. Another view of the stack:
genSubsets([0,1,2])
genSubsets([0,1]) <- gets [[],[0]] returned to it
The return value gets assigned to smaller again, extra is [1], and the loop happens again. This time, new gets two values, [1] and [0,1]. They get concatenated onto the end of smaller again, and the return value is [[],[0],[1],[0,1]]. The last stack view:
genSubsets([0,1,2]) <- gets [[],[0],[1],[0,1]] returned to it
The same thing happens again, this time adding 2s onto the end of each of the items found so far. new ends up as [[2],[0,2],[1,2],[0,1,2]].
The final return value is [[],[0],[1],[0,1],[2],[0,2],[1,2],[0,1,2]]
I am no big fan of trying to visualize the entire call graph for recursive function to understand what they do.
I believe there is a much simpler way:
Enter fairy tale land where recursive functions do the right thing™.
Just assume that genSubsets(L) works:
# This computes the powerset of the list L minus the last element
smaller = genSubsets(L[:-1])
Because this magically worked, the only entries that are missing are those, that contain the last element.
This fragment constructs all those missing subsets:
new = []
for i in smaller:
new.append(i+extra)
Now we have those subsets containing the last element in new and we have those subsets not containing the last element in smaller.
It follows that we must now have all subsets, so we can return new + smaller.
The only thing left is the base case to make sure the recursion stops. Because the empty set (or list in this case) is an element of every power set, we can use that to stop the recursion: Requesting the powerset of an empty set is a set containing the empty set. So our base case is correct. Since every recursive step removes one element off the list, the base case must be encountered at some time.
Thus, the code really does produce the power set.
Note: The principle behind this is that of induction. If something works for some known n0, and we can prove that: The algorithm working for n implies it works for n+1, it must thus work for all n &geq; n0.

Categories

Resources