Data Structures - lists - python

This is homework but the lesson gives me the answer already. I'm having trouble putting the words from the answer to the line of code
#Calculate all the primes below 1000
result = [1]
candidates = range(3, 1000)
base = 2
product = base
while candidates:
while product < 1000:
if product in candidates:
candidates.remove(product)
product = product + base
result.append(base)
base = candidates[0]
product = base
del candidates[0]
result.append(base)
print result
This is a version of "The Sieve of Erastothenes."
This is the explanation that was given to me.
New things in this example…
The built-in function range actually returns a list that can be used like all other lists. (It includes the first index, but not the last.) A list can be used as a logic variable. If it is not empty, then it is true — if it is empty, then it is false. Thus, while candidates means “while the list named candidates is not empty” or simply “while there are still candidates”. You can write if someElement in someList to check if an element is in a list. You can write someList.remove(someElement) to remove someElement from someList. You can append an element to a list by using someList.append(something). Actually, you can use + too (as in someList = someList+[something]) but it is not as efficient. You can get at an element of a list by giving its position as a number (where the first element, strangely, is element 0) in brackets after the name of the list. Thus someList[3] is the fourth element of the list someList. (More on this below.) You can delete variables by using the keyword del. It can also be used (as here) to delete elements from a list. Thus del someList[0] deletes the first element of someList. If the list was [1,2,3] before the deletion, it would be [2,3] afterwards.
Before going on to explaining the mysteries of indexing list elements, I will give a brief explanation of the example.
This is a version of the ancient algorithm called “The Sieve of Erastothenes” (or something close to that). It considers a set (or in this case, a list) of candidate numbers, and then systematically removes the numbers known not to be primes. How do we know? Because they are products of two other numbers.
We start with a list of candidates containing numbers [2..999] — we know that 1 is a prime (actually, it may or may not be, depending on who you ask), and we wanted all primes below 1000. (Actually, our list of candidates is [3..999], but 2 is also a candidate, since it is our first base). We also have a list called result which at all times contains the updated results so far. To begin with, this list contains only the number 1. We also have a variable called base. For each iteration (“round”) of the algorithm, we remove all numbers that are some multible of this base number (which is always the smallest of the candidates). After each iteration, we know that the smallest number left is a prime (since all the numbers that were products of the smaller ones are removed — get it?). Therefore, we add it to the result, set the new base to this number, and remove it from the candidate list (so we won’t process it again.) When the candidate list is empty, the result list will contain all the primes. Clever, huh?
What I don't understand is where they say, 'we remove all numbers that are some multiple of this base number.' Where is that in the line of code? Can someone explain line by line what the program is doing? I am a newb at this trying to understand the mechanics on each line of code and why. Thanks for any assistance.

At the start of each of the while candidates: loops, product equals base. Then in that loop you have another loop, while products < 1000. At the end of this loop you increment product by base. So product goes through each multiple of base. You then remove all the values of product which is where you "remove multiples of the base number".
Basically what the program is doing is:
...
set product to base
for each candidate
for each multiple of base, referred to as 'product'
remove product from candidates
set base to new value
reset product to new base
...

Related

Python - leave only n items in a list

There are many ways to remove n items from the list, but I couldn't find a way to keep n items.
lst = ["ele1", "ele2", "ele3", "ele4", "ele5", "ele6", "ele7", "ele8", "ele9", "ele10"]
n = 5
lst = lst[:len(lst)-(len(lst)-n)]
print(lst)
So I tried to solve it in the same way as above, but the problem is that the value of 'lst' always changes in the work I am trying to do, so that method is not valid.
I want to know how to leave only n elements in a list and remove all elements after that.
The simplest/fastest solution is:
del lst[n:]
which tells it to delete any elements index n or above (implicitly keeping 0 through n - 1, a total of n elements).
If you must preserve the original list (e.g. maybe you received it as an argument, and it's poor form to change what they passed you most of the time), you can just reverse the approach (slice out what you want to keep, rather than remove what you want to discard) and do:
truncated = lst[:n] # You have access to short form and long form
so you have both the long and short form, or if you don't need the original list anymore, but it might be aliased elsewhere and you want the aliases unmodified:
lst = lst[:n] # Replaces your local alias, but leaves other aliases unchanged
very similar to the solution of ShadowRanger, but using a slice assignment on the left hand side of the assigment oparator:
lst[n:] = []

Finding the middle of a list

So I am trying to find the median of the list "revenues" which will be named "base_revenue".
Comments:
#Assume that revenues always has an odd number of members
#Write code to set base_revenue = midpoint of the revenue list
#Hint: Use the int and len functions
revenues = [280.00, 382.50, 500.00, 632.50, 780.00]
{def findMiddle(revenues):
middle = float(len(revenues))/2
if middle % 2 != 0:
return revenues[int(middle - .5)]
else:
return (revenues[int(middle)], revenues[int(middle-1)])}
I'm getting invalid syntax. The median function itself works, but maybe there is a more efficient way to do it.
Hint: the answer to this is far simpler than you've made it. You can even do it in a single line, unless your instructor specifically requires you to define a function.
You're told the list will always have an odd number of items; all you need is the index of the middle item. Remember that in Python, indices start at 0. So, for instance, a list of length 5 will have its middle element at index 2. A list of length 7 will have its middle element at index 3. Notice a pattern?
Your assignment also reminds you about len(), which finds the length of something (such as a list), and int(), which turns things (if possible) into integers. Notably, it turns a floating-point number into the the closest integer at or below it (a "floor" function); for instance it turns 2.5 into 2.
Can you see how you might put those together to programmatically find the midpoint index?

Shuffling with constraints on pairs

I have n lists each of length m. assume n*m is even. i want to get a randomly shuffled list with all elements, under the constraint that the elements in locations i,i+1 where i=0,2,...,n*m-2 never come from the same list. edit: other than this constraint i do not want to bias the distribution of random lists. that is, the solution should be equivalent to a complete random choice that is reshuffled until the constraint hold.
example:
list1: a1,a2
list2: b1,b2
list3: c1,c2
allowed: b1,c1,c2,a2,a1,b2
disallowed: b1,c1,c2,b2,a1,a2
A possible solution is to think of your number set as n chunks of item, each chunk having the length of m. If you randomly select for each chunk exactly one item from each lists, then you will never hit dead ends. Just make sure that the first item in each chunk (except the first chunk) will be of different list than the last element of the previous chunk.
You can also iteratively randomize numbers, always making sure you pick from a different list than the previous number, but then you can hit some dead ends.
Finally, another possible solution is to randomize a number on each position sequentially, but only from those which "can be put there", that is, if you put a number, none of the constraints will be violated, that is, you will have at least a possible solution.
A variation of b above that avoids dead ends: At each step you choose twice. First, randomly chose an item. Second, randomly choose where to place it. At the Kth step there are k optional places to put the item (the new item can be injected between two existing items). Naturally, you only choose from allowed places.
Money!
arrange your lists into a list of lists
save each item in the list as a tuple with the list index in the list of lists
loop n*m times
on even turns - flatten into one list and just rand pop - yield the item and the item group
on odd turns - temporarily remove the last item group and pop as before - in the end add the removed group back
important - how to avoid deadlocks?
a deadlock can occur if all the remaining items are from one group only.
to avoid that, check in each iteration the lengths of all the lists
and check if the longest list is longer than the sum of all the others.
if true - pull for that list
that way you are never left with only one list full
here's a gist with an attempt to solve this in python
https://gist.github.com/YontiLevin/bd32815a0ec62b920bed214921a96c9d
A very quick and simple method i am trying is:
random shuffle
loop over the pairs in the list:
if pair is bad:
loop over the pairs in the list:
if both elements of the new pair are different than the bad pair:
swap the second elements
break
will this always find a solution? will the solutions have the same distribution as naive shuffling until finding a legit solution?

Why list created by code does not include values greater than 99,999 (Python)

This code is meant to find the largest palindrome created by the product of two 3 digit numbers.
I'm sure that there are more efficient ways to solve this problem, and you're welcome to post them, but at this stage in my learning, I'm most interested in how I could edit the code that I've written to make it work correctly.
When I run this code, it correctly creates a sorted list of palindromes, but the largest number in the list is 99,999. I can't see why the list doesn't extend past this.
def palindromes():
product_list=[]
palindrome_list=[]
for a in range(100,1000):
for b in range(100,1000):
product_list.append(a*b)
for product in product_list:
product = str(product)
if len(product) % 2 == 0:
if product[0]==product[5] and product[1]==product[4] and product[2]==product[3]:
palindrome_list.append(product)
if len(product) % 2 != 0:
if product[0]==product[4] and product[1]==product[3]:
palindrome_list.append(product)
palindrome_list = sorted(set(palindrome_list))
return palindrome_list
print(palindromes())
Your code is doing what you told it to do.
Contrary to your assumption, there are lots of numbers larger than 99,999 in your list.
I have executed your code, and the first result in the list is:101101
which obviously is > 99999.
But there are other bigger like 561165 and 888888, and they are also in your list.
In total the list contains 650 palindromes. These are the only you could generate with your start condition
you cannot reach 999,999 because it cannot be reach in your for loops...
Python just did what you told it.
EDIT: Like Oscar's answer says, you should put your limit to 1001, then the palindrome 999,999 will come to you.
That's because this part is incorrect:
palindrome_list.append(product)
You were sorting strings, not numbers - even though all the results were appearing in the list, they were being sorted as strings, and appeared in a different order than you expected. Change the above code in the two places where it appears, it should look like this:
palindrome_list.append(int(product))
Now it's easy to see what's the largest palindrome created by the product of two 3 digit numbers (the last one in the list):
palindromes()[-1]
=> 906609

Recursion in Python 3.2

I am trying to wrap my head around recursion and have posted a working algorithm to produce all the subsets of a given list.
def genSubsets(L):
res = []
if len(L) == 0:
return [[]]
smaller = genSubsets(L[:-1])
extra = L[-1:]
new = []
for i in smaller:
new.append(i+extra)
return smaller + new
Let's say my list is L = [0,1], correct output is [[],[0],[1],[0,1]]
Using print statements I have narrowed down that genSubsets is called twice before I ever get to the for loop. That much I get.
But why does the first for loop initiate a value of L as just [0] and the second for loop use [0,1]? How exactly do the recursive calls work that incorporate the for loop?
I think this would actually be easier to visualize with a longer source list. If you use [0, 1, 2], you'll see that the recursive calls repeatedly cut off the last item from the list. That is, recusion builds up a stack of recursive calls like this:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0])
genSubsets([])
At this point it hits the "base case" of the recursive algorithm. For this function, the base case is when the list given as a parameter is empty. Hitting the base case means it returns an list containing an empty list [[]]. Here's how the stack looks when it returns:
genSubsets([0,1,2])
genSubsets([0,1])
genSubsets([0]) <- gets [[]] returned to it
So that return value gets back to the previous level, where it is saved in the smaller variable. The variable extra gets assigned to be a slice including only the last item of the list, which in this case is the whole contents, [0].
Now, the loop iterates over the values in smaller, and adds their concatenation with extra to new. Since there's just one value in smaller (the empty list), new ends up with just one value too, []+[0] which is [0]. I assume this is the value you're printing out at some point.
Then the last statement returns the concatenation of smaller and new, so the return value is [[],[0]]. Another view of the stack:
genSubsets([0,1,2])
genSubsets([0,1]) <- gets [[],[0]] returned to it
The return value gets assigned to smaller again, extra is [1], and the loop happens again. This time, new gets two values, [1] and [0,1]. They get concatenated onto the end of smaller again, and the return value is [[],[0],[1],[0,1]]. The last stack view:
genSubsets([0,1,2]) <- gets [[],[0],[1],[0,1]] returned to it
The same thing happens again, this time adding 2s onto the end of each of the items found so far. new ends up as [[2],[0,2],[1,2],[0,1,2]].
The final return value is [[],[0],[1],[0,1],[2],[0,2],[1,2],[0,1,2]]
I am no big fan of trying to visualize the entire call graph for recursive function to understand what they do.
I believe there is a much simpler way:
Enter fairy tale land where recursive functions do the right thing™.
Just assume that genSubsets(L) works:
# This computes the powerset of the list L minus the last element
smaller = genSubsets(L[:-1])
Because this magically worked, the only entries that are missing are those, that contain the last element.
This fragment constructs all those missing subsets:
new = []
for i in smaller:
new.append(i+extra)
Now we have those subsets containing the last element in new and we have those subsets not containing the last element in smaller.
It follows that we must now have all subsets, so we can return new + smaller.
The only thing left is the base case to make sure the recursion stops. Because the empty set (or list in this case) is an element of every power set, we can use that to stop the recursion: Requesting the powerset of an empty set is a set containing the empty set. So our base case is correct. Since every recursive step removes one element off the list, the base case must be encountered at some time.
Thus, the code really does produce the power set.
Note: The principle behind this is that of induction. If something works for some known n0, and we can prove that: The algorithm working for n implies it works for n+1, it must thus work for all n &geq; n0.

Categories

Resources