loop until all elements have been accessed N times in python - python

I have a group of buckets, each with a certain number of items in them. I want to make combinations with one item from each bucket. The loop should keep making different combinations until each item has participated in at least some defined number.
I can easily see how to run the loop and stop once a single element has been accessed a certain number of times. However I can't see how to set a minimum cutoff point beyond searching through all the elements in all the buckets to check their access number after every iteration of the loop.

itertools.product is one way (a very systematic one) to make the "combinations" you request (don't confuse with the .combinations function of course) -- or you could make them randomly with random.choose from each bucket; not sure which one is for you since I don't know what your real purpose is.
Anyway, I'd keep track of how many combos each item has been in with a dict (or one dict per bucket, if there can be overlap in items among buckets). Or, you could use a collections.Counter in Python 2.7, if that's your version.
At any rate, one possibility to do what you request is: the moment an item's count reaches N, remove that item from its bucket (or all buckets, if there's overlap and that's the semantics you require) -- except that if this leaves the bucket empty, restore the bucket's contents and mark that bucked as "done" (you don't need to remove items from a done bucket) e.g. by adding the bucket's index to a set.
You're done when all buckets are done (whether it be randomly or systematically).
Need some code to explain this better? Then please specify the overlap semantics (if overlap is possible) and the systematic-or-random requirements you have.

try
visits = defaultdict(int)
# do at each node visiting
visits[n] += 1
if visits[n] >= MAX_VISITS:
break
print 'done'

Use a dictionary with the items as keys. Every time the item is used, update its count. Then check to see whether all the values are at least above the threshold, ie:
counter = dict()
while min(counter.values) < threshold:
# make a combination
# and update the dictionary

In vanilla Python, this seems to do the job:
buckets = [ [1,2,3],[4],[5,6],[7,8,9,0] ]
def combo(b, i = 0, pref = []):
if len(b) > i:
c = b[i]
for v in c:
combo(b, i + 1, pref + [v])
else:
print pref
combo(buckets)
Output:
[1, 4, 5, 7]
[1, 4, 5, 8]
[1, 4, 5, 9]
[1, 4, 5, 0]
[1, 4, 6, 7]
[1, 4, 6, 8]
[1, 4, 6, 9]
[1, 4, 6, 0]
[2, 4, 5, 7]
[2, 4, 5, 8]
[2, 4, 5, 9]
[2, 4, 5, 0]
[2, 4, 6, 7]
[2, 4, 6, 8]
[2, 4, 6, 9]
[2, 4, 6, 0]
[3, 4, 5, 7]
[3, 4, 5, 8]
[3, 4, 5, 9]
[3, 4, 5, 0]
[3, 4, 6, 7]
[3, 4, 6, 8]
[3, 4, 6, 9]
[3, 4, 6, 0]
There is no doubt a more Pythonic way of doing it.

Related

Removing duplicate integers in python

As a beginner python learner I bumped up to a wall at this point and couldn't figure it out.
What I am trying to do is to be able to pick integers in multiple lists and remove the duplicates among them. Then make a copy list which does not include the duplicates.
def my_function(x):
return list(dict.fromkeys(x))
liss = [[1,2],[3,4,5,6],[1,4,3,99]]
list2 = my_function(str(liss))
list1 = [x for i in list2 for x in i]
print(list1)
Consider using a set comprehension:
>>> xss = [[1, 2], [3, 4, 5, 6], [1, 4, 3, 99]]
>>> xs_no_dups = list({x for xs in xss for x in xs})
>>> xs_no_dups
[1, 2, 3, 4, 5, 6, 99]
Please write code to implement an algorithm named "merge-sort"
You can click on these links to read the wikipedia page on merge-sort
Write code to accomplish the following tasks:
Sort each list. For example [4, 2, 1, 2, 5, 0] becomes [0, 1, 2, 3, 4, 5]
merge the sorted lists. For example, merge [1, 3, 4, 9] and [2, 6, 20] to form [1, 2, 3, 4, 6, 20]
traverse the final sorted merged list from left to right. If the current element is different from the previous element, then output the current element.
Consider the following list, named the_list:
OUTPUTS (VALUES) [1, 1, 1, 5, 5, 6, 9, 9, 9, 9, 14, 14, 14]
INPUTS (INDICIES) 0 1 2 3 4 5 6 7 8 9 10 11 12
Go through the list from left-to-right.
Notice that the_list[0] is 1.
1 has not been seen before.
So, you should send 1 to the output stream.
We are finished processing the_list[0]
Next, take a look at the_list[1].
the_list[1] is equal to the previous value.
So, do NOT send the_list[1] to the output stream.
Only when the input changes do we send the input to the output stream.
As long as the input is the same as what it was 2 seconds ago, we send no output to the output stream.
def foobar(the_list):
# `prev` stands for the English word `previous`
the_list = iter(the_list)
prev = next(the_list)
yield prev
for elem in the_list:
if elem != prev:
yield elem
prev = elem

How to make ascending sublists in a list of integers go in descending order?

Working on some example questions, the particular one asks to make a function which would take a list and return a new one which would make every ascending sublist in the list go in descending order and leave the descending sublists as they are. For example, given the list [1,2,3,4,5], I need the list [5,4,3,2,1] or given a list like [1,2,3,5,4,6,7,9,8] would return [5,3,2,1,9,7,6,4,8]
Here's what I have so far, but it does not do anything close to what I'd like it to do:
def example3(items):
sublst = list()
for i in items:
current_element = [i]
next_element = [i+1]
if next_element > current_element:
sublst = items.reverse()
else:
return items
return sublst
print (example3([1,2,3,2])) #[[1, 2, 3, 2], [1, 2, 3, 2], [1, 2, 3, 2], [1, 2, 3, 2]]
EDIT:
I feel like people are a little confused as to what I want to do in this case, heres a better example of what I'd like my function to do. Given a list like: [5, 7, 10, 4, 2, 7, 8, 1, 3] I would like it to return [10, 7, 5, 4, 8, 7, 2, 3, 1]. As you can see all the sublists that are in descending order such as ([5,7,10]) gets reversed to [10, 7, 5].
It was a bit challenging to figure out what you need.
I think you want something like as follows:
import random
l = [5, 7, 10, 4, 2, 7, 8, 1, 3]
bl =[]
while True:
if len(l) == 0:
break
r = random.randint(0, len(l))
bl.extend(l[r:None:-1])
l = l[r+1:]
print(bl)
Out1:
[10, 7, 5, 4, 8, 7, 2, 3, 1]
Out2:
[10, 7, 5, 2, 4, 1, 8, 7, 3]
Out3:
[3, 1, 8, 7, 2, 4, 10, 7, 5]
Out4:
[2, 4, 10, 7, 5, 3, 1, 8, 7]
etc.
If you want a specific reverse random list:
import random
loop_number = 0
while True:
l = [5, 7, 10, 4, 2, 7, 8, 1, 3]
bl =[]
while True:
if len(l) == 0:
break
r = random.randint(0, len(l))
bl.extend(l[r:None:-1])
l = l[r+1:]
loop_number += 1
if bl == [10, 7, 5, 4, 8, 7, 2, 3, 1]:
print(bl)
print("I tried {} times".format(loop_number))
break
Out:
[10, 7, 5, 4, 8, 7, 2, 3, 1]
I tried 336 times
The general algorithm is to keep track of the current ascending sublist you are processing using 2 pointers, perhaps a "start" and "curr" pointer. curr iterates over each element of the list. As long as the current element is greater than the previous element, you have an ascending sublist, and you move curr to the next number. If the curr number is less than the previous number, you know your ascending sublist has ended, so you collect all numbers from start to curr - 1 (because array[curr] is less than array[curr - 1] so it can't be part of the ascending sublist) and reverse them. You then set start = curr before incrementing curr.
You will have to deal with the details of the most efficient way of reversing them, as well as the edge cases with the pointers like what should the initial value of start be, as well as how to deal with the case that the current ascending sublist extends past the end of the array. But the above paragraph should be sufficient in getting you to think in the right direction.

Remove duplicate numbers from a list

I was attempting to remove all duplicated numbers in a list.
I was trying to understand what is wrong with my code.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
The result I got was:
[1, 1, 6, 5, 2, 3]
I guess the idea is to write code yourself without using library functions. Then I would still suggest to use additional set structure to store your previous items and go only once over your array:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
unique = set()
for x in numbers:
if x not in unique:
unique.add(x)
numbers = list(unique)
print(numbers)
If you want to use your code then the problem is that you modify collection in for each loop, which is a big NO NO in most programming languages. Although Python allows you to do that, the problem and solution are already described in this answer: How to remove items from a list while iterating?:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
Using a shallow copy of the list:
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers) # [1, 6, 5, 2, 3]
Alternatives:
Preserving the order of the list:
Using dict.fromkeys()
print(list(dict.fromkeys(numbers).keys())) # [1, 6, 5, 2, 3]
Using more_itertools.unique_everseen(iterable, key=None):
from more_itertools import unique_everseen
print(list(unique_everseen(numbers))) # [1, 6, 5, 2, 3]
Using pandas.unique:
import pandas as pd
print(pd.unique(numbers).tolist()) # [1, 6, 5, 2, 3]
Using collections.OrderedDict([items]):
from collections import OrderedDict
print(list(OrderedDict.fromkeys(numbers))) # [1, 6, 5, 2, 3]
Using itertools.groupby(iterable[, key]):
from itertools import groupby
print([k for k,_ in groupby(numbers)]) # [1, 6, 5, 2, 3]
Ignoring the order of the list:
Using numpy.unique:
import numpy as np
print(np.unique(numbers).tolist()) # [1, 2, 3, 5, 6]
Using set():
print(list(set(numbers))) # [1, 2, 3, 5, 6]
Using frozenset([iterable]):
print(list(frozenset(numbers))) # [1, 2, 3, 5, 6]
Why don't you simply use a set:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(set(numbers))
print(numbers)
Before anything, the first advice I can give is to never edit over an array that you are looping. All kinds of wacky stuff happens. Your code is fine (I recommend reading other answers though, there's an easier way to do this with a set, which pretty much handles the duplication thing for you).
Instead of removing number from the array you are looping, just clone the array you are looping in the actual for loop syntax with slicing.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
print("Final")
print(numbers)
The answer there is numbers[:], which gives back a clone of the array. Here's the print output:
[1, 1, 1, 6, 5, 5, 2, 3]
[1, 1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
Final
[1, 6, 5, 2, 3]
Leaving a placeholder here until I figure out how to explain why in your particular case it's not working, like the actual step by step reason.
Another way to solve this making use of the beautiful language that is Python, is through list comprehension and sets.
Why a set. Because the definition of this data structure is that the elements are unique, so even if you try to put in multiple elements that are the same, they won't appear as repeated in the set. Cool, right?
List comprehension is some syntax sugar for looping in one line, get used to it with Python, you'll either use it a lot, or see it a lot :)
So with list comprehension you will iterate an iterable and return that item. In the code below, x represents each number in numbers, x is returned to be part of the set. Because the set handles duplicates...voila, your code is done.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
nubmers_a_set = {x for x in numbers }
print(nubmers_a_set)
This seems like homework but here is a possible solution:
import numpy as np
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
filtered = list(np.unique(numbers))
print(filtered)
#[1, 2, 3, 5, 6]
This solution does not preserve the ordering. If you need also the ordering use:
filtered_with_order = list(dict.fromkeys(numbers))
Why don't you use fromkeys?
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(dict.fromkeys(numbers))
Output: [1,6,5,2,3]
The flow is as follows.
Now the list is [1, 1, 1, 1, 6, 5, 5, 2, 3] and Index is 0.
The x is 1. The numbers.count(1) is 4 and thus the 1 at index 0 is removed.
Now the numbers list becomes [1, 1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 1.
The x is 1. The numbers.count(1) is 3 and thus the 1 and index 1 is removed.
Now the numbers list becomes [1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 2.
The x will be 6.
etc...
So that's why there are two 1's.
Please correct me if I am wrong. Thanks!
A fancy method is to use collections.Counter:
>>> from collections import Counter
>>> numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> c = Counter(numbers)
>>> list(c.keys())
[1, 6, 5, 2, 3]
This method have a linear time complexity (O(n)) and uses a really performant library.
You can try:
from more_itertools import unique_everseen
items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
list(unique_everseen(items))
or
from collections import OrderedDict
>>> items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]
more you can find here
How do you remove duplicates from a list whilst preserving order?

Different ways of Iterating over a list

Can somebody tell me what's the difference between this code:
x = [1, 2, 3, 4, 5, 6, 7]
for i in x[:]:
if i == 5:
x.insert(0, i)
And this code:
x = [1, 2, 3, 4, 5, 6, 7]
for i in x:
if i == 5:
x.insert(0, i)
Why doesn't the second one work? I know it is mentioned in the Python tutorial, but I can't quite understand it.
In the first version, you create a copy (by slicing the list from the beginning to the end), in the second one you're iterating over the original list.
If you iterate over a container, its size can't change during iteration, for good reasons (see below). But since you call x.insert, the size of your list changes.
If you execute the second version, it actually doesn't throw an error immediately, but continues indefinitely, filling the list up with more and more 5s:
Once you're at the list index 4 (and i is therefore 5), you're inserting a 5 at the beginning of the list:
[5, 1, 2, 3, 4, 5, 6, 7]
You then continue in the loop, implicitly increasing your index to 5, which is now again 5, because the whole list got shifted one to the right, due to your insertion, so 5 will be inserted again:
[5, 5, 1, 2, 3, 4, 5, 6, 7]
This goes on "forever" (until there's a MemoryError).

Smallest number in a particular part of my list

I have a question about python. I have to sort a list of random numbers in a particular way (it's not allowed to use sort()). I'll try to explain:
I have to search for the smallest number, and swap this number with the number at the first position in the list.
Then, I search again for the smallest number, but this time ignore the first number in my list because this one is already sorted. So, I should start searching for the smallest number from the second number (index 1) till the end of the list. The smallest number then found, should be swapped with the second number in the list(so the index 1).
I hope you understand my problem. This is the code I wrote so far, but I get errors and/or the sorting isn't correct.
array = random_integers(10,size=10)
my_list = list(array)
for i in range(len(my_list)):
print my_list
a = min(my_list[i:len(my_list)])
b = my_list.index(a)
my_list[i],my_list[b]=my_list[b],my_list[i]
print my_list
I think there's a problem in my range, and a problem with the
a = min(my_list[i:len(my_list)])
I want to search for the smallest number, but not in the ENTIRE list how can I do this?
The problem occurs on this line:
b = my_list.index(a)
since this searches for the first occurrence of a in all of my_list. If the same number occurs twice, then b will always correspond to the smallest such index, which might be less than i. So you might end up moving a number which has already been sorted.
The obvious thing to try is to slice my_list before calling index:
my_list[i:].index(a)
but note that index will return values between 0 and N-i. We want numbers between i and N. So be sure to add i to the result:
b = my_list[i:].index(a)+i
Thus, the easiest way to fix your code as it presently exists is:
for i in range(len(my_list)):
a = min(my_list[i:])
b = my_list[i:].index(a)+i
my_list[i], my_list[b] = my_list[b], my_list[i]
but notice that min is searching through all the items in my_list[i:] and then the call to index is traversing the same list a second time. You could find b in one traversal like this:
b = min(range(i, N), key=my_list.__getitem__)
Demo:
import numpy as np
array = np.random.random_integers(10,size=10)
my_list = list(array)
N = len(my_list)
for i in range(N):
b = min(range(i, N), key=my_list.__getitem__)
my_list[i], my_list[b] = my_list[b], my_list[i]
print my_list
yields
[3, 10, 9, 6, 5, 3, 6, 8, 8, 4]
[3, 3, 9, 6, 5, 10, 6, 8, 8, 4]
[3, 3, 4, 6, 5, 10, 6, 8, 8, 9]
[3, 3, 4, 5, 6, 10, 6, 8, 8, 9]
[3, 3, 4, 5, 6, 10, 6, 8, 8, 9]
[3, 3, 4, 5, 6, 6, 10, 8, 8, 9]
[3, 3, 4, 5, 6, 6, 8, 10, 8, 9]
[3, 3, 4, 5, 6, 6, 8, 8, 10, 9]
[3, 3, 4, 5, 6, 6, 8, 8, 9, 10]
[3, 3, 4, 5, 6, 6, 8, 8, 9, 10]
If you want the smallest number from a list you can use min(). If you want a part of a list you can use list slicing: my_list[1:]. Put the two together and you get the smallest number from a part of your list. However, you don't need to do this, as you can .pop() from the list instead.
sorted_list = []
while my_list:
n = min(my_list)
sorted_list.append(my_list.pop(my_list.index(n)))
If you're using numpy arrays then instead of my_list.index(min(my_list)) you can use the .argmin() method.
While this type of sorting is good for an introduction, it is not very efficient. You may want to consider looking at the merge sort, and also the Python's built-in timsort.

Categories

Resources