Random sampling from a set of integers - python

I am working with python 3.2 and I spent a lot of time trouble shooting this, and I still can't seem to wrap my brain around it.
number = random.randint ( x0 ,xn )
I'm generating a random number. It's purpose is to make my code come at me differently everytime.
For example I have 10 variables of text that I have written. I have solved the problem of not having these variables appear in the same order at each program run.
The issue I have is that they now appear randomly everytime. It picks one out of 10 everytime, instead the first time 10 and next 9. I can't seem to find out how to exclude the previous ones.
thelist = [0]
while i < x
if number in thelist:
>>>repeat<<<
else:
thelist.append (number)
if ( number == x0 ):
>>>something<<<
elif ( number == x1 ):
>>>something<<<
This is what I would imagine the code would look like, everytime you loop one more number gets appended to the list, so that everytime it picks a number already in the list it repeats the loop again until it then has used all the numbers that random.randint can pull.

Here's a shuffle function:
import random
max = 15
x = list(range(max+1))
for i in range(max, 0, -1):
n = random.randint(0, i)
x[n], x[i] = x[i], x[n]
This starts with a sorted list of numbers [0, 1, ... max].
Then, it chooses a number from index 0 to index max, and swaps it with index max.
Then, it chooses a number from index 0 to index max-1, and swaps it with index max-1.
And so on, for max-2, max-3, ... 1
As yosukesabai rightly notes, this has the same effect as calling random.sample(range(max+1), max+1). This picks max + 1 unique random values from range(max+1). In other words, it just shuffles the order around. Docs: http://docs.python.org/2/library/random.html#random.sample
If you wanted something more along the lines of your proposed algorithm, you could do:
import random
max = 15
x = range(max+1)
l = []
for _ in range(max+1):
n = random.randint(0,max)
while n in l:
n = random.randint(0,max)
l.append(n)

From what I understand of your description and sample code, you want thelist to end up with every integer between x0 and xn in a random order. If so, you can achieve that very simply with random.shuffle(), which shuffles a list in place:
import random
x0 = 5
xn = 15
full_range = list(range(x0, xn))
print(full_range)
random.shuffle(full_range)
print(full_range)

Related

Why is my function returning full list instead of just a total count?

I've been working on a problem for a class that involves random number generation and returning a count for those divisible by 7 or 13. The function appears to work, or is at least returning the proper values along the steps. But, it returns a list for the count instead of a single value. What am I doing wrong? The requirements and my code so far are:
Develop and call a function that will:
Generate a random number n with a value between 1 and 1,000.
In the range of (n, n+200), iterate each number and count how many of them are divisible by 7 and how many of them are divisible by 13.
Print out the result
import random
def randDiv():
n = random.randint(1, 1000)
randList=[]
for x in range(n, n+200):
if (x%7==0) or (x%13==0):
randList.append(str(x))
total = len(randList)
print(total)
randDiv()
You are printing the total in each iteration. You should print it after for loop.
import random
def randDiv():
n = random.randint(1, 1000)
randList=[]
for x in range(n, n+200):
if (x%7==0) or (x%13==0):
randList.append(x)
print(len(randList))
randDiv()
I assume you mean that the function prints many values, as there is no return statement. Your print statement is nested, so each time (x%7==0) or (x%13==0) is True, it will increment the length of randList by 1 and print the length of total. Moving the final print statement outside the for loop should solve the problem.
In the for loop, you are checking if the current value is divided by 7 or 13 then add that value into the list. And every time, you are printing the length of the list. But if you want to print how many of them are divided by 7 and how many of them are divided by 13 separately, you have to take two different lists.
import random
def randDiv():
# n = random.randint(1, 1000)
n = 1
divBySeven=[]
divByThirteen=[]
divByThirteen
for x in range(n, n+200):
if (x%7==0):
divBySeven.append(x)
if x % 13 == 0:
divByThirteen.append(x)
print(len(divBySeven))
print(len(divByThirteen))
randDiv()
There have some numbers that are divided by both 7 and 13 such as 91. You have to consider that case also.

how to print 100 random numbers of "set" in python and add them to empty set()

how to print 100 random numbers of "set" in python, means I have to take 100 random numbers from given range and add it to an empty set(). I need solution in Python.I have tried in following way but its not taking 100 numbers exact.
import random
s=set()
for i in range(200):
s.add((random.randint(0,101)))
print(s)
print(len(s))
This will create a set that always has 100 elements (given the input range is equal to or larger than 100).
import random
set(random.sample(range(1337), 100))
as comments said the set can't contain duplicated numbers, so you need to execute a while loop until you get the number of elements you need in your set.
Then add a random number
import random
s=set()
while len(s) < 100:
s.add((random.randint(0,200)))
print(s)
print(len(s))
set() can only contain unique items. If you try adding an item to set() that already exists, it will be silently ignored:
>>> s = set()
>>> s.add(1)
>>> s
{1}
>>> s.add(1)
>>> s
{1}
In your original code you're actually trying to "Add 200 random numbers to a set from 0 to 100". Conceptually this is wrong, because it's not possible to get 200 unique random numbers from a range of 0 - 100. You can only get up to 100 unique random numbers from that range.
The other issue with your code is what you're randomly choosing the number in each iteration, without checking if it has been added before.
So, in order to take N random numbers from a range of 0 to M, you would have to do the following:
import random
s = set()
N = 100 # Number of items that will be appended to the set
M = 200 # Maximum random number
random_candidates = list(range(M))
for _ in range(N):
numbers_left = len(random_candidates)
# Choose a random number and remove it from the candidate list
number = random_candidates.pop(random.randrange(numbers_left))
s.add(number)
The above will work well for small ranges. If you expect M to be a large number, then generating a large random_candidates array will not be very memory effective.
In that case it would be better to randomly generate a number in a loop until you find one that was not chosen before:
import random
s = set()
N = 100 # Number of items that will be appended to the set
M = 2000 # Maximum random number
for _ in range(N):
random_candidate = random.randrange(M)
while random_candidate in s:
random_candidate = random.randrange(M)
s.add(random_candidate)
sets don't allow duplicate values (that's part of sets defininition...), and statically you will get duplicate values when calling random.randint(). The solution here is obviously to use a while loop:
while len(s) < 100:
s.add(random.randint(0, 101))
Note that with those values (100 ints in the 0..101 range) you won't get much variations since you're selecting 100 distinct values out of 102.
Also note that - as quamrana rightly mentions in a comment - if the range of possible values (randint() arguments) is smaller than the expected set length, the loop will never terminate.

Python for loop on two dimensional index

This may be a simple task, but I am unsure of how to achieve this in Python.
I have a for loop executing on an index in Python. I have a unique value that is defined within each iteration that is cycled through the for loop.
I want to get the value of the NEXT or PREVIOUS for loop unique value. For example, I have:
counter = 0
for rect in rects:
randomnumber = random.randint(1,101)
if counter < len(rects)-1:
if rects[counter] - rects[counter+1]
pastrand = {get random value from PREVIOUS loop iteration}
randsubtract = randomnumber - pastrand
So how do I get the random number from the previous (or next) iteration to use in the CURRENT iteration in Python? For example:
randomnumber in rects[0]
randomnumber in rects[1]
How do I call specific values from iterations in for loops?
its late in the day, but this might do you..
counter = 0
random =[]
for rect in rects:
randomnumber = random.randint(1,101)
random.append(randomnumber)
if counter < len(rects)-1:
if rects[counter] - rects[counter+1]:
pastrand = random[-1]
randsubtract = randomnumber - pastrand
Option 1
Use enumerate. If you want the current and next, you'll need to iterate till len(rect) - 1. If you want the previous and current, you'll need to start iterating from 1.
for i, r in enumerate(rects[:-1]):
cur = r
next = rects[i + 1]
Or,
for i, r in enumerate(rects[1:]):
prev = rects[-1]
cur = r
Option 2
You can use zip to the same effect:
for cur, next in zip(rect, rect[1:]):
...
As written, you can save the value from the previous loop iteration before assigning a new one.
for ...:
pastrand = randomnumber
randomnumber = ...
Of course you will have to assign something to randomnumber before the loop starts so that the assignment works the very first time through.
An alternative would be to loop over pairs of random numbers rather than computing one random number per loop iteration. For this you can use the pairwise() tool whose implementation is given in the itertools documentation or e.g. in the more-itertools package. Looping over pairs of random numbers could be done like this:
for rand1, rand2 in pairwise(repeatfunc(random.randint, None, 1, 101)):
...
where I have used another itertool, repeatfunc(), to repeatedly call randint(). (You can do this without using repeatfunc() too.) At each iteration of this loop except the first, rand1 will be equal to rand2 from the previous iteration.
Now, you're going to want to pair each random number with a rectangle (assuming that's what is in rects), right? That you can do using zip(). Specifically, zip(random_numbers, rects) is an iterator over tuples of a random number and a rectangle. You could use it like so:
for randomnumber, rect in zip(random_numbers, rects):
...
but you're going to want to iterate over pairs, so you combine pairwise with that:
for r1, r2 in pairwise(zip(random_numbers, rects)):
rand1, rect1 = r1
rand2, rect2 = r2
...
Here random_numbers could be that thing I did earlier with repeatfunc(). This will associate one random number with each rectangle, and give you access to each set of two consecutive number/rectangle pairs.

Using python pick a random element from a list with replacement

I have a list of 40 elements. I am trying to estimate how many times I need to sample this list in order to reproduce all elements in that list. However, it is important that I replace the picked element. I.e. it is possible that I will pick the same element 20 times. So far I have the following
import random
l = range(0,40)
seen=[]
x=0
while len(seen)<len(l):
r = random.choice(l)
if r not in seen:
seen.append(r)
x=x+1
print x
However, this always returns that it took 40 times to accomplish what I want. However, this is because a single element is never selected twice.
Eventually I would run this function 1000 times to get a feel for how often I would have to sample.
as always, thanks
You need just adjust the indentation of x=x+1. Because right now you just increment if the value was not seen before.
If you will do that more often with a lot of items may use a set as your seen variable because access items is faster in avarage.
l = range(0, 40)
seen = set()
x = 0
while len(seen) < len(l):
r = random.choice(l)
if r not in seen:
seen.add(r)
x = x + 1
print x
Here is a similar method to do it. Initialize a set, which by definition may only contain unique elements (no duplicates). Then keep using random.choice() to choose an element from your list. You can compare your set to the original list, and until they are the same size, you don't have every element. Keep a counter to see how many random choices it takes.
import random
def sampleValues(l):
counter = 0
values = set()
while len(values) < len(l):
values.add(random.choice(l))
counter += 1
return counter
>>> l = list(range(40))
This number will vary, you could Monte Carlo to get some stats
>>> sampleValues(l)
180
>>> sampleValues(l)
334
>>> sampleValues(l)
179

Pick N distinct items at random from sequence of unknown length, in only one iteration

I am trying to write an algorithm that would pick N distinct items from an sequence at random, without knowing the size of the sequence in advance, and where it is expensive to iterate over the sequence more than once. For example, the elements of the sequence might be the lines of a huge file.
I have found a solution when N=1 (that is, "pick exactly one element at random from a huge sequence"):
import random
items = range(1, 10) # Imagine this is a huge sequence of unknown length
count = 1
selected = None
for item in items:
if random.random() * count < 1:
selected = item
count += 1
But how can I achieve the same thing for other values of N (say, N=3)?
If your sequence is short enough that reading it into memory and randomly sorting it is acceptable, then a straightforward approach would be to just use random.shuffle:
import random
arr=[1,2,3,4]
# In-place shuffle
random.shuffle(arr)
# Take the first 2 elements of the now randomized array
print arr[0:2]
[1, 3]
Depending upon the type of your sequence, you may need to convert it to a list by calling list(your_sequence) on it, but this will work regardless of the types of the objects in your sequence.
Naturally, if you can't fit your sequence into memory or the memory or CPU requirements of this approach are too high for you, you will need to use a different solution.
Use reservoir sampling. It's a very simple algorithm that works for any N.
Here is one Python implementation, and here is another.
Simplest I've found is this answer in SO, improved a bit below:
import random
my_list = [1, 2, 3, 4, 5]
how_big = 2
new_list = random.sample(my_list, how_big)
# To preserve the order of the list, you could do:
randIndex = random.sample(range(len(my_list)), how_big)
randIndex.sort()
new_list = [my_list[i] for i in randIndex]
If you have python version of 3.6+ you can use choices
from random import choices
items = range(1, 10)
new_items = choices(items, k = 3)
print(new_items)
[6, 3, 1]
#NPE is correct, but the implementations that are being linked to are sub-optimal and not very "pythonic". Here's a better implementation:
def sample(iterator, k):
"""
Samples k elements from an iterable object.
:param iterator: an object that is iterable
:param k: the number of items to sample
"""
# fill the reservoir to start
result = [next(iterator) for _ in range(k)]
n = k - 1
for item in iterator:
n += 1
s = random.randint(0, n)
if s < k:
result[s] = item
return result
Edit As #panda-34 pointed out the original version was flawed, but not because I was using randint vs randrange. The issue is that my initial value for n didn't account for the fact that randint is inclusive on both ends of the range. Taking this into account fixes the issue. (Note: you could also use randrange since it's inclusive on the minimum value and exclusive on the maximum value.)
Following will give you N random items from an array X
import random
list(map(lambda _: random.choice(X), range(N)))
It should be enough to accept or reject each new item just once, and, if you accept it, throw out a randomly chosen old item.
Suppose you have selected N items of K at random and you see a (K+1)th item. Accept it with probability N/(K+1) and its probabilities are OK. The current items got in with probability N/K, and get thrown out with probability (N/(K+1))(1/N) = 1/(K+1) so survive through with probability (N/K)(K/(K+1)) = N/(K+1) so their probabilities are OK too.
And yes I see somebody has pointed you to reservoir sampling - this is one explanation of how that works.
As aix mentioned reservoir sampling works. Another option is generate a random number for every number you see and select the top k numbers.
To do it iteratively, maintain a heap of k (random number, number) pairs and whenever you see a new number insert to the heap if it is greater than smallest value in the heap.
This was my answer to a duplicate question (closed before I could post) that was somewhat related ("generating random numbers without any duplicates"). Since, it is a different approach than the other answers, I'll leave it here in case it provides additional insight.
from random import randint
random_nums = []
N = # whatever number of random numbers you want
r = # lower bound of number range
R = # upper bound of number range
x = 0
while x < N:
random_num = randint(r, R) # inclusive range
if random_num in random_nums:
continue
else:
random_nums.append(random_num)
x += 1
The reason for the while loop over the for loop is that it allows for easier implementation of non-skipping in random generation (i.e. if you get 3 duplicates, you won't get N-3 numbers).
There's one implementation from the numpy library.
Assuming that N is smaller than the length of the array, you'd have to do the following:
# my_array is the array to be sampled from
assert N <= len(my_array)
indices = np.random.permutation(N) # Generates shuffled indices from 0 to N-1
sampled_array = my_array[indices]
If you need to sample the whole array and not just the first N positions, then you can use:
import random
sampled_array = my_array[random.sample(len(my_array), N)]

Categories

Resources