So i was learning how to handle probabilities and how to plot them in Python. I came across a problem where i needed to find the probability of the sum of 2 dices being > 7 or odd. I know the result is around 75% but it was translating that to Python that i had a problem with. My code to solve the problem is something like this:
import random
import numpy as np
dice = [1,2,3,4,5,6]
value = 0
simulation_number = len(dice)**2
percentage = []
for i in range(0,simulation_number):
dice1 = random.choice(dice)
dice2 = random.choice(dice)
random_match = dice1,dice2
if (sum(random_match))>7 or (sum(random_match)%2) != 0:
value += 1
percentage.append(np.round((value/simulation_number)*100,2))
print(percentage,"%")
It works just fine but everytime i run the code it gives a different solution because the loop is repeating outcomes for random_match. How do I include in the code the condition of not repeating random_match values?
Generating random values from 1 to 6 won't work. Assume that you are tossing a coin 10 times. theoretically you should get 5 heads and 5 tails. But that does not happens in real life because of sampling error. When you generate random values, there is always some sampling error.
import random
import numpy as np
dice = [1,2,3,4,5,6]
value = 0
simulation_number = len(dice)**2
for i in range(len(dice)):
for j in range(len(dice)):
dice1 = dice[i]
dice2 = dice[j]
x = dice1+dice2
if x>7 or x%2== 1:
value += 1
percentage = np.round((value/simulation_number)*100,2)
print(f'{percentage} %')
This works as you need every case only once. Using random might take same case again and again
I guess you're looking for random.sample:
import random
dice = [1,2,3,4,5,6]
dice1, dice2 = random.sample(dice, 2)
Related
Below python code represents a small 'game':
I flip a coin 100 times and get a sequence of Heads and Tails
I try to figure out how many times in that sequence it happens that there are 6 Heads or 6 Tails after each other.
I want to run this 10000 times and then calculate the average occurrence of streaks per sequence.
I know the solution is that there are about 0.8 streaks per sequence, but I get to a number of 1.6 and I cannot figure out what I am doing wrong...
I have obviously seen other solutions, but I would like to figure out how I can make this specific code work.
Could you have a look at the below code and let me know what I am doing wrong?
import random
numberOfStreaks = 0
possib = ['H', 'T']
folge = ''
x = 0
while x < 10000:
for i in range (100):
folge = folge + str(random.choice(possib))
numberOfStreaks = folge.count('TTTTTT') + folge.count('HHHHHH')
x = x + 1
print(numberOfStreaks)
Since you "know" the answer you seek is ~= 0.8:
I believe you have misinterpreted the question. I suspect that the question you really want to answer is the (in)famous one from "Automate the Boring Stuff with Python" by Al Sweigart (emphasis mine):
If you flip a coin 100 times ...
... Write a program to find out how often a streak of six heads or a
streak of six tails comes up in a randomly generated list of heads and
tails. Your program breaks up the experiment into two parts: the first
part generates a list of randomly selected 'heads' and 'tails' values,
and the second part checks if there is a streak in it. Put all of this
code in a loop that repeats the experiment 10,000 times so we can find
out what percentage of the coin flips (experiments) contains a
streak of six heads or tails in a row.
Part 1 (generate a list of randomly selected 'heads' and 'tails' values):
observations = "".join(random.choice("HT") for _ in range(100))
Part 2 (checks if there is a streak in it.):
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
Part Do Loop (put code in a loop that repeats the experiment 10,000 times):
experimental_results = []
for _ in range(10_000):
observations = "".join(random.choice("HT") for _ in range(100))
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
experimental_results.append(has_streak)
Part Get Result (find percentage of the experiments that contain a streak):
print(sum(experimental_results)/len(experimental_results))
This should give you something close to:
0.8
Full Code:
import random
experimental_results = []
for _ in range(10_000):
observations = "".join(random.choice("HT") for _ in range(100))
has_streak = observations.find("H"*6) != -1 or observations.find("T"*6) != -1
experimental_results.append(has_streak)
print(sum(experimental_results)/len(experimental_results))
If however, the question you seek to answer is:
On average, how many occurrences of of at least 6 consecutive
heads or tails there are in 100 flips of a coin?
Then we can count them up and average that like:
import random
def count_streaks(observations):
streaks = 0
streak_length = 1
prior = observations[0]
for current in observations[1:]:
if prior == current:
streak_length += 1
if streak_length == 6:
streaks += 1
else:
streak_length = 1
prior = current
return streaks
experimental_results = []
for _ in range(10_000):
observations = [random.choice("HT") for _ in range(100)]
observed_streaks = count_streaks(observations)
experimental_results.append(observed_streaks)
print(sum(experimental_results)/len(experimental_results))
This will give you a result of about:
1.50
Note:
Your code uses folge.count('TTTTTT'). I believe this code and any answer that uses a similar strategy is likely (over the course of 10k experiments) to overestimate the answer as ("H"*12).count("H"*6) is 2 not 1.
For example:
This otherwise excellent answer by #samwise (Probability of streak of heads or tails in sequence of coin tossing) consistently generates results in the range of:
1.52
You're appending to folge each time through the x loop, so the 10000 different runs aren't independent of one another -- you don't have 10000 different sets of 100 tosses, you have a single set of 1000000 tosses (which is going to have slightly more streaks in it since you aren't "breaking" it after 100 tosses).
What you want to do is count the streaks for each set of 100 tosses, and then take the mean of all those counts:
from random import choice
from statistics import mean
def count_streaks(folge: str) -> int:
return folge.count("TTTTTT") + folge.count("HHHHHH")
print(mean(
count_streaks(''.join(
choice("HT") for _ in range(100)
))
for _ in range(10000)
))
import random as rd
n = 0
ListOfStreaks = []
ListOfResults = []
while n != 10:
numberOfStreaks = 0
for i in range(100):
Flip = rd.randint(0,1)
ListOfResults.append(Flip)
for i in range(96):
count = 0
for j in range(6):
if ListOfResults[i] == ListOfResults[i + j]:
count += 1
if count == 6:
numberOfStreaks += 1
count = 0
else:
continue
else:
break
ListOfStreaks.append(numberOfStreaks)
n += 1
print(ListOfStreaks)
print(len(ListOfResults))
In the code above, I am able to successfully flip a coin 100 times, and examine how many times in the 100 flips Heads or Tails came up six time in a row. I am unable to properly set up the code to run the experiment 10 times in order to examine how many times Heads or Tails came up six times in a row in each of the single experiments. The goal is to not flip the coins 1,000 times in a row but 10 experiments of flipping 100 coins in a row.
The exercise focuses on later being able to simulate the experiment 10,000 times in order to see what the probability is of Heads or Tails appearing six times in a row in 100 flips. Essentially, I am trying to gather enough of a sample size. While there are actual statistical/probability methods to get the exact answer, that isn't what I am trying to focus on.
CoinFlip Code
Your key problem appears to be that you have ListOfResults = [] outside of your while loop, so each run adds another 100 entries to the list instead of setting up a new test.
I've replaced the initial for loop with a list comprehension which sets up a new sample each time.
import random as rd
list_of_streaks = []
for _ in range(10):
list_of_results = [rd.randint(0,1) for _ in range(100)]
number_of_streaks = 0
for i in range(96):
if sum(list_of_results[i: i+6]) in(0, 6):
number_of_streaks += 1
list_of_streaks.append(number_of_streaks)
print(list_of_streaks)
print(len(list_of_results))
You also don't need the inner for loop to add up all of the 6 flips - you can just sum them to see if the sum is 6 or 0. You appear to have just tested for heads - I tested for 6 identical flips, either heads or tails, but you can adjust that easily enough.
It's also much easier to use a for loop with a range, rather than while with a counter if you are iterating over a set number of iterations.
The first comment from #JonSG is also worth noting. If you had set up the individual test as a function, you'd have been forced to have ListOfResults = [] inside the function, so you would have got a new sample of 100 results each time. Something like:
import random as rd
def run_test():
list_of_results = [rd.randint(0,1) for _ in range(100)]
number_of_streaks = 0
for i in range(96):
if sum(list_of_results[i: i+6]) in(0, 6):
number_of_streaks += 1
return number_of_streaks
print([run_test() for _ in range(10)])
print(len(list_of_results))
I assigned my students in comp sci to write a code that simulates flipping a coin 100 times, storing the results in a list. Look for a streak of 6 heads ( or more ), or 6 tails ( or more ). If you find a streak, then consider the trial a success. Repeat this experiment 10,000 times. Use this to determine the probability of finding a streak of 6 heads or 6 tails.
Theoretically this probability should be ~80%.
EDIT: It is possible I misinterpreted this theoretical probability. I found this probability here: https://math.stackexchange.com/questions/2736117/what-is-the-probability-of-getting-6-or-more-heads-or-tails-in-a-row-after-flipp
My code is giving me a probability of about 54%, the probability of getting exactly 6 in a row. However, if I got 7, 8, 9, or more in a row, my code should mark this as a success, correct?
I understand my code checks for steaks of 6, but if there is a streak of 7, 8, 9, ... it would still mark it as a success. There must be something I'm missing here...
Attached is my code:
import random
numberofstreaks = 0
for experimentnumber in range(10000):
result = []
for i in range(100):
flip = random.randint(0,1)
result.append(flip)
for i in range(len(result)-6):
if result[i:i+6] == ([0,0,0,0,0,0] or [1,1,1,1,1,1]):
numberofstreaks += 1
break
print(numberofstreaks)
print('Chance of steak:',(numberofstreaks/100))
Note: They are currently learning about lists, which is why their code must contain the use of lists.
Thanks ahead of time!
A slight correction:
import random
numberofstreaks = 0
for experimentnumber in range(10000):
result = []
for i in range(100):
flip = random.randint(0,1)
result.append(flip)
for i in range(len(result)-6):
if result[i:i+6] == [0,0,0,0,0,0] or result[i: i+6] == [1,1,1,1,1,1]:
numberofstreaks += 1
break
print(numberofstreaks)
print('Chance of steak:',(numberofstreaks/100))
The answer is now 80.22%. (A or B) returns A, if A cannot be deduced to be false, otherwise B. A list of all 0s isn't False. So you were only checking for streaks of 0s.
From what I understood, your code doesn't account for multiple streaks in a given 100 throws of a coin.
You can do that by shifting i by 6 places once it has found a streak so that it doesn't count a >6 streak as multiple streaks.
Here is the problem statement.
There is a lottery where 4 random numbers are picked everyday.
I want to find out whether I have better odds of winning the lottery (let's say over 1 000 000 trials).
I have added the solution I have written to solve this problem, but it is very slow, running. Anything over 3000 trials is very very slow.
I have added comments to my code to show my reasoning
ADD: I need help finding the bottleneck
ADD2: Code is complete, sorry, had renamed a few variables
#lottery is 4 numbers
#lottery runs 365 days a year
#i pick the same number every day, what are my odds of winning/how many times will i win
#what are my odds of winning picking 4 random numbers
import random
my_pick = [4,4,4,7]
lotto_nums = list(range(0,9))
iterations = 3000
#function to pick 4 numbers at random
def rand_func ():
rand_pick = [random.choice(lotto_nums) for _ in range(4)]
return rand_pick
#pick 4 random numbers X amount of times
random_pick = [rand_func() for _ in range(iterations)]
#pick 4 random numbers for the lottery itself
def lotto ():
lotto_pick = [random.choice(lotto_nums) for _ in range(4)]
return lotto_pick
#check how many times I picked the correct lotto numbers v how many times i randomly generated numbers that would have won me the lottery
def lotto_picks ():
lotto_yr =[]
for _ in range(iterations):
lotto_yr.append(lotto())
my_count = 0
random_count = 0
for lotto_one in lotto_yr:
if my_pick == lotto_one:
my_count = my_count +1
elif random_pick == lotto_one:
random_count = random_count +1
print('I have {} % chance of winning if pick the same numbers versus {} % if i picked random numbers. The lotto ran {} times'.format(((my_count/iterations)*100), ((random_count/iterations)*100), iterations))
lotto_picks()
The reason of why your code is slow is because in each iteration you are calculating all simulations all over again. In reality you need to check if you won the lottery only once per simulation. So lotto_picks() should probably look something like this:
def lotto_picks ():
lotto_yr = []
my_count = 0
random_count = 0
for _ in range(iterations):
new_numbers = lotto()
lotto_yr.append(new_numbers) # You can still save them for later analysis
if my_pick == new_numbers:
my_count = my_count +1
if random_pick == new_numbers: # Changed from elif to if
random_count = random_count +1
print('I have {} % chance of winning if pick the same numbers versus {} % if i picked random numbers. The lotto ran {} times'.format(((my_count/iterations)*100), ((random_count/iterations)*100), iterations))
This will make your program run in linear time O(n), and before your code was running at a quadratic time complexity O(n^2).
Your problem is with the nested for loop.
Your initial running time for your first for loop is of the order O(n) (aka linear).
For each initial iteration (let's say i) your nested loop runs i times.
for i in range(iterations):
for lotto_one in i:
This means that in total your nested loop will be run 4501500 times (sum of numbers from 1 to 3000). Add your initial outer loop iterations to it (3000) and you get 4 504 500 "real" iterations total. Which gives you something like O(n^1.9) running time, almost ^2 running time. That's your bottleneck.
I'm trying to write a function that calls a function (roll die() which rolls a die 1000 times and counts on a list [1,2,3,4,5,6] so an outcome might be [100,200,100,300,200,100]) and tells it to run it x amount of times. It seems my code is printing it over and over again x times
#simulate rolling a six-sided die multiple tiems, and tabulate the results using a list
import random #import from the library random so you can generate a random int
def rollDie():
#have 6 variables and set the counter that equals 0
one = 0
two = 0
three = 0
four = 0
five = 0
six = 0
#use a for loop to code how many times you want it to run
for i in range(0,1000):
#generate a random integer between 1 and 6
flip = int(random.randint(1,6))
# the flip variable is the the number you rolled each time
#Every number has its own counter
#if the flip is equal to the corresponding number, add one
if flip == 1:
one = one + 1
elif flip == 2:
two = two + 1
elif flip == 3:
three = three + 1
elif flip == 4:
four = four + 1
elif flip == 5:
five = five + 1
elif flip == 6:
six = six + 1
#return the new variables as a list
return [one,two,three,four,five,six]
the new function that I am having problems with is:
def simulateRolls(value):
multipleGames = rollDie() * value
return multipleGames
I would like to see a result like this if you typed in 4 for value
[100,300,200,100,100,200]
[200,300,200,100,100,100]
[100,100,100,300,200,200]
[100,100,200,300,200,100]
Can someone guide me in the right direction?
You can get what you want like this:
def simulateRolls(value):
multipleGames = [rollDie() for _ in range(value)]
return multipleGames
By the way, your original function seems to work perfectly fine, but if you're interested, you can remove some redundancy like this:
def rollDie():
#have 6 variables and set the counter that equals 0
results = [0] * 6
#use a for loop to code how many times you want it to run
for i in range(0,1000):
#generate a random integer between 1 and 6
flip = int(random.randint(1,6))
# the flip variable is the the number you rolled each time
results[flip - 1] += 1
return results
The line
multipleGames = rollDie() * value
will evaluate rollDie() once and multiply the result by value.
To instead repeat the call value times do this.
return [rollDie() for i in xrange(value)]
You can also simplify your rollDie function by working with a list throughout
import random #import from the library random so you can generate a random int
def rollDie():
result = [0] * 6
for i in range(0,1000):
result[random.randint(0,5)] += 1
return result