Little new here and any help would be appreciated.
I have been tooling around with this code for a while now and I cant seem to wrap my head around it. Im fairly new to python so I dont quite know or remember all the tricks yet/skills.
So the question at hand:
Equation: {x_(n+1) = x_n * r * (1- x_n)}
With x_n between (0,1) and r between (0,4).
The goal here is to make a loop function that will gather a value for 'x_n' and 'r' and spit out the iteration 'n' and the current 'x_n+1'; i.e. print(n , x_n+1), at each 'n' step while checking to see if the new value is within 0.0000001 of the old value.
If it settles on a fixed point within 20,000 (0.0000001), print the final 'n' + message. If not then and goes to 20,000 then print another message.
All i have so far is:
import math
x_o=float(input("Enter a 'seed' value: "))
r=float(input("Enter an 'r' value: "))
x_a=((x_o + 0) * r * (1-(x_o + 0)))
while x_a != (0.0000001, x_o , 0.0000001):
for n in range(0,99):
x_a=((x_o + n) * r * (1-(x_o + n)))
print(n , x_a)
I'm pretty sure this is no where close so any help would be awesome; if you need any more info let me know.
Much appreciated,
Genosphere
You could write a generator function and use it directly in your for loop. If you need to keep track of the rank of intermediate values you can use enumerate on the generator.
def fnIter(fn,x,delta=0.000001):
while True:
yield x
prev,x = x,fn(x)
if abs(x-prev)<delta:break
output:
r = 2
seed = 0.1
for i,Xn in enumerate(fnIter(lambda x:x*r*(1-x),seed)):
print(i,Xn)
0 0.1
1 0.18000000000000002
2 0.2952
3 0.41611392
4 0.4859262511644672
5 0.49960385918742867
6 0.49999968614491325
7 0.49999999999980305
To implement the maximum iteration check you can either add a conditional break in the loop or use zip with a range:
maxCount = 20000
n,Xn = max(zip(range(maxCount+1),fnIter(lambda x:x*r*(1-x),seed)))
if n < maxCount:
print(n,Xn)
else:
print(Xn,"not converging")
This is an exponentially-weighted moving average. Pandas has a function for this: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.ewm.html
You have a good start so far. You might be overthinking it, though.
The following approach just tries to generate this sequence for 20,000 terms. Each time, it checks whether the new value is within 0.0000001 from the previous value. If so, it breaks out of the loop and prints that. If not, it uses python's for/else construct to print a different value. Note the different levels of indentation.
x_0 = float(input("enter a 'seed' value: "))
r = float(input("enter an 'r' value: "))
x_m = x_0 # placeholder for 'previous value'
delta = 0.0000001
# Try to calculate 20 thousand terms of this sequence
# We will break out of the loop early if our x_n converges
for _ in range(20000):
x_n = x_m * r * (1 - x_m)
if abs(x_n - x_m) < delta:
print("Settled on value for x_n: ", x_n)
break
else:
x_m = x_n # move forward to the next value
else:
print("x_n did not converge in 20000 terms")
I’d like to know whether this is the formula problem or my problem.
I’ve looked up various formulas online. This is edx’s formula
Cost * Number of Months * Monthly Rate / 1 - ((1 + Monthly Rate) ** Number of Months)
cost = 150000
rate = 0.0415
years = 15
rate = rate / 12
years = years * 12
house_value = cost * years * rate
house_value2 = (1 + rate) ** years
house_value = house_value / house_value2
house_value = round(house_value, 2)
print("The total cost of the house will be $" + str(house_value))
It should print “The total cost of the house will be $201751.36” but it prints “The total cost of the house will be $50158.98”
Going off your answer with the correct formula, you can simplify the code quite a bit and add more readability by doing the following:
# This is a function that lets you calculate the real mortgage cost over
# and over again given different inputs.
def calculate_mortgage_cost(cost, rate, years):
# converts the yearly rate to a monthly rate
monthly_rate = rate / 12
# converts the years to months
months = years * 12
# creates the numerator to the equation
numerator = cost * months * monthly_rate
# creates the denominator to the equation
denominator = 1 - (1 + monthly_rate) ** -months
#returns the calculated amount
return numerator / denominator
# sets the calculated amount
house_value = calculate_mortgage_cost(150000, 0.0415, 15)
# This print statement utilizes f strings, which let you format the code
# directly in the print statement and make rounding and conversion
# unnecessary. You have the variable inside the curly braces {}, and then
# after the colon : the comma , adds the comma to the number and the .2f
# ensures only two places after the decimal get printed.
print(f"The total cost of the house will be ${house_value:,.2f}")
I have now solved this. This is the edit.
cost = 150000
rate = 0.0415
years = 15
house_value = cost * (years * 12) * (rate / 12)
house_value2 = 1 - (1 + (rate / 12)) ** -years
house_value = house_value / house_value2
house_value = round(house_value, 2)
print("The total cost of the house will be $" + str(house_value))
I have added a negative sign to the years.
I am trying to write a program to calclulate digits of pi using the Nilakantha Series in python. Every time it runs though it will not give me more than 50 decimals. Still learning python so any help is appreciated.
# Program using Nilakantha Series to crunch digits of pi
from math import *
from decimal import *
getcontext().prec = 200 # this is not doing anything
# epsilon is how accurate I want to be to pi
EPSILON = 0.000000000000000000000000000000000000000000000000000001
sum = float(3)
step = 0
i = 2
while abs(pi - sum) >= EPSILON:
step += 1
print (step)
if step % 2 == 1:
sum += 4.0 / (i * (i + 1) * (i + 2))
i += 2
else:
sum -= 4.0 / (i * (i + 1) * (i + 2))
i += 2
print (Decimal(sum))
print (Decimal(pi))
print ("Total itterations: ", step)
print ("Accurate to: ", EPSILON)
You are not using the Decimal class to calculate Pi, but rather the float class. getcontext() affects Decimal, not float.
If you want to use Decimal, modify your code to convert to Decimal before looping. Note that AFAIK, the value of Pi is not available as a decimal in Python, so you need to get the value from someplace else (http://www.geom.uiuc.edu/~huberty/math5337/groupe/digits.html).
As preparation for an upcoming bioinformatics course, I am doing some assignments from rosalind.info. I am currently stuck in the assignment "Mendel's First Law".
I think I could brute force myself through this, but that somehow my thinking must be too convoluted. My approach would be this:
Build a tree of probabilities which has three levels. There are two creatures that mate, creature A and creature B. First level is, what is the probability for picking as creature A homozygous dominant (k), heterozygous (m) or homozygous recessive (n). It seems that for example for homozygous dominant, since there are a total of (k+m+n) creatures and k of them are homozygous dominant, the probability is k/(k+m+n).
Then in this tree, under each of these would come the probability of creature B being k / m / n given that we know what creature A got picked as. For example if creature A was picked to be heterozygous (m), then the probability that creature B would also be heterozygous is (m-1)/(k+m+n-1) because there is now one less heterozygous creature left.
This would give the two levels of probabilities, and would involve a lot of code just to get this far, as I would literally be building a tree structure and for each branch have manually written code for that part.
Now after choosing creatures A and B, each of them has two chromosomes. One of these chromosomes can randomly be picked. So for A chromosome 1 or 2 can be picked and same for B. So there are 4 different options: pick 1 of A, 1 of B. Pick 2 of A, 1 of B. Pick 1 of A, 2 of B. Pick 2 of A, 2 of B. The probability of each of these would be 1/4. So finally this tree would have these leaf probabilities.
Then from there somehow by magic I would add up all of these probabilities to see what is the probability that two organisms would produce a creature with a dominant allele.
I doubt that this assignment was designed to take hours to solve. What am I thinking too hard?
Update:
Solved this in the most ridiculous brute-force way possible. Just ran thousands of simulated matings and figured out the portion that ended up having a dominant allele, until there was enough precision to pass the assignment.
import random
k = 26
m = 18
n = 25
trials = 0
dominants = 0
while True:
s = ['AA'] * k + ['Aa'] * m + ['aa'] * n
first = random.choice(s)
s.remove(first)
second = random.choice(s)
has_dominant_allele = 'A' in [random.choice(first), random.choice(second)]
trials += 1
if has_dominant_allele:
dominants += 1
print "%.5f" % (dominants / float(trials))
Species with dominant alleles are either AA or Aa.
Your total ppopulation (k + n + m consists of k (hom) homozygous dominant organisms with AA, m (het) heterozygous dominant organisms with Aa and n (rec) homozygous recessive organisms with aa. Each of these can mate with any other.
The probability for organisms with the dominant allele is:
P_dom = n_dominant/n_total or 1 - n_recessive/n_total
Doing the Punnett squares for each of these combinations is not a bad idea:
hom + het
| A | a
-----------
A | AA | Aa
a | Aa | aa
het + rec
| a | a
-----------
A | Aa | Aa
a | aa | aa
Apparently, mating of of two organisms results in four possible children. hom + het yields 1 of 4 organisms with the recessive allele, het + rec yields 2 of 4 organisms with the recessive allele.
You might want to do that for the other combinations as well.
Since we're not just mating the organisms one on one, but throw together a whole k + m + n bunch, the total number of offspring and the number of 'children' with a particular allele would be nice to know.
If you don't mind a bit of Python, comb from scipy.misc might be helpful here. in the calculation, don't forget (a) that you get 4 children from each combination and (b) that you need a factor (from the Punnett squares) to determine the recessive (or dominant) offspring from the combinations.
Update
# total population
pop_total = 4 * comb(hom + het + rec, 2)
# use PUNNETT squares!
# dominant organisms
dom_total = 4*comb(hom,2) + 4*hom*het + 4*hom*rec + 3*comb(het,2) + 2*het*rec
# probability for dominant organisms
phom = dom_total/pop_total
print phom
# probability for dominant organisms +
# probability for recessive organisms should be 1
# let's check that:
rec_total = 4 * comb(rec, 2) + 2*rec*het + comb(het, 2)
prec = totalrec/totalpop
print 1 - prec
This is more a probability/counting question than coding. It's easier to calculate the probability of an offspring having only recessive traits first. Let me know if you have any trouble understanding anything. I ran the following code and my output passed the rosalind grader.
def mendel(x, y, z):
#calculate the probability of recessive traits only
total = x+y+z
twoRecess = (z/total)*((z-1)/(total-1))
twoHetero = (y/total)*((y-1)/(total-1))
heteroRecess = (z/total)*(y/(total-1)) + (y/total)*(z/(total-1))
recessProb = twoRecess + twoHetero*1/4 + heteroRecess*1/2
print(1-recessProb) # take the complement
#mendel(2, 2, 2)
with open ("rosalind_iprb.txt", "r") as file:
line =file.readline().split()
x, y, z= [int(n) for n in line]
print(x, y, z)
file.close()
print(mendel(x, y, z))
Klaus's solution has most of it correct; however, the error occurs when calculating the number of combinations that have at least one dominant allele. This part is incorrect, because while there are 4 possibilities when combining 2 alleles to form an offspring, only one possibility is actually executed. Therefore, Klaus's solution calculates a percentage that is markedly higher than it should be.
The correct way to calculate the number of combos of organisms with at least one dominant allele is the following:
# k = number of homozygous dominant organisms
# n = number of heterozygous organisms
# m = number of homozygous recessive organisms
dom_total = comb(k, 2) + k*m + k*n + .5*m*n + .75*comb(m, 2)
# Instead of:
# 4*comb(k,2) + 4*k*n + 4*k*m + 3*comb(n,2) + 2*n*m
The above code segment works for calculating the total number of dominant combos because it multiplies each part by the percentage (100% being 1) that it will produce a dominant offspring. You can think of each part as being the number of punnet squares for combos of each type (k&k, k&m, k&n, m&n, m&m).
So the entire correct code segment would look like this:
# Import comb (combination operation) from the scipy library
from scipy.special import comb
def calculateProbability(k, m, n):
# Calculate total number of organisms in the population:
totalPop = k + m + n
# Calculate the number of combos that could be made (valid or not):
totalCombos = comb(totalPop, 2)
# Calculate the number of combos that have a dominant allele therefore are valid:
validCombos = comb(k, 2) + k*m + k*n + .5*m*n + .75*comb(m, 2)
probability = validCombos/totalCombos
return probability
# Example Call:
calculateProbability(2, 2, 2)
# Example Output: 0.783333333333
You dont need to run thousands of simulations in while loop. You can run one simulation, and calculate probability from it results.
from itertools import product
k = 2 # AA homozygous dominant
m = 2 # Aa heterozygous
n = 2 # aa homozygous recessive
population = (['AA'] * k) + (['Aa'] * m) + (['aa'] * n)
all_children = []
for parent1 in population:
# remove selected parent from population.
chosen = population[:]
chosen.remove(parent1)
for parent2 in chosen:
# get all possible children from 2 parents. Punnet square
children = product(parent1, parent2)
all_children.extend([''.join(c) for c in children])
dominants = filter(lambda c: 'A' in c, all_children)
# float for python2
print float(len(list(dominants))) / len(all_children)
# 0.7833333
Here I am adding my answer to explain it more clearly:
We don't want the offspring to be completely recessive, so we should make the probability tree and look at the cases and the probabilities of the cases that event might happen.
Then the probability that we want is 1 - p_reccesive. More explanation is provided in the comment section of the following code.
"""
Let d: dominant, h: hetero, r: recessive
Let a = k+m+n
Let X = the r.v. associated with the first person randomly selected
Let Y = the r.v. associated with the second person randomly selected without replacement
Then:
k = f_d => p(X=d) = k/a => p(Y=d| X=d) = (k-1)/(a-1) ,
p(Y=h| X=d) = (m)/(a-1) ,
p(Y=r| X=d) = (n)/(a-1)
m = f_h => p(X=h) = m/a => p(Y=d| X=h) = (k)/(a-1) ,
p(Y=h| X=h) = (m-1)/(a-1)
p(Y=r| X=h) = (n)/(a-1)
n = f_r => p(X=r) = n/a => p(Y=d| X=r) = (k)/(a-1) ,
p(Y=h| X=r) = (m)/(a-1) ,
p(Y=r| X=r) = (n-1)/(a-1)
Now the joint would be:
| offspring possibilites given X and Y choice
-------------------------------------------------------------------------
X Y | P(X,Y) | d(dominant) h(hetero) r(recessive)
-------------------------------------------------------------------------
d d k/a*(k-1)/(a-1) | 1 0 0
d h k/a*(m)/(a-1) | 1/2 1/2 0
d r k/a*(n)/(a-1) | 0 1 0
|
h d m/a*(k)/(a-1) | 1/2 1/2 0
h h m/a*(m-1)/(a-1) | 1/4 1/2 1/4
h r m/a*(n)/(a-1) | 0 1/2 1/2
|
r d n/a*(k)/(a-1) | 0 0 0
r h n/a*(m)/(a-1) | 0 1/2 1/2
r r n/a*(n-1)/(a-1) | 0 0 1
Here what we don't want is the element in the very last column where the offspring is completely recessive.
so P = 1 - those situations as follow
"""
path = 'rosalind_iprb.txt'
with open(path, 'r') as file:
lines = file.readlines()
k, m, n = [int(i) for i in lines[0].split(' ')]
a = k + m + n
p_recessive = (1/4*m*(m-1) + 1/2*m*n + 1/2*m*n + n*(n-1))/(a*(a-1))
p_wanted = 1 - p_recessive
p_wanted = round(p_wanted, 5)
print(p_wanted)
I just found the formula for the answer. You have 8 possible mating interactions that can yield a dominant offspring:
DDxDD, DDxDd, DdxDD, DdxDd, DDxdd, ddxDD, Ddxdd, ddxDd
With the respective probabilities of producing dominant offspring of:
1.0, 1.0, 1.0, 0.75, 1.0, 1.0, 0.5, 0.5
Initially it seemed odd to me that DDxdd and ddxDD were two separate mating events, but if you think about it they are slightly different conceptually. The probability of DDxdd is k/(k+m+n) * n/((k-1)+m+n) and the probability of ddxDD is n/(k+m+n) * k/(k+m+(n-1)). Mathematically these are identical, but speaking from a probability stand point these are two separate events. So your total probability is the sum of the probabilities of each of these different mating events multiplied by the probability of that mating event producing a dominant offspring. I won't simplify it here step by step but that gives you the code:
total_probability = ((k ** 2 - k) + (2 * k * m) + (3 / 4 * (m ** 2 - m)) + (2* k * n) + (m * n)) / (total_pop ** 2 - total_pop)
All you need to do is plug in your values of k, m, and n and you'll get the probability they ask for.
I doubt that this assignment was designed to take hours to solve. What am I thinking too hard?
I also had the same question. After reading the whole thread, I came up with the code.
I hope the code itself will explain the probability calculation:
def get_prob_of_dominant(k, m, n):
# A - dominant factor
# a - recessive factor
# k - amount of organisms with AA factors (homozygous dominant)
# m - amount of organisms with Aa factors (heterozygous)
# n - amount of organisms with aa factors (homozygous recessive)
events = ['AA+Aa', 'AA+aa', 'Aa+aa', 'AA+AA', 'Aa+Aa', 'aa+aa']
# get the probability of dominant traits (set up Punnett square)
punnett_probabilities = {
'AA+Aa': 1,
'AA+aa': 1,
'Aa+aa': 1 / 2,
'AA+AA': 1,
'Aa+Aa': 3 / 4,
'aa+aa': 0,
}
event_probabilities = {}
totals = k + m + n
# Event: AA+Aa -> P(X=k, Y=m) + P(X=m, Y=k):
P_km = k / totals * m / (totals - 1)
P_mk = m / totals * k / (totals - 1)
event_probabilities['AA+Aa'] = P_km + P_mk
# Event: AA+aa -> P(X=k, Y=n) + P(X=n, Y=k):
P_kn = k / totals * n / (totals - 1)
P_nk = n / totals * k / (totals - 1)
event_probabilities['AA+aa'] = P_kn + P_nk
# Event: Aa+aa -> P(X=m, Y=n) +P(X=n, Y=m):
P_mn = m / totals * n / (totals - 1)
P_nm = n / totals * m / (totals - 1)
event_probabilities['Aa+aa'] = P_mn + P_nm
# Event: AA+AA -> P(X=k, Y=k):
P_kk = k / totals * (k - 1) / (totals - 1)
event_probabilities['AA+AA'] = P_kk
# Event: Aa+Aa -> P(X=m, Y=m):
P_mm = m / totals * (m - 1) / (totals - 1)
event_probabilities['Aa+Aa'] = P_mm
# Event: aa+aa -> P(X=n, Y=n) + P(X=n, Y=n) = 0 (will be * 0, so just don't use it)
event_probabilities['aa+aa'] = 0
# Total probability is the sum of (prob of dominant factor * prob of the event)
total_probability = 0
for event in events:
total_probability += punnett_probabilities[event] * event_probabilities[event]
return round(total_probability, 5)
I'm running into a dilemma with a for i in range(x) loop not iterating. The purpose of my program is to simulate foxes and rabbits interacting with one another on an island and printing out the populations of each respective animal after each day. I know the equations are correct, the problem I am having is my loop will only run once for a large range.
My code:
def run_simulation():
print()
RABBIT_BIRTH_RATE = 0.01
FOX_BIRTH_RATE = 0.005
INTERACT = 0.00001
SUCCESS = 0.01
x = 0
y = 1
FOXES = eval(input("Enter the initial number of foxes: "))
print()
RABBITS = eval(input("Enter the initial number of rabbit: "))
print()
DAYS = eval(input("Enter the number of days to run the simulation: "))
print()
print("Day\t","Rabbits\t","Foxes\t")
print(0,"\t",RABBITS,"\t","\t",FOXES,"\t")
for i in range(DAYS):
RABBITS_START = round((RABBIT_BIRTH_RATE * RABBITS) - (INTERACT * RABBITS * FOXES))
FOXES_START = round((INTERACT * SUCCESS * RABBITS * FOXES) - (FOX_BIRTH_RATE * FOXES))
y = y + x
print (y,"\t",(RABBITS_START+RABBITS),"\t","\t",(FOXES_START+FOXES),"\t")
run_simulation()
When this is run with an example of 500 Foxes, 10000 Rabbits, and 1200 days, my output will look like
Day Rabbits Foxes
0 10000 500
1 10050 498
With the second output line repeating the remaining 1199 times.
Any help would be greatly appreciated I cannot figure out what I am doing wrong.
You set RABBITS and RABBIT_BIRTH_RATE at the beginning. Then, on every loop iteration, you set RABBITS_START to some formula involving these two numbers. You never change the value of RABBITS or RABBIT_BIRTH_RATE or FOXES or anything, so every time you run through the loop, you're just calculating the same thing again with the same numbers. You need to update the values of your variables on each iteration --- that is, set a new value for RABBITS, FOXES, etc.
The biggest issue for me is what you named your "change in rabbits/foxes". RABBITS_START sounds like an initial count for RABBITS, but it's not. This is why I renamed it to RABBITS_DELTA, because really it's calculating the CHANGE in rabbits for each day.
I think I got it. At the very least this behaves more like a simulation now:
def run_simulation():
RABBIT_BIRTH_RATE = 0.01
FOX_BIRTH_RATE = 0.005
INTERACT = 0.00001
SUCCESS = 0.01
x = 0
y = 1
FOXES = eval(str(input("Enter the initial number of foxes: ")))
RABBITS = eval(str(input("Enter the initial number of rabbits: ")))
DAYS = eval(str(input("Enter the number of days to run the simulation: ")))
print("Day\t","Rabbits\t","Foxes\t")
print(0,"\t",RABBITS,"\t","\t",FOXES,"\t")
count = 0
while count < DAYS:
RABBITS_DELTA = round((RABBIT_BIRTH_RATE * RABBITS) \
- (INTERACT * RABBITS * FOXES))
FOXES_DELTA = round((INTERACT * SUCCESS * RABBITS * FOXES) \
- (FOX_BIRTH_RATE * FOXES))
y = y + x
RABBITS += RABBITS_DELTA
FOXES += FOXES_DELTA
print (y,"\t",(RABBITS),"\t","\t",(FOXES),"\t")
count += 1
run_simulation()
I'm going to take a wild stab at trying to interpret what you mean:
for i in range(1, DAYS + 1):
rabbit_delta = ... # RABBITS_START
fox_delta = ... # FOXES_START
RABBITS += rabbit_delta
FOXES += fox_delta
print(i, "\t", RABBITS, "\t\t", FOXES, "\t")
edited based on others' answers. (Wild stab is less wild.)
See BrenBarn's answer for an explanation in prose.