how do I use modular expression/ working with large intergers - python

I want to make a program that calculate the the populations after x years.
where the pop in 2002 is 6.2 billion people and increases 1.3 % each year.
The formula I will use is
population = ((1.013)**x) * 6.2B
How do I make 6.2B easier to work with?

Here is your code. Read and learn well. This is probably a problem that you could have solved with Google.
import math
def calculate_population(years_since_2002): #the original calculation
population_2002 = 6.2*10**9
final_population = int(((1.013)**years_since_2002)*population_2002)
return final_population
def pretty_print(num,trunc=0):
multiplier = int(math.log10(num)) #finds the power of 10
remainder = float(num)/(10**multiplier) #finds the float after
str_remainder = str(remainder)
if trunc != 0:
str_remainder = remainder[:trunc+1] #truncates to trunc digits total
return str_remainder+'e'+str(multiplier) #can also be print

Related

Monte Carlo simulation of Birthday paradox in python 3

The birthday paradox is that everyone has equal probability of having a birthday on any given of 365 days. We start adding people in a room. What is the probability that 2 people have birthdays on same day as a function of number of people in the room? The code I wrote is as follows:
import numpy as np
import matplotlib.pyplot as plt
x=[0]
y=[0]
for j in range(1000):
if j!=0:
freq = []
L1 = list(np.random.randint(low = 1, high=366, size = j))
result = list((i, L1.count(i)) for i in L1)
for a_tuple in result:
freq.append(a_tuple[1])
print(freq)
rep = j - freq.count(1)
prob = rep/j
y = y + [prob]
x = x + [j]
print(prob)
plt.plot(x,y)
Here, in L1 = list(np.random.randint(low = 1, high=366, size = j)) I select the day on which someone would have a birthday and in result = list((i, L1.count(i)) for i in L1) I calculate the frequency of birthdays on each day. The entire thing is looped over to account for increasing number of people.
In the following for loop, I isolate the unique events and find repetitions and store the value in rep.
Next I calculated the probability as fraction of people sharing birthdays and plotted them as a function of number.
However, the question requires me to find the probability of just one shared birthday. How do I calculate that? I think I have to loop this entire thing for number of trials but that just gives an accurate solution with less variations of the same program. Currently my program gives fraction of people having shared birthdays I think.
Birthday problem Wikipedia for better reference
NOTE
I assume that when n persons have been in the room, they are all thrown out of the room and then n+1 persons enter the room.
========================================
I would think of it this way;
First, set probs = [0]*365. Now, say 2 persons get in the room - we then write their birthdays onto a piece of paper and check, if those two dates are equal. If they are, we increase probs[2] by 1 (yes, theres some indexes that we don't need, and Python is 0-indexed etc. but to keep it simple).
Now do the same for 3 persons, for 4 persons, for 5 persons ... all the way up to 365.
Your array might look something like probs==[0,0,0,0,0,1,0,1,1,0,1,1,1,1,0,1....].
You can now start over from 2 persons (still keeping the same array as before i.e don't create a new one with 0's!), then 3 persons etc. and start over 1000 times. Your array might look like
probs==[0,0,2,0,4,1,5,2,9,12,10,17....,967,998]
If you divide that array by 1000 (elementwise) you now have your simulated probability as a function of n persons.
import numpy as np
import matplotlib.pyplot as plt
N_TOTAL_PERS= 366
N_SIM = 10000 #number of simulations
counts = np.zeros(N_TOTAL_PERS)
for _ in range(N_SIM):
for n in range(2,N_TOTAL_PERS):
b_days = np.random.randint(1,366,size=n) #Get each persons birth-day
counts [n] += len(b_days) != len(set(b_days)) #Increment if some birthdays are equal
total_probs = counts/N_SIM #convert to probabilities
total_probs[70] #Get the probability when 70 persons are together (0.9988)
plt.plot(range(N_TOTAL_PERS),total_probs)
which generates a plot that looks like
You should run multiple experiments for different number of people in the room. Note that for N_people > 365, the probability should compute equal to 1.
Refactoring your code, and changing the logic a bit, I came up with the following:
import numpy as np
import matplotlib.pyplot as plt
def random_birthdays(n_people):
return list(np.random.randint(low=1, high=366, size=n_people))
def check_random_room(n_people):
"""
Generates a random sample of `n_people` and checks if at least two of them
have the same birthday
"""
birthdays = random_birthdays(n_people)
return len(birthdays) != len(set(birthdays))
def estimate_probability(n_people, n_experiments):
results = [check_random_room(n_people) for _ in range(n_experiments)]
return sum(results)/n_experiments
N_EXPERIMENTS = 1000
x = list(range(1, 400))
y = [estimate_probability(x_i, N_EXPERIMENTS) for x_i in x]
plt.plot(x, y)
plt.show()

How to convert bar count to time, in midi? (music)

Given a midi file, how can one convert the bar count to time?
Generally, how can one easily map the bar count, in entire numbers, to the time in seconds in the song
Using pretty midi, my solution
import pretty_midi as pm
def get_bar_to_time_dict(self,song,id):
def get_numerator_for_sig_change(signature_change,id):
# since sometime pretty midi count are wierd
if int(signature_change.numerator)==6 and int(signature_change.denominator)==8:
# 6/8 goes to 2 for sure
return 2
return signature_change.numerator
# we have to take into account time-signature-changes
changes = song.time_signature_changes
beats = song.get_beats()
bar_to_time_dict = dict()
# first bar is on first position
current_beat_index = 0
current_bar = 1
bar_to_time_dict[current_bar] = beats[current_beat_index]
for index_time_sig, _ in enumerate(changes):
numerator = get_numerator_for_sig_change(changes[index_time_sig],id)
# keep adding to dictionary until the time signature changes, or we are in the last change, in that case iterate till end of beats
while index_time_sig == len(changes) - 1 or beats[current_beat_index] < changes[index_time_sig + 1].time:
# we have to increase in numerator steps, minus 1 for counting logic of natural counting
current_beat_index += numerator
if current_beat_index > len(beats) - 1:
# we labeled all beats so end function
return bar_to_time_dict
current_bar += 1
bar_to_time_dict[current_bar] = beats[current_beat_index]
return bar_to_time_dict
song = pm.PrettyMIDI('some_midi_file.midi')
get_bar_to_time_dict(song)
If anyone knows a function in pretty midi or music21 that solves the same issue please let me know, couldn't find one.
EDIT: There was also an issue with 6/8 beats, I think this covers all edge cases(not 100% sure)

Trying to calculate EMA using python and i cant figure out why my code is always producing the same result

I am trying to calculate an exponential moving average of bitcoin in python2.7 but my result is always the same value and I have no idea why.
def calcSMA(data,counter,timeframe):
closesum = 0
for i in range(timeframe):
closesum = float(closesum) + float(data[counter-i])
return float(closesum / timeframe)
def calcEMA(price,timeframe,prevema):
multiplier = float(2/(timeframe+1))
ema = ((float(price) - float(prevema))*multiplier) + float(prevema)
return float(ema)
counter = 0
closeprice = [7242.4,7240,7242.8,7253.8,7250.6,7255.7,7254.9,7251.4,7234.3,7237.4
,7240.7,7232,7230.2,7232.2,7236.1,7230.5,7230.5,7230.4,7236.4]
while counter < len(closeprice):
if counter == 3:
movingaverage = calcSMA(closeprice,counter,3)
print movingaverage
if counter > 3:
movingaverage = calcEMA(closeprice[counter],3,movingaverage)
print movingaverage
counter +=1
This is how to calculate the EMA:
{Close - EMA(previous day)} x multiplier + EMA(previous day)
you seed the formula with a simple moving average.
Doing this in Excel works so might it be my use of variables?
I would be really glad if someone could tell me what I am doing wrong because I have failed on this simple problem for hours and can't figure it out I've tried storing my previous ema in a separate variable and I even stored all of them in a list but I am always getting the same values at every timestep.
The expression 2/(timeframe+1) is always zero, because the components are all integers and therefore Python 2 uses integer division. Wrapping that result in float() does no good; you just get 0.0 instead of 0.
Try 2.0/(timeframe+1) instead.

Python Set Birthday

So I am trying to make a program that creates the probability of a bunch of people in a room to have the same birthday... I can't figure out how to create the function. Here is what I have so far
def birthday():
mySet = set()
x = 1
for item in mySet:
if item in mySet:
return x
else:
mySet().append() # don't know what to do here.
Edit:
Alright so what I am trying to accomplish is to make a function using a set that stores birthdays using numbers 1 through 365...For example, if you randomly pick a room with 30 people in it, they may not have the same birthday. Although, if you have twins in the same room, you only need 2 people
in the room to have the same birthday. So eventually I want a parameter that tests this function several times and averages it all up. Unfortunately I can't figure out how to make this. I want x to be a counter of how many people are in the room and when there is a match the loop stops and it stops. I also don't know what to append to.
Is there a reason why you're trying to simulate this rather than using the closed form solution to this problem? There's a pretty decent approximation that's fast and easy to code:
import math
def closed_form_approx_birthday_collision_probability(num_people):
return 1 - math.exp(-num_people * (num_people - 1) / (2 * 365.0))
You could also implement an very good "exact" solution (in quotes because some fidelity is lost when converting to float):
import operator
import functools
import fractions
def slow_fac(n):
return functools.reduce(operator.mul, range(2, n+1), 1)
def closed_form_exact_birthday_collision_probability(num_people):
p_no_collision = fractions.Fraction(slow_fac(365), 365 ** num_people * slow_fac(365 - num_people))
return float(1 - p_no_collision)
To do a simulation, you'd do something like this. I'm using a list rather than a set because the number of possibilities is small and this avoids some extra work that using a set would do:
import random
def birthday_collision_simulate_once(num_people):
s = [False] * 365
for _ in range(num_people):
birthday = random.randint(0, 364)
if s[birthday]:
return True
else:
s[birthday] = True
return False
def birthday_collision_simulation(num_people, runs):
collisions = 0
for _ in range(runs):
if birthday_collision_simulate_once(num_people):
collisions += 1
return collisions / float(runs)
The numbers I get from the simulation and the closed form solution look similar to the table at http://en.wikipedia.org/wiki/Birthday_problem
>>> closed_form_approx_birthday_collision_probability(20)
0.40580512747932584
>>> closed_form_exact_birthday_collision_probability(20)
0.41143838358058
>>> birthday_collision_simulation(20, 100000)
0.41108
Of course the simulation with that many runs is closer to the actual 41.1%, it's much slower to calculate. I'd choose one of the closed form solutions, depending on how accurate it needs to be.

How to avoid division by zero error, when performing calculations on parsed xml data

please be kind with your answers I have been coding now for 10 days. I am having trouble with performing loops in my code, but I am fairly certain this is because I am getting a traceback.
I parse an xml file obtained from a url, using the following code:
pattern4 = re.compile('title=\'Naps posted: (.*) Winners:')
pattern5 = re.compile('Winners: (.*)\'><img src=')
for row in xmlload1['rows']:
cell = row["cell"]
##### defining the Keys (key is the area from which data is pulled in the XML) for use in the pattern finding/regex
user_delimiter = cell['username']
selection_delimiter = cell['race_horse']
##### the use of the float here is to make sure the result of the strike rate calculations returns as a decimal, otherwise python 2 rounds to the nearest integer!
user_numberofselections = float(re.findall(pattern4, user_delimiter)[0])
user_numberofwinners = float(re.findall(pattern5, user_delimiter)[0])
strikeratecalc1 = user_numberofwinners/user_numberofselections
strikeratecalc2 = strikeratecalc1*100
##### Printing the results of the code at hand
print "number of selections = ",user_numberofselections
print "number of winners = ",user_numberofwinners
print "Strike rate = ",strikeratecalc2,"%"
print ""
getData()
This code with the rest of the code returns:
number of selections = 112.0
number of winners = 21.0
Strike rate = 18.75 %
number of selections = 146.0
number of winners = 21.0
Strike rate = 14.3835616438 %
number of selections = 163.0
number of winners = 55.0
Strike rate = 33.7423312883 %
Now the results of this xmlload suggest that there is only three users to parse however there is a 4th whos data would read
number of selections = 0
number of winners = 0
Strike rate = 0
for my purposes it is not necessary to pull in user stat's for those with no track record, how do i make the code either skip users with 0 selections or at least make it so that the division by zero error doesn't affect the ability of the code to run in a loop?
Kind regards!
Just use a continue when you find a 0.
user_numberofwinners = float(re.findall(pattern5, user_delimiter)[0])
# if the number of winners is 0, go to the next row to avoid division by 0
if user_numberofwinners == 0.0 : continue;

Categories

Resources